Blog

Power law extrapolation

(Warning: Just a toy model / very crude extrapolation, with no connection to reality.)

Background: https://twitter.com/svat/status/1814587528986419633

As of this revision, the article said:

wide use of Microsoft Windows and CrowdStrike software by large and global corporations in many business sectors. At the time of the incident, CrowdStrike said it had more than 24,000 customers, including nearly 60% of Fortune 500 companies and more than half of the Fortune 1,000.

Just as a toy mathematical model, can we, from these numbers and some power-law assumption, get some estimate of what fraction of organizations were affected, as a function of their size/rank — some number that would vary from 0.6 for “Fortune 500 companies”, down to 0 for home users/small companies (not using Crowdstrike)?

Well sure, it’s possible to draw a power law curve through two points :-)

(Unjustified) Assumptions:

With these assumptions, we can work out the values of the constants c and α from the two data points. Let Sn=k=1npk. Then

S500=ck=1500kα=300S1000=ck=11000kα=500

By dividing, we can calculate the value of α numerically: it is the value such that, if we define

f(α)=k=1500kαk=11000kα, then f(α)=3/5.

(More standard here may be the continuous approximation, or something formal with Hurwitz zeta function or whatever. But we can just do things numerically…)

This is what f looks like as a function of α:

# prompt: Plot f as a function of a, from a = -10 to a = 10
import matplotlib.pyplot as plt
import numpy as np

x = np.linspace(-10, 10, 100)
y = [f(a) for a in x]

plt.plot(x, y)
plt.xlabel('a')
plt.ylabel('f(a)')
plt.show()

So finding α numerically:

lo = -10.0
hi = 10.0
eps = 1e-8
while hi - lo > eps:
  mid = (lo + hi) / 2
  if f(mid) < 0.6: lo = mid
  else: hi = mid
print(lo, hi)
# 0.2662351727485657 0.2662351820617914

gives α=0.266235, after which we can plug back to get c:

a = 0.2662351789580996
print(300 / s(a, 500), 500 / s(a, 1000))
# 2.3161288835449767 2.3161288835449767

So we have our power-law distribution: pk=ckα with c=2.3161 and α=0.2662.

We can further truncate this to have total 24000:

def p(k): return 2.3161288835449767 * k**-0.2662351789580996
s = 0
n = 0
while s < 24000:
  n += 1
  s += p(n)
print(n, s)
# 194633 24000.062311828504

This gives an updated function:

pk={ckαfor k1946330for k>194633

(The sharp cut-off from pk0.09 to pk=0 shows that this toy power-law model is probably not a good reflection of reality, in case there was any belief left… but whatever.)

Let’s make it an interactive tool (used Claude): change the value of k below (the “rank” of your company) to see corresponding value pk, or vice-versa:

For companies around rank 10000, about 19.94% of them are customers (per the toy model)

Edit: The same thing as a plot, again thanks to Claude:

And when viewed as a plot it shows why the model is nonsense, as it gives pk values greater than 1 for small k.