Power law extrapolation
(Warning: Just a toy model / very crude extrapolation, with no connection to reality.)
Background: https://twitter.com/svat/status/1814587528986419633
As of this revision, the article said:
wide use of Microsoft Windows and CrowdStrike software by large and global corporations in many business sectors. At the time of the incident, CrowdStrike said it had more than 24,000 customers, including nearly 60% of Fortune 500 companies and more than half of the Fortune 1,000.
Just as a toy mathematical model, can we, from these numbers and some power-law assumption, get some estimate of what fraction of organizations were affected, as a function of their size/rank — some number that would vary from 0.6 for “Fortune 500 companies”, down to 0 for home users/small companies (not using Crowdstrike)?
Well sure, it’s possible to draw a power law curve through two points :-)
(Unjustified) Assumptions:
- Every company has a distinct rank 1, 2, 3, ….
- For company with rank
, there is an associated number , where . - The sum of
for is . (This is treating “nearly 60% of Fortune 500 companies” as ”exactly 300 of the top 500 companies”.) - The sum of
for is . (This is treating ”more than half of the Fortune 1000” as ”exactly 500 of the top 1000 companies”.) is a decreasing/non-increasing function of , that moreover varies as a power-law: for some constants and .
With these assumptions, we can work out the values of the constants
By dividing, we can calculate the value of
(More standard here may be the continuous approximation, or something formal with Hurwitz zeta function or whatever. But we can just do things numerically…)
This is what
# prompt: Plot f as a function of a, from a = -10 to a = 10
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-10, 10, 100)
y = [f(a) for a in x]
plt.plot(x, y)
plt.xlabel('a')
plt.ylabel('f(a)')
plt.show()
So finding
lo = -10.0
hi = 10.0
eps = 1e-8
while hi - lo > eps:
mid = (lo + hi) / 2
if f(mid) < 0.6: lo = mid
else: hi = mid
print(lo, hi)
# 0.2662351727485657 0.2662351820617914
gives
a = 0.2662351789580996
print(300 / s(a, 500), 500 / s(a, 1000))
# 2.3161288835449767 2.3161288835449767
So we have our power-law distribution:
We can further truncate this to have total
def p(k): return 2.3161288835449767 * k**-0.2662351789580996
s = 0
n = 0
while s < 24000:
n += 1
s += p(n)
print(n, s)
# 194633 24000.062311828504
This gives an updated function:
(The sharp cut-off from
Let’s make it an interactive tool (used Claude): change the value of
For companies around rank 10000, about 19.94% of them are customers (per the toy model)
Edit: The same thing as a plot, again thanks to Claude:
And when viewed as a plot it shows why the model is nonsense, as it gives