Interesting article, and a phenomenon I hadn't thought about before. One thing t...

time_management · on Sept 24, 2008

Benford's Law applies to distributions which vary over orders of magnitude, and locally flatly in the log-space. It obviously doesn't apply to adult male heights, which follow a (69, 3) normal distribution, thus 6's and 7's dominate. In the log space, this distribution is not "flat" over an order of magnitude; it's a steep "bump". However, incomes, urban populations, file sizes, and natural disasters (measured in dollar-cost or fatalities) do follow such distributions, and hence we observe Benford's Law.

To a person with some statistical background, I usually explain Benford's Law in the context of town and city populations. Obviously, there's great variation in the populations of chartered cities, from as low as 10 to over 10 million. I note there is nothing "natural" about normal distributions other than the fact that they emerge from the addition of a large number of (finite variance) variables, a process that approximately describes the determination of adult height from genes. Then consider the variables that determine a city's population. Being near water might increase the population by 50%. A strong economy over a decade might lead to 40% growth. Oppressive taxes might decrease the population by 20%. It doesn't matter what these numbers actually are; the point is that the population is the product rather than sum of such variables, so we get an approximately normal distribution in the log space. Variation in log-population goes from as low as 1.0 to over 7.0, which means we can expect approximate flatness over [4.0, 5.0). Then, approximately 30% of cities between 10,000 and 100,000 people will be between 10,000 and 20,000, which is [4.0, 4.3) in the log space.

I like the fixed-point/scale invariance explanation of Benford's Law better though, because it's more intuitive than the one I use. Still, it's not completely satisfying. It doesn't explain why Benford's Law applies to all of the distributions to which it applies, such as file sizes, urban populations (inches and dollars are purely arbitrary units, while numbers of people or bits are not) and fatality figures in natural disasters.

scott_s · on Sept 24, 2008

To be clear: are you saying denglish's rationale is incorrect? (I ask because it feels legit, but, alas, that doesn't mean it is.)

time_management · on Sept 24, 2008

His rationale is not incorrect but incomplete.

Essentially, he's arguing that since the Benford distribution of leading digits is the sole fixed point under the scaling operation, it's the most natural distribution to expect in large collection of measurements. Since units of measurement (e.g. dollars, meters, miles) represent arbitrary quantities, and the data set could be examined using literally any unit of measure (a unit of measure being a scaling operation, e.g. meters -> feet multiplies each datum by 3.26), a sufficiently large set of measured data (e.g. an almanac) can be expected to obey Benford's distribution.

Benford's Law is also not true of specific distributions that are very tight. Consider IQ. That the mean is 100 is completely arbitrary, but the standard deviation of ~15% is not. Observed ratio IQs in healthy children are log-normal with a multiplicative standard deviation of 1.15-1.16; in other-words, the 85th-percentile 6-year-old will have the cognitive maturity of an average 7-year-old, a fact that is independent of the unit of measure. (Adult "deviation IQs" are a different matter entirely, as they are "forced" to conform to a normal distribution, e.g. a person who scores in the 99.0th percentile will be "assigned" a z-score of 2.33, corresponding to an IQ of 135.) Obviously, with 50% of IQs having a leading digit of 1 and almost none having a leading digit of 2 or 3, this is not a Benford distribution. You could use a different arbitrary scaling factor, setting the median to 50 instead of 100, but then leading digits of 5 and 6 would be overrepresented, with virtually no 1s or 2s. The issue, of course, is that normal IQs are very tightly distributed in the log-space and don't span nearly an order of magnitude, so we will never get a Benford distribution no matter what scaling factor we choose.

The other problem with the OP's argument is that it doesn't apply to figures like fatality figures in natural disasters, or sizes of cities, neither of which involves an arbitrary unit, but both of which exhibit Benford-esque distributions, due to the multiplicative rather than additive compilation of the variables involved. An additive compilation (e.g. sum) of a large number of variables (e.g. height from genes) exhibits a normal distribution, for which Benford's Law does not apply. However, a multiplicative compilation (e.g. product) of a large number of random variables will have a log-normal distribution, and if the variation of X is over many orders of magnitude, its distribution will be locally flat enough (in the log-space) that Y - floor(Y), where Y = log X, will be approximately a uniform choice out of [0, 1), leading to the Benford distribution.