
Benford's Law - Hooke
https://en.wikipedia.org/wiki/Benford%27s_law
======
Symmetry
Did my thesis trying to apply insights from Benford's law to the design of the
adders in your CPU. The idea was you make your low order bits, which see more
activity, slower and more power efficient and your high order bits faster and
less efficient. Meet the same timing overall but with less energy on average.
Sadly, memory addresses were close to pure entropy for my purposes and the
savings I was able to get were only around 5%, not enough for all the effort.

~~~
lisper
Benford's law works for random variables drawn from a uniform distribution
whose upper bound is itself a random variable with a uniform distribution. But
memory addresses are not drawn from such a distribution. The upper bound of
the distribution of memory addresses is almost always a power of 2.

~~~
Symmetry
Yes, the fact that adders deal with so many memory addresses in practice means
that the distribution of numbers I saw didn't end up working very much like
Benford's law. The non-address data did but they weren't a large enough
fraction of what was added together.

------
corpMaverick
Benford's Law has been widely criticized as a technique to detect election
fraud. A political scientist wrote a paper describing a variation.
[http://www-personal.umich.edu/~wmebane/pm06.pdf](http://www-
personal.umich.edu/~wmebane/pm06.pdf) He used some data from the 2006 election
in Mexico just to illustrate the proposed technique.

The paper has been used extensively by critics of the election as a proof of
fraud. But they haven't been able to prove fraud any other way. The paper was
never meant to be used as a strong proof of anything, it was mostly
exploratory.

~~~
panarky
_Benford 's law does not apply to unary systems such as tally marks._

Does that mean Benford doesn't work for counting things like votes or people?

This site tests Benford's Law on 30 different public datasets.

[http://testingbenfordslaw.com/mexico-population-by-
county](http://testingbenfordslaw.com/mexico-population-by-county)

Looks to me like Benford works just fine for tally-type data like votes and
populations.

~~~
corpMaverick
Per your example. The numbers fit the population by municipality. Probably
because the counts span several orders of magnitude. Some have millions (e.g
Delegaciones in Mexico city). Some have hundreds( e.g Oaxaca ) However voting
places are roughly the same size.

------
tmaic
Data Genetics did a great job doing a layman's write up on Benford's law,
which is still one of the most fun math things to tell my family.

Here it is:

[http://www.datagenetics.com/blog/march52012/index.html](http://www.datagenetics.com/blog/march52012/index.html)

------
magoghm
Here is an interesting analysis of Benford's Law from a digital signal
processing perspective:
[http://www.dspguide.com/ch34.htm](http://www.dspguide.com/ch34.htm)

~~~
ClintEhrlich
That was phenomenal. Thank you. I'd encourage anyone else interested in
Benford's Law to skip straight to this book chapter, because it seems that
most of the other material online overlooks the heart of the matter: i.e.,
that the law applies to distributions that are wide compared with unit
distance along the logarithmic scale, but not to distributions that are
narrow.

------
thinkr42
Used this to detect tax fraud for a state government and uncover a breach (by
poor suppression techniques) of medicaid data. This little law is
extraordinarily useful!

------
jumpmanjr
I applied Bendord’s law against hard drive bad block addresses across 20,000
enterprise storage arrays that called home. In theory, drive bad block LBAs
should map perfectly to Benford’s distribution. In our system, there were a
number of anomalies. Digging in further, I discovered that the engineeers
periodically had the drives seek to a certain location, and write a status
block. This happened frequently enough that it interrupted the drives internal
“swirl” algorithm that was developed to keep the head from carving a “canyon”
into the medium. At a microscopic level, our drives looked like the Grand
Canyon.

~~~
romwell
This is fascinating!

The only time I've heard people talking about using Benford's law to detect
anomalies was in the context of election fraud. This is much more exciting and
practical.

~~~
jumpmanjr
Thanks! Unfortunately the raid controller needed those blocks during boot
time, or it couldn’t recover properly. I may have convinced the engineers to
turn on “read after write” for those blocks.

After explaining this discovery to my manager, I also explained how Benford’s
law could be used to detect fraud in his corporate travel expenses. He seemed
more interested in that application.....

------
briandoll
How Benford's Law was discovered is fascinating - one of my favorite Radiolab
episodes: [http://www.radiolab.org/story/91699-from-benford-to-
erdos/](http://www.radiolab.org/story/91699-from-benford-to-erdos/)

~~~
acqq
Can somebody summarize for us who can't listen?

~~~
jloughry
This was in the days before electronic calculators, when even mechanical
adding machines were expensive and people used tables of logarithms (and trig
functions) to calculate. [Think of Manhattan Project days.]

Benford was using a book of such tables (even random numbers came in books, in
those days) and noticed that some pages of the book were much more dog-eared
than others. That led him to wonder why those particular pages were being used
more than others. He discovered that it correlated with the first digit of the
numbers. Pages starting numbers with low digits were used more often than
pages starting with higher digits.

------
donovanbiela
Interesting note from the page: "Some well-known infinite integer sequences
provably satisfy Benford's Law exactly including Fibonacci numbers,
factorials, and the powers of almost any number."

~~~
tgb
What are the numbers whose powers don't satisfy the law? 0 of course and any
power of 10 will have only leading 1's. Any product of just 2's and 5's,
probably?

~~~
madcaptenor
Powers of two satisfy Benford's law. In order for powers of an integer k to
not satisfy Benford's law, k would have to be a rational power of 10; in fact
it would have to be an integer power of 10 because if 10^(m/n) is rational
then it's an integer.

But a quick calculation makes it look like powers of two don't satisfy
Benford's law, For example the first power of 2 which begins with 7 is 2^46,
and the first power of 2 which begins with 9 is 2^53. This takes so long,
roughly speaking, because 2^10 = 1024 is so close to a power of ten.

~~~
tgb
That surprised me since I saw the "generalization to digits beyond the first"
in the article and it's clear that powers of 2 or 5 clearly don't have a
Benford-style distribution of their _last_ digits. However, that was a
misunderstanding on my part, since that generalization says something about
the _first_ n digits, and any fixed n will eventually not include the last
digit of the number.

~~~
madcaptenor
In fact powers of any integer don't have a Benford-style distribution of their
last digits, since those digits will repeat periodicially. For example powers
of 2 end in 2, 4, 8, 6, 2, 4, 8, 6, ...

------
parallel_item
I applied it to a sizable chunk of company expense data when auditing accounts
payable. It highlighted some managers who made consistent purchases from the
same vendor, not the multi-million dollar fraud I was hoping to find.

~~~
ska

        not the multi-million dollar fraud I was hoping to find.
    

Why were you hoping to find this?

~~~
parallel_item
Like Midas, it is the dream of all green auditors until they find it and
realize the downsides.

------
schemathings
Was a contributor to a short-lived website/blog for business folks. My non-de-
plume was Benford :)

------
amelius
But prices are often $.99 cents, or $9.99, or $99.99, etc.

EDIT: Ok, ignore. Article mentioned it.

~~~
pliny
FTA:

>Distributions that would not be expected to obey Benford's Law

> ...

>Where numbers are influenced by human thought: e.g. prices set by
psychological thresholds ($1.99)

~~~
_qhtn
> Where numbers are influenced by human thought

Fraud?

~~~
dredmorbius
Yes.

