Loved this article. There are so many applications of entropy and statistical ph...

pabo · on Aug 14, 2020

There's a related anecdote about John von Neumann: he used to joke that he has superpowers and can easily tell truly random and pseudo random sequences apart. He asked people to sit down in another room and generate a 0/1 sequence via coin flips, and record it. Then, generate another sequence by heart, trying to mimick randomness as much as possible. When people finally showed the two sequences to him, Neumann could instantly declare which one was which.

People were amazed.

The trick he used was based on the "burstiness" rule you describe: a long enough random sequence will likely contain a long homogeneous block. While humans tend to avoid long streaks of the same digit, as it does not feel random enough.

So, all he did was he quickly checked with a glimpse, which of the two sequences contained the longest homogeneous block, and recognized that as the one generated via the coin flips.

maxov · on Aug 14, 2020

That's a cool anecdote :-) I wouldn't say it uses concentration of measure exactly, but I see how it is related. The anecdote is about asymptotic properties of random sequences, and concentration of measure is about the same too. In this case, I think you can show that homogenous blocks of length log(n) - log log (n) occur at least with constant probability as n gets large. In other words, the length of homogenous blocks is basically guaranteed to grow with n. I suppose a human trying to generate a random sequence will prevent homogenous blocks above a certain constant length from appearing regardless of the length of the sequence, which would make distinguishing the sequences for large n quite easy!

I think there is also a quite strong connection in this anecdote to the information-theoretic notion of entropy, which takes us all the way back to the idea of entropy as in the article :-) Information-theoretically, the entropy of a long random sequence concentrates as well (it concentrates around the entropy of the underlying random variable). The implication is that with high probability, a sampled long random sequence will have an entropy close to a specific value.

Human intuition actually is somewhat correct in the anecdote, though! The longer the homogenous substring, the less entropy the sequence has, and the less likely it is to appear (as a limiting example, the sequence of all 0s or all 1s is extremely ordered, but extremely unlikely to appear). I think where it breaks down is that there are sequences with relatively long homogenous substrings with entropy close to the specific values (in the sense that the length is e.g. log (n) - log log (n) as in the calculation before), where the human intuition of the entropy of the sequence is based on local factors (have I generated 'too many' 0s in a row?) and leads us astray.