
What does randomness look like? - cyrusradfar
http://www.empiricalzeal.com/2012/12/21/what-does-randomness-look-like/
======
henrikschroder
Back in the day I was in a raiding guild in World of Warcraft. And back then,
each boss monster in a raid dungeon had a loot table, and when you killed it,
your raid got a few pieces of random loot.

And I don't know how many times I, and the other guy in the guild who
understood statistics, had to explain to the others that random doesn't mean
uniform. If a certain piece of armour has a 1/X chance to drop from a boss,
what people _think_ should happen is that if they kill that boss X times, they
should see it drop once.

But the reality was of course that loot was very non-uniform. Some pieces we
saw lots of times, and other pieces very rarely, despite them having the same
drop chance. And the players who wanted those pieces that happened to be rare
for our guild, got very, very angry.

We saw the same things on the official message boards, players were _furious_
after having spent a year killing the same raid boss once a week, and _never_
seeing a certain piece drop for them. But simple math shows that with million
of players, tens of thousands of raiding guilds, some of those will see very
streaky results.

These days in World of Warcraft, boss monsters drop tokens instead, and when
you have X tokens, you can exchange that for a piece of armour, or a weapon,
guaranteed. And noone complains about the random loot anymore.

~~~
thurn
This is actually an important lesson for game designers. Real randomness is
very frustrating for players! Games should be designed to _not_ be random. For
example, there's an expansion to _The Settlers of Catan_ that lets you use
cards instead of dice to ensure a nice smooth distribution of resources... I
think it's a lot more fun!

~~~
henrikschroder
Oh yes. Our intuition is that random is fair and uniform, but it's absolutely
not.

I was bit by another random quirk in World of Warcraft. There was a long-
running achievement called "What a Strange Long Journey it has Been", which
took at least a year in real-time to complete, and you had to actively play
during each of the ten or so in-game festivals and holidays, and do some
quests and tasks during each. If you missed a festival, you had to wait
another year to do it again, so if you were aiming for the achievement, you
really wanted to do it in one go.

During their version of Valentine's, each player got a bag of heart-shaped
candy, and the task you had to complete was to pull out at least one each of
the eight different heart candies. But you could only pull a piece of candy
once every hour or so, each time you pulled a piece you had a 1/8 chance to
get a certain piece, but the holiday was time-limited, so you had about two
weeks to complete it.

Sounds easy and fair, right? 1/8 chance, get all eight pieces, two whole
weeks, easy! Except that there was one piece that I just never got. The piece
that said "I LOVE YOU!". And as the time went by, I got more and more frantic,
logged in more often so as not to miss any opportunity to pull a piece of
candy, but no luck.

So, I did a quick bit of math. You can pull one piece every hour for two
weeks, but sleeping, not playing, missing days etc, meant that I effectively
pulled ~100 pieces. The chance of missing a certain piece is 7/8, so missing a
certain piece 100 times in a row is (7/8)^100, which is roughly a little bit
more than one in a million.

With ten million players all doing the same thing, there's going to be a
number of them that will hit that "one in a million" chance, which means that
whoever designed that part of the achievement didn't do their homework, didn't
do the math. Because intuition tells you that 1/8 chance is plenty! Hundreds
of tries, of course everyone will get all pieces! Except proper math tells you
otherwise, and I was one of the "lucky" outliers.

(Later, they apologized and retroactively removed that part of the
achievement, so I got my purple nether-drake mount without having to wait a
year extra.)

~~~
baddox
> Oh yes. Our intuition is that random is fair and uniform, but it's
> absolutely not.

But a uniform distribution _is_ random, isn't it? This sort of phrasing is
rampant in this thread, and I'm confused by it. How can a uniform distribution
be considered non-random, or "less" random than another distribution?

~~~
InclinedPlane
Randomness is about having a _uniform probability_ of results, but that does
not translate into a uniform distribution of results, nor can it, because of
the de-correlation between random results. Specifically, the chance of getting
the same result multiple times is non-zero, and actually can be fairly high
with a lot of samples, whereas the chance of duplicate results in a uniform
distribution is zero.

Read the article for more details and examples.

~~~
scott_s
_Randomness is about having a uniform probability of results_

Well, you're still assuming the colloquial definition of "random," which
implies "uniformly random." Of course, we can have a random process that is
_not_ uniformly random.

------
marvin
Regarding the bomb plot at the very bottom. Examining the number of bombs
dropped in each square and comparing it to the Poisson distribution, it
appears that the distribution of the bombs is random.

But looking at the plot on the map, it appears that the higher incidence of
bombs is focused on a specific region. The Poisson distribution doesn't
account for the fact that a lot of the squares with a high incidence of bombs
are adjacent to each other. From my layman's understanding, it appears that
the bombs were in fact targeted on a specific area, but that there was a
random offset from this area regarding where the bombs actually landed.
Because of this, you'd see randomness in the distribution. But the
distribution of bombs wasn't really perfectly random.

Is the author deliberately avoiding this point, or is there something I've
misunderstood?

~~~
feral
If you read the linked post its clearer whats going on:
[http://madvis.blogspot.ie/2010/09/flying-bombs-on-london-
sum...](http://madvis.blogspot.ie/2010/09/flying-bombs-on-london-summer-
of-1944.html)

The poisson analysis is only done on a subset of the area thats in the 3d
plots: "Clarke's analysis was focused on the central area of higher density
here and with finer geographic coordinates. Within that area his analysis
found no evidence of clustering that cannot be accounted for by a Poisson
process."

~~~
jerf
Yes, it could have been a bit more clear in how they "weren't being aimed".
Obviously the Germans did not just point them straight up and light them off,
hoping for the best. What this really demonstrated was something more like
where the noise floor for the bomb distribution was. They could hit "London"
but they couldn't hit any particular place in it. Very important for the
military to know that information.

------
Xcelerate
Here's a weird thing about randomness that has always bothered me:

In quantum mechanics, if you measure two incompatible observables (like
position and momentum) of a system, and then repeat that experiment many
times, you will get two lists of real numbers. QM says you can predict the
distribution of these numbers, but you cannot predict the individual numbers
themselves. The popular way of thinking nowadays is that "the universe is just
inherently random".

So I posed the question on the Physics Stack Exchange: how do we know these
numbers are truly random, and not the result of some as-yet-undiscovered
pseudorandom number generator that is nonetheless deterministic? Luboš Motl
(Czech string theorist) replied (a bit abrasively I might add) that yes, the
numbers are truly random and plenty of experiments have ruled out the
loopholes. Now, there's no way to determine if a set of numbers are truly
random, so how he made this bold matter-of-fact statement is beyond me.

Einstein initially believed in "hidden variable" theories, undiscovered
properties of quantum systems. Most of these have been ruled out by experiment
(this is what Lubos mentioned), but really, this doesn't apply at all to my
question of whether those numbers are random or not. Superdeterminism seems to
still allow non-randomness, but for some reason, most physicists (notably
excepting Gerard t'Hooft) have discounted superdeterminism as nonsense.

~~~
pfortuny
Yes, that is the thing: Bell's inequality has been verified experimentally and
that leads to a non-hidden variables reality (for the usual meaning of
'reality'):

<http://en.wikipedia.org/wiki/Bell%27s_theorem>

~~~
sp332
Even weirder [http://arstechnica.com/science/2012/04/decision-to-
entangle-...](http://arstechnica.com/science/2012/04/decision-to-entangle-
effects-results-of-measurements-taken-beforehand/)

------
praptak
You can try it out yourself in your browser. This scriptlet generates random
and non-random dot distributions side by side (refreshing regenerates, at
least in Firefox 17):

javascript:"<html><body><canvas id=\"tutorial\" width=\"200\"
height=\"200\">foo</canvas><-><canvas id=\"tutorial2\" width=\"200\"
height=\"200\">foo</canvas><script>var canvas =
document.getElementById('tutorial');var ctx =
canvas.getContext('2d');ctx.fillStyle = \"rgb(000,0,0)\";for (var
i=0;i<400;i++) {ctx.fillRect (Math.random() * 200,Math.random() * 200, 2, 2);
};</script><script>var canvas = document.getElementById('tutorial2');var ctx =
canvas.getContext('2d');ctx.fillStyle = \"rgb(000,0,0)\";for (var
i=0;i<20;i++) for(var j=0;j<20;j++) for(k=0;k<1;k++) {ctx.fillRect (i * 10 +
Math.random() * 10, j * 10 + Math.random() * 10, 2, 2);
};</script></body></html>"

(just paste the above into the address bar)

~~~
roryokane
Here’s a web page displaying those dot distributions plus a more readable
version of that code: <http://bl.ocks.org/4358325>.

~~~
praptak
Thanks, this looks much better. My code was like this because I used a pasted
canvas tutorial example + my browser's URL bar as the IDE :)

------
Uhhrrr
"Gravity's Rainbow", by Thomas Pynchon, has an extended section about the
Poisson distribution as applied to falling bombs, maternity wards, etc.:

[http://books.google.com/books?id=GGPm4I3BbxAC&pg=PT194&#...</a><p>(possibly
NSFW)

~~~
krymise
That book is exactly what I thought of when reading the article. Everyday I
learn something new that seems to unlock a section of that novel that was
previously opaque.

------
suby
"The one on the left, with the clumps, strands, voids, and filaments is the
array that was plotted at random, like stars."

Are stars really plotted at random?

~~~
davidcuddeback
That's a good question. There's obviously more stars in the directions of the
galactic plane (what we observe as the Milky Way), but the individual stars
that we see in the night sky tend to only be the ones that are really close to
us (relatively speaking). So it's a question of which effect dominates the
other. I think it'd be interesting to see how well the distribution of stars
fits a Poisson distribution.

~~~
morsch
Imagine you were one of the dots on the plane in the dot illustrations. Would
the "constellation" of all the other dots from your position be Poisson
distributed?

------
thaumaturgy
I would love to see more like this on HN.

~~~
cyrusradfar
The author is @aatishb (<https://twitter.com/aatishb>) on Twitter. I've
followed him for some time and he's a enjoyable science/math blogger if you're
interested in that.

~~~
thaumaturgy
Thank you! I've added him to my reading list.

------
tdyo
This is exactly how they determine species distributions in ecology as well -
<http://en.wikipedia.org/wiki/Species_distribution>

Uniform dispersion would suggest some territoriality aspect of the species,
and clumped dispersion would suggest a heterogeneity of resources (or any
other hypothesis that could then be tested).

~~~
aatish
Nice! That's a really interesting and relevant example.

------
dgallagher
Python script I wrote a while back that outputs random images. PIL must be
installed to work. Change |file_name| accordingly. JPEG images can be up to
65535 x 65535 in size, which are the max values for |width| and |height|. It's
not optimized, so keep the resolution small unless you want to wait a while.

It'll output images that look like this: <http://dave-
gallagher.net/pics/666x666.png>

    
    
        from PIL import Image, ImageDraw
        from random import randint
        
        def random_image():
            
            width       = 666
            height      = 666
            file_name   = '/Users/Dave/%dx%d' % (width, height)
            path_png    = file_name + '.png'
            path_jpg    = file_name + '.jpg'
            path_bmp    = file_name + '.bmp'
            path_tif    = file_name + '.tif'
            
            img  = Image.new("RGB", (width, height), "#FFFFFF")
            draw = ImageDraw.Draw(img)
            
            for height_pixel in range(height):
                if height_pixel % 100 is 0:
                    print height_pixel
                
                for width_pixel in range(width):
                    r  = randint(0, 255)
                    g  = randint(0, 255)
                    b  = randint(0, 255)
                    
                    dr = (randint(0, 255) - r) / 300.0
                    dg = (randint(0, 255) - g) / 300.0
                    db = (randint(0, 255) - b) / 300.0
                    
                    r  = r + dr
                    g  = g + dg
                    b  = b + db
                    
                    draw.line((width_pixel, height_pixel, width_pixel, height_pixel), fill=(int(r), int(g), int(b)))
                
            img.save(fp=path_png, format="PNG")
            img.save(fp=path_jpg, format="JPEG", quality=95, subsampling=0)     # 100 quality is 2x to 3x file size, but you won't see a difference visually.
            img.save(fp=path_bmp, format="BMP")
            img.save(fp=path_tif, format="TIFF")
        
        
        if __name__ == "__main__":
            random_image()

------
y4m4
Such a great article! - something nice at the HN top after a long time.

------
gluegeorge
Were the bombs actually mostly random?

~~~
davidvaughan
No, they were aimed but it was a matter of trial and error. To get the data
about where they'd landed, some had radio transmitters. They sent a message
back to armourers in France (I assume some were deliberately non-explosive).

Meanwhile German agents in England were also observing the success or failure
of the V2s, and in particular where they'd landed.

However, in an effort to deceive the Germans, the British started reporting
the correct time of successful attacks, while mentioning an incorrect
location.

Moreover, a double agent called Eddie Chapman also fed false information back
to the Germans.

As a result, the aimers never really got a grip on ranging accurately. The
bombs started landing to the south east of London.

There's considerably more detail about this in Most Secret War by R V Jones,
who was involved in all sorts of ruses to confuse the enemy. Well worth a
read.

Eddie Chapman (Agent ZigZag) was played by Christopher Plummer in the film
Triple Cross. It seems to be on YouTube.

~~~
andrewcooke
does that book discuss whether / how the misinformation was used to select the
south east? i suspect that was a poorer area of london and i vaguely remember
some kind of scandal about the relative suffering of various parts of london
and class, etc. so i wonder to what extent the south east was chosen (by the
british) as a target?

------
ChrisNorstrom
True Randomness = If you roll a six sided die 60 times, each side will NOT get
chosen 10 times. 1 side(s) will get chosen more than others, another side(s)
might not get chosen at all.

So for all you entrepreneurs, if you fail, don't fall into a depression. Sure
you worked just as hard as everyone else, maybe even harder. And yeah it's
annoying to see others surpass you even though you've got everything they do.
But that's life, you just got a bad batch of rolls.

------
laureny
Here is an article on how to assess whether your data sample is "random
enough": <http://goo.gl/opl52>

------
pm90
Very interesting! So, the basic point being made is that if you know that a
set of events are random and independent and you know their mean value, then
you can predict their spread? (or aggregation)

edit: Hmm...another question that comes to mind: is the converse true? If the
spread of values of these events do not match the poisson distribution, then
can we presume them to be nonrandom? Or nonindependent? Or both?

~~~
quantumet
The Poisson distribution is just one random number distribution; there are
several others for situations where the events are correlated, or have other
properties. Half the fun of probability is figuring out which distribution is
the right one to apply to the question at hand. So if your measurements don't
match up to Poisson, it doesn't mean they're not random - they could just be
interdependent.

So yes, for a Poisson process, the spread (standard deviation) is equal to the
square root of the mean; as the number of events gets large, the Poisson
distribution approaches the normal distribution, but the relationship between
the standard deviation and the mean continues to hold.

------
Avshalom
As a side note this is one of those things that several tetris-clones deal
have to deal with, long runs are probable but can frustrate a player so a lot
of the times you want to to extra logic to keep it from being actually random.

A uniform "deck" a couple times larger than the number of pieces is the usual
suggestion prevents large runs and makes sure that you see every piece more
regularly.

~~~
wfn
Marginally related --

on the other hand, here's "Evil Tetris": <http://qntm.org/hatetris>

------
jebblue
I was happy that the first example with the dots looked like the second one
was natural. My eyes felt better. With the student example my eyes struggled
so I "reasoned" that the first one had more clumps and was wrong since the
first dot example had more clumps and was wrong.

I guess I'm not a good randomness detective.

------
ashutosh2000
This article is really interesting. It says that randomness doesn't mean
uniformity but true randomness can have clusters. So what exactly is
randomness? How can we define randomness? Is Math.random() is really random?
Which is the best random function and how can we find if it is purely random?

------
Zenst
True randomness is a system that say generates a 10 digit number and 11111111
has the same odd's as appearing as the other permutations. But then thats not
strictly true as we all know any system that monitors the outcome (a good
example will be fruit machines) to make sure there is a even distribution of
all permutation will in essence remove that level of definition we like to
think as truely random.

With that it gets hard to truely say what is random or what is a as yet
unknown pattern. This is why many have taken the approach of not having a
single source of random numbers but use many and average out from there. There
again is that random as the chances with such an approach of getting a high or
low value would be biased out.

So with that I postulate one mans random string is another mans non-random
string. So with that I define randomness as a yet undertermind sequence or a
data. So the included Dilbert post is with that extreemly clever and totaly
true.

~~~
Steko
"hard to truely say what is random or what is a as yet unknown pattern."

No, it's just saying that it's much easier to get pseudorandomness out of a
computer than true randomness.

"This is why many have taken the approach of not having a single source of
random numbers but use many and average out from there. There again is that
random as the chances with such an approach of getting a high or low value
would be biased out."

Technically speaking you wouldn't average you'd add them together and take the
decimal portion (modulo 1). That can negate bias as long as one of the sources
is good even if you don't know which one.

Of course you can remove bias from a single source by Von Neumann's method
although this might be computationally harder than the above:

[http://en.wikipedia.org/wiki/Fair_coin#Fair_results_from_a_b...](http://en.wikipedia.org/wiki/Fair_coin#Fair_results_from_a_biased_coin)

------
baddox
I'm a bit confused. What is this notion of "pure randomness" that this article
and many comments seem to be eluding to? Perhaps I'm just not using the same
definitions of terms, but I thought a uniform distribution is still considered
random.

~~~
lutusp
> Perhaps I'm just not using the same definitions of terms, but I thought a
> uniform distribution is still considered random.

No, a uniform distribution is not evidence of randomness. Consider the digits
0 - 9 repeated endlessly:

0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
...

Uniformly distributed? Yes. Random? No.

> What is this notion of "pure randomness" that this article and many comments
> seem to be eluding [sic] to?

First, s/eluding/alluding/

Second, although the topic is complex, one test of randomness is that an ideal
compression method, one able to find and exploit any repetitive pattern,
cannot compress a random sequence.

Third, the term "entropy" as used in information theory is tied to randomness,
as explained here:

<http://en.wikipedia.org/wiki/Entropy_(information_theory)>

A quote: "The entropy rate for a [fair coin] toss is one bit per toss.
However, if the coin is not fair, then the uncertainty, and hence the entropy
rate, is lower."

Based on that, high entropy -> high randomness.

Not to oversimplify a complex topic.

~~~
endtime
>Uniformly distributed? Yes. Random? No.

I think you're fudging what is supposed to be uniform here. In your example,
the unigrams (i.e. single digits) may be uniform, but the n-grams for n > 2
are not.

~~~
lutusp
Actually, the n-grams are periodic and repetitive also. They certainly aren't
random.

~~~
endtime
They may be periodic and repetitive, but they aren't uniform. For example,
there are lots of "01"s and no "02"s.

------
tome
Strange terminology. In mathematical parlance the arrangement of the glowworms
is still described as "random", but the positions of the worms simply are not
independent of one another.

~~~
bluedanieru
Can you expand on this? I'm curious. It seems to me that with the glowworms,
knowing the position of one worm tells you something about the position of
every other worm in the sample. If it was truly random, wouldn't it tell you
nothing?

~~~
tome
No, in mathematics the eye colour and hair colour of a randomly selected human
being are considered random, even though knowing one tells you something about
what the other is likely to be. Mathematically, "random" means distributed
according to some definite (though perhaps unknown) distribution whereas in
common parlance it means something like "not distributed according to anything
definite at all" (though it's not clear that this way of defining things is
even meaningful).

~~~
jwmerrill
Common parlance "random" seems to mean something more like "uniformly
distributed and independent."

------
michaelochurch
One under-stated but awesome thing about Poisson distributions: fishing has a
Poisson distribution. (Poisson is French for "fish".)

~~~
endtime
I think that would only be true in an infinite lake.

------
saosebastiao
TL;DR <http://www.google.com/finance>

