
The Art of Computer Programming: Random Numbers - emanuelez
http://www.informit.com/articles/article.aspx?p=2221790
======
antics
When you step back to think about it, a _huge_ amount of CS theory papers rely
on this hand-wavy definition of "randomness" to prove something. Oddly,
though, if you actually press a CS theory person for a mathematical definition
of this concept, they virtually always draw a blank.

As a student I remember thinking that this was _incredibly_ alarming. Without
a good understanding of this concept, how can we be sure any of what we're
saying is even remotely correct?

What Knuth has given us here is the ideal treatment of the subject,
essentially putting the question of what randomness is, to rest for good. He
starts by taking a comprehensive, entertaining history of the landscape (in
short: researchers ping-ponged between having definitions of randomness so
strict that nothing was random, and so loose that they were all pathologically
predictable) before finally proving a theorem that completely solves the
problem.

It is a monumental accomplishment in the field, and it is quite shocking to me
that it's still so obscure. It's one of my favorite results in all of CS,
anywhere.

If you haven't had the chance, I highly recommend the rest of this volume
(Seminumerical Algorithms). It is just stuffed with other surprising, amazing
results.

~~~
sidww2
What? The modern definition of pseudorandomness
([https://en.wikipedia.org/wiki/Pseudorandomness#Pseudorandomn...](https://en.wikipedia.org/wiki/Pseudorandomness#Pseudorandomness_in_computational_complexity))
was figured out in the 80s through the works of Blum, Goldwasser, Micali,
Goldreich, etc. and is not hand-wavy at all. It's pretty rigorous and
reliable.

~~~
antics
You are 100% mistaken that this implies we had a good definition of random
sequences before Knuth. In the article you link, they discuss the uniform
distribution, but any distribution (and the modern notion of probability in
general) absolutely depends on a mathematically precise notion of random
sequences.

If you don't believe me, read the chapter. Early probability theorists (e.g.,
von Mises, Kolmogorov) literally started thinking about randomness _in order
to define probability_.

EDIT: And, I don't suppose it's worth pointing out that pseudorandomness is
_not at all_ the same thing as randomness. The fact that you seem to use them
interchangeably is not a good sign IMHO.

EDIT 2: Why the unexplained downvote, HN? :(

~~~
sidww2
I skimmed through the extract presented (didn't have time to go into detail)
but I don't see a formal definition of any kind presented in the extract.
Could you point me to where it is? And if it's not in the extract, then could
you quote it here?

Pseudorandomness is not the same thing as randomness but most algorithms today
work on pseudorandom numbers so the concept is important. My impression was
that that's what you were referring to.

PS :- FYI I didn't downvote your comment. Actually upvoted as your post made
me discover some new math (various notions of randomness by kolmogorov, von
mises, martin-lof) :)

~~~
antics
So, the excerpt here is chapter 3, section 1. The actual definition happens in
chapter 3 section 5 ("What Is a Random Sequence?"). I have the book at home,
though, so I can't quote it here. Sorry. But the intuition is, if you have an
infinite sequence of random numbers, then the numbers in all infinite
subsequences should be equidistributed. So, like, if you a stream of random
0's and 1's, then if you pick only every other number, the 0's and 1's have to
be equiprobable, and if you pick every third number, they still have to be
equiprobable, etc. This is slightly wrong, but it's on the right track to the
actual definition.

re: Pseudorandomness, the point of pseudorandomness is the following.

1\. A lot of algorithms use randomness to make pathologically bad cases
extremely unlikely. For example, choosing a random pivot in quicksort makes
the worst case _very_ unlikely.

2\. But in a lot of cases, this leads to huge amounts of space consumption.
For example, most frequency moment estimations involve a matrix of random
numbers. So if you're getting those numbers from a "truly random" source, then
you have to store the entire matrix, which can be huge.

3\. So, a better solution is to use a pseudorandom number generator! That way
you can store a seed of s bits, and do something clever, like
deterministically re-generate the matrix as you need it, rather than storing
it outright.

Notice though, that this is not independent of the notion of randomness! In
fact they are quite intimately tied together.

~~~
sidww2
Your definition relies on the notion of probability though. So I'm not sure
why you seemingly view Knuth's work as more fundamental than Kolmogorov's,
etc.

~~~
antics
Because the trick Knuth pulls is to express this intuition without appealing
to the definition of probability. It's quite clever.

~~~
tel
What's the sketch of the trick? I can define randomness by appealing to some
of the same basic theory used to develop probability, but it's not really
independent despite looking that way from the outside. Does Knuth do this
uniquely?

------
thegeomaster
This reminds me of a (somewhat) funny situation that happened to me one time.
My parents bought some Lotto tickets and wanted to play, so they filled most
of the combinations themselves and left two of them for me. I was annoyed by
having to do that, so I take the tickets, and ask my dad, "So, you know that
every combination is equally likely to win?", and he casually replies
something like, "Yeah, sure, I'm not dumb".

So just to get done with it, I pick 1, 2, 3, 4, 5, 6 and 7 on one combo and 8,
9, 10, 11, 12, 13 and 14 on the other. I give it to him, when he throws his
hands in the air and angrily says, "What the hell? You just wasted two
combinations, these numbers are never going to get drawn!"

It handily illustrates how hard it is to grasp the concept of true randomness
and probability, and even if you do get it, sometimes you'll be caught off
guard and your psychological biases will kick in.

~~~
abecedarius
I ran through the same script with my dad once, except he answered "Yeah, but
you'll have to split the pot with more smartasses like you."

~~~
bronson
"You know, the most amazing thing happened to me tonight. I was coming here,
on the way to the lecture, and I came in through the parking lot. And you
won't believe what happened. I saw a car with the license plate ARW 357. Can
you imagine? Of all the millions of license plates in the state, what was the
chance that I would see that particular one tonight? Amazing!" \-- Feynman

------
nwhitehead
Once you've got pseudorandom bits working, the next step is generating
variates from different distributions. Luc Devroye has a nice book on this
freely available online, "Non-Uniform Random Variate Generation".
[http://luc.devroye.org/rnbookindex.html](http://luc.devroye.org/rnbookindex.html)

~~~
tel
This book deserves lots of recommendations. Luc Devroye does generally, as
well.

------
adolgert
Do the Monte Carlo people all know about Barash and Shchur's work? They made
an RNGSSELIB and PRAND (using NVIDIA cards), but the main contribution is that
they incorporated a new way to generate multiple streams not by re-seeding the
generator but by skipping ahead, even with Mersenne twisters. It is the only
simple way to get parallel streams, and it didn't exist just a few years ago.
You still need to be careful in various ways, but this helps a lot. A lot, a
lot.

[1] [http://arxiv.org/pdf/1307.5866.pdf](http://arxiv.org/pdf/1307.5866.pdf)

~~~
pbsd
Skipping ahead is not new; it is a well-known fact that any linearly recurrent
generator, such as WELL, Xorshift, or Mersenne Twister can skip ahead via
matrix exponentiation.

And it is not the only way to get good independent streams either; hooking a
cipher with a counter is arguably a superior way to do it, cf [1].

[1]
[http://www.thesalmons.org/john/random123/papers/random123sc1...](http://www.thesalmons.org/john/random123/papers/random123sc11.pdf)

~~~
adolgert
I like that paper. Thank you. It has references which criticize the skipahead
technique. I need to look into whether I believe that paper, though. There are
lots of ideas about how to do this, and subtle flaws abound. While many
generators are known to skipahead, there is a quite new technique for MT that
makes it much faster, not by Barash and Shchur, but they used it. Again,
thanks.

------
contingencies
I feel compelled to share this quote from the author of _cfengine_ with whom I
have had some emails of late, because it neatly summarizes a more holistic,
physics-inspired approach to the computational world's obsession with
invariance, which could be seen as present in the quest for 'randomness'.

 _The simplest idea of stability is constancy, or invariance. A thing that has
no possibility to change is, by definition, immune to external pertubations.
[...] Invariance is an important concept, but also one that has been shattered
by modern ideas of physics._ What was once considered invariant, is usually
only apparently invariant on a certain scale. _When one looks in more detail,
we find that we may only have invariance of an average._ \- Mark Burgess, _In
Search of Certainty: The Science of Our Information Infrastructure_ (2013)

This accords well with the opening quotation _Lest men suspect your tale
untrue, Keep probability in view._ \- John Gay, English poet and dramatist and
member of the Scriblerus Club (1727)
[https://en.wikipedia.org/wiki/John_Gay](https://en.wikipedia.org/wiki/John_Gay)

------
mathattack
I'm not sure why, but I struggled with the concept of randomness in my
algorithms class. On the last day of class, the professor signed his book,
"May you one day explain randomness to all of us"

------
msie
This is old material, right? Or is Knuth publishing a new fascicle?

~~~
tjr
This is old material, from Volume 2. What's new is the recently-published
electronic editions, and, I guess, publishing some excerpts in online article
form, like this one here.

~~~
krazydad
...and the reference to rap music.

