Random numbers need not be uniform

fhars · on June 14, 2010

Of course not. There are whole books on the subject: http://cg.scs.carleton.ca/~luc/rnbookindex.html

fhars · on June 14, 2010

For what it is worth: here is an example of a non-uniform random variate:

  a = [0,0,0,0,0,0,0,0,0,0]
  1000.times do
    a[5 *(rand + rand)] += 1
  end

results in (actual numbers may vary randomly):

  irb(main):020:0> a
  => [28, 60, 96, 151, 171, 168, 146, 96, 63, 21]

snippyhollow · on June 14, 2010

I am disappointed, I thought you were about to link something stating that "it is not correct to always consider unknown random variables to be uniformly distributed" (take 1/X for instance of why this is wrong). Instead, Benford's law...

jgrahamc · on June 14, 2010

The phrase "Random numbers need not be uniform" is a really bad title for this. Random numbers are by definition uniformly distributed.

What you are referring to here is Benford's Law. Why not simply say that in the title?

ajuc · on June 14, 2010

This is really a bad title, but random numbers indeed need not be uniformly distributed - for example Gauss distributed numbers aren't distributed uniformly (from definition:).

jgrahamc · on June 14, 2010

I don't understand in what sense numbers drawn from a Gaussian distribution would be considered random.

ajuc · on June 14, 2010

If the random variable that we measure has Gaussian distribution - the numbers we read will have Gaussian distrubuition, and will be random.

Other example - rolling dice with 5 sides signed "6" and one side signed "1" - results will be random, but the resulting distribution won't be uniform.

Uniform distribution is special case of random distribution.

Rod · on June 14, 2010

"Uniform distribution is special case of random distribution."

What is a random distribution? I think you meant probability distribution.

ajuc · on June 14, 2010

Indeed - sorry - my math Enlish isn't very good.

jgrahamc · on June 14, 2010

Ah. Fair enough. That's a rather interesting case, and what I assume is not what people think of when you say random number. I assume that people think of a uniform distribution (i.e equally likely numbers).

Rod · on June 14, 2010

Speak for yourself. Gaussian distributions are ubiquitous (thanks to the Central Limit Theorem), while uniform distributions only exist in the realm of pseudo-number generators and little else.

I think that the point you're trying to make is that a random variable with a uniform distribution is more "random" than a random variable with a gaussian distribution. The metric you're implying is called entropy, as a uniform distribution is maximally entropic.

omaranto · on June 14, 2010

You're probably right that people who haven't taken a course in probability are likely to use "random" to mean "uniformly at random". And of course, most people haven't taken a course in probability.

mturmon · on June 14, 2010

I would assert that people don't even know how to conceptualize the question in these terms ("random number" == "uniform distribution").

After you do study probability (by gambling, by lab work, by quantitative programming, or, least fun, in coursework), you get introduced to the "right questions to ask", like:

"what's the distribution", "are the samples independent", "what are you conditioning upon"

For instance, many people don't know that the distribution of the sum of two dice looks very different than just one die. They're both random, but not both uniform. Often people know something's weird, but can't formulate the question because they don't have the language.

warfangle · on June 14, 2010

I've actually looked at ways of combining die rolls so that there are an x and y variable applied to them and they produce as Gaussian distribution:

Increasing the x would decrease the range of the distribution, while increasing the y would increase the magnitude of the distribution.

I never found something simple enough for what I wanted to apply it to (a game).

Retric · on June 14, 2010

I thought the sum of several dice rolls was a Gaussian distribution to start with. What more do you need just roll say 10 dice and then pick a threshold value?

sesqu · on June 14, 2010

Exactly. 3d6 is already pretty close, and it just gets closer and closer to the gaussian as you increase the number of dice. And it's bounded, too, so you get minimum and maximum values.

SamReidHughes · on June 14, 2010

I don't understand in what sense numbers drawn from a uniform distribution would be considered random.

tome · on June 14, 2010

Do you think the result of rolling two dice and adding them is random?

Daniel_Newby · on June 14, 2010

The numbers' only relationship is to the distribution, not each other or some other process.

p3ll0n · on June 14, 2010

Benford's Law is a special case of Zipf's Law (http://en.wikipedia.org/wiki/Zipfs_law) as they both originate from the same scale invariant functional relation from stat mech.

pohl · on June 14, 2010

Sweet...a wikipedia page that reliably causes an "Aw, snap" in Chrome (5.0.375.70 on MacOS X). Is it just mine?

(In case anybody is wondering whether I've done the right thing and reported the problem: yes, I have.)

scott_s · on June 14, 2010

Strange. Loads fine for me. Same version of Chrome on 10.5.8.

pohl · on June 14, 2010

Interesting. I'm on 10.6.3. I'm also using the 64-bit kernel, which isn't the default. I wonder if one or both of these variables accounts for the difference.

mturmon · on June 14, 2010

I get "snap" reliably from Chrome 5.0.375.70. I get crashes from Safari 4. OS X 10.6.3.

jellicle · on June 14, 2010

People are going to be confused by this, because the definition of "random" used by mathematicians and that used by programmers is different.

"Random" to mathematicians means the outcome of a probability experiment. If you flip a two-headed coin or roll a one-sided die, and record the results, the results are "random" - random within the set of possible outcomes, according to the function that generates them. So you dutifully record "heads", "heads", "heads" and "1", "1", "1" for your results. It's random!

"Random" for mathematicians inherently means "predictable". There's a known probability distribution.

The word "random" for programmers means something very different. It inherently means "unpredictable". The probability distribution must be flat across the space of possible outcomes. If we are operating in a base-10 system, the chance of any of the next digits occurring must be exactly 1/10.

My prediction is most of the discussion here will be people talking past each other, using different definitions of the word "random".

ANH · on June 14, 2010

Speak for yourself. I am a programmer, and to me 'random' is a characterization of any variable drawn from a probability distribution, and it need not be uniform. In fact, most of the random time series I generate are produced by drawing from non-uniform (e.g. power law, gaussian) distributions. And I think calling them "predictable" is a bit of a stretch. They are predictable only insofar as I can say, "This variable will probably not go above or below X."

ajuc · on June 14, 2010

I'm programmer - for me random string is a string, that is shorter than any programm that generates it (random string has Kolmogorov complexity greater or equal its length).

I think this is the definition for mathematicians as well.

It's just that "random" is a common shortcut for - string comming out of probability experiment - it is very likely that it is random according to Kolmogorov complexity, because most strings are random, and space is very very big.

omaranto · on June 14, 2010

While I think that while most mathematicians are aware of Kolmogorov/Chaitin complexity theory, the first thing that comes to mind when they hear "random" is the theory of probability. In probability a random variable is, roughly speaking, an experiment that has a set of possible outcomes each occuring with a certain probability. The probabilities are not necessarily the same.

I think most mathematicians would agree that Probability is a much larger and important branch of mathematics than Kolmogorov-Chaitin complexity, and usually use "random" in its probabilistic meaning.

ajuc · on June 14, 2010

I was always confused by lack of definition of randomness in statistic/probabilistic meaning, so I've somewhat mixed these 2 branches of math.

Thanks for clearing this up.

sesqu · on June 14, 2010

I don't agree with that. Yes, most people interpret the word "random" as "uniformly distributed" or "unpredictable", depending on context ("pick a number at random" vs. "some random glitch"), but statisticians and mathematicians interpret it as "nondeterministic".

That's not at all the same as "predictable", but the goal is usually to be able to characterize the phenomenon statistically such that certain properties, such as general location on the number line, can be identified. If such properties cannot be identified, that doesn't make the phenomenon any less random - just less understood and by extension less predictable.

scott_s · on June 14, 2010

This phenomenon can be witnessed below, but, hey, people cleared up the confusion! Yay for civil discussions about technical topics.

Although I do disagree with the words you used to characterize the differences. While most people use "random" to mean "unpredictable," as you demonstrated, a uniform distribution is just as "predictable" as any other kind of distribution.

I think the real difference is that most people only associate "random" with uniformly random distributions. Since you likely need some formal probability education to be aware of the others, this isn't surprising.

Perceval · on June 14, 2010

>and "1", "1", "1" for your results. It's random!

That's the problem with random, you can never be sure: http://www.dilbert.com/dyn/str_strip/000000000/00000000/0000...