Hacker News new | past | comments | ask | show | jobs | submit login
Random numbers need not be uniform (wikipedia.org)
25 points by RiderOfGiraffes on June 14, 2010 | hide | past | favorite | 32 comments



Of course not. There are whole books on the subject: http://cg.scs.carleton.ca/~luc/rnbookindex.html


For what it is worth: here is an example of a non-uniform random variate:

  a = [0,0,0,0,0,0,0,0,0,0]
  1000.times do
    a[5 *(rand + rand)] += 1
  end
results in (actual numbers may vary randomly):

  irb(main):020:0> a
  => [28, 60, 96, 151, 171, 168, 146, 96, 63, 21]


I am disappointed, I thought you were about to link something stating that "it is not correct to always consider unknown random variables to be uniformly distributed" (take 1/X for instance of why this is wrong). Instead, Benford's law...


The phrase "Random numbers need not be uniform" is a really bad title for this. Random numbers are by definition uniformly distributed.

What you are referring to here is Benford's Law. Why not simply say that in the title?


This is really a bad title, but random numbers indeed need not be uniformly distributed - for example Gauss distributed numbers aren't distributed uniformly (from definition:).


I don't understand in what sense numbers drawn from a Gaussian distribution would be considered random.


If the random variable that we measure has Gaussian distribution - the numbers we read will have Gaussian distrubuition, and will be random.

Other example - rolling dice with 5 sides signed "6" and one side signed "1" - results will be random, but the resulting distribution won't be uniform.

Uniform distribution is special case of random distribution.


"Uniform distribution is special case of random distribution."

What is a random distribution? I think you meant probability distribution.


Indeed - sorry - my math Enlish isn't very good.


Ah. Fair enough. That's a rather interesting case, and what I assume is not what people think of when you say random number. I assume that people think of a uniform distribution (i.e equally likely numbers).


Speak for yourself. Gaussian distributions are ubiquitous (thanks to the Central Limit Theorem), while uniform distributions only exist in the realm of pseudo-number generators and little else.

I think that the point you're trying to make is that a random variable with a uniform distribution is more "random" than a random variable with a gaussian distribution. The metric you're implying is called entropy, as a uniform distribution is maximally entropic.


You're probably right that people who haven't taken a course in probability are likely to use "random" to mean "uniformly at random". And of course, most people haven't taken a course in probability.


I would assert that people don't even know how to conceptualize the question in these terms ("random number" == "uniform distribution").

After you do study probability (by gambling, by lab work, by quantitative programming, or, least fun, in coursework), you get introduced to the "right questions to ask", like:

"what's the distribution", "are the samples independent", "what are you conditioning upon"

For instance, many people don't know that the distribution of the sum of two dice looks very different than just one die. They're both random, but not both uniform. Often people know something's weird, but can't formulate the question because they don't have the language.


I've actually looked at ways of combining die rolls so that there are an x and y variable applied to them and they produce as Gaussian distribution:

Increasing the x would decrease the range of the distribution, while increasing the y would increase the magnitude of the distribution.

I never found something simple enough for what I wanted to apply it to (a game).


I thought the sum of several dice rolls was a Gaussian distribution to start with. What more do you need just roll say 10 dice and then pick a threshold value?


Exactly. 3d6 is already pretty close, and it just gets closer and closer to the gaussian as you increase the number of dice. And it's bounded, too, so you get minimum and maximum values.


I don't understand in what sense numbers drawn from a uniform distribution would be considered random.


Do you think the result of rolling two dice and adding them is random?


The numbers' only relationship is to the distribution, not each other or some other process.


Benford's Law is a special case of Zipf's Law (http://en.wikipedia.org/wiki/Zipfs_law) as they both originate from the same scale invariant functional relation from stat mech.


Sweet...a wikipedia page that reliably causes an "Aw, snap" in Chrome (5.0.375.70 on MacOS X). Is it just mine?

(In case anybody is wondering whether I've done the right thing and reported the problem: yes, I have.)


Strange. Loads fine for me. Same version of Chrome on 10.5.8.


Interesting. I'm on 10.6.3. I'm also using the 64-bit kernel, which isn't the default. I wonder if one or both of these variables accounts for the difference.


I get "snap" reliably from Chrome 5.0.375.70. I get crashes from Safari 4. OS X 10.6.3.


People are going to be confused by this, because the definition of "random" used by mathematicians and that used by programmers is different.

"Random" to mathematicians means the outcome of a probability experiment. If you flip a two-headed coin or roll a one-sided die, and record the results, the results are "random" - random within the set of possible outcomes, according to the function that generates them. So you dutifully record "heads", "heads", "heads" and "1", "1", "1" for your results. It's random!

"Random" for mathematicians inherently means "predictable". There's a known probability distribution.

The word "random" for programmers means something very different. It inherently means "unpredictable". The probability distribution must be flat across the space of possible outcomes. If we are operating in a base-10 system, the chance of any of the next digits occurring must be exactly 1/10.

My prediction is most of the discussion here will be people talking past each other, using different definitions of the word "random".


Speak for yourself. I am a programmer, and to me 'random' is a characterization of any variable drawn from a probability distribution, and it need not be uniform. In fact, most of the random time series I generate are produced by drawing from non-uniform (e.g. power law, gaussian) distributions. And I think calling them "predictable" is a bit of a stretch. They are predictable only insofar as I can say, "This variable will probably not go above or below X."


I'm programmer - for me random string is a string, that is shorter than any programm that generates it (random string has Kolmogorov complexity greater or equal its length).

I think this is the definition for mathematicians as well.

It's just that "random" is a common shortcut for - string comming out of probability experiment - it is very likely that it is random according to Kolmogorov complexity, because most strings are random, and space is very very big.


While I think that while most mathematicians are aware of Kolmogorov/Chaitin complexity theory, the first thing that comes to mind when they hear "random" is the theory of probability. In probability a random variable is, roughly speaking, an experiment that has a set of possible outcomes each occuring with a certain probability. The probabilities are not necessarily the same.

I think most mathematicians would agree that Probability is a much larger and important branch of mathematics than Kolmogorov-Chaitin complexity, and usually use "random" in its probabilistic meaning.


I was always confused by lack of definition of randomness in statistic/probabilistic meaning, so I've somewhat mixed these 2 branches of math.

Thanks for clearing this up.


I don't agree with that. Yes, most people interpret the word "random" as "uniformly distributed" or "unpredictable", depending on context ("pick a number at random" vs. "some random glitch"), but statisticians and mathematicians interpret it as "nondeterministic".

That's not at all the same as "predictable", but the goal is usually to be able to characterize the phenomenon statistically such that certain properties, such as general location on the number line, can be identified. If such properties cannot be identified, that doesn't make the phenomenon any less random - just less understood and by extension less predictable.


This phenomenon can be witnessed below, but, hey, people cleared up the confusion! Yay for civil discussions about technical topics.

Although I do disagree with the words you used to characterize the differences. While most people use "random" to mean "unpredictable," as you demonstrated, a uniform distribution is just as "predictable" as any other kind of distribution.

I think the real difference is that most people only associate "random" with uniformly random distributions. Since you likely need some formal probability education to be aware of the others, this isn't surprising.


>and "1", "1", "1" for your results. It's random!

That's the problem with random, you can never be sure: http://www.dilbert.com/dyn/str_strip/000000000/00000000/0000...




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: