
How Do Computers Generate Random Numbers? - aryamansharda
https://digitalbunker.dev/2020/09/08/how-do-computers-generate-random-numbers/
======
lsc36
1\. You don't turn PRNG into "true" RNGs simply by picking seeds from
environmental randomness. The seed is just the initial state, as long as the
output is generated by a deterministic algorithm, by definition it's a PRNG.
At the very best you can make a CSPRNG, but not a "true" RNG.

2\. The dice roll example is _not_ uniform distribution, I think this is a
common pitfall when generating random integers of a range. `randomNumber % 6`
results in a slight bias towards 0 and 1, since 2^31 % 6 == 2, there are more
numbers in the range [0, 2^31-1] that map to 0 and 1 than those that map to
2...5. To make it uniform, for example, you should always discard if
`randomNumber < 2` and regenerate another number for use.

~~~
ficklepickle
#2 reminds me of Benford's Law, which I recently learned about and find truly
fascinating.

[https://en.wikipedia.org/wiki/Benford%27s_law](https://en.wikipedia.org/wiki/Benford%27s_law)

~~~
chriselles
Interesting.

On first pass, Benford’s Law looks a lot like Zipf’s Law.

What differentiates Benford’s Law from Zipf’s Law?

~~~
pmiller2
From
[https://en.wikipedia.org/wiki/Zipf%27s_law](https://en.wikipedia.org/wiki/Zipf%27s_law)
:

> It has been argued that Benford's law is a special bounded case of Zipf's
> law,[22] with the connection between these two laws being explained by their
> both originating from scale invariant functional relations from statistical
> physics and critical phenomena.[24] The ratios of probabilities in Benford's
> law are not constant. The leading digits of data satisfying Zipf's law with
> s = 1 satisfy Benford's law.

~~~
chriselles
Thank you!

I’ll take some time to try and better understand your post.

------
jrnkntl
Cloudflare has a fun solution to this involving lava lamps[0][1]

[0] [https://blog.cloudflare.com/randomness-101-lavarand-in-
produ...](https://blog.cloudflare.com/randomness-101-lavarand-in-production/)

[1] [https://blog.cloudflare.com/lavarand-in-production-the-
nitty...](https://blog.cloudflare.com/lavarand-in-production-the-nitty-gritty-
technical-details/)

------
shadowprofile77
Just out of layman's curiosity, what would be the problem or difficulty of
somehow connecting a more compact type of radio telescope that detects some
level of cosmic background radiation and hooking that up to a computer. Would
this not be a guaranteed way of generating truly random numbers for any need
flawlessly?

I know that serious radio telescopes cost way more than any random person
could afford to pay but I've certainly seen plans for smaller DIY homemade
models.

~~~
Animats
A small radiation source works well, although the data rate is low.[1]

The idea is to count the number of events (beta particles here) per time
interval. Do this twice. If count A > count B, output a 1. If count A < count
B, output a 0. If count A = count B, skip that result. Von Neumann came up
with that trick.

Don't use the low-order bit of the count. That has a bias.

[1] [https://www.fourmilab.ch/hotbits/](https://www.fourmilab.ch/hotbits/)

~~~
DanBC
The Von Neumann extractor is interesting.

> Von Neumann’s originally proposes the following technique for getting an
> unbiased result from a biased coin :

> > If independence of successive tosses is assumed, we can reconstruct a
> 50-50 chance out of even a badly biased coin by tossing twice. If we get
> heads-heads or tails-tails, we reject the tosses and try again. If we get
> heads-tails (or tails-heads), we accept the result as heads (or tails).

------
mywittyname
The chart shows the distribution of "10,000 dice rolls," yet each potential
value, 1-6, has between 6000 and 7000 hits, indicating the true number of
rolls to be between 36,000 and 42,000.

~~~
aryamansharda
You're absolutely right. The attached code was for 40K and I just mislabeled
the graph. I've updated it - thanks for catching that!

------
rurban
The Mersenne-Twister approach should certainly not be studied anymore, even if
some popular old libraries still use it. It fell long out of favor, is too
slow, and not good enough.

Modern PRNG's can be tested with Dieharder, TestU01 or STS and benchmarked.
This article only talks about primitive old LCG's (not any good one) or MT.

~~~
aryamansharda
When you say Mersenne-Twister isn't good enough, what are the other
shortcomings apart from speed? It seems that even modern versions of Python
are continuing to use it...

~~~
Straw
Its slow, large, and statistically worse than modern PRNGs- and jumping ahead
takes longer and a more complicated algorithm.

Even a truncated 128-bit LCG has far better properties.

See [https://www.pcg-random.org/index.html](https://www.pcg-
random.org/index.html)

The homepage might come across as a a little overzealous (for example ChaCha
quality listed as good rather than excellent), but generally has good points.

~~~
benibela
However, this page claims PCG is rather bad:
[http://pcg.di.unimi.it/pcg.php](http://pcg.di.unimi.it/pcg.php)

They recommend to use their xoshiro PRNG.

~~~
Straw
That author has a history of extreme bias and almost-vindictive personal
attacks on the author of PCG. See the reddit comments:

[https://www.reddit.com/r/programming/comments/8jbkgy/the_wra...](https://www.reddit.com/r/programming/comments/8jbkgy/the_wrapup_on_pcg_generators/)

And the PCG author's response: [https://www.pcg-random.org/posts/on-vignas-
pcg-critique.html](https://www.pcg-random.org/posts/on-vignas-pcg-
critique.html)

For example, for one of his arguments, he specifically chose a generator
called pcg32_once_insecure, which the PCG author does not recommend due to its
invertible output function!

Personally, I have read both arguments in detail and I would always use PCG or
even a truncated LCG over xoshiro, which has a large size in comparison,
potentially worse statistical properties, and no gain- faster in some
benchmarks and slower in others.

~~~
benibela
Well, I had only skimmed the page

But I am using xoshiro in my projects, because I thought xor was simpler than
multiplication.

~~~
Straw
Yeah, xor is simpler than multiplication in terms of hardware complexity-
luckily, we have the multiplication circuits built in, so may as well take
advantage of them.

------
mschuetz
My favourite is this one: [https://preshing.com/20121224/how-to-generate-a-
sequence-of-...](https://preshing.com/20121224/how-to-generate-a-sequence-of-
unique-random-integers/)

* Creates a sequence of unique integers

* Uses prime numbers that are congruent to P = 3 (mod 4).

* A single iteration has noticable patterns but applying it twice already results in randomness that is sufficiently good for many use cases.

* It is "embarrassingly parallel", a simple mapping of randomValue = randomize(i). You can calculate unique and deterministic random numbers from input i in parallel threads with no sync between threads.

* Since it is a unique mapping of i to r, you can use it to shuffle data sets virtually instantly. Take the index of a value in the original array, and use it to compute the target index in the shuffled array.

* I've used it to shuffle up to 800 million items per second on a GPU, including the time it took to transfer the data from RAM to GPU. So without the IO, you could probably shuffle billions of values per second, probably mostly bound by GPU bandwidth. E.g., 700GB/s and each item is 70 bytes -> could perhabs shuffle 10 billion items per second.

------
cm2187
Is a hardware random number generator that would use environment or electrical
sensors to generate noise expensive or hard to manufacture? I would assume
this should be part of any standard motherboard given the importance of
cryptography. Or does it create an attack vector?

~~~
jfindley
Cryptography doesn't depend on random numbers for its random seed. It depends
on unpredictable numbers. These are not the same thing. As long as no-one can
predict what the next number will be it doesn't matter how "random" they are.
TFA is sufficiently vague it's unclear if the author understands this.

The more I read it, the more confused the article appears to be (e.g. mersenne
twister is NOT a good example of a modern or high quality PRNG). For more
about secure random numbers in Linux, I'd suggest reading [0].

0: [https://buttondown.email/cryptography-
dispatches/archive/cry...](https://buttondown.email/cryptography-
dispatches/archive/cryptography-dispatches-the-linux-csprng-is-now/)

~~~
UncleMeat
I wouldn't be so universal with this statement.

Some cryptosystems really do need uniform randomness (ECDSA) rather than just
negligible probability of choosing values. Other cryptosystems depend on not
reusing values, though the values could be predictable. Sometimes there are
subtle shifts in these needs based on modes (AES/CBC vs AES/GCM is a good
example).

~~~
jfindley
Thanks for the comment, yes I should have phrased that very differently.

What I was trying to say is that the kernel CSPRNG (exact mechanism depends on
version) mixes together a bunch of things that aren't truly random from an
information-theoretical perspective in order to produce uniform random output
from the CSPRNG function - and that it doesn't actually matter that those
sources aren't information-theoretically random. That'll teach me to comment
in haste!

------
ImaCake
For those interested, one can get from a uniform distribution to pretty much
any defined statistical distribution using just the uniform random number
generator and the inverse cumulative distribution function of the desired
random distribution. A useful trick for non-standard distributions with no
function available in your prefered language.

Example from matlab: [https://www.mathworks.com/help/stats/generate-random-
numbers...](https://www.mathworks.com/help/stats/generate-random-numbers-
using-the-uniform-distribution-inversion-method.html)

------
johnatwork
Roll20 has an interesting way of generating random numbers as well -
[https://wiki.roll20.net/QuantumRoll](https://wiki.roll20.net/QuantumRoll)

------
IIAOPSW
My personal favorite RNG is the logistical map. x_{n+1} = 4x_n(1-x_n). There
is no hidden seed beyond the current output. Thus if you have a scientific
calculator that lets you refer to the value on screen then you can rig it into
an RNG. Seeing a simple, non-programmable machine "misbehave" and act random
melts my mind a little.

~~~
bregma
That's still just a linear congruential PRNG with state sizeof(unsigned int).
LCG is simple and has well-known flaws.

~~~
IIAOPSW
No it is not an LCG. When you distribute the terms there is a -4x_n^2.
Therefore it is non-linear.

