
OpenBSD bug in the random() function - muks
https://banu.com/blog/42/openbsd-bug-in-the-random-function/
======
Yrlec
While not exactly a bug, but if you run this code in Java:

    
    
      for(int i = 0; i< 100; i++){ 
      	Random random = new Random(i);                                       	
      	System.out.println(random.nextDouble()); 
      }

It prints the following sequence (at least on JDK 7 and Win 7):

    
    
      0.730967787376657
      0.7308781907032909
      0.7311469360199058
      0.731057369148862
      0.7306094602878371
      0.730519863614471
      0.7307886238322471
      0.7306990420600421
      0.7302511331990172
      0.7301615514268123
    

I know that you're not supposed to recreate the Random-instance like that but
it's still a bit odd that the initial values in each sequence are so similar
to each other.

~~~
Yrlec
Any reason for the down-vote? I honestly want to learn.

~~~
gcp
The Java Random class uses a 48-bit LCG with a 35-bit multiplier. Because of
this, small seed values won't be able to "wrap around" the full range of the
LCG and will cause starting sequences that are all but random relative to each
other.

Put differently, you're seeing that 35/48 = 0.73.

I'd consider this a bug in Java, but it's a common one. Qt has the same
problem. Could have been avoided by cycling the seed through the LCG once,
instead of using XOR.

~~~
Yrlec
Interesting, thanks! Any particular reason they limit the multiplier to 35
bits and the output to 48 bits?

Edit: just noticed that Java limits the output to 32 bits, not 48
(<http://en.wikipedia.org/wiki/Linear_congruential_generator>). How does it
create 64 bit values, like long and double?

~~~
gcp
_Any particular reason they limit the multiplier to 35 bits and the output to
48 bits?_

Good question. There appears to be no good justification for this, but the
generator is guaranteed by the docs. So it's possible the initial
implementation was bad and everybody is required to follow it since.

------
calloc
Or why maybe using random() is a terrible idea. Use arc4random() instead on
FreeBSD/OpenBSD/Mac OS X for a MUCH better random number generation, and best
of all it is auto-seeded.

Obligatory XKCD: <http://xkcd.com/221/>

~~~
DHowett
The only caveat is that you then have to wrap it in an #ifdef if you want
source portability. The thing random() has over arc4random() is exactly that -
it's part of the standard C library on most platforms.

For discussion: Why is random() not already arc4random() on platforms that
provide the arc4 variant? Is it for speed's sake? Different implementations of
libc functions will seed differently, so it's not a cross-platform seed
stability concern. Is the problem that you can't seed it with a fixed value
and get the same pseudorandom sequence?

~~~
caladri
Yes, because of the need for pseudorandom sequences. In FreeBSD this comes up
every so often, but the reality is that there's a lot of AI and
simulation/modeling code that uses the libc random functions (either rand(3)
or random(3)) and expects reproducible behavior with the same seed both
throughout the life of a program and across multiple executions.

~~~
harshreality
That could easily be accommodated by implementing random_ng() to take an
optional buffer that the PRNG would use to initialize its state. If a buffer
is not passed, use a random or pseudorandom entropy source... whatever's
available on the system. From ivy bridge on, intel cpus will have the RdRand
instruction, or there's /dev/urandom.

That offers the best of both worlds. If you want repeatably, initialize
random_ng() with a known buffer. If you want reasonable unpredictability, let
the PRNG initialize itself using whatever it wants. (Not to confuse that PRNG
with good entropy randomness that might be accessible from RdRand, or which is
usually obtained by asking the user to move the mouse.)

~~~
caladri
Right, and there are other RNG and PRNG sources and interfaces for precisely
that reaon. The question was why random(3) isn't arc4random(3).

------
michaelni
One issue with the OpenBSD "bug" that i think hasnt been mentioned is that
while openbsds srandom(0) leading to a 0 sequence sucks. The fix everyone is
using (including up to date OpenBSD trunk) causes srandom(X) and srandom(Y) to
produce the same sequence for at least one pair of distinct X and Y. This
probably is less an issue but still. For example linux debian with gnu libc
produces the same sequence after srandom(0) and srandom(1). Namely 1804289383
846930886 1681692777 1714636915 ...

------
dfc
It seems like the right thing to do would be to spend the time composing the
email to tech@o.o and then write the blog post.

~~~
marshray
I've tried pointing out a deficiency in the system RNG to those guys before.

They're not as grateful as you'd think.

~~~
gonzo
OpenBSD is full of navel-gazing.

