
Violating randomization standards - pdw
http://article.gmane.org/gmane.os.openbsd.tech/39839
======
Negitivefrags
This seems like a very bad idea to me.

It's highly ingrained in my mind that rand() provides a deterministic
sequence, and I use deterministic psudorandom sequences all the time.

If you are changing this just in OpenBSD isn't that going to be a portability
nightmare? I would find this behaviour in OpenBSD extremely surprising.

I'm very sceptical of the concept that developers expect true randomness out
of rand(). Unless the way people are taught this stuff has changed
drastically, the very concept of randomness in computing was always introduced
right alongside the idea that they don't produce true randomness.

Infact, the need to call srand() or get the same sequence every time can't
help but teach you the fact that it's a deterministic sequence.

~~~
tedunangst
If people wanted deterministic sequences, they wouldn't be calling
srand(time(NULL)). Or srand((time(NULL) + buf_len) ^ rand()). Or, my personal
favorite, srand(getpid() * time(NULL) % ((unsigned int) -1)). Though I will
admit it is very difficult for me to properly imagine what that developer was
expecting.

~~~
bmm6o
I tend to agree. The determinism of random() is a feature, but it's a feature
that almost nobody actually needs. It's just that the alternatives are either
more trouble than they're worth or platform-dependent.

~~~
fryguy
Why are things like Mersenne Twister or linear congruential generators
platform dependent?

To me, there are three classes of reasons you would need random numbers:

1\. Simulation. Things like Monte Carlo. In this case, only the statistical
quality of the numbers matters.

2\. Repeatable simulation. Things like fuzzing tests or benchmarks, where you
want them to be repeatable. In this case, you care about determinism and the
statistical quality of the numbers.

3\. Resource allocation. In this case, you have a set of things that you want
to be unique, but to either make them unpredictable (process numbers, invoice
numbers) or unique without sharing state (guids). You care about the
statistical quality and unpredictability here.

The problem is that 2 and 3 are at odds with each other. 3 requires a constant
source of entropy, and 2 requires there to be no entropy.

I think the following slides are relevant:
[http://www.openbsd.org/papers/hackfest2014-arc4random/mgp000...](http://www.openbsd.org/papers/hackfest2014-arc4random/mgp00001.html)

~~~
nitrogen
I would add a #4: cryptography.

You don't need to add entropy for #3. If a CSPRNG with n bits of initial
entropy as a seed can satisfy #4, then it's also good enough for #3.

~~~
fryguy
My understanding is that the way cryptography is done is by having hardware
entropy source in the CPU fed into some sort of hashing algorithm. Having a
CSPRNG with only n bits of initial entropy either exposes the problem of if an
attacker can somehow get the state of the CSPRNG, then they can predict future
random numbers that are generated (and previous ones, depending on
implementation). Constantly reseeding with entropy means that it's not
possible to do that.

------
leni536
You don't always need good random numbers but "random looking" numbers.
Examples:

\- Map generation for games

\- AI for games

\- Random looking texture generation (like Perlin noise)

\- Shuffle a playlist in your music player

\- Clicking "random" on a webcomic's page

I'm sure there are other examples. I think that rand() is much faster than any
good pseudorandom algorithm too.

~~~
nightcracker
> I think that rand() is much faster than any good pseudorandom algorithm too.

This is simply not true. It's often _slower_, since it uses the integer div
instruction.

For an example of a good fast PRNG, see Tyche:
[https://eden.dei.uc.pt/~sneves/pubs/2011-snfa2.pdf](https://eden.dei.uc.pt/~sneves/pubs/2011-snfa2.pdf)

It's faster and produces infinitely better distributions.

~~~
clarry
> This is simply not true. It's often _slower_, since it uses the integer div
> instruction.

Actually I think most implementations reduce modulo a power of two (e.g. 2^31
or 2^32):

    
    
        0000000000048ee0 <rand_r>:
           48ee0:       8b 07                   mov    (%rdi),%eax
           48ee2:       69 c0 6d 4e c6 41       imul   $0x41c64e6d,%eax,%eax
           48ee8:       05 39 30 00 00          add    $0x3039,%eax
           48eed:       89 07                   mov    %eax,(%rdi)
           48eef:       25 ff ff ff 7f          and    $0x7fffffff,%eax
           48ef4:       c3                      retq   
    

[https://en.wikipedia.org/wiki/Linear_congruential_generator#...](https://en.wikipedia.org/wiki/Linear_congruential_generator#Parameters_in_common_use)

~~~
nightcracker
Woops, I totally forgot about that, I think I was confused with range
reduction on the final result. Nevertheless, Tyche-i is also blazing fast and
small:

    
    
        0000000000000000 <_Z6tycheiv>:
           0:   8b 0d 0c 00 00 00       mov    0xc(%rip),%ecx        
           6:   c4 e3 7b f0 05 07 00    rorx   $0x7,0x7(%rip),%eax   
           d:   00 00 07
          10:   c4 e3 7b f0 15 ff ff    rorx   $0x8,-0x1(%rip),%edx  
          17:   ff ff 08
          1a:   44 8b 05 04 00 00 00    mov    0x4(%rip),%r8d        
          21:   31 ca                   xor    %ecx,%edx
          23:   44 31 c0                xor    %r8d,%eax
          26:   41 29 d0                sub    %edx,%r8d
          29:   c4 e3 7b f0 d2 10       rorx   $0x10,%edx,%edx
          2f:   29 c1                   sub    %eax,%ecx
          31:   c4 e3 7b f0 c0 0c       rorx   $0xc,%eax,%eax
          37:   44 31 c0                xor    %r8d,%eax
          3a:   31 ca                   xor    %ecx,%edx
          3c:   29 c1                   sub    %eax,%ecx
          3e:   89 05 08 00 00 00       mov    %eax,0x8(%rip)        
          44:   41 29 d0                sub    %edx,%r8d
          47:   89 15 00 00 00 00       mov    %edx,0x0(%rip)        
          4d:   44 89 05 04 00 00 00    mov    %r8d,0x4(%rip)        
          54:   89 0d 0c 00 00 00       mov    %ecx,0xc(%rip)        
          5a:   c3                      retq

------
viraptor
It's slightly disappointing that he didn't look through at least a sample of
the apps using something like `srand(silly_value)` to see what the numbers are
used for. (at least that's what I understand from his post) They're using
silly pseudo-random numbers, but what if all of them expect nothing more?
They're breaking a known interface that may just happen to affect some people
when they don't expect it, but there's a chance the "good side" of the change
is not even going to be noticed.

I mean, use cases like choosing a tip of the day to display, or choosing a
starting colour for displaying a number of objects, or a million other ideas
probably wouldn't care if 0 was chosen 10% of the time as a result.

------
sehugg
_All three subsystems produce deterministic results. The reasons for this are
history not known to me, but it might even be linked to Dual_EC_DRBG._

Seriously? I mean rand/srand have been around since what, at least 1980?

~~~
nullc
And then he goes and replaces it with something that calls itself RC4 (but
isn't!)? April fools must have come early.

------
emmelaich
There's been a warning in 'man rand' for as long as I can remember. The man
page on OSx has "bad random number generator" in the NAME.

The amazing thing is that they're still being used.

------
plg
What's the current recommendation for a better random number generator? (C
please)

~~~
kbaker
Maybe arc4random(), from OpenBSD. There were some slides linked earlier about
it [1], it might be worth looking into.

[1]
[http://www.openbsd.org/papers/hackfest2014-arc4random/mgp000...](http://www.openbsd.org/papers/hackfest2014-arc4random/mgp00001.html)

~~~
carey
Is it really wise to choose a random number generator based on RC4, which is
known to have remarkably strong biases? Surely it would be better to use
something based on ChaCha20, Keccak or Spritz if you have the choice?

~~~
grogers
It does use ChaCha20, and the implementation will use something else in
another 20 years most likely.

[http://www.openbsd.org/papers/hackfest2014-arc4random/mgp000...](http://www.openbsd.org/papers/hackfest2014-arc4random/mgp00038.html)

~~~
carey
Serves me right for not reading through all the slides. I would argue that
it's one of the more misleading function names out there, though.

~~~
justincormack
Yes, it is misleading, it used to use rc4. Posix are proposing to standardise
it under a new generic name, so it is unlikely to get changed until this
happens.

------
platz
considering an rtl-sdr radio is $15, why not just make it standard to include
a radio which samples atmospheric noise to genereate random numbers?

~~~
dfox
First, sampling atmospheric noise is not good general entropy source, because
it needs careful setup to ensure that what you are sampling actually is
atmospheric noise and not some signal, also such noise might well be
unpredictable, but certainly is not secret (ie. not readily usable for
cryptographical purposes).

Second, there are significantly cheaper ways to get quality entropy in modern
digital system. Relative timing of various internal events in system with
multiple clock sources is one of the meaningful sources. Also, purpose built
hardware RNG integrated into some silicon is essentially free (and mostly
cheaper than it's interface circuitry).

The problem is not that system does not have usable entropy, but that
application tend to not use the good sources that are already there (ie.
/dev/urandom, CryptGenRandom/rand_s and such)

