

/dev/random is not robust - techtalsky
http://eprint.iacr.org/2013/338

======
tptacek
This is the third time this post has surfaced on HN. Last time, we got a
thread with Ted Ts'o participating:

[https://news.ycombinator.com/item?id=6548893](https://news.ycombinator.com/item?id=6548893)

There are things not to like about Linux's kernel CSPRNG; most notably among
them, the pointless distinction made between random and urandom.

But it's not clear that this paper has a lot of practical impact.
Specifically, it refers to scenarios in which attackers control all the
entropy inputs to the CSPRNG; it would be relevant, perhaps, in the case of a
CSPRNG design that depended _entirely_ on a closed HWRNG like RDRAND (Linux
does not, nor does any other reputable CSPRNG). In reality, though, the only
way an attacker can find themselves with the vantage point to launch
theoretical attacks like these is if they've thoroughly owned up your kernel.

~~~
throwaway2048
it is worthwhile to note OpenBSD got rid of /dev/random and replaced it with
/dev/urandom several years ago in response to the issues surrounding it.

~~~
tedunangst
OpenBSD has done several things with the random device and its userland
interface, but nothing like what you have described. The opposite in fact.
Until recently, there was no /dev/random node. Now there is.

------
SEMW
Ted Ts'o (maintainer of /dev/random)'s initial thoughts on the paper, in the
comments on Bruce Schneier's blog: [1], [2]. (TLDR: " _What the authors of
this paper seem to be worried about is not even close to the top of my list in
terms of things [about /dev/random] I'm worried about_").

[1]
[https://www.schneier.com/blog/archives/2013/10/insecurities_...](https://www.schneier.com/blog/archives/2013/10/insecurities_in.html#c1909001)

[2]
[https://www.schneier.com/blog/archives/2013/10/insecurities_...](https://www.schneier.com/blog/archives/2013/10/insecurities_in.html#c1909182)

------
tbrownaw
_This property states that a good PRNG should be able to eventually recover
from compromise even if the entropy is injected into the system at a very slow
pace, and expresses the real-life expected behavior of existing PRNG designs._

If someone figures out the internal state of a PRNG that doesn't take input
entropy, they can predict all future outputs.

If someone figures out the internal state of a PRNG that _does_ take input
entropy, they can mostly predict near-future output but shouldn't be able to
predict what the output will be after enough new input entropy has been
provided.

I think this paper is saying that that "shouldn't" either isn't true of Linux,
or takes more input entropy to happen than it ought to?

------
berntb
This is certainly naive, but I have wondered a long time so I'll stick the
foot in my mouth. :-)

Shouldn't an easy way of getting reasonably good random values be to take
three-four of the best of breed implementations of algorithms with as much
entropy injections as possible -- and then XORing the output from them?

(Or rather, one of multiple alternative ways of mixing the bit streams
together where XOR is one, based on a random byte from one of the streams.
Another byte tells after how long the mixing algorithm changes.)

Sure, you lose a 1/3 of the random data.

Hell, it should be enough to randomly throw away most of the generated random
bits?

It was a long time ago since I did any computer security, but shouldn't this
work?

~~~
keypusher
No. The strength is only as good as your best algorithm. XOR'ing two mediocre
algorithms does not give you a better algorithm. In fact, you might be more
likely to end up with collisions in the final product if you XOR two results
together than you would have had in the primary results themselves, making
your random results even worse.

To quote: _in any iterated hash function it is relatively easy to find
exponential sized multicollisions, and thus the concatenation of several hash
functions does not increase their security._

Source:
[http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.99.9...](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.99.9368)

See also discussion of why XOR is often a bad idea for combining hashes:
[http://stackoverflow.com/questions/5889238/why-is-xor-the-
de...](http://stackoverflow.com/questions/5889238/why-is-xor-the-default-way-
to-combine-hashes)

And this question about combining different hashes:
[http://crypto.stackexchange.com/questions/270/guarding-
again...](http://crypto.stackexchange.com/questions/270/guarding-against-
cryptanalytic-breakthroughs-combining-multiple-hash-functions)

~~~
berntb
Uh, I am stressed at work -- but most of that looked like a discussion of
cryptological algorithms, not generation of random numbers?

Your point is obviously true, then. Sorry if I was unclear?

~~~
keypusher
I am quite new to both topics as well, but it seems like there are some
similarities. That is, from the excerpt in the original article, the PRNG used
by /dev/random essentially takes noise from some external input, and runs it
through "a cryptographic function that outputs random numbers from the
continually internal state". In both cases (cryptography and random numbers,)
the goal is to transform known data into something that is indistinguishable
from random. The way I understand it is that the entropy in the random number
generator doesn't come from the algorithms, entropy comes from noise gathered
off the hardware signals. The algorithms just make sure that even if you have
biased noise generating the input, attackers will still have a very hard time
using that to control the end result. I think this a very interesting question
by the way, and hope my previous answer didn't come off as dismissive. I am
still a novice in the world of security, and so I wanted to put links in
instead of trying to explain this myself, as I probably would get it horribly
wrong.

~~~
berntb
Thanks for the time. (I never read the 2nd Knuth book at Univ. :-) )

OK, obviously the obfuscation algorithms using entropy are equivalent to some
form of hashing/encryption function.

So if I understand this:

To just keep every Xth bit (where X varies over time) of the generated data
doesn't help (it adds a little obfuscation to the algorithm?). And to combine
multiple algorithms with their own sources of entropy is not superior to e.g.
just using the three entropy sources to get better data?

