
Myths about /dev/urandom - petrosagg
http://www.2uo.de/myths-about-urandom/
======
po
Why hasn't someone qualified re-written that unix man page by now? I've been
reading cryptographers trying to explain all of this for a while now. I feel
like that would help put a lot of this to rest and save everyone time.

~~~
AndyKelley
It would have saved me a lot of time, that's for sure. And here I thought I
was being responsible and informed by reading the man page.

~~~
cbsmith
Honestly, anyone who cares & can understand will look at the implementation of
the two and grok it. Those who don't will screw up the rest of it anyway, and
that will create more than enough trouble even with some helpful guidance.
Best to have them fail in pretty obvious ways.

~~~
tedunangst
I would not classify the /dev/random failure modes as "pretty obvious".

~~~
cbsmith
I guess it is a matter of perspective. If you understand what you need for
your cryptography, you'll know if you want /dev/random or not. The fact that
it blocks is very, very well advertised and clear. The fact that the
difference is that /dev/urandom won't block is also very, very clear. If you
don't know which you want, I'd wager you don't understand your cryptographic
protocol's needs terribly well, and are best off outsourcing that work to
someone else.

~~~
ghshephard
Last time I tried using gpg on a VM it failed to work (literally would not do
anything) because it blocked on /dev/random. Would you say that the gpg people
should be outsourcing their work to someone else?

~~~
dfc
Things have improved a little on this front. It turn out gnupg was being a
little gluttonous when it came to entropy:

"Daniel Kahn Gillmor observed that GnuPG reads 300 bytes from /dev/random when
it generates a long-term key, which, he observed, is a lot given /dev/random's
limited entropy . Werner explained that GnuPG has always done this. In
particular, GnuPG maintains a 600-byte persistent seed file and every time a
key is generated it stirs in an additional 300 bytes. Daniel pointed out an
interesting blog post by DJB explaining that a proper CSPRNG should never need
more than about 32 bytes of entropy. Peter Gutmann chimed in and noted that a
2048-bit RSA key needs about about 103 bits of entropy and a 4096-bit RSA key
needs about 142 bits, but, in practice, 128-bits is enough. Based on this,
Werner proposed a patch for Libgcrypt that reduces the amount of seeding to
just 128-bits."[^1]

On a related not why are you generating keys on a remote vm? Its probably not
fair to say that gpg "failed to work (literally would not do anything)." It
was doing something, gnupg was waiting for more entropy. Needing immediate
access to cryptographic keys that you just generated on a recently spun up
remote VM is kind of a strange use case?

[^1]: [https://www.gnupg.org/blog/20150607-gnupg-in-
may.html](https://www.gnupg.org/blog/20150607-gnupg-in-may.html)

~~~
ghshephard
Thanks very much for the updates on entropy requirements.

Re: "Why are you generating keys on a remote VM" \- prior to this, it hadn't
occured to me I couldn't generate a gpg key on linode/digital ocean, VMs. I
realize now that keys should be generated on local laptops (or such), and
copied up.

Re: "Fair to say failed to work" \- It just sat their for 90+ minutes - I
spent a couple hours researching, and found a bug (highly voted on) that other
people had run into the same issue. But, honestly - don't you think that gpg
just hanging for 90+ minutes for something like generating a 2048 bit RSA key
should be considered, "failing to work?" \- I realize under the covers (now)
what was happening - but 99% of the naive gpg using population would just give
up in the same scenario instead of trying to debug it.

~~~
cbsmith
Yeah, the bug was really how it handled the case of waiting forever without
telling you why. In GPG's defense, before it actually stars reading from
/dev/random, it does give you all kinds of warnings that it needs sources of
entropy before it can make any progress.

Hard to get that kind of thing right, but fundamentally it did stop you from
making exactly the kind of terrible mistake that I was talking about. ;-)

------
tytso
Actually, the preferred and recommended way to get randomness on modern
kernels is to use the new getrandom(2) system call, with the flags argument
set to zero.

[http://man7.org/linux/man-
pages/man2/getrandom.2.html](http://man7.org/linux/man-
pages/man2/getrandom.2.html)

With the flags set to zero, it works like the getentropy(2) system call in
OpenBSD. In fact, code that uses getentropy(buf, buflen) can be trivially
ported to Linux as getrandom(buf, buflen, 0).

~~~
beefhash
This is all fine and dandy for seeding a CSPRNG, but we still don't actually
have a CSPRNG in the C stdlib or glibc. getrandom(2) seems to target
exclusively getting entropy for seeding, looking at the man page ("These bytes
can be used to seed userspace random number generators or for cryptographic
purposes."). And of course, the syscall is very recent, introduced in Linux
3.17, coming after the kernels in Debian stable (jessie), which features 3.16,
and CentOS/RHEL7, which features 3.10.

OpenBSD and NetBSD feature a ChaCha-based arc4random; FreeBSD and libbsd still
seem stuck with RC4-based arc4random[1]. An equivalent for that is sorely
missing on Linux. /dev/urandom requires messing with file descriptors, which
you may run out of and may require error handling, plus all kinds of security
precautions to make sure you're actually looking at the right /dev/urandom[2].

[1]
[https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=182610](https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=182610)
and
[https://bugs.freedesktop.org/show_bug.cgi?id=85827](https://bugs.freedesktop.org/show_bug.cgi?id=85827)

[2] [http://insanecoding.blogspot.ch/2014/05/a-good-idea-with-
bad...](http://insanecoding.blogspot.ch/2014/05/a-good-idea-with-bad-usage-
devurandom.html)

~~~
justincormack
There is an open request to add a posix_random (basically arc4random) to posix
[http://austingroupbugs.net/view.php?id=859](http://austingroupbugs.net/view.php?id=859)

------
antirez
The root problem is that cryptography is not usable without some
understanding. In the latest days we developed an addiction for "best
practices" like "use bcrypt", "use /dev/random" or alike which in the field of
cryptography are not enough: understanding is the only possible path.
Actually, this is true for most things, it just shows up more obviously in the
field of cryptography because of the security concerns. So we should replace
"use bcrypt" with "Read Applied Cryptography & other books".

~~~
ReidZB
The average guy doesn't have the time to read enough about cryptography to
have the sort of knowledge you're wanting. This very article debates an
understanding created by incomplete and poor cryptographic knowledge.

If there's anything I've learned by studying cryptography, it's that the
average person needs to invest _significant_ resources to become a pseudo-
expert in cryptography - all before they can make any meaningful decision.

Applied Cryptography is not a terribly good starting point for cryptographic
understanding these days. That's just another facet of difficulty: where does
one even start? Do you want a theoretical baseline understanding? Do you want
the high-level, quick-and-dirty overview which only gives a summary? (That'll
make it hard to make real, informed decisions...)

If we really want to make cryptography accessible to many developers, the best
solution is for the cryptographic community to make our libraries and
interfaces better. At the same time, there comes a point where a developer has
to stop and say "this is beyond what is standardized in ${widely used
library}; we need to hire a cryptographer." In that sense, we need to instill
a better anti-crypto ethic.

~~~
antirez
Basically to become an expert cryptographer you need a math degree and ten
years of experience, so this is out of question indeed. What I'm referring to
is to get enough information in order to understand the big picture: what is a
stream cipher, what a block cipher, a cryptographic hash function and its main
properties, how many of those primitives are kinda equivalent sometimes and
you can use one to create another, the tradeoff between speed and security
(and how number of rounds effect the security of crypto building blocks),
analyzing simple algorithms in order to _really_ understand why it is so hard
for you to create something secure, secure PRNG generation and weak PRNG
generation (and how to break a congruential linear generator), algorithms like
DH, RSA, basic knowledge on number theory, and so forth. This will not make
you an expert, but will give you enough understanding in order to actually
undetstand _why_ a rule or a best practice is used and when it is safe or not
to break it.

About starting point, this is incredibly sad but true: there is no Applied
Cryptography of 2015. The book is at this point in some way outdated and no
replacement exists, however what you can do is to read it, and then to read
the documents that there are around to get updated information. Also there are
now the online courses on cryptography that really help. This may look like an
overkill, but at this point crypto is everywhere and is the foundation of most
things secure, so it is a requirement of everybody involved with computer
security.

~~~
tedunangst
Well, this is the point of nacl (and successors). You don't need to know that
it's using a stream cipher. You only need to know you want to send a secret,
and this is the function that does that.

~~~
falcolas
How secret is it? How much effort do you want an adversary to spend vs. your
intended recipient. How do you want to manage keys between yourself and the
recipient? What is the size of the secret? How much do you trust the channels
over which you are sending the message? Do you need to validate the identity
of the secret's recipient? How many secrets do you need to send to how many
recipients each minute?

All of these influence how the secret should be bundled up and sent, and it
takes more than a library to pick the appropriate method.

~~~
tedunangst
OK, I admit, if you like you can make it much harder than it needs to be.

------
AndyKelley
Thanks for this. Time to go fix my code[1].

[1]:
[https://github.com/andrewrk/genesis/blob/0d545d692110d33068d...](https://github.com/andrewrk/genesis/blob/0d545d692110d33068dee36bbc2492d7aa8aa325/src/os.cpp#L47)

~~~
masklinn
Since you're only targeting Linux, if you are fine with requiring recent
kernels (3.17), you should probably use getrandom(2):
[http://man7.org/linux/man-
pages/man2/getrandom.2.html](http://man7.org/linux/man-
pages/man2/getrandom.2.html)

------
msm23
I prefer to trust the NSA on these matters. They end up saying much of what
the author has written, but they make it clear why you want to use one vs the
other.

The excerpt below is from
[https://www.nsa.gov/ia/_files/factsheets/I43V_Slick_Sheets/S...](https://www.nsa.gov/ia/_files/factsheets/I43V_Slick_Sheets/Slicksheet_RNG_IntroForAppDev.pdf)
(which in turn also references
[https://www.nsa.gov/ia/_files/factsheets/I43V_Slick_Sheets/S...](https://www.nsa.gov/ia/_files/factsheets/I43V_Slick_Sheets/Slicksheet_RNG_IntroForOpSysDev.pdf)
)

Unix-like Platforms (e.g. Linux, Android, and Mac OS X):

Application developers should use the fread function to read random bytes from
/dev/random for cryptographic RNG services. Because /dev/random is a blocking
device, /dev/random may cause unacceptable delays, in which case application
developers may prefer to implement a DRBG using /dev/random as a conditioned
seed.

Application developers should use the “Random Number Generators: Introduction
for Operating System Developers” guidance in developing this solution. If
/dev/random still produces unacceptable delays, developers should use
/dev/urandom which is a non-blocking device, but only with a number of
additional assurances:

\- The entropy pool used by /dev/urandom must be saved between reboots. \- The
Linux operating system must have estimated that the entropy pool contained the
appropriate security strength entropy at some point before calling
/dev/urandom. The current pool estimate can be read from
/proc/sys/kernel/random/entropy_avail.

At most 2^80 bytes may be read from /dev/urandom before the developer must
ensure that new entropy was added to the pool.

~~~
wfleming
Maybe I'm just tired & not able to detect sarcasm right now, but isn't 2^80
bytes more bytes than are currently stored in the world? That's on the order
of 10^24, which is something like 1 million exabytes, which is a million
squared terabytes, right?

~~~
raverbashing
No, it's called being safe

And I wouldn't put it past them to have an attack (at least theorectical) that
exploits this

~~~
viraptor
There's safe and there's FUD. Nobody will read 2^80 bytes from urandom. You'll
literally run out of time before that. It would take around 1,782,051,134
years to do on my system.

So if they write that there's a vulnerability after reading 2^80 bytes -
that's great! We're secure. If they write that you must ensure to do something
after 2^80 bytes - that's complete bullshit.

~~~
raverbashing
Yes, reading 2^80 bytes for a practical attac is impossible today (and also
for the near future)

However, remember when attacks to 3DES, MD5 were only theoretical?

Also, you may not even need to read 2^80 bytes, there might be a (future)
vulnerability that allows you to shortcut this.

~~~
johncolanduoni
The difference is that weaknesses were found in 3DES and MD5. Increasing
computing power was not the main factor. "Only" being able to produce 2^80
random bytes is a known and expected limitation. Sure, the CSPRNG could in
theory be found to have a weakness, but that has nothing to do with the 2^80
bytes and the same could be said for virtually any cryptographic algorithm.

~~~
tptacek
What weaknesses in 3DES are you thinking about that yield practical attacks?

~~~
johncolanduoni
I am not arguing that there are; that was the parent comment. However, while
the 3DES weaknesses don't yield practical attacks now, they still reduce the
effective key length. My point was not that 3DES is different in that it is
exploitable, but that it is different from the 2^80 limit in that the CSPRNG
in that the later is not a result of a mistake in the algorithm's design but
instead an expected feature. Just like the fact that any fixed-size key
symmetric cipher is "limited" by that key size.

Now, if someone found a lower limit based on exploiting some weakness in the
random number generation, the analogy with 3DES and MD5 would make more sense.

------
panic
Why doesn't Linux follow FreeBSD and provide a single interface (under two
names, maybe, for compatibility)? Is there any reason to use /dev/random?

~~~
geofft
Ego. They've put a lot of work into the entropy-tracking thing and they don't
want to admit that it wasn't needed.

(Ego is the reason for a bunch of other security misfeatures in Linux.
Securelevel comes to mind, where Linus explicitly said after fifteen years,
okay, this was in fact the right model all along, you can merge it, just call
it something other than securelevel so I don't have to eat my words about
securelevel being a mistake.)

------
aidenn0
Is there a patch to make /dev/random behave like /dev/urandom? I've had at
least some server software stall blocking on /dev/random, and I already build
my own kernels, so incorporating such a patch is easy.

~~~
mappu

      rm /dev/random && mknod -m 444 /dev/random c 1 9

~~~
throwaway2048
this wont survive reboot, you need udev rules.

------
1_player
Is there anyone that, with full access to the machine/kernel, has managed to
predict the output of random/urandom?

I know that a PRNG is predictable if you know all the input variables, and the
code for it is publicly available, but has anybody in practice been able to
exploit that?

EDIT: that's an honest question. I'd like to read a paper about that.

------
theophrastus
Interesting and edifying, thank you. I wonder how the recent 4.2 kernel
release affects this: Linux 4.2 Released Improving Cryptography Options[1]

[1]: [http://www.linuxplanet.com/news/linux-4.2-released-
improving...](http://www.linuxplanet.com/news/linux-4.2-released-improving-
cryptography-options.html)

------
snorrah
Very interesting article. Clears up the preconceptions I had about using
/dev/random over /dev/urandom ("it's more secure!") and explains why in very
straightforward language.

------
vbezhenar
Java is known to read /dev/random when dealing with SecureRandom class. In
particular it might cause extremely slow starting time for Tomcat on fresh
virtual machine. There's parameter "-Djava.security.egd=file:/dev/./urandom"
and I always felt unsafe using it. Thanks to this article, now I won't regret
it even theoretically.

Fun thing is, if you pass "/dev/urandom" to this parameter, Java will read
/dev/random anyway. May be that was a wise decision 20 years ago.

------
wangweij
This is my understanding: /dev/urandom is just a blind producer returning any
number in some pool; /dev/random is a producer as well as an inspector, it
returns the same number from the pool but it also has an extra eye on the
source of that pool and refuse to work if it believes the source is of low
quality -- here, not enough entropy.

~~~
mikeash
The problem is that /dev/random is too paranoid. It refuses to work if there
isn't enough entropy in the pool, yes. But when it produces numbers, it
subtracts that amount from the supposed amount of entropy in the pool. Thus it
will refuse to work again pretty soon in a lot of circumstances where it's
still completely safe to proceed.

For a concrete example, you start out with zero bits in the pool. /dev/urandom
will produce a predictable stream of bits. This is extremely bad. /dev/random
will block. this is good.

Now you add 1024 bits to the pool. Both /dev/random and /dev/urandom will
produce good numbers.

Now let's say you read 1024 bits from /dev/random. This will reduce the
entropy counter back to zero. If you then try to read another 1024 bits from
/dev/random, it will block.

But this is nonsense! Those 1024 bits you added before aren't depleted just
because you pulled 1024 from /dev/random! It is perfectly safe to proceed
generating more numbers at this point (or at least it's as safe as it was
before), but /dev/random refuses to.

Many systems don't add entropy to the pool very quickly so it's entirely
possible to "deplete" it in this way. Then your code using /dev/random wedges
and you have a problem. However, few systems are ever in a state where they
have zero entropy. Thus the advice to use /dev/urandom.

Ideally, you'd want a device which blocks if and only if the entropy pool
hasn't been properly seeded yet, but which never blocks again after that
point. Apparently this is what the BSDs do, but Linux doesn't have one.

~~~
lsh123
"But this is nonsense! Those 1024 bits you added before aren't depleted just
because you pulled 1024 from /dev/random!"

An observer of the produced random numbers can potentially deduce the next
numbers from the first 1024 random numbers. This is the reason why /dev/random
requires more randomness added -- to prevent the guessing of the next number.

~~~
tptacek
No, they can't. That's not how crypto DRBGs work. If you could do that, you'd
have demonstrated a flaw in the entire /dev/random apparatus, not a reason not
to use urandom.

Think of a CSPRNG almost exactly the way you would a stream cipher --- that's
more or less all a CSPRNG is. Imagine you'd intercepted the ciphertext of a
stream cipher and that you knew the first 1024 plaintext bytes, because of a
file header or because it contained a message you sent, or something like
that. Could you XOR out the known plaintext, recover the first 1024 bytes of
keystream, and use it to predict the next 1024 bytes of keystream? If so,
you'd have demolished the whole stream cipher. Proceed immediately to your
nearest crypto conference; you'll be famous.

Modern CSPRNGs, and Linux's, work on the same principle. They use the same
mechanisms as a stream cipher (you can even turn a DRBG into a stream cipher).
The only real difference is that you select keys for a stream cipher, and you
use a feed of entropy as the key/rekey for a CSPRNG.

It's facts like this that make the Linux man page so maddening, with its weird
reference to attacks "not in the unclassified literature".

------
frankzinger
Previously:
[https://news.ycombinator.com/item?id=7359992](https://news.ycombinator.com/item?id=7359992)

