First of all, the paper is incorrect in some minor details; the most significant error is its (untrue) claim that we stop gathering entropy when the entropy estimate for a given entropy pool is "full". Before July 2012, we went into a trickle mode where we only took in 1 in 096 values. Since then, the main way that we gather entropy, which is via add_interrupt_randomness(), has no such limit. This means that we will continue to collect entropy even if the input pool is apparently "full".
This is critical, because secondly their hypothetical attacks presume certain input distributions which have an incorrect entropy estimate --- that is, either zero actual entropy but a high entropy estimate, or a high entropy, but a low entropy estimate. There has been no attempt by the paper's authors to determine whether the entropy gathered by Linux meets either of their hypothetical models, and in fact in the "Linux Pseudorandom Number Generator Revisited", the analysis showed that our entropy estimator was actually pretty good, given the real-life inputs that we are able to obtain from an actual running Linux system.
The main thing which I am much more worried about is that on various embedded systems, which do not have a fine-grained clock, and which is reading from flash which has a much more deterministic timing for their operations, is that when userspace tries to generate long-term public keys immediately after the machine is taken out of the box and plugged in, that there isn't a sufficient amount of entropy, and since most userspace applications use /dev/urandom since they don't want to block, that they end up with keys that aren't very random. We had some really serious problems with this, which was written up in the "Mining Your Ps and Qs: Detection of Widespread Weak Keys in Network Devices" paper, and the changes made in July 2012 were specifically designed to address these worries.
However, it may be that on certain systems, in particular ARM and MIPS based systems, where a long-term public key is generated very shortly after the first power-on, that there's enough randomness that the techniques used in  would not find any problems, but that might be not enough randomness to prevent our friends in Fort Meade from being able to brute force guess the possible public-private key pairs.
Speaking more generally, I'm a bit dubious about academic analysis which are primarily worried about recovering from the exposure of the state of the random pool. In practice, if the bad guy can grab the state of random pool, they probably have enough privileged access that they can do much more entertaining things, such as grabbing the user's passphrase or just grabbing their long-term private key. Trying to preserve the amount of entropy in the pool, and making sure that we can extract as much uncertainty from the system as possible, are much higher priority things to worry about.
That's not to say that I might not make changes to /dev/random in reaction to academic analysis; I've made changes in reaction to , and I have changes queued for the next major kernel release up to make some changes to address concerns raised in . However, protection against artificially constructed attacks is not the only thing which I am worried about. Things like making sure we have adequate entropy collection on all platforms, especially embedded ones, and adding some conservatism just in case SHA isn't a perfect random function are some of the other things which I am trying to balance as we make changes to /dev/random.
The term for the "academic analysis" of "recovering from exposure of state" is "forward secrecy", and from what I can tell the whole of the crypto literature that discusses CSPRNGs in any depth seems to be unified on the opinion that forward secrecy is an important property for a CSPRNG to have.
(But that's a reaction to your comment, not your opinion of the paper, because I agree that the attacks here don't seem very practical.)
What do you think of Ferguson's belief that entropy estimators are a flawed design approach for CSPRNGs? FreeBSD uses Yarrow for this, which has two benefits --- first, it loses the silly notion of a "urandom" that is somehow marginally less secure than "random" but is in practice not actually less secure, and second, it comes from a design that has a formal research track record, that people can analyze from the paper and not solely by reversing code.
(I sound snippy but am in fact thrilled that you took the time to comment, and thank you).
We have been moving away from using entropy estimators in /dev/random, actually. We still use it for certain entropy sources, most notably for the keyboard and mouse inputs, where it is useful for filtering out event timings caused by the user leaning on the key and triggering autorepeat. But these days we have a "fast mix pool" (which is per-CPU) where we sample on every single interrupt, and then down-mix from there to much larger entropy pool, and the bulk of the entropy, especially on servers, comes from the fast mix pool.
I tend to focus much more on engineering/social aspects; for example, the fundamental cause of the failure found by the Mining Your P's and Q's paper (as opposed to the hypothetical failures of things like the forward secrecy papers), was people disabling the entropy collection in many/most device drivers because they thought it was too slow for high speed devices, in particular network devices. They made arguments that it was because an adversary could monitor the packet arrival times on the Ethernet (which is not true in a switched fabric anyway, and if the attacker is sitting at Fort Meade, or even at AT&T's data center, they won't know about your local area network), but that was just an excuse; the main reason was engineers who were worried about performance.
So using a fancy "approved by academics" design which uses crypto hashing to mix entropy into the mixing pools may be pointless, if downstream kernel hackers disable the entropy collection "because it's too slow". We now have something hard-wired into the main interrupt code path (it's now no longer optional, so it can't be configured on or off on an individual device driver basis), and I've gone through a lot of work to make it be extremely fast, including using per-CPU pools, worrying about cache effects, etc. These are concerns that I'm sure would never help a professor achieve tenure, but as a practicing engineer, I've been painfully aware about how important these sorts of concerns really are.
Don't get me wrong; I do read the papers from those academics who try to analyze /dev/random, and I appreciate their attention. But I do come from a someone different perspective than they do, and over the years I've learned that it's not wise to blindly take the advice of every single bug report that you get (and I consider papers to be simply formal bug reports :-) since sometimes you can't satisfy every single user and every single bug report.
1. The distinction between /dev/random and /dev/urandom is silly. On a machine that's been been collecting entropy for weeks, /dev/random will still block after reading a few bytes. On the other hand, on an embedded device that's just left the factory, /dev/urandom will happily return "random" bytes that aren't. The result is that /dev/random is unusable for anyone except cipherpunks, and /dev/urandom is unsafe on embedded devices.
2. /dev/urandom is slow, so libraries (including mine) all end up shipping their own PRNG implementations in userspace, where they're susceptible to threading issues, having their entire state duplicated by fork(), "//MD_Update(&m,buf,j)", etc. It's surprisingly difficult to do this correctly in a user-space library, and I bet the NSA and others are having a lot of fun at the expense of our users.
What I'd like is a single, fast, trustworthy /dev/?random device that is good enough to obsolete user-space PRNGs entirely. This means it would have the following properties:
1. It blocks until the kernel decides that it can safely generate cryptographically-strong random bytes (i.e. first boot), and never blocks again unless something extraordinary happens (i.e. it's a VM guest that's just been cloned, or something). The idea would be that it never produces output that isn't cryptographically strong, but is otherwise reliable enough that applications can behave as if it never blocks.
2. It produces output so quickly that userspace programs never have an excuse to use anything else. i.e. Its speed should be comparable to what you get using AES-NI in counter mode (I know you don't like Fortuna's entropy pooling, but its output-side PRNG is exactly what you'd want here.) Ideally, this could be fast enough that developers wouldn't even need to distinguish between "random numbers" and "cryptographically-strong random numbers"---just use it everywhere, even if you might otherwise use Mersenne Twister.
TL;DR - I want a faster, reliable, trust-inspiring /dev/?random that obsoletes all user-space CSPRNGs and possibly all user-space PRNGs.
I've been tempted to write something like this myself, but from my casual reading of the lkml archives, it hasn't been clear whether you were actually interested in RNG patches like this. Are you?
That the problems with the two sources are completely different tells you that the distinction is not silly.
Your speed requirement sounds to me like you don't want a /dev at all, but maybe a daemon that seeds itself from /dev/random and frequently refreshes itself from /dev/urandom. This wouldn't be kernel space and it wouldn't have threading issues.
Second, he just explained why the distinction isn't useful in practice. The sole difference between random and urandom is that random blocks based on an entropy estimate. But that's silly: the only time you want random to block in practice is during cold start, but instead random blocks unpredictably all the time and is unusable. Meanwhile, urandom is effective virtually all of the time... except on cold start (on solid state devices), where it would be OK for it just once to block.
I'm not sure how a userspace daemon helps performance, since it implies roundtripping through the kernel anyways.
Distinction: FreeBSD's /dev/random device allows the desired behaviour (modulo the performance concern), once you issue `sysctl kern.random.sys.seeded=0`, which ensures /dev/random blocks if there isn't enough entropy (the command, in effect, says the source needs more entropy to be trustworthy). I think this is a better behaviour than Linux's, but Linux's behaviour in turn seems to be better than Darwin/OSX's, which AFAICS (`sysctl kern` there shows no settings that obviously influence /dev/random) gives you no opportunity to block before you have enough entropy.
Being able to block until there is enough entropy is essential to what dlitz wants, and the random/urandom distinction offers that.
Colin may correct me on this, but I looked it up again to be sure.
/dev/urandom is the Unix standard for crypto randomness; I don't think there's much point in being more portable. Any Unix on which you're going to deploy strong crypto should provide a workable /dev/random.
I'm not sure what a userland daemon could do for you that the kernel couldn't do. The parent commenter's demands are as rigorous as any CSPRNG user's could be: he's building a crypto library and can't assume his users are casual. If a CSPRNG can be made faster for his use case, there's a strong argument that the kernel should provide that performance by default.
It's also worth noting that Yarrow is a Ferguson/Kelsey design, and Ferguson has foresworn entropy estimation; hence Fortuna, his newer design.
The whole notion of the kernel CSPRNG having "enough" entropy is pretty silly for systems in steady state. "enough" entropy is important at cold start, but rarely otherwise.
Also worth noting that "just read urandom" is Daniel J. Bernstein and Peter Schwabe's recommendation, based on the NaCl design papers. It's also Golang's whole CSPRNG approach, which inasmuch as it's owned by anyone is owned by Adam Langley.
whose HISTORY section reads:
A random device appeared in FreeBSD 2.2. The early version was taken
from Theodore Ts'o's entropy driver for Linux. The current software
implementation, introduced in FreeBSD 5.0, is a complete rewrite by Mark
R V Murray, and is an implementation of the Yarrow algorithm by Bruce
Schneier, et al. The only hardware implementation currently is for the
VIA C3 Nehemiah (stepping 3 or greater) CPU. More will be added in the
[+] Let's think of the topical "sufficiently sophisticated attacker", who might have secret knowledge of problems with the CSPRNG we use.
There are plenty of uses of randomness. Let's look at some of them:
1. Monte Carlo Simulations
2. Wiping disk files that might contain sensitive information.
3. Padding in crypto applications
4. Initialization Vectors (IV's)
5. Session keys / D-H key agreement
6. Long-term keys for users
7. Country-level TLD keys for things like DNSSEC
These uses have been rank ordered roughly in order of cryptographic strength that might required. Very often, I've found that people who are complaining about the speed of /dev/urandom are using it for something that it wasn't designed for; for example, "dd if=/dev/urandom of=/dev/hda" to wipe a disk.
And similarly, for people who insist that there is no good difference/distinction between /dev/urandom and /dev/random, they are only thinking of their use case, and/or they aren't considering that there might be a difference between generating a long-term public key for GPG (which today tends to use /dev/random for this purpose, and I think rightly so), and generating a one-time session key when encrypting a message using GPG. Since you only do the first once, a bit more inconvenience might be tolerated by the user. (Today this takes about ten minutes on a laptop, and part of the reason for this is that Chrome is positively profligate with how it uses randomness, including a 32k read from /dev/urandom[!] at startup, and this is draining the entropy available for /dev/random, which slows it down significantly. I have a change pending upstream which time-limits the amount of entropy that can be drawn from the input pool to the nonblocking pool to protect our entropy from abusive users of /dev/urandom, and I have on my todo list tracking down some chrome developers to figure out what the heck they are doing with all of their demands on /dev/urandom.)
The reality is that strong random numbers is a precious resource, and you can't use it willy-nilly. Furthermore, the strength of random numbers is a continuum; it is not a binary "strong" vs. "not strong" distinction. I'm sure it would be very convenient for userspace programmers to be able to assume that randomness is always freely available, and that it only comes in two flavors, "insecure" and "secure", but that's not the way the world works.
Even in the case of hardware random number generators, there are tradeoffs. This is why future Intel processors will be making a distinction between RDRAND (for which you can't assume that two 128 bit RDRAND outputs, when put together, will give you 256 bits of entropy), and RDSEED (which does have this guarantee). And of course, there is also the tradeoff between the auditability of /dev/random and the non-auditability of something which is locked in a complex piece of Silicon.
TL;DR. I'd like to have a Personal Gulfstream Jet, and enough money that it's not a big deal to consider a $12 million dollar home in La Jolla to be teardwon fodder. But sometimes reality intrudes on your dreams/fantasies....
The hierarchy of secrets you've come up with sounds useful, but in practice:
* An attack based on entropy on a CSPRNG that ran months or even days ago, barring blatant flaws like zero cold start entropy, isn't realistic†. And, if you're paranoid about your "long term keys", generate them on a different machine than the one on which you serve up disk-wiping entropy. That's what you're supposed to do anyways. But I'm a little concerned that such a suggestion dignifies an unrealistic risk.
* Meanwhile, the notion that "lesser" secrets, like IV and padding, can deal with less randomness doesn't seem to bear empirical scrutiny. The attacks that have tended to matter against cryptosystems in the last few years have, with the exception of the P's and Q's zero-cold-start-entropy issue, not targeted long-term secrets, but rather have hinged on details like padding (random padding in the OAEP/PKCS#1 case, I guess; most crypto padding is deterministic).
* The mentality of "long term secrets" and "short term crypto parameters" that you laid out already missed a doozy: DSA and ECDSA nonces, which people who think like your comment did thought of as "undemanding", but for which it turns out that just being able to predict a couple of bits of that parameter can get you with a few thousand samples all the way back to private keys.
I think you'd be better off, cryptographically and from a software design perspective, with a single CSPRNG that was suitable for all cryptographic use cases, and, if users are paranoid, have them tap a physically different CSPRNG for their "long term keys" --- which they need to do anyways on the presumption that the systems that they use routinely are also probably owned up.
Also, I have to again point out that the Linux CSPRNG design doesn't handle any of these cases well. Its /dev/random is unusable by demanding applications in steady state, with the possible (terrible) workaround of reimplementing CSPRNGs seeded from small amounts of /dev/random data in userspace (and all the attendant flaws); its /dev/urandom is unsafe at cold start. Something's gotta give in that design: it's doing extra work internally and inflicting extra work on developers, all for a worse result than FreeBSD.
† Re: realism: it's always easy to argue "attacks always get better, we should account for every attack, realistic or not". But engineering is about tradeoffs, and here the tradeoff Linux is making is to service an attack unknown to the literature that is likely precluded by the basic design of a modern refreshing CSPRNG, and doing so at the expense of protecting against very realistic attacks that have been discovered in the wild; it doesn't help that Linux (in general) seems to respond to that point by suggesting that userland devs should have done a better job.
Oh, dubiously-informed CSPRNG opinions; I sure do have them.
You're right that for DSA nonces, even a partial failure of the RNG is a disaster; to me, that's more of an indictment of DSA as being really extremely fragile.
I agree that making /dev/urandom to be as secure as is possible, as quickly as possible after a cold start, is a good thing, and indeed most of my efforts have been focused in that area. I'm not convinced the FreeBSD approach is a panacea, since that presumes FreeBSD's entropy estimator is sufficiently accurate. And I'll note that even with the Fortuna design, which completely gives up on the entropy estimator, it's still subject to the cold start problem --- you just don't know when Fortuna has accumulated enough entropy such that it is secure. In that regard, it's no better than Linux's /dev/random approach.
I'm not opposed to some adding a cold-start block to /dev/urandom, and ultimately, I might go there just because it's easier than making recommendations that userspace wait until the last possible moment to lazily generate keying materials. However, it's a bit of a bandaid, so I'm more interested in trying to collect as much entropy as possible right after boot, to narrow the window after a cold start.
FreeBSD's approach certainly won't help with the criticism leveled by the "/dev/random is not robust" paper, since it assumes that fundamentally the entropy estimator is broken. But if it's broken, then FreeBSD Yarrow will also not be robust in the face of the cold start problem. So it's certainly not going to help people who are going to kvetch about entropy estimators being hopeless.
Entropy may be a scarce resource on the input side, but it's a big mistake to export that scarcity to userspace, rather than abstracting it away. It's the sort of intuitive decision that made sense in the late 1990s---when we also thought that the "error propagation properties" of CBC mode were useful rather than a liability, that CTR mode was scary, and that compressing plaintext was a good idea---but it was wrong and it's due to be revisited. The idea that developers---as a group---can reliably distinguish between multiple types of strong, random numbers, and then correctly implement PRNGs in userspace that take into account those differences, is nothing but pure fantasy.
Applications don't actually need scarce "entropy" (of the information-theoretic kind that you seem so obsessed about). What they need is an abundance of unpredictable bytes that pass the next-bit test, as quickly and straightforwardly as possible.
There is an entire class of vulnerabilities that only exists because of the artificial scarcity of unpredictable bytes: Since applications can't just get all of their unpredictable bytes from /dev/urandom, they use crypto libraries that implement their own PRNGs in userspace (where it's more complex to do, especially in a library, thanks to forking, threading, signals, etc.). Inevitably, these libraries get something wrong, but nobody notices for years.
Quoting a random link: http://comments.gmane.org/gmane.comp.emulators.libvirt/73841
[libvirt] [PATCHv5] virtio-rng: Add rate limiting options for virtio-RNG
Qemu's implementation of virtio RNG supports rate limiting of the
entropy used. This patch exposes the option to tune this functionality.
environment -> host kernel -> egd -> host kernel -> qemu master -> fork -> rate limiter -> qemu slaves -> guest kernel -> egd -> guest kernel -> apache+openssl -> fork -> apache+openssl
All of this complexity is madness, and completely unnecessary. Cryptographic protocols don't actually distinguish between varying "quality" of random numbers, and neither does the literature, so it makes no sense to export an interface that expects people to make that distinction. /dev/random is nearly useless and it doesn't even make up for its lack of availability by being academically rigorous. As tptacek mentioned, key generation actually isn't anything special (DSA needs unpredictable numbers every time, for example). Even where /dev/urandom might be good enough, the lack of rigor in its design makes some people nervous, so they use some userspace PRNG anyway. The result is that random number generation on most machines is a complex mishmash of overcomplicated PRNGs connected in overcomplicated ways, resulting in an endless stream of bugs that are as embarrassing and harmful as all of the buffer overflows caused by the continued misuse of strcat().
Speaking of RNG bugs, here's one that I fixed today: http://lists.dlitz.net/pipermail/pycrypto/2013q4/000702.html
An application may be affected if, within 100 milliseconds, it performs
the following steps (which may be summarized as "read-fork-read-read"):
1. Read from the Crypto.Random PRNG, causing an internal reseed;
2. Fork the process and invoke Crypto.Random.atfork() in the child;
3. Read from the Crypto.Random PRNG again, in at least two different
processes (parent and child, or multiple children).
The fundamental problem is that you're treating /dev/*random as a device driver that provides scarce "entropy", when what's needed is a cryptographic service that provides unpredictable bytes in abundance. You say that /dev/urandom wasn't designed to do "dd if=/dev/urandom of=/dev/hda", and my point is that it ought to be.
Have you tried Haveged?
We have tested Havege and on modern x86 systems you do get good numbers. But there are embedded SSL stacks that use Havege on such things as AVR32 cpus where the exection time is much more predictable. I have yet to test in those environments, but it does not look like a smart move.
Also, Havege includes an entropy estimator that seems to be broken. We used the Simics system simulator and forced the TSC to be stuck and could see that we got bad entropy from Havege. But the estimator happily said that the entropy was good.
My suggestion is that if you are to use Havege and friends, use them to feed the OS entropy pool, not as a replacement for /dev/random, /dev/uramndom.
And speaking of friends. Jytter is one such friend. I have yet to test it with Dieharder though.
Just look at all those "OpenSSL hangs on my server" bug reports and forum posts.
And the man pages are better today than they used to be, but they still don't forcefully dispel that myth.
I really would love to see FreeBSDs /dev/urandom behavior on Linux, i.e. "block once at system start". Yes, you need some kind of entropy estimator for that.
Excellent, I can ask instead of going to read the source!
Does the code managing /dev/random make any effort to detect hardware RNGs where available and use those to feed its pool? That way an available proper" random source is feeding the pool as early as possible, before any user-space tool pumping data from a generator to the pool could be started. It is increasingly common for chipsets/CPUs to have some form of HWRNG circuitry built in.
Or is that deemed an over-complication (and potential point of failure) for something that is not currently available on most devices? I'm thinking of chipsets like the SoC used in the RPi which has a built-in HWRNG that is easily accessible once the appropriate module is loaded, and recent Intel CPUs which have one (though I don't know if that is as easy to access ATM).
> is that when userspace tries to generate long-term public keys immediately after the machine is taken out of the box and plugged in, that there isn't a sufficient amount of entropy
I've wondered if this is something we can at least partially address by adding a property to those assigned via DHCP or similar protocols so as well as telling a new device what address it should respond to (and the usual other properties such as name resolution settings and potentially hostname and so forth) would there be room in the protocol to include a small block of random data? If the network setup is done early enough during boot then there will be at least some entropy in the pool before other process start needing it.
Of course this would require that the device handing out addresses and names had a sufficient entropy source of its own, potentially a HWRNG, in order to be able to provide this random data at the rate needed by the prevailing network conditions (the worst case being at if all the hosts on a large subnet pop online at the same time due to recovery from a power-cut and so all want their block of bits at the same time). Also it wouldn't help devices that might not have any network access on first boot but might still need to generate keys ready for when access is available.