Hacker News new | past | comments | ask | show | jobs | submit login
Insecurities in the Linux /dev/random (schneier.com)
123 points by e1ven on Oct 14, 2013 | hide | past | favorite | 74 comments

This is an interesting paper that was discussed widely on Twitter a few months back. It includes a very useful breakdown of the Linux kernel CSPRNG starting on page 17. The attacks the paper describes start on page 22. The front part of the paper is dominated by a discussion of the theoretical modeling of CSPRNGs, in part because the theoretical foundations for CSPRNGs and their interactions with applications are not well documented.

The most important bit of information you probably want to know about this paper is that it is modeling the security of a CSPRNG when its entropy inputs are entirely controlled by attackers. For instance, one of the attacks involves tricking the entropy estimator into believing it's seeding itself with high-quality random data when that data in fact has no entropy.

This is not a realistic attack scenario for the vast majority (maybe all) of applications.

Having said that, the Linux kernel CSPRNG doesn't (as far as I know) have a particularly well formalized design. There are good CSPRNG designs out there; for instance, the BSDs use Yarrow, and Ferguson followed that design up with Fortuna, which dispenses with the need for entropy estimators altogether.

Wouldn't this in fact be completely unrealistic? If attackers have full control of entropy inputs, they can just hook up a very high quality randomness source, but record every single bit before it gets to /dev/random.

Of course, why bother with that if you have that kind of access?

Using a "false entropy" model feels partially like a response to criticism of things like RdRand, for example.

So I'm the maintainer for Linux's /dev/random driver. I've only had a chance to look at the paper very quickly, and I will at it more closely when I have more time, but what the authors of this paper seem to be worried about is not even close to the top of my list in terms of things I'm worried about.

First of all, the paper is incorrect in some minor details; the most significant error is its (untrue) claim that we stop gathering entropy when the entropy estimate for a given entropy pool is "full". Before July 2012, we went into a trickle mode where we only took in 1 in 096 values. Since then, the main way that we gather entropy, which is via add_interrupt_randomness(), has no such limit. This means that we will continue to collect entropy even if the input pool is apparently "full".

This is critical, because secondly their hypothetical attacks presume certain input distributions which have an incorrect entropy estimate --- that is, either zero actual entropy but a high entropy estimate, or a high entropy, but a low entropy estimate. There has been no attempt by the paper's authors to determine whether the entropy gathered by Linux meets either of their hypothetical models, and in fact in the "Linux Pseudorandom Number Generator Revisited"[1], the analysis showed that our entropy estimator was actually pretty good, given the real-life inputs that we are able to obtain from an actual running Linux system.

[1] http://eprint.iacr.org/2012/251.pdf

The main thing which I am much more worried about is that on various embedded systems, which do not have a fine-grained clock, and which is reading from flash which has a much more deterministic timing for their operations, is that when userspace tries to generate long-term public keys immediately after the machine is taken out of the box and plugged in, that there isn't a sufficient amount of entropy, and since most userspace applications use /dev/urandom since they don't want to block, that they end up with keys that aren't very random. We had some really serious problems with this, which was written up in the "Mining Your Ps and Qs: Detection of Widespread Weak Keys in Network Devices"[2] paper, and the changes made in July 2012 were specifically designed to address these worries.

[2] https://www.factorable.net/paper.html

However, it may be that on certain systems, in particular ARM and MIPS based systems, where a long-term public key is generated very shortly after the first power-on, that there's enough randomness that the techniques used in [2] would not find any problems, but that might be not enough randomness to prevent our friends in Fort Meade from being able to brute force guess the possible public-private key pairs.

Speaking more generally, I'm a bit dubious about academic analysis which are primarily worried about recovering from the exposure of the state of the random pool. In practice, if the bad guy can grab the state of random pool, they probably have enough privileged access that they can do much more entertaining things, such as grabbing the user's passphrase or just grabbing their long-term private key. Trying to preserve the amount of entropy in the pool, and making sure that we can extract as much uncertainty from the system as possible, are much higher priority things to worry about.

That's not to say that I might not make changes to /dev/random in reaction to academic analysis; I've made changes in reaction to [2], and I have changes queued for the next major kernel release up to make some changes to address concerns raised in [1]. However, protection against artificially constructed attacks is not the only thing which I am worried about. Things like making sure we have adequate entropy collection on all platforms, especially embedded ones, and adding some conservatism just in case SHA isn't a perfect random function are some of the other things which I am trying to balance as we make changes to /dev/random.

Nit: the cold start entropy problem (specifically in the Linux kernel CSPRNG) was published way before Heninger; Zvi Gutterman presented it at Black Hat in 2006. "P's and Q's" got a lot of publicity, but could some of that exposure have been mitigated had earlier reports been acted on? (I seriously have no idea and am just asking).

The term for the "academic analysis" of "recovering from exposure of state" is "forward secrecy", and from what I can tell the whole of the crypto literature that discusses CSPRNGs in any depth seems to be unified on the opinion that forward secrecy is an important property for a CSPRNG to have.

(But that's a reaction to your comment, not your opinion of the paper, because I agree that the attacks here don't seem very practical.)

What do you think of Ferguson's belief that entropy estimators are a flawed design approach for CSPRNGs? FreeBSD uses Yarrow for this, which has two benefits --- first, it loses the silly notion of a "urandom" that is somehow marginally less secure than "random" but is in practice not actually less secure, and second, it comes from a design that has a formal research track record, that people can analyze from the paper and not solely by reversing code.

(I sound snippy but am in fact thrilled that you took the time to comment, and thank you).

Yes, I'm aware of the concept of forward secrecy, and the formalism around it. Without intending to sound snippy myself, I think there is a tendency for academicians (whose promotion/tenure prospects are directly related being able to generate papers that get published in high impact journals) to be biased in favor of those things for which formal models can be created. We saw that for example in the academic focus on formal proof of correctness of programs, which at this point is recognized as a blind alley, as opposed to a more engineering approach of using tools like valgrind and lint and static program analysis. So I am a bit unpersuaded regarding the academic focus on forward secrecy. It's something that if we can add to a design, I'll certainly do it --- and indeed there is a Yarrow-style "catastrophic reseeding" in the /dev/random design.

We have been moving away from using entropy estimators in /dev/random, actually. We still use it for certain entropy sources, most notably for the keyboard and mouse inputs, where it is useful for filtering out event timings caused by the user leaning on the key and triggering autorepeat. But these days we have a "fast mix pool" (which is per-CPU) where we sample on every single interrupt, and then down-mix from there to much larger entropy pool, and the bulk of the entropy, especially on servers, comes from the fast mix pool.

I tend to focus much more on engineering/social aspects; for example, the fundamental cause of the failure found by the Mining Your P's and Q's paper (as opposed to the hypothetical failures of things like the forward secrecy papers), was people disabling the entropy collection in many/most device drivers because they thought it was too slow for high speed devices, in particular network devices. They made arguments that it was because an adversary could monitor the packet arrival times on the Ethernet (which is not true in a switched fabric anyway, and if the attacker is sitting at Fort Meade, or even at AT&T's data center, they won't know about your local area network), but that was just an excuse; the main reason was engineers who were worried about performance.

So using a fancy "approved by academics" design which uses crypto hashing to mix entropy into the mixing pools may be pointless, if downstream kernel hackers disable the entropy collection "because it's too slow". We now have something hard-wired into the main interrupt code path (it's now no longer optional, so it can't be configured on or off on an individual device driver basis), and I've gone through a lot of work to make it be extremely fast, including using per-CPU pools, worrying about cache effects, etc. These are concerns that I'm sure would never help a professor achieve tenure, but as a practicing engineer, I've been painfully aware about how important these sorts of concerns really are.

Don't get me wrong; I do read the papers from those academics who try to analyze /dev/random, and I appreciate their attention. But I do come from a someone different perspective than they do, and over the years I've learned that it's not wise to blindly take the advice of every single bug report that you get (and I consider papers to be simply formal bug reports :-) since sometimes you can't satisfy every single user and every single bug report.

Oh, cool! Have you considered making changes to the interface that's presented to userspace? As a crypto library author, I have a few gripes about the current interface:

1. The distinction between /dev/random and /dev/urandom is silly. On a machine that's been been collecting entropy for weeks, /dev/random will still block after reading a few bytes. On the other hand, on an embedded device that's just left the factory, /dev/urandom will happily return "random" bytes that aren't. The result is that /dev/random is unusable for anyone except cipherpunks, and /dev/urandom is unsafe on embedded devices.

2. /dev/urandom is slow, so libraries (including mine) all end up shipping their own PRNG implementations in userspace, where they're susceptible to threading issues, having their entire state duplicated by fork(), "//MD_Update(&m,buf,j)", etc. It's surprisingly difficult to do this correctly in a user-space library, and I bet the NSA and others are having a lot of fun at the expense of our users.

What I'd like is a single, fast, trustworthy /dev/?random device that is good enough to obsolete user-space PRNGs entirely. This means it would have the following properties:

1. It blocks until the kernel decides that it can safely generate cryptographically-strong random bytes (i.e. first boot), and never blocks again unless something extraordinary happens (i.e. it's a VM guest that's just been cloned, or something). The idea would be that it never produces output that isn't cryptographically strong, but is otherwise reliable enough that applications can behave as if it never blocks.

2. It produces output so quickly that userspace programs never have an excuse to use anything else. i.e. Its speed should be comparable to what you get using AES-NI in counter mode (I know you don't like Fortuna's entropy pooling, but its output-side PRNG is exactly what you'd want here.) Ideally, this could be fast enough that developers wouldn't even need to distinguish between "random numbers" and "cryptographically-strong random numbers"---just use it everywhere, even if you might otherwise use Mersenne Twister.

TL;DR - I want a faster, reliable, trust-inspiring /dev/?random that obsoletes all user-space CSPRNGs and possibly all user-space PRNGs.

I've been tempted to write something like this myself, but from my casual reading of the lkml archives, it hasn't been clear whether you were actually interested in RNG patches like this. Are you?

Why do you need so much random numbers such that /dev/urandom is too slow for you?

There are plenty of uses of randomness. Let's look at some of them:

1. Monte Carlo Simulations

2. Wiping disk files that might contain sensitive information.

3. Padding in crypto applications

4. Initialization Vectors (IV's)

5. Session keys / D-H key agreement

6. Long-term keys for users

7. Country-level TLD keys for things like DNSSEC

These uses have been rank ordered roughly in order of cryptographic strength that might required. Very often, I've found that people who are complaining about the speed of /dev/urandom are using it for something that it wasn't designed for; for example, "dd if=/dev/urandom of=/dev/hda" to wipe a disk.

And similarly, for people who insist that there is no good difference/distinction between /dev/urandom and /dev/random, they are only thinking of their use case, and/or they aren't considering that there might be a difference between generating a long-term public key for GPG (which today tends to use /dev/random for this purpose, and I think rightly so), and generating a one-time session key when encrypting a message using GPG. Since you only do the first once, a bit more inconvenience might be tolerated by the user. (Today this takes about ten minutes on a laptop, and part of the reason for this is that Chrome is positively profligate with how it uses randomness, including a 32k read from /dev/urandom[!] at startup, and this is draining the entropy available for /dev/random, which slows it down significantly. I have a change pending upstream which time-limits the amount of entropy that can be drawn from the input pool to the nonblocking pool to protect our entropy from abusive users of /dev/urandom, and I have on my todo list tracking down some chrome developers to figure out what the heck they are doing with all of their demands on /dev/urandom.)

The reality is that strong random numbers is a precious resource, and you can't use it willy-nilly. Furthermore, the strength of random numbers is a continuum; it is not a binary "strong" vs. "not strong" distinction. I'm sure it would be very convenient for userspace programmers to be able to assume that randomness is always freely available, and that it only comes in two flavors, "insecure" and "secure", but that's not the way the world works.

Even in the case of hardware random number generators, there are tradeoffs. This is why future Intel processors will be making a distinction between RDRAND (for which you can't assume that two 128 bit RDRAND outputs, when put together, will give you 256 bits of entropy), and RDSEED (which does have this guarantee). And of course, there is also the tradeoff between the auditability of /dev/random and the non-auditability of something which is locked in a complex piece of Silicon.

TL;DR. I'd like to have a Personal Gulfstream Jet, and enough money that it's not a big deal to consider a $12 million dollar home in La Jolla to be teardwon fodder. But sometimes reality intrudes on your dreams/fantasies....

This is a great comment, but I disagree with it.

The hierarchy of secrets you've come up with sounds useful, but in practice:

* An attack based on entropy on a CSPRNG that ran months or even days ago, barring blatant flaws like zero cold start entropy, isn't realistic†. And, if you're paranoid about your "long term keys", generate them on a different machine than the one on which you serve up disk-wiping entropy. That's what you're supposed to do anyways. But I'm a little concerned that such a suggestion dignifies an unrealistic risk.

* Meanwhile, the notion that "lesser" secrets, like IV and padding, can deal with less randomness doesn't seem to bear empirical scrutiny. The attacks that have tended to matter against cryptosystems in the last few years have, with the exception of the P's and Q's zero-cold-start-entropy issue, not targeted long-term secrets, but rather have hinged on details like padding (random padding in the OAEP/PKCS#1 case, I guess; most crypto padding is deterministic).

* The mentality of "long term secrets" and "short term crypto parameters" that you laid out already missed a doozy: DSA and ECDSA nonces, which people who think like your comment did thought of as "undemanding", but for which it turns out that just being able to predict a couple of bits of that parameter can get you with a few thousand samples all the way back to private keys.

I think you'd be better off, cryptographically and from a software design perspective, with a single CSPRNG that was suitable for all cryptographic use cases, and, if users are paranoid, have them tap a physically different CSPRNG for their "long term keys" --- which they need to do anyways on the presumption that the systems that they use routinely are also probably owned up.

Also, I have to again point out that the Linux CSPRNG design doesn't handle any of these cases well. Its /dev/random is unusable by demanding applications in steady state, with the possible (terrible) workaround of reimplementing CSPRNGs seeded from small amounts of /dev/random data in userspace (and all the attendant flaws); its /dev/urandom is unsafe at cold start. Something's gotta give in that design: it's doing extra work internally and inflicting extra work on developers, all for a worse result than FreeBSD.

Re: realism: it's always easy to argue "attacks always get better, we should account for every attack, realistic or not". But engineering is about tradeoffs, and here the tradeoff Linux is making is to service an attack unknown to the literature that is likely precluded by the basic design of a modern refreshing CSPRNG, and doing so at the expense of protecting against very realistic attacks that have been discovered in the wild; it doesn't help that Linux (in general) seems to respond to that point by suggesting that userland devs should have done a better job.

Oh, dubiously-informed CSPRNG opinions; I sure do have them.

It's true that many of the attacks (such as BEAST) rely on attacks related to the IV. However, look at underlying mechanism of that attack; it requires that the attacker be able to predict the entire IV (which it can do since it can use the CBC residue from the previous packet). If the attacker can predict 64 out of the 128 bits of an IV used for AES, that's less of a disaster than if the attacker can predict 64 out of the 128 bits for an AES key. So there is a difference, although I might agree with you that relying on application developers to understand these distinctions might be unrealistic.

You're right that for DSA nonces, even a partial failure of the RNG is a disaster; to me, that's more of an indictment of DSA as being really extremely fragile.

I agree that making /dev/urandom to be as secure as is possible, as quickly as possible after a cold start, is a good thing, and indeed most of my efforts have been focused in that area. I'm not convinced the FreeBSD approach is a panacea, since that presumes FreeBSD's entropy estimator is sufficiently accurate. And I'll note that even with the Fortuna design, which completely gives up on the entropy estimator, it's still subject to the cold start problem --- you just don't know when Fortuna has accumulated enough entropy such that it is secure. In that regard, it's no better than Linux's /dev/random approach.

I'm not opposed to some adding a cold-start block to /dev/urandom, and ultimately, I might go there just because it's easier than making recommendations that userspace wait until the last possible moment to lazily generate keying materials. However, it's a bit of a bandaid, so I'm more interested in trying to collect as much entropy as possible right after boot, to narrow the window after a cold start.

FreeBSD's approach certainly won't help with the criticism leveled by the "/dev/random is not robust" paper, since it assumes that fundamentally the entropy estimator is broken. But if it's broken, then FreeBSD Yarrow will also not be robust in the face of the cold start problem. So it's certainly not going to help people who are going to kvetch about entropy estimators being hopeless.

> The reality is that strong random numbers is a precious resource, and you can't use it willy-nilly. Furthermore, the strength of random numbers is a continuum; it is not a binary "strong" vs. "not strong" distinction. I'm sure it would be very convenient for userspace programmers to be able to assume that randomness is always freely available, and that it only comes in two flavors, "insecure" and "secure", but that's not the way the world works.

Entropy may be a scarce resource on the input side, but it's a big mistake to export that scarcity to userspace, rather than abstracting it away. It's the sort of intuitive decision that made sense in the late 1990s---when we also thought that the "error propagation properties" of CBC mode were useful rather than a liability, that CTR mode was scary, and that compressing plaintext was a good idea---but it was wrong and it's due to be revisited. The idea that developers---as a group---can reliably distinguish between multiple types of strong, random numbers, and then correctly implement PRNGs in userspace that take into account those differences, is nothing but pure fantasy.

Applications don't actually need scarce "entropy" (of the information-theoretic kind that you seem so obsessed about). What they need is an abundance of unpredictable bytes that pass the next-bit test, as quickly and straightforwardly as possible.

There is an entire class of vulnerabilities that only exists because of the artificial scarcity of unpredictable bytes: Since applications can't just get all of their unpredictable bytes from /dev/urandom, they use crypto libraries that implement their own PRNGs in userspace (where it's more complex to do, especially in a library, thanks to forking, threading, signals, etc.). Inevitably, these libraries get something wrong, but nobody notices for years.

Quoting a random link: http://comments.gmane.org/gmane.comp.emulators.libvirt/73841

    [libvirt] [PATCHv5] virtio-rng: Add rate limiting options for virtio-RNG

    Qemu's implementation of virtio RNG supports rate limiting of the
    entropy used. This patch exposes the option to tune this functionality.
Rate limiting?? So, if we have Apache running on a VM, we end up with something like this:

environment -> host kernel -> egd -> host kernel -> qemu master -> fork -> rate limiter -> qemu slaves -> guest kernel -> egd -> guest kernel -> apache+openssl -> fork -> apache+openssl

All of this complexity is madness, and completely unnecessary. Cryptographic protocols don't actually distinguish between varying "quality" of random numbers, and neither does the literature, so it makes no sense to export an interface that expects people to make that distinction. /dev/random is nearly useless and it doesn't even make up for its lack of availability by being academically rigorous. As tptacek mentioned, key generation actually isn't anything special (DSA needs unpredictable numbers every time, for example). Even where /dev/urandom might be good enough, the lack of rigor in its design makes some people nervous, so they use some userspace PRNG anyway. The result is that random number generation on most machines is a complex mishmash of overcomplicated PRNGs connected in overcomplicated ways, resulting in an endless stream of bugs that are as embarrassing and harmful as all of the buffer overflows caused by the continued misuse of strcat().

Speaking of RNG bugs, here's one that I fixed today: http://lists.dlitz.net/pipermail/pycrypto/2013q4/000702.html

    An application may be affected if, within 100 milliseconds, it performs
    the following steps (which may be summarized as "read-fork-read-read"):

    1. Read from the Crypto.Random PRNG, causing an internal reseed;
    2. Fork the process and invoke Crypto.Random.atfork() in the child;
    3. Read from the Crypto.Random PRNG again, in at least two different
         processes (parent and child, or multiple children).
This bug wouldn't even have the opportunity to exist if I could have just read everything from /dev/urandom---which is what I proposed back in 2008, but some of my users complained, citing the lack of speed and rigor in /dev/urandom, so I implemented the most reliable RNG that I knew about at the time (Fortuna), but it was broken in a case where the state-copying behavior of fork() interacts with the rate-limiting behavior of Fortuna, which---ironically---is designed to protect users against attacks that exploit the scarcity of random numbers. Perhaps I should have ignored them, and been as "profligate" with /dev/urandom as Chrome is, but then they might have individually just implemented something on their own, which wouldn't necesasrily be better for end-users.

The fundamental problem is that you're treating /dev/*random as a device driver that provides scarce "entropy", when what's needed is a cryptographic service that provides unpredictable bytes in abundance. You say that /dev/urandom wasn't designed to do "dd if=/dev/urandom of=/dev/hda", and my point is that it ought to be.

> The distinction between /dev/random and /dev/urandom is silly

That the problems with the two sources are completely different tells you that the distinction is not silly.

Your speed requirement sounds to me like you don't want a /dev at all, but maybe a daemon that seeds itself from /dev/random and frequently refreshes itself from /dev/urandom. This wouldn't be kernel space and it wouldn't have threading issues.

First, other operating systems lose the distinction, and maintain the two device filesystem names solely for historical purposes. FreeBSD, for instance, uses Yarrow and doesn't distinguish between urandom and random.

Second, he just explained why the distinction isn't useful in practice. The sole difference between random and urandom is that random blocks based on an entropy estimate. But that's silly: the only time you want random to block in practice is during cold start, but instead random blocks unpredictably all the time and is unusable. Meanwhile, urandom is effective virtually all of the time... except on cold start (on solid state devices), where it would be OK for it just once to block.

I'm not sure how a userspace daemon helps performance, since it implies roundtripping through the kernel anyways.

Userspace: Being in userspace obviously doesn't increase speed, but I don't think the overhead is material to dlitz's speed concern. The point is that we can cook up things in userspace that might meet dlitz's requirements: this offers flexibility and the possibility of portability, provided the basic ingredients are available from the OS.

Distinction: FreeBSD's /dev/random device allows the desired behaviour (modulo the performance concern), once you issue `sysctl kern.random.sys.seeded=0`, which ensures /dev/random blocks if there isn't enough entropy (the command, in effect, says the source needs more entropy to be trustworthy). I think this is a better behaviour than Linux's, but Linux's behaviour in turn seems to be better than Darwin/OSX's, which AFAICS (`sysctl kern` there shows no settings that obviously influence /dev/random) gives you no opportunity to block before you have enough entropy.

Being able to block until there is enough entropy is essential to what dlitz wants, and the random/urandom distinction offers that.

OS X took its /dev/random from FreeBSD. The CSPRNG design is Yarrow. It explicitly unifies urandom/random; they're the same device.

Colin may correct me on this, but I looked it up again to be sure.

/dev/urandom is the Unix standard for crypto randomness; I don't think there's much point in being more portable. Any Unix on which you're going to deploy strong crypto should provide a workable /dev/random.

I'm not sure what a userland daemon could do for you that the kernel couldn't do. The parent commenter's demands are as rigorous as any CSPRNG user's could be: he's building a crypto library and can't assume his users are casual. If a CSPRNG can be made faster for his use case, there's a strong argument that the kernel should provide that performance by default.

FreeBSD's current implementation behind /dev/random is more sophisticated that Darwin's - just look at the manpages. I don't see how you can tell if Darwin doesn't have enough entropy, which you can with both Linux and FreeBSD.

I might be looking at an older FreeBSD man page. COLIN? Where are you?

It's also worth noting that Yarrow is a Ferguson/Kelsey design, and Ferguson has foresworn entropy estimation; hence Fortuna, his newer design.

The whole notion of the kernel CSPRNG having "enough" entropy is pretty silly for systems in steady state. "enough" entropy is important at cold start, but rarely otherwise.

Also worth noting that "just read urandom" is Daniel J. Bernstein and Peter Schwabe's recommendation, based on the NaCl design papers. It's also Golang's whole CSPRNG approach, which inasmuch as it's owned by anyone is owned by Adam Langley.

The current man page for FreeBSD random (4) is at http://www.freebsd.org/cgi/man.cgi?query=random&apropos=0&se...

whose HISTORY section reads:

     A random device appeared in FreeBSD 2.2.  The early version was taken
     from Theodore Ts'o's entropy driver for Linux.  The current software
     implementation, introduced in FreeBSD 5.0, is a complete rewrite by Mark
     R V Murray, and is an implementation of the Yarrow algorithm by Bruce
     Schneier, et al.  The only hardware implementation currently is for the
     VIA C3 Nehemiah (stepping 3 or greater) CPU.  More will be added in the
dlitz's worry was about initial entropy, which is why he wanted his ideal random device to block initially - this makes a lot of sense. That said, I guess if you are worried about sophisticated attackers [+] getting to see very long stretches of output from your random device, you might want to ensure you are periodically injecting some fresh entropy in there. This should be ensured in the system, so the luser can just read /dev/urandom, but it should be possible to check that the system has fresh enough entropy. FreeBSD's interface seems better suited for this than Linux's.

[+] Let's think of the topical "sufficiently sophisticated attacker", who might have secret knowledge of problems with the CSPRNG we use.

> /dev/random will still block after reading a few bytes.

Have you tried Haveged? http://www.issihosts.com/haveged/

Havege and other CPU-based entropy sources are interesting. But they do implicit assumptions such as that the TSC is somehow giving you real measurements (and is not virtualized for example) and that the CPU has cache memories, do out-of-order issue etc which makes the exection time variable.

We have tested Havege and on modern x86 systems you do get good numbers. But there are embedded SSL stacks that use Havege on such things as AVR32 cpus where the exection time is much more predictable. I have yet to test in those environments, but it does not look like a smart move.

Also, Havege includes an entropy estimator that seems to be broken. We used the Simics system simulator and forced the TSC to be stuck and could see that we got bad entropy from Havege. But the estimator happily said that the entropy was good.

My suggestion is that if you are to use Havege and friends, use them to feed the OS entropy pool, not as a replacement for /dev/random, /dev/uramndom.

And speaking of friends. Jytter is one such friend. I have yet to test it with Dieharder though.


I have now tried Jytter on OSX (MBA with Ivy Bridge Core i7). I get about 50 kbps from jytter. Testing the data with Dieharder I get about 82% success rate on the test cases.

I'm really happy to hear that entropy collection on other interrupts is being brought back. After struggling with "entropy exhaustion" on servers (non virtual ones!) that were only getting 50 bits per second or whatever from the timing interrupt this should remove some unfortunate sources of annoyance.

Actually my impression is that many many user space programs use /dev/random, even for short term keys and nonces and stuff, just because they have learned by "more experienced Linux users" that /dev/urandom is not secure.

Just look at all those "OpenSSL hangs on my server" bug reports and forum posts.

And the man pages are better today than they used to be, but they still don't forcefully dispel that myth.

I really would love to see FreeBSDs /dev/urandom behavior on Linux, i.e. "block once at system start". Yes, you need some kind of entropy estimator for that.

Its academic papers and discussion from the maintainer like this that makes open source so great.

> So I'm the maintainer for Linux's /dev/random driver.

Excellent, I can ask instead of going to read the source!

Does the code managing /dev/random make any effort to detect hardware RNGs where available and use those to feed its pool? That way an available proper" random source is feeding the pool as early as possible, before any user-space tool pumping data from a generator to the pool could be started. It is increasingly common for chipsets/CPUs to have some form of HWRNG circuitry built in.

Or is that deemed an over-complication (and potential point of failure) for something that is not currently available on most devices? I'm thinking of chipsets like the SoC used in the RPi which has a built-in HWRNG that is easily accessible once the appropriate module is loaded, and recent Intel CPUs which have one (though I don't know if that is as easy to access ATM).

> is that when userspace tries to generate long-term public keys immediately after the machine is taken out of the box and plugged in, that there isn't a sufficient amount of entropy

I've wondered if this is something we can at least partially address by adding a property to those assigned via DHCP or similar protocols so as well as telling a new device what address it should respond to (and the usual other properties such as name resolution settings and potentially hostname and so forth) would there be room in the protocol to include a small block of random data? If the network setup is done early enough during boot then there will be at least some entropy in the pool before other process start needing it.

Of course this would require that the device handing out addresses and names had a sufficient entropy source of its own, potentially a HWRNG, in order to be able to provide this random data at the rate needed by the prevailing network conditions (the worst case being at if all the hosts on a large subnet pop online at the same time due to recovery from a power-cut and so all want their block of bits at the same time). Also it wouldn't help devices that might not have any network access on first boot but might still need to generate keys ready for when access is available.

Tips: If you're cloning or creating Linux virtual machines, run virt-sysprep[1] on them. It will insert a new random seed into the cloned VM, ensuring it boots with different entropy from the clone source or other clones from the same source. Secondly, run your VMs with the new virtio-rng driver[2].

[1] http://libguestfs.org/virt-sysprep.1.html

[2] https://fedoraproject.org/wiki/Features/Virtio_RNG

Their "solution" is also pretty blunt. Basically replacing the whole thing with a construct that looks a lot like AES-GCM (basically using a field multiply to mix in new random state)... which they show is strong for their particular complaint, but would abandon all the existing reasons to have any confidence in the current approach. (and also appears(?) to substantial shrink the random pool size)

From the abstract: "A pseudo-random number generator (PRNG) is a deterministic algorithm that produces numbers whose distribution is indistinguishable from uniform."

I think this is incorrect, since it would imply that a program that produces the sequence 1,2,1,2,1,2,1.... is a PRNG.

Clearly, a PRNG is a subtle concept and it's common for abstracts to not be completely precise. In context, it's clear to me that by "uniform" they mean "a random process which generates numbers independently from a uniform distribution." Your process is generating numbers uniformly, but successive outputs are definitely not independent.

One reason why I feel your interpretation is strained is that the phrasing makes "uniform" sound like a single process. You seem to be reading it as "numbers which are distributed uniformly," but the word indistinguishable really suggests that it's more than just the distribution that matters.

Using ent, and 0's and 1's (instead of 1's and 2's).

(By this I mean a pattern of 0,1,0,1 bits etc. - 4248759 bytes worth in this case)

Below is the output from ent:


Entropy = 0.000000 bits per byte.

Optimum compression would reduce the size of this 4248759 byte file by 100 percent.

Chi square distribution for 4248759 samples is 1083433545.00, and randomly would exceed this value less than 0.01 percent of the times.

Arithmetic mean value of data bytes is 170.0000 (127.5 = random). Monte Carlo value for Pi is 4.000000000 (error 27.32 percent). Serial correlation coefficient is undefined (all values equal!).

No, your sequence IS distinguishable from a uniform sequence. At least it is if I understand what you mean by "...".

If you mean that it generates 1,2,1,2,1,2,1 and then after that generates a mix of different values that include all possible values in approximately equal frequency, then you may have a fairly good PRNG. If you mean that it produces 1,2,1,2,1,2,1 and then after that it continues to alternate between 2 and 1, then this is easily distinguished from uniform.

>"No, your sequence IS distinguishable from a uniform sequence." No, it is not distinguishable. The definition according to wikipedia: The discrete uniform distribution is a symmetric probability distribution whereby a finite number of values are equally likely to be observed; every one of n values has equal probability 1/n.

So the distribution is indeed uniform but it fails in being random.

The point is that, in the context of the paper, you are supposed to group consecutive sequence elements and find their distribution.

Consider the sequence 1,2,1,2,1,2,1.... When you take elements one at a time, you have a uniform distribution, but the sequence is clearly not random. However, when you take elements two at a time, you have; 12,12,12,12... In this case, the distribution is not uniform, which indicates non-randomness.

The paper's definition implies that you're grouping the sequence elements like this and, for all groupings, you find uniform distributions. This is of course common knowledge to experts, and is made clear in the body of the paper.

Actually, it is more general than that. Any polynomial time algorithm that takes as input a sequence and which has access to a unbounded source of uniform random numbers (i.e. a "true" random number generator) should output a 1 when given a uniform random sequence with "nearly" the same probability as when given the output of a PRNG ("nearly" here means within a negligible amount e.g. the algorithm might just try to guess the PRNG seed). In other words,

  Pr[A(prng(seed, n))] - Pr[A(uniform(n))] < 1/negl(n)
Where for all polynomial functions p(n), there exists an N such that for all n > N, 1/p(n) > 1/negl(n); uniform(n) is an n element long uniform random sequence, and prng(seed, n) is the first n elements of the output of the PRNG for a (uniform) random seed. This is a common definition in cryptography and a variant is mentioned further down in the paper in Section 2.

All that says is that 1,2,1,2,1,2... is a possible uniform random sequence. It is not the only such sequence, and it is a very unlikely sequence.

The notion of "distinguishible" here refers to an experiment where an algorithm can receive one of two possible inputs, with equal probability: a sequence of uniformly sampled numbers, or a sequence generated by a PRNG. If the PRNG is secure, the algorithm should output "1" when it receives the uniform sequence with (very nearly i.e. different by a negligible amount) equal probability as when it receives the PRNG's sequence. So, for example, if you test if even indexed elements are 1 and odd indexed elements are 2, and output 1 if both tests pass, then you will almost never output 1 for a uniform sequence; on the other hand, if the PRNG always and very frequently outputs 1,2,1,2,1,2... then you will always or very frequently output 1 when you get the PRNG's sequence as input, and thus your probability of outputting a 1 is greater in that case.

What is "...", here? If it's infinite then we're talking about something abstract and you need to be more precise. If we're talking about something finite then you need the specify the finite-ness.

Note that (1,2,1,2,...,1,2) is not a "distribution," as you seem to be saying. It's a sequence of values drawn from a distribution.

For a fixed length, is it possible for the sequence (1,2,1,2,1,2,...,1,2) to be drawn from a uniformly random distribution that contains 1 and 2 as possible values? Yes. How likely is it? It depends on the number of possible values in the distribution and the length of the sequence. The more values besides 1 and 2 and the longer the sequence, the less likely that sequence will have been drawn from a uniformly random distribution.

In fact, as the length of the sequence increases, the probability of any particular sequence approaches 0 but is always greater than 0.

The crux of your confusion appears to be that you think "distinguishable" means "a sequence that couldn't possibly have been drawn from a particular distribution," which is not the case.

Please note that I am not the OC. It seems like you wrote in the wrong thread (for instance, I didn't use any ellipsis on my comment)

Ah, so your actual complaint was that they used the term "uniform sequence" as shorthand for the more accurate "uniformly random sequence".

But I never said it was indistinguishable. What I'm saying is that the definition in the first sentence of the paper is wrong or, at least, incomplete.

"whose distribution is indistinguishable from uniform"

Emphasis mine. Distribution here is not its English meaning, it is referring to the statistical term. Thus, it is talking about numbers "drawn from the uniform distribution", not "uniform numbers".

A PRNG drawing from the uniform distribution may produce that sequence (for any set that includes 1 and 2), but a PRNG that only produced that sequence would certainly not be drawing from the uniform distribution.

Perhaps you don't understand the terms in context. If you looked at the distribution of your example, it would have two big spikes in it, at numbers one and two.

The sequence 1,2,1,2,1,2,1... is not indistinguishable from a uniform sequence, at least in the context of cryptography. With high probability, an algorithm that checks if all the even indexed inputs are 1 and all the odd indexed inputs are 2 will distinguish that sequence from a uniform random sequence (this is easy to see, as the probability of a sequence of that form being generated by repeatedly tossing a fair coin is very small).

you can also check that (1,1), (1,2), (2,1), and (2,2) all happen with expected frequencies (i.e. roughly 25%), where the two numbers in the pair are just two numbers in a row in the sequence. You would have gotten 100% (1,2) and 0% for the rest, so that's suspicious.

same for 3-tuples, 4-tuples, ..., and n-tuples.

There's probably other, more sophisticated tests as well, but I'm not an expert.

I think you missed my point, which is this: the first sentence in the paper's abstract is wrong.

My examples with pairs and n-tuples are also distributions.

That's a good point, and that may well be what the authors meant.

Since all PRNGs are periodic (at least all I now of), the definition eventually breaks down. I can see how it can be made to work with a bit of extra formalism, though.


Any finite state PRNG has to be periodic. Good luck implementing an infinite state one.

you're quoting the first sentence from the abstract in isolation; there's a lot more detail in the paper. in particular the define the notion of epsilon-closeness (section 2) for any distinguisher (which could include one that detects repeated patterns).

My intention is not to criticize the paper, which I don't even claim to understand. I just thought the authors definition of a PNRG is wrong (or at least incomplete) and that caught my attention. If I'm wrong, I hope somebody will set me right.

i just did.

(1) an abstract is just a sketch of the entire paper contents.

(2) in section 2 of the paper they give a more precise meaning to indistinguishable which allows you to use any function you can think of to detect non-uniformity (roughly - there are details like it's a limit as you go to large sequences, so that any finite random pattern becomes progressively less likely, etc etc). so you can cover the case you are worried about by using a test that looks for repeated patterns of values.

[more generally, even with hn as it is these days, if you're asking for clarification rather than raising an objection, then you should probably prefix your comment with something like "i don't understand the paper, can someone please understand why...".]

>"(1) an abstract is just a sketch of the entire paper contents."

That is not an excuse for containing a bogus definition without a warning, especially in the abstract which is the first thing that you read to tell if the paper is interesting enough to look it in depth.

i agree it's kind-of a weird sentence for the start of the abstract - seems too introductory for the rest. i suspect it was added because a referee wanted the abstract to be more general, or just because they didn't know how else to start referring to PRNGs. writing papers is hard.

but it's not bogus, it's just incomplete. what i was trying to explain is that if something in an abstract doesn't make sense, you should look to the paper for a more detailed explanation.

and finally, in a "deep" way it's actually a very good definition of a PRNG. it's exactly what section 2 expands on. there's nothing you can fall back to that's more fundamental, that i can see, than it being indistinguishable from random.

It's a fine definition for an abstract. If it contained a full, precise definition, it wouldn't be an abstract. It's a single paragraph that summarizes a dense, academic 30 page paper. Everyone who reads them understands that the abstract is about the general idea, not formal correctness.

>"It's a fine definition for an abstract"

It is not. An abstract is one of the most important parts in a paper and that's why authors tend to avoid writing definitions on them but stating something like "A definition for X is given in the paper". It is meant to "describe" the paper rather than a summary or introduction of the subject in question.

In any case, writing a good abstract is hard, and it doesn't diminish the quality of this paper.

Abstracts should have an introductory sentence, and they quite often are a one paragraph summary of the work.


I don't think people understand what your point is. What do you think is missing from the sentence? An explicit mention of independence? Why do you think 1,2,1,2,1,2,... is indistinguishable from uniform? In another response you said

> But I never said it was indistinguishable

which is confusing in context. What properties do yo claim your sequence has?

I'm sorry for being unclear.

The authors define a a PRNG as a sequence such that, if you find its distribution (histogram), it resembles the uniform distribution (all values appear the same number of times).

In 1,2,1,2,1,2,..., all values appear the same number of times and its histogram is uniform. However, it is not random. It is clearly not random -- so it is distinguishable from a uniform "true random" sequence, which was my point.

However, as was explained to me by another commenter, what I missed is that you may group a number of consecutive numbers together and find the histogram of that, and, in that case, my sequence fails to have a uniform histogram.

So, at this point I'm willing to accept that the author's definition is correct and I just had a hole in my understanding.

I guess I could see your interpretation, but it requires that you ignore all of the context of the paper. For instance, you take distribution to equal histogram. This may make sense in some contexts, but here it's absolutely not true; the values are supposed to be taken independently from some probability distribution.

Pro tip: if you think you've spotted a major problem in the first line of an abstract, maybe give the author the benefit of the doubt. Especially if you're not familiar with the field.

I take your point, I should have phrased my thoughts differently. I did start with "I think", but I could have made it clearer that I didn't mean to criticize the paper, but rather to invite explanations from people who know the subject better than I do.

Yes, it's always interesting how different presentations of the same basic idea are received. "Is X false?" gets a much different response than "why is X true?". It's good to keep in mind that the second one tends to elicit more helpful replies.

Are you missing the point of a limit?

As t->inf, if we counted all of the 1's and all of the 2's and all of the 3's, etc. and there were no 3's, it's not random according to this definition or any definition I'm aware of.

I guess he is restricting the domain to [1,2] in which case 1,2,1,2,1,2... is as "random" as 1,2,2,2,1,2,1,1 ...

I guess what is bothering him is the lack of entropy in the sequence.

I am not terribly good with statistical terminology, but I don't think that the definition you're assuming for "uniform distribution" is the one in general use. Once you know a[n], a[n+1] will follow a delta distribution, not a uniform distribution.

This sequence is trivially distinguishable from something uniform.

Not all deterministic algorithms with uniform distributions are PRNGs.

I know that, but that is what the first sentence in the paper's abstract is claiming. As I said, I think it's wrong.

Breaking Bad is a TV show. American Idol is also a TV show.

However, Breaking Bad is not American Idol.

A dog is a mammal, but not all mammals are dogs.

The paper states "[a] pseudo-random number generator (PRNG) is a deterministic algorithm that produces numbers whose distribution is indistinguishable from uniform."

You are asserting the converse, that all deterministic algorithms with uniform distributions are PRNGs.

Nope, because you don't have the same number of 9s and 8s as 1s and 2s, so it's not uniform.

I don't think you're correct. A discrete distribution can be defined for any number of values. Linux, which is the subject of the paper, uses a discrete distribution too.

Ok, that's fine, you just want to use 1s and 2s (though it would have been more obvious if you'd just used 0s and 1s :) ).

In that case, you don't have as many 11s and 22s as 12s and 21s (I'm ignoring the commas since they're irrelevant).

why not 4, 4, 4, 4, 4,... ;)

obligatory: http://xkcd.com/221/

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact