Hacker News new | past | comments | ask | show | jobs | submit login
Fixing Getrandom() (lwn.net)
132 points by Tomte 15 days ago | hide | past | web | favorite | 157 comments

It's recommended that people check out Theo de Raadt's Hackfest 2014 presentation on OpenBSD's arc4random(3).


Also suggest reading from page 19 on, which covers OpenBSD"s random(4) subsystem, and getentropy(2) syscall which suffers from none of the issue that are plaguing Linux getrandom today.


https://man.openbsd.org/man4/random.4 https://man.openbsd.org/arc4random.3

There may be some systems where storing entropy in a file across reboots isn't an option, e.g. diskless situations or non-writable filesystems. Though those should be a rare case. Of course, this would actually require cooperation from userspace, and given Linux is just a kernel and userland kind of free to do what they want, it's not as easy as it sounds.

Incidentally, I really wish we could get the arc4random(3) family on GNU/Linux already. It's the only notable *NIX platform that still doesn't provide it; illumos has it, every BSD has it, macOS has it. Given the difficulties with entropy as noted above, however, the whole "non-failing cryptographically secure PRNG" model just won't work with Linux.

arc4random almost made it into glibc 2.28:

That patch was held over for 2.29 but then disappeared.

If glibc had added it then musl libc would have and all would be right with the world, at least from the perspective of userland. The kernel would still need to get its story straight.

The bugzilla is classic Ulrich Drepper. Ugh.

Some much truth to this. Those that think Linus or RMS are abrassive have obviously never met Drepper.

I once had the (dis)pleasure of meeting him once at a former company when he came onsite and he came across as every bit the condescending asshole as he does on bigzilla. Didn't really get the chance to get to know him as a person, but as they say, first impressions... I get that he's in a tough spot maintaining glibc, but a bit of tack in correspondence would go a long way.

His whitepaper on memory is a fantastic read, though (if with a somewhat clickbaity title).


I'm not saying he hasn't made contributions to the community. He surely has made more than I have, and very valuable ones at that. I'm just saying interacting with him is not pleasant.

Drepper has been gone from glibc for many years now.

I figured that the reason the Linux kernel gained popularity originally was because Torvalds was so open about accepting contributions. This compared to BSD which had "standards".

I get where either were coming from (BSD was open-sourcing an established codebase, Linux was a from-scratch effort), but it does seem like that foundational culture persisted.

Sounds like GLibc is more of the latter.

I see where you're going, but I don't agree we're at the same endpoint. One should generally be accepting of changes, but they should be worthwhile.

Theo de Raat doesn't suffer any fools. He calls bullshit as soon as he sees it (ok, Linis does too) and is fairly uncompromising when it comes to code quality and security regardless of whether it's kernel or userspace.

Seems to me, Linus is primarily focused on kernel space being secure and as long as kernel changes dont break userspace, hes cool.

I appreciate both views, but tend to like Theo's better.

I haven't seen anything criminal in his reply. Can you clarify what's so offending about this discussion, please?

If I submit a patch and get a response like Ulrich's, that's a great way to ensure I'm never going to bother contributing to that project again.

The API was being added to Unix libcs left and right, giving it wide adoption and validation, making it an informal or de facto standard. He was dismissing it as some "dumping ground" code and sarcastically saying "ok, nice work, that would go great in your own project ..."

> If glibc had added it then musl libc would have

Has musl committed to adding it if glibc does? If not, why would musl add it?

See Rich Felker's original message to which Florian Weimer (glibc patch author) replied: https://www.openwall.com/lists/musl/2018/07/02/5

> ... glibc has adding [sic] the arc4random interfaces and it seems reasonable that we should too....

systemd does store the entropy in a file, but Lennart says this in the comments of the article: "we can only credit a random seed read from disk when we can also update it on disk, so that it is never reused. This means /var needs to be writable, which is really later during boot, long after we already needed entropy".

How does OpenBSD deal with this issue?

From rc(8):

  # Push the old seed into the kernel, create a future seed  and create a seed
  # file for the boot-loader.
  random_seed() {
        dd if=/var/db/host.random of=/dev/random bs=65536 count=1 status=none
        chmod 600 /var/db/host.random
        dd if=/dev/random of=/var/db/host.random bs=65536 count=1 status=none
        dd if=/dev/random of=/etc/random.seed bs=512 count=1 status=none
        chmod 600 /etc/random.seed
* https://cvsweb.openbsd.org/src/etc/rc?rev=1.537


* https://svnweb.freebsd.org/base/head/libexec/save-entropy/

* https://www.freebsd.org/cgi/man.cgi?loader.conf(5) (see "entropy_cache_load")

OpenBSD's boot loader also injects entropy into the kernel.

Which is to say, /var/ is writable when they do it.

This business of /var/ not being writable early enough is probably to do with the wonderful complexities of modern Linux, and is just not a problem that OpenBSD has.

> Which is to say, /var/ is writable when they do it.

They also do it at shutdown, before /var is R/O or unmounted.

One can also have multiple files, and have a regular cron job that replaces one of them occasionally while the system is running. This is what the FreeBSD script does.

* https://svnweb.freebsd.org/base/head/libexec/save-entropy/

* https://svnweb.freebsd.org/base/head/libexec/rc/rc.d/random

Other systems have solved this problem, so I'm not sure why all the high drama with Linux.

What if a filesystem error forces /var (or whichever mount point hosts the entropy seed) to remain read-only and seed is never replaced? Seems like unlike systemd the BSDs are fine with crediting entropy even if it has been used already in a previous boot. That's wrong though.

> Seems like unlike systemd the BSDs are fine with crediting entropy even if it has been used already in a previous boot.

FreeBSD runs a regular cron job so that there are multiple files. If /var goes R-O, you still have a bunch of files from when it wasn't and they are not the same as on initial boot (assuming that (a) /var was mounted R-W at some point, and (b) cron managed to run as well).

Also, on shutdown there is an attempt to write 4096B to both /entropy and /boot/entropy:

* https://svnweb.freebsd.org/base/head/libexec/rc/rc.d/random?...

Further, all of these various seed files are not the only sources of entropy (timers, RDRAND, etc).

> (assuming that (a) /var was mounted R-W at some point

Not a good assumption if the initial r/w mount is the one that fails...

If the initial /var (or /) is broken, then one has bigger problems and probably has to go in manually to fix things.

But for >99% of cases where the system comes up cleanly, and runs cleanly for a period of time, you'll have a bunch of seed files ready to go for the next boot. This configuration optimizes for the common case.

This is a fancy way of saying it's systemd's fault. (/s)

FreeBSD saves entropy in two places; the one you've linked is only loaded after userspace starts (via rc.d/random). The boot-time entropy mentioned in loader.conf is also created and saved in rc.d/random:


Similar idea to what OpenBSD does.

Read the seed on boot, replace it when possible.

It's also said that adding entropy to the pool can never make it worse, no? So why not just add it anyway. If you get to replace it, great! If not, well you tried.

Yes, but you should not credit the entropy you added if you're not sure it's never been used before. And you need to credit it to avoid hangs.

OpenBSD doesn't have systemd.

The boot-loader injects randomness into the kernel before starting the kernel (using the saved random seed file), so you can have randomness from the very first bit of initialization.

If you cannot guarantee that the entropy has never been used before, that's bad.

It is guaranteed to not be worse than no injected entropy at all, which is the default and only possible alternative state before the kernel starts, it always gets mixed with what can be gathered.

Sure it would, except nobody is brave enough to do it. They're too busy attacking some detail, this week it's diskless situations, next week it's "untrusted" random hardware/rdrand, and ignore a system that mixes components together.

I suppose that's the curse of the success of Linux. Now that it's everywhere, it also has to answer to all kinds of usage scenarios.

The problem is not intractable if you are willing to solve pieces of it. Then you solve another piece. Then somebody realizes they're the odd man out, and they solve their piece.

Read-only file systems are common in embedded systems, where Linux is often used.

> which suffers from none of the issue that are plaguing Linux getrandom today

Because OpenBSD assumes existence of RDRAND and a previous entropy file. Linus stated that not all systems have RDRAND or r/w media.

OpenBSD doesn't have a magic fix for availability on readonly media, right? You can use jitter entropy, and/or RDRAND (if available), but otherwise availability is a problem.

Linus merged a jitter entropy fallback that calls the kernel's schedule() function in a loop:


Classic maintainer stuff, love it. Getting adequate fixes in place, when nobody else has been there to submit it, so that the next release can have one bug less.

What a mess. The problem here is systemd and/or things like GNOME wanting high quality entropy too soon, often not actually even needing it. Systemd shouldn't need high-quality entropy for itself, and most services it starts shouldn't need any either, though eventually some service is bound to need it (e.g., for TLS), but each service should make its own decisions about early-in-boot entropy. Then instead of a deadly embrace between the kernel and systemd leading to not making progress in boot you'd have some services not quite operational right away -- still a big problem, but a lesser one.

Still, if you're booting a system that will be doing one thing, and that one thing needs entropy, then you have a problem. And the only practical solution then is to accept using things like RDRAND and a previous shutdown's saved seed to seed the system's SRNG, then add better entropy slowly as you can get it.

EDIT: I agree with tytso: if you trust the CPU enough to run on it, then you should trust its RDRAND or equivalent (if it has it).

RPis don't necessarily have RDRAND, they especially suffer from increasingly fast boot where waiting for the entropy pool is an actual concern.

Of course, not all applications need that kind of cryptographic entropy but during early boot it's pretty much the best randomness you can get (seeding from datetime isn't available on the RPi, so until you can bring up the network and contact the NTP server, you're stuck with whatever getrandom() returns)

RPI 2s have a BMC (binary-only) driver for entropy generation.

I have an RPi1 that suffers from this problem and will for a long while still.

This is strictly bad. getrandom was fine as it was, it's not the issue with the kernel if people don't read the explicitly documented man page on it. This is now changing behaviour, breaking the god damn userspace which is exactly 100% what the kernel shouldn't be doing.

Indeed. I don't see how the solution to this problem isn't just to patch gdm.

Because the kernel has a policy of not introducing changes that break userspace, even if userspace is doing something weird.

Changing the semantics of getrandom(flags=0) breaks a lot of userspace ABI.

yeah, this breaks userspace. It's much easier to argue userspace is already broken under the current definition

> systemd reads a random seed off disk just fine for you, no need to write any script for that. Problem with the approach is that it's waaaay too late: we can only credit a random seed read from disk when we can also update it on disk, so that it is never reused. This means /var needs to be writable, which is really later during boot, long after we already needed entropy, and long after the initrd

I understand in general why seed reuse is bad. If an attacker can get access to two things that used the same seed, that attacker can often learn things that the randomness was suppose to make unlearnable.

In this particular case, though, I wonder if that is actually a problem. If the seed is used before you have writable storage, and then the system is rebooted without writable storage ever becoming available (and so before the seed could be updated), and then the system reuses that seed on the next boot--the only way an attacker can get access to two things that used the same seed is if something that used it on the first boot has persisted somewhere.

Since there was no local writable storage the first time, this can only happen if the results of using the seed were communicated off the system, via networking or via a terminal, to someplace that did have working writable storage.

Thus, it would seem, that if you block networking with other machines and terminal access until /var becomes writable and the seed is updated, there is no problem with seed reuse.

Blocking doesn’t buy you anything because the seed could have already been used to generate data in-memory, which can then later be sent over a network or persisted to permanent storage.

The light goes on! :) Yes, the problem is not as impossible as it seems if you start with a best effort here, best effort there approach and converge to a point where the exceptions don't matter so much.

> we need to get over people's distrust of Intel and RDRAND.

Am I misreading this or is Ted T'so really suggesting that we should all just stop worrying and love the secret and compeltely unauditable RNG offered by the same company that has literally backdoored every CPU they've sold in the past 12 years?

If you don't trust Intel, then don't use Intel CPU's at all. Using Intel CPU's and simultaneously saying, "but we can't trust RDRAND because could be backdoored" is completely insane.

Intel hid an entire x86 core running minix --- with security holes --- and told no one. Between the firmware code which runs in System Management Mode and in UEFI, which persists after the OS is booted and can take over the system and read and write arbitrary memory locations --- if you fear a backdoor in the CPU, RDRAND is not the only place where an attacker can screw you over. The bottom line is the entire CPU can not be audited. So trust it. Or don't trust it. Use another CPU architecture. Or go back to using pen and paper, and don't use any computers or cell phones at all.

We have to trust the bootloader to verify the digital signature on the kernel, so we might as well trust the bootloader to get a secure random seed. But from where? It could call UEFI or try use the RNG from the TPM (if available). But now you have to trust Intel, and/or the motherboard manufacturer, and/or the TPM. The bootloader could read from a seed file from the HDD/SSD. But we know that nation-state intelligence agencies (like the NSA) have the capability of implanting malware into HDD firmware (which is also unauditable), and we also know that we can't trust most consumer grade manufacturers or IOT devices to correctly insert device-specific secure entropy into the seed file at manufacturing time. Otherwise, all of the seed files will likely have the same value, at which point when the device is generating its long-term private key, immediately after it is first plugged in, the key will very likely be weak (see the "Mining your p's and q's"[1] paper).

The bottom line is that unless you propose to personally wire up your CPU from transistors, and then create your own compiler from assembly language (see the "Reflections on Trusting Trust"[2] paper by Ken Thompson about what can be hidden inside an untrustworthy compiler), and then personally audit all of the open source code you propose to compile using that compiler and use on your system, you have to trust someone.

What makes this political, and difficult to address, is everyone has different opinions on what they are willing to trust and not trust. But some combinations, such as "we can't trust Intel because their firmware can't be audited", and "but we insist on using Intel CPU's", really don't make much sense.

[1] https://factorable.net

[2] https://www.archive.ece.cmu.edu/~ganger/712.fall02/papers/p7...

RDRAND might not be the only place where an attacker can screw me over, but it's a much easier exploit than many others, and one that is by design impossible to detect.

It would probably not be possible to get away with an XOR instruction that doesn't behave as advertised. There is simply no way to determine if RDRAND is behaving as advertised or not.

> you have to trust someone

Well, yes. Intel specifically has to my mind demonstrated beyond a shadow of a doubt that they are not be trusted (that "entire x86 core running minix --- with security holes" thing you just mentioned, thought it was ARM though). So if the only difference is moving my root of trust from Intel to a random, unknown party that is not Intel, that is already something of an improvement.

How would you measure the jitter from your disk? Certainly you can't trust rdtsc either.

It was never ARM. It used to be ARC, but these days it is x86.

> Using Intel CPU's and simultaneously saying, "but we can't trust RDRAND because could be backdoored" is completely insane.

The main difference is that the behavior of the rest of the CPU is deterministic, so it can be verified. When you use the ADD instruction, the output will always be the sum of its inputs; when you use the RDRAND instruction, the output comes from a black box which is supposed to mix internal non-deterministic noise sources in a non-reversible way. How can you distinguish a correct RDRAND from a malicious or broken one which generates its output in a reversible way or based on a fixed key, since the apparent output (an arbitrary number) is the same in both cases?

I don't understand why you believe an adversarial CPU must be deterministic simply because it says it will be.

I don't understand why you got downvoted. Even if GP meant that a CPU that is advertised as formally verified must be deterministic... the devil is in the details, and any number of bugs and undocumented features, including vulnerabilities and backdoors, can hide in a formally verified CPU that has billions of transistors.

Even if it seems deterministic now, there's no guarantee it won't become non deterministic in the future. https://ieeexplore.ieee.org/document/7546493

This is a very valid point, and there is even anecdotal evidence that operations we expect to be deterministic may not always be so.[1] Just because ADD works as documented in most scenarios doesn't mean we can verify that it will always work as documented in every scenario.

[1]: https://techreport.com/review/17732/intel-graphics-drivers-e...

> Using Intel CPU's and simultaneously saying, "but we can't trust RDRAND because could be backdoored" is completely insane.

The claim is not "can't trust RDRAND [to keep your system secure] because could be backdoored", the claim is "can't trust RDRAND [to generate unique unguessable numbers] because could be backdoored.

RDRAND not only can be backdoored, but its implementation flawed or too weak for some attackers. Or it can be a pinnacle of RNGs(unlikely). In any case, it is reasonable to not trust it to generate unique unguessable numbers. That trust requires control of the generation.

The impact this distrust has on decision of using Intel CPU's is specific to the use case. I trust my Intel CPU to do deterministic computational work for me, because in some cases I can rerun the job and compare the results, or check the results in another way. But I don't trust it to generate unguessable numbers for me. That is better done by separate HW device, ideally one that I would construct and control.

I don't have to trust implicitly in my computer to use it. At the same time, I do not want my kernel to rely on RDRAND to get random numbers, if more trustworthy source is viable. No that perfect randomness is required, but if the authors can do better than rely on RDRAND, they should.

> The bottom line is the entire CPU can not be audited. So trust it. Or don't trust it.

I can trust it to evaluate 1234+1234, or play a video for me, but I don't trust it to generate unique unguessable numbers.

Aren't there multiple forms of trusting Intel?

There is trusting them not to be malicious.

Then there is trusting them not to more innocently have some errata effecting quality of randomness.

> We have to trust the bootloader to verify the digital signature on the kernel

Do we? I was under the impression that most people would not care if the bootloader did that or not.

If you want secure boot then you want the BIOS/UEFI to verify a signature on a boot loader which then has to verify the signature on a kernel which then has to verify filesystems' ubberblock checksums (which should be signed in the boot manifest, which signature should be verified too).

Most people to my knowledge do not use secure boot out of choice but rather because it was forced on them.

Why not mix rdrand with jitter entropy?

On most modern systems, especially laptops and servers, RDRAND is only one of multiple hardware RNGs available. And except on first boot there's also the seed--partly derived from interrupts, etc--from previous boots that can be mixed back in.

I think the sentiment (or at least my sentiment) is that the number of systems where early entropy is used for something important like key generation but that lack strong sources is diminishing. Many, if no most, x86 and ARM embedded platforms over the past ~5 years have had at at least an on-chip RNG, and over the past 5-10 years some at least some on-board RNG (e.g. Intel NIC controllers often contain an RNG, AMDs PSP provides an RNG, plus various other options like SPI RNG chips).

Ideally there would be multiple sources, but ultimately it's the designer's choice. Likewise, any new platform that lacks any source but has software that expects strong entropy is unequivocally broken. Older systems are disappearing and are unlikely to be upgraded, anyhow.

Excessive paranoia about RDRAND is counter productive. It's speculation that it's weak, while we know for a fact that the machinations userland code goes through to work around the poor semantics and unreliability of Linux kernel randomness interfaces has caused and is causing real problems, including real security problems. At some point the risk of a bugged RDRAND is less than the risk of assuming a non-bugged RDRAND. Plus, if the kernel is loud about its expectations for quality hardware sources it creates an environment where people (embedded designers, savvy laptop purchasers) demand quality sources and manufacturers will compete along that axis (more so than they already do--the VIA C3 was the first to add an on-chip RNG on a mass production chip nearly 15 years ago).

I got wedged about six months ago trying to fast iterate on creating a Docker image on Docker for OS X. I exhausted the entropy on the Linux VM and IIRC, one of the package managers started hanging trying to fetch packages over SSL.

Ten years back I had an embedded app that was struggling with this. It's always been my assumption/fear that jitter on an RTOS is not a particularly rich source of entropy. Not to the degree of a Linux kernel sitting on a loud network (maybe switches have worsened this as well?)

Thankfully the system had a few things going for it. First, the flavor of PRNG we were using had a seed function that was cumulative (not all are). Second, useful implementation of system time in nanoseconds (at the time, not a given). And third, the writers had a longish list of bootstrapping tasks to perform prior to making any network calls.

The code that started these tasks was all located in one bootstrapping file. So in addition to whatever crap entropy the libraries could gin up, we pushed another handful of bits of entropy in for each task by taking the nanoseconds and stuffing it into the seed data. doTask1(), seed(nanos()), doTask2(), etc.

This is the same observation made in Linux's RNG, and that is demonstrated in the algorithm for randomness from a biased coin: A high quality mixing algorithm and a large volume biased data can be transformed into a smaller quantity of high quality entropy. Which makes me wonder why Linux doesn't use RDRAND with a low entropy estimate to augment the early sources.

> I exhausted the entropy on the Linux VM

How is that supposed to work? Entropy does not get exhausted.

Pretty sure the Linux entropy pool is "consumed" when read and thus will become exhausted when not replenished fast enough.

You can see this when reading /dev/random on a low-entropy machine (such as a VM), and adding whatever you want to the entropy pool manually. The reading process will unblock, read a few bytes, then block again until you add more.

This is why you're supposed to use getrandom() instead of /dev/random. It doesn't block on "low" entropy.

What were you using that uses /dev/random?

Right, or just use *BSD where /dev/{u,}random both behave correctly :)

I'm not the same guy, so I have no idea what he was reading /dev/random for. Just pointing out that it can in fact be "exhausted."

I'm not a crypto expert either, but wouldn't it still make sense to use /dev/random when generating long-lived keys (e.g. PGP, SSL CA, ssh keys...)?

I understand, I think, that a CSPRNG seeded with just the minimum amount of entropy should be "good enough." More entropy has to be at least theoretically better though, doesn't it?

If the only tradeoff is having to wait a few more seconds or possibly minutes to generate a key I'm going to use for years... why wouldn't I want to do that?

> I'm not a crypto expert either, but wouldn't it still make sense to use /dev/random when generating long-lived keys (e.g. PGP, SSL CA, ssh keys...)?

No. https://sockpuppet.org/blog/2014/02/25/safely-generate-rando...

I’m pretty sure it was https session keys for pulling packages. I can’t recall how I fixed it. Best bet is logged into the Linux host and cross linked dev/urandom, or just restarted the host. Hasn’t happened again, and I was just experimenting with couple of images over the last few weeks.

This is extremely weird. Would you happen to have an idea of what the reasoning behind it is?

Well, not really, but before Linux 3.17 there was no way other way to be safe: /dev/urandom was the only other option, and that happily returns data even if the entropy pool isn't (yet) initialized and the CSPRNG can't be properly seeded.

Since then there is get_random(2) which does the right thing for most applications.

The man page also has this to say:

> The /dev/random device is a legacy interface which dates back to a time where the cryptographic primitives used in the implementation of /dev/urandom were not widely trusted.

I must admit though, that I still don't understand why blocking on insufficient entropy is a bad thing for generating long-lived keys. I'm going to use the key for years, why wouldn't I just wait a couple of seconds or minutes?

Because the entire idea of "insufficient entropy" past an initialization amount is a cargo culted fantasy.

In the Linux kernel view of the world, /dev/random "loses" 1 bit of entropy for every bit that gets read.

That is not how cryptographic and hashing primitives work, if it were it would be dangerous to use an AES key to encrypt more than a few megabytes for example.

The /dev/random re-blocking behavior, where entropy is "consumed" (after initial entropy seeding), is a result of lack of crypto understanding by Linux shot-callers, and Linus in particular.

You wouldn't blame them for it 20-25 years ago, but we've known better for at least 15 years so it'd be nice if they'd get with the program. And it can be fixed without breaking their userspace ABI that they're usually fanatical about not breaking. Linus wanting to break getrandom's ABI just shows he still doesn't understand how CSPRNGs work.

Surely someone must have explained that to them years ago, right? Would you happen to have a discussion in mind where the maintainers of the RNG generators commented on the issue? I would like to see their perspective but I am kind of lost.

I imagine so, but I'm not close to the Linux kernel community and don't know. I don't get the impression that Linus, for example, changes his mind particularly easily.

If you're talking about Zen, AMD's PSP RNG is the same one used by RDRAND. That's not an independent source.

Am I misreading this or is Ted T'so really suggesting that we should all just stop worrying and love the secret and compeltely unauditable RNG offered by the same company that has literally backdoored every CPU they've sold in the past 12 years?

Doesn't it all depend on how you are using it? As far as I recall, Linux does not use RDRAND is a direct source for random numbers, but rather mixes it with the entropy pool. Since xor'ing with (potentially) low-entropy data does not reduce the entropy, the argumentation is that it is safe. The potentially backdoored RDRAND is problematic as a source of random numbers or when the entropy has too little entropy (since RDRAND would determine the entropy).

> Linux does not use RDRAND is a direct source for random numbers, but rather mixes it with the entropy pool. Since xor'ing with (potentially) low-entropy data does not reduce the entropy

And this is still insecure https://blog.cr.yp.to/20140205-entropy.html

DJB is not saying that an entropy pool is insecure, as you put it. Nor is he saying that one shouldn't bother with a PRNG seeded with RDRAND and/or other hardware sources of entropy.

DJB has an axe to grind (and rightly so) here in that he's pushing cryptographic algorithms that need less (or no) entropy. Specifically EdDSA instead of ECDSA. And he's also dismantling some of the bad arguments about how /dev/urandom should operate.

An operating system doesn't know why an application needs entropy, just that it does, and I don't see the problem with having an SRNG for that purpose. DJB doesn't even say that a system SRNG is not good, but, rather, he's quite sensibly saying that _applications_ should stretch one good 256-bit chunk of entropy. And yes, DJB correctly points out that there's no need to have security relative to an attacker that once saw the entire state of the SRNG at an earlier point in time -- but that does not mean that a kernel-land SRNG shouldn't add entropy to its pool, just that it doesn't buy you much.

He is saying that using a malicious source of entropy that can read the current state of the pool (such as RDRAND) can harm security.

This is an extremely weird threat model, in that it implies an actively adversarial CPU that can't be trusted with or without RDRAND.

If you only have one source that you suspect might be malicious (such as RDRAND), can't you just use that as the first source?

Well yes. Except if, as others have noted, the hypothetical malicious RDRAND can read the state of the entropy pool as well.

Of course that's a totally different and more complex attack. Is it still plausible enough to possibly worry about? I have no idea, honestly -- this is way outside my area of expertise.

What I do know is that hardware RNGs made from open-source and verifiable hardware is readily available for about US$50 for qty 1, cheaper in bulk.

Given that, it doesn't seem worth more than a couple of minutes investigating whether or not RDRAND can be trusted. If there is the slightest hint of doubt, I just disable it and move on with life.

You're cutting off the first half of the quote which is pretty damn relevant:

> we need to find some way to fix userspace's assumptions that they can always get high quality entropy in early boot, or we need to get over people's distrust of Intel and RDRAND

Though ts'o is also ignoring the third option — and the one used by openbsd — of storing an rng seed after boot (and on shutdown).

The very first boot does remain problematic if the image isn't seeded though.

Ubuntu's man pages say "all major Linux distributions have [saved a seed file across reboots] since 2000 at least":


Lennart, posting as mezcalero, addresses that in the LWN discussion. TL;DR: systemd loads the seed too late in the boot process, which is a deliberate choice on their part.

But is it really that difficult to seed the image for that first boot?

For example, an installation process using an .iso image could, after verifying the .iso, put a random seed inside the .iso (unpack-modify-pack) as a pre-configuration step. Like an extra preparation step after downloading. New installation = new pre-configuration step, to avoid seed reuse.

The previous would apply to VMs and containers and such.

For traditional embedded/IoT style devices, the step could be done at the factory when flashing the image. Or maybe there could be a small flash area with the seed as part of the initial flashing procedure.

I guess some forms of installation end up being "problematic" to handle (ro root filesystems maybe).

A VirtIO-RNG driver has been in Linux for over 10 years: https://wiki.qemu.org/Features/VirtIORNG

Randomness in VMs isn't a problem. If VM hypervisors aren't providing VirtIO-RNG, it's still just a quick fix and reboot away. The quickest way to make sure all hypervisors and hosting providers enable VirtIO-RNG, or something similar, is for the kernel to being assuming, loudly, that strong entropy is available at boot time.

For embedded stuff SPI RNG chips and similar solutions have been available since forever, not to mention the on-chip sources on most modern ARM and x86 CPUs and SoCs.

If a product designer (e.g. embedded device) or solution provider (cloud hosting) chooses to put their trust in RDRAND alone, then that's their choice--Linux shouldn't be second guessing it, or at least allow such a choice to dictate the semantics of its APIs.

VirtIO-RNG has not been officially accepted into Xen (it seems to be easier to incorporate into a PV DOMU).

If you have RDRAND hardware, you could (I suppose) pipe it's output from DOM0 into an encrypted root terminal session on DOMU, and feed that into the pool.

I personally don't want to rely on RDRAND. Unlike a tampered ADD instruction, a tampered RDRAND instruction would appear to be undetectable, even in principle. I wish Intel had given me the raw bits from their DRBG, instead of providing only the whitened stream. Or hell, give me both - but not just an opaque stream.

> unpack-modify-pack

This seems eminently unfeasible for many applications, and it won't work at all for read-only filesystems as you've mentioned.

How is it relevant? The fact that he's proposing an alternative option doesn't change the fact that he's also presenting a completely insane one.

In the same way "We need to stop driving the boat into rocks or..." is relevant context to "...we need to start liking to swim". Just because something is listed as an option does not mean the speaker is interested in actually doing that.

It's a speaking pattern meant to convey "we've got to do something about <x> and people don't like <y> (so it's gotta be <x>)". In reality there are almost always more <options> than the duality implies but at no point does the person ever actually want <y>.

I can't quite put my finger on it, but there is something in Tso's statement that seems to me to be quite different from yours.

In your example, the fact that the first option is self-evidently correct even in a vacuum, is something of a cue that the second option isn't meant to be taken as seriously.

In Tso's statement, it's not immediately clear to me which of the two options he prefers. In fact, his use of the phrase "get over it" leads me to think that is actually his preferred variant. To me, that phrasing means if not an outright trivial and unworthy concern, then at least one that only needs the passage of time. There doesn't seem to be any indication that he's using the phrase ironically or anything.

Of course it is always possible that I'm just misreading or misinterpreting it. But that is why the first half didn't, and still doesn't, seem relevant to me.

If you make the hypothesis your CPU is completely compromised, it is vain to be hysterical about rdrand. Any sane random gen does not pass it directly to consumers, anyway (it always goes through at least an irreversible hash function with other inputs)

It isn't completely vain - consider djb's blog post that discusses this exact scenario:


That was exactly what I was thinking about. This is a quite impractical attack, in the sense that if you are able to do that, you are able to do way more simple attacks, like tuning the final output without even going through the pretend & mine phase -- that could be more effective btw, depending on the mixing algo and the presence of true randomness in the mix or not, but arguably slightly easier to detect. So the hw rng path would only be reserved to extremely high profile targets subject to APT, and those target better generate their keys on dedicated hardware, or there will be 10000 manner for them to be owned. (and even if they do on dedicated hardware, subject to an attacker so motivated, I predict they are still owned...)

And above all: you will also probably be able to modify others supposed sources of entropy (if you restrict yourself to sources, I still think it is a terrible idea if you have that kind of capabilities), so short of analyses showing why this would be way more difficult, dropping the dedicated hw gen for a theoretical problem that would also be an end-game for virtually everything else is insane. You get no actual advantages, and tons of drawbacks.

Really, the hypothesis of what the attackers are able to do and what they are not makes no sense IMO; like:

> For example, an attacker can't exert any serious control over the content of my keystrokes while I'm logged in; I don't see how hashing this particular content into my laptop's entropy pool can allow any attacks.

Yeah, so that mythical unicorn can invoke utter craziness in RDRAND, but is somehow unable to modify the key inputs and/or timings just when that data reaches the entropy input algos? I just don't buy it. Like I don't buy that the user is somehow recomputing everything on a second computer and comparing both results (or using other fancy proof of work things, that I'm not aware that even exist on real code we are talking about).

My litmus test is: imagining I'd have to attack, I would never do it by such a convoluted method. Or maybe only ever if Ben Laden Two is the target, AND if it seems to actually make sense in the technical context of the attack: the proba of both occurring is, I guess, quite low...

This feels too obvious to work, but:

The kernel knows it's early in the boot process, and it knows that there isn't enough entropy that it will have to block getrandom() calls. We also know that what caused this issue was some FS efficiency improvements that result in fewer disk accesses during early boot, which results in fewer interrupts, which results in less entropy.

So... when we get into this situation, why doesn't the kernel just start issuing some disk accesses (perhaps with weakly-random offsets and sizes) until enough interrupts are generated to fill the entropy pool to a safe level?

Not all Linux systems have disks. The jitter entropy approach described elsewhere is more comprehensive and less dependent on hardware components outside of the CPU.

Fair; it does make sense to re-evaluate the whole thing and find the most generic of solutions, but the issue that triggered this discussion was a sort of regression on systems that do have disks.

(Your point also raises the question: on systems that don't have disks, how are requests for early-boot entropy handled?)

> Fair; it does make sense to re-evaluate the whole thing and find the most generic of solutions, but the issue that triggered this discussion was a sort of regression on systems that do have disks.

Right. On those kinds of systems, persisting some saved state between boots is your best first option. (As other commenters point out, you really don't need very much! 256 bits or 32 bytes is basically adequate.) Any additional entropy you get from device interrupts or machine-dependent random sources like RDRAND are good to mix in.

(I don't think the adversarial RDRAND thread model is a reasonable one, and that has been adequate addressed elsewhere in this thread by tytso, tptacek, and tedunangst. Any adversarial CPU that can defeat cryptographic digest whitening of entropy sources is so sophisticated/privileged it already fully owns your operating system; game over.)

Linux systems already do this, kind of! But Linux has taken the stance that entropy you can't delete from the boot media at the time it is fed into the CSPRNG as entropy doesn't count ("credit") as far as initial seeding, and that decision is what moots its utility for early CSPRNG availability. It's not an unreasonably paranoid take, but not an especially practical one (IMO). I suspect the net effect on security, holistically, is negative.

> (Your point also raises the question: on systems that don't have disks, how are requests for early-boot entropy handled?)

Depends which subsystem you ask and how you ask it, right? And whether or not a jitter entropy scheme is enabled in the kernel. (I might be misunderstanding your question, sorry if that's the case.) If you ask /dev/random or getrandom(~GRND_NONBLOCK), they'll block until the CSPRNG is satisfied enough entropy is available. If you ask /dev/urandom, it just spews unseeded and likely predictable garbage at you. If you ask getrandom(GRND_NONBLOCK), it returns -1/EAGAIN.

> (I might be misunderstanding your question, sorry if that's the case.)

Not entirely, but I think my question was just more narrow. Considering the case that started this off (machines that used to work, now hanging on boot because of lack of entropy, triggered by too few disk interrupts), I don't get how this generalizes. If you take that same system, and remove the disk (replacing it with netboot or whatever), how would it ever boot, at least not without making some changes to how early entropy is seeded? I guess in the netboot case you end up with the timing of network interrupts that you can feed into the entropy pool, but this whole thing just feels off to me.

If a system was relying on a specific, inefficient set of disk access to give it just enough entropy to get through early boot, and that making the FS code more efficient caused it to fail to have enough entropy, I'd suggest that (likely unbeknownst to the system administrators), this system was already pretty broken. I get why Torvalds decided to bucket this under the "we don't break userspace" umbrella, and appreciate the care he takes to that sort of thing, but suggesting that "FS access patterns during early boot" is sorta a part of the kernel's ABI is... a bit too incredible for me to take seriously.

So the discussed sources of entropy are

   - interrupts
   - rdrand
   - timer/instruction jitter
Why don't they throw more sources at the problem? LSBs from analog sources such as audio inputs, stray network packets (put the nic in promiscuous mode). Even uninitialized memory might be worth a few bits (not all systems zero it on boot).

This is happening so early that audio and network (and /var where the persistent seed is stored) aren't up. The kernel doesn't want to be susceptible to hangs unless userspace brings up all entropy sources early enough.

This is about GDM, which is the last thing that gets started.

Hardware devices are initialized before init is started, so you should still be able to gather entropy from the various input devices.

Lack of entropy can't cause the kernel to not boot; it's just userspace (typically services which require encryption) which can have a problem. It would be silly to start a webserver or sshd before the FS is remounted rw.

Not if you use modules.

Can you elaborate on what you mean?

Init is executed before modules start being loaded.

The issue was GDM/X11 starting though, that's later into the boot sequence.

More sources don't help; the problem is very early availability. Networking isn't online yet. Uninitialized memory is probably worthless.

They should display a prompt: "Press any number of keys to continue".

I'm not sure that analog sources can be relied on to be present in all systems. And I'm not sure that your other suggestions (stray network packets or uninitialized memory) can be relied on to to be sufficiently random in all systems. It seems to me like they're looking for solutions that rely on the processor for the randomness and not on any peripherals.

Sorry if I'm missing something, but what would promiscuous mode do in this age, exactly? I personally haven't seen a hub in at least a decade, and on a switched network, is there any difference?

Yes, there's tons of non-unicast traffic: ARP requests, spanning tree, various UDP broadcast packets, etc...

The NIC would receive broadcast traffic regardless of whether promiscuous mode is enabled, wouldn't it?

Broadcast packets are received in non-promiscuous mode, obviously. Otherwise none of this would work without it.

Yes, you are correct.. I was more commenting that the "stray" network packets could still be used for timing.

I'm surprised nobody in the thread mentioned using a TPM device when available. My 5-year old MB comes with one builtin, and probably all new ones have one. TPM also provides a limited amount of persistent storage.

I used to work on a Java application that used a bunch of libraries, and specifically I remember one of the libraries generated some identifier using urandom. I think it was axis2 or something like that. There was unexplainably random hangs when trying to read from urandom. The wikipedia article says that urandom should not block. Anyone smarter than me that can understand this article have any reason why urandom hangs?

Java on Linux at used to have this bad behavior where it would short-circuit configuration to use /dev/urandom to use its own internal RNG based on /dev/random. It did not create a single global seed for the internal random number generator, so some systems could overwhelm the entropy pool.

I've solved this for some systems by changing the java security properties file to instead point to /dev/./urandom .

I did that and even then it still happened :( one of lifes great mysteries

Impossible to say without a deeper inspection of the system, but given `/dev/random` does block, the answer is something is redirecting urandom to random. On the system, does cat urandom ever block? Step one is the reproduce it outside of Java and if that's not possible, start digging into the Java library's code. If it is, then start digging into `glibc`, devfs, the kernel.

Something in the chain switched urandom and random. Isolate each component until the culprit is found.

Java app servers with high request rates suffered this problem, leading to throttling of SSL session creation and thus reducing effective requests per second on the server.

I doubled throughput on two different projects by 'fixing' the /dev/random-/dev/urandom problem.

But writing your own can be worse. Stumbled onto a substantial CSRNG bug on an embedded system than was ignoring most of the seed data due to a premature optimization in the ingestion code. Even after that the people who wrote the glue code wanted to do their own entropy mixing. Lesson not learned, at all.

After boot, and once it's been initialized with sufficient entropy, the /dev/urandom PRNG should not block. Can you provide more detail as to what it was doing?

When the issue came up with the OpenSSH hang, I failed to be convinced why this should be a problem at all. Perhaps someone can enlighten me after reading my thoughts in the thread there? https://news.ycombinator.com/item?id=20463586

> Sounds like the real problem is some part of the system can't be bothered to persist its state.

That's just a special case when you do a reboot. There are many cases where the system starts and there's either no previous state, or the previous state is explicitly killed to prevent sharing.

For example a lot of AWS instances will boot once once in their lifetime, and will do that from a shared AMI. You can't reuse a shared random seed in that case or getting random values from one VM would allow you to predict new values on another.

So persisting the state will help desktop users, but that's addressing only a specific version of this problem, not solving it for everyone.

> You can't reuse a shared random seed in that case or getting random values from one VM would allow you to predict new values on another.

Why can't AWS seed you with a random number before giving you control? In fact does it not do that already? If it doesn't, either I'm surprised, or I don't understand why it's not possible.

AWS does not touch your drives itself. (Apart from what you configure) It would be a drama on its own if they tried. And even if they did try, what about encrypted partitions, filesystems which they don't support, systems configured for a different seed location, etc?

For some hypervisors this is solved in a different way: https://wiki.qemu.org/Features/VirtIORNG but as far as I know AWS does not support it yet.

What do you mean they don't touch your drive? Do they just give you a blank disk with no bootable OS (in which case, doesn't that mean you yourself are now responsible for seeding when you install the OS)? I know whenever I've dealt with a cloud VM, it's come with an OS set up by the vendor, but if it's not, then the OS installer needs to seed the image. So my point is, whoever sets up the system seems responsible for seeding it. If AWS gives me a VM, I expect it to come seeded with a unique number, and of course I don't expect them to touch anything after giving me control. Anything I set up beyond that is my responsibility to seed properly. I don't understand why this would be a drama.

They give you a VM with an image you choose. It may be an image you prepare or one they provide. But no modifications to the file contents since creation.

Drama would come from the same place as "we shouldn't trust rdrand". That's still avoiding the issue of: sometimes can't write that seed in some configurations.

Yeah, so while creating it, they seed it. If you don't trust that, you install another OS yourself, with a kernel your trust, and seed it yourself when installing, from whatever sources of randomness you want. I fail to see why this wouldn't work.

They can't seed it while creating the AMI. If they did that, everyone would share the initial seed and I could boot up the published image, run the generator with that seed, and be able to predict (for example) what ssh key your server will have.

(Or if you meant creating the VM, as explained before they can't do that for encrypted volumes for example)

If you can't trust your cloud provider to give you good boot time entropy you shouldn't host your VMs on them at all.

Sure. Same applies to CPUs and rdrand... yet here we are.

Not all CPUs have rdrand? Yes, trust it on x86 when it is present. Otherwise... Linux runs on a lot of arch that are not x86.

I agree that the LKML fear-mongering over RDRAND is doing more harm than good.

"EFI variables." Or shove it in an ACPI table. Or any of the myriad other ways BIOSes communicate data to the operating system via RAM rather than disk image. Edit: or the virtio-rng interface, as another commenter reminds me.

Yup. There are ways to do that without a disk seed. They need to be actually implemented on both sides though, so some standardisation is needed. (Or at least one popular solution)

It absolutely could and should! This is called out in like, 2003 in FS&K Fortuna.

Seems to me that providing a way to either store a random seed or provide a canned one on boot is a bootloader/system problem not a Linux OS problem.

Why isn't it standard that hypervisors provide a random seed for each of their vms?


So, the Linux kernel can't just persist 32 little random bytes? That shouldn't be hard. Even giving a seed for the first boot (or when booting a VM image) shouldn't be hard. Heck, we don't even need to collect entropy. Like, ever. Just use Chacha20 with fast key erasure and be done with it.

If your system is too small to be able to persist 32 bytes, it's probably too small to perform any amount of meaningful cryptography anyway.

Yes, the availability problem hinges on being able to persist a handful of bytes. The problem isn't the number, which, sure, is small. The problem is the availability of any persistent media in general.

There are absolutely devices with exclusively RO media with plenty of processing power to perform meaningful crypto (e.g., embedded devices with 802.11 wifi). So:

> If your system is too small to be able to persist 32 bytes, it's probably too small to perform any amount of meaningful cryptography anyway.

This sentence is not based in reality.

> There are absolutely devices with exclusively RO media with plenty of processing power to perform meaningful crypto (e.g., embedded devices with 802.11 wifi).

Well, those devices are flawed. Unless they have special hardware to generate random numbers? I mean, the ability to generate random number is such a basic feature. If your communication hardware doesn't have it, your communication hardware is crap, and that's the end of it.

I maintain what I said: if a system is big enough to do crypto, it is big enough to persist 32 bytes. When you can process a 64-byte Chacha20 block in RAM, you can spare 32 bytes of EEPROM or flash. I think this is reasonable enough to be "grounded in reality".

That said, of course many devices out there are so badly designed that even though they're powerful enough to process elliptic curves (and I presume the entire Linux kernel), they don't even have enough persistent memory to store 256 bits of entropy, and they don't have a reliable, fast way to generate 256 bits of entropy. Stupidity is a thing, I know.

Acknowledging such stupidly designed devices is of course very important. We do what we can with what we have. But let's not go the C standard route, and use that stupidity as an excuse to never use persistent state or random generation instructions. Let's use the capabilities of our platforms, dammit.

For instance, that stupid WiFi router is not an excuse to not persist state on a freaking desktop. (Or laptop. Or palmtop. Or R-Pi. Or virtualised server. Etc.)

> There are absolutely devices with exclusively RO media with plenty of processing power to perform meaningful crypto (e.g., embedded devices with 802.11 wifi).

Don't most microcontrollers have some EEPROM that could be used for this? I'm assuming even 4 bytes would help a lot. I am pretty sure STM32's with hardware crypto have EEPROM good for a million write cycles.

4 bytes isn't enough, but yes, it's quite possible some SoCs could persist 256-512 bits of entropy in some hardware-specific way. It's hard to abstract that into a general-purpose operating system, though it could be a good solution for specific applications.

> It's hard to abstract that into a general-purpose operating system

Linux has lots and lots of hardware specific drivers. This shouldn't be any different.

Feel free to start writing device drivers for 1000s of individual SoCs and figure out a way to solicit testing. I agree hardware-specific drivers are a solution for the SoCs with RW media of some kind, but it's not as trivial as you paint it. And (AFAIK) there are still SoCs without RW media. I think jitterentropy is a better and more easily tested use of developer hours than implementing 1000s of SoC drivers, but it's open source; work on what you like.

The onus is not on Linux to support devices. The onus is on the devices to support Linux. Linux support is a selling point, that should be leveraged.

The current way of gathering initial entropy should not be the default, it should be a fallback. If a device have neither persistent storage (32 bytes for crying out loud) nor a hardware random generator, they should be treated like second class citizen, with only best effort random numbers.

Right now, everyone is a second class citizen. Kind of insane, don't you think?

> it's not as trivial as you paint it

It is, under one (extremely hard to fulfil) condition: collaboration from hardware vendors. It's just 32 persistent bytes. Just bury them in a chip and memory map the damn thing! The driver can be reduced to an offset.

> I think jitterentropy is a better and more easily tested use of developer hours than implementing 1000s of SoC drivers

That one? https://fuchsia.dev/fuchsia-src/zircon/jitterentropy/config-...

It's one of the first links I found, I hope it is obsolete: they say the internal state of Jitterentropy is puny 64 bits number. What were they thinking, we need four times as much! That kind of thing is why people are tempted to shed the system's RNG, and ask users to wiggle their mouse instead.

I'll pass on most of this rant, just want to correct this misunderstanding:

> That one? https://fuchsia.dev/fuchsia-src/zircon/jitterentropy/config-....

> It's one of the first links I found, I hope it is obsolete: they say the internal state of Jitterentropy is puny 64 bits number

You skimmed too quickly and came to a judgement based on a misreading or misunderstanding of the concept. Obviously 64 bits is not sufficient.

The idea of the jitter entropy mechanism is that most modern CPUs have super high resolution clock or cycle counters available (even embedded boards), and instruction execution speed has mild variance.

You run a series of trials, each of which produces one or more bits of output (classically: 1). You run as many as you want, producing an infinite stream of weak entropy, until you have collected a satisfactory number of output bits. Trials can be run relatively quickly (many nanoseconds or a handful of milliseconds per trial).

For each trial, you perform some minimal workload intended to exacerbate CPU runtime variance (this might be where you saw "64 bits"), and extract some number of output bits (maybe 1) from one or more of: the low bits of the cyclecounter, nanosecond clock, or something similar of that nature.

Caveat: jitter is an even weaker entropy source than most non-HWRNG sources typically consumed by entropy gatherers. Your motto of "just 256 bits" assumes total independence and 8 bits per byte of entropy. It isn't met by most real-world entropy sources on server systems (expect 4-5 bits/byte) and especially not by jitter entropy. Empirically, jitter seems to come closer to 1 bit per byte minimum in SP800-90B evaluations on the raw output — it's a pretty weak entropy source.

Anyway: feel free to read more about the concept at any of the sources. The Fuschia writeup you linked is a good one if you read it more closely; there's also:

* From the original (2013) proponent, Stephan Mueller: description: http://www.chronox.de/jent.html and sources: https://github.com/smuellerDD/jitterentropy-library

* LWN's 2015 writeup of the concept as a Linux entropy source: https://lwn.net/Articles/642166/

* And this is all suddenly topical due to the latest nonsense from Linus (TFA, 2019), who still does not understand CSPRNGs. Anyway, Linus wrote and merged a version of jitter entropy quite recently: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin... This is a relatively happy outcome in that Linus didn't just break the ABI to be completely insecure by default.

> Your motto of "just 256 bits" assumes total independence and 8 bits per byte of entropy.

Yes of course. This is easily obtained by hashing a much bigger input. The problem is determining how big the input should be. That is, how much entropy it actually holds. You can also hash piecemeal (H is whatever you think is secure):

  H0 = H(I0)
  H1 = H(I1 || H0)
  H2 = H(I2 || H1)
  H3 = H(I3 || H2)
You can stop as soon as you gathered enough input to be happy about its entropy. Then just switch to fast key erasure with Chacha20 and stop wasting cycles on entropy gathering.

> Anyway, Linus wrote and merged a version of jitter entropy quite recently. […] This is a relatively happy outcome in that Linus didn't just break the ABI to be completely insecure by default.

I'm genuinely relieved. This would have been the worst way to break userspace. Still, tiny embedded systems might need to persist (properly seeded) 32 bytes instead of relying on jitter entropy.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact