
Removing the Linux /dev/random blocking pool - lukastyrychtr
https://lwn.net/SubscriberLink/808575/9fd4fea3d86086f0/
======
CiPHPerCoder
Making /dev/random behave like getrandom(2) will finally put to rest one of
the most frustrating arguments in the public understanding of cryptography.
Please do it.

~~~
jdormit
What argument are you referring to?

~~~
ATsch
The idea of "randomness" being "used up", and then "running out of
randomness", somehow.

So let's look at how a hypothetical CSPRNG might work. We get our random
numbers by repeatedly hashing a pool of bytes, and then feeding the result,
and various somewhat random events, back into the pool. Since our hash does
not leak any information about the input (if it did, we'd have much bigger
problems), this means attackers must guess, bit for bit, what the value of the
internal pool of entropy is.

This is essentially how randomness works on Linux (they just use a stream
cipher instead for performance)

This clarifies a few things:

1\. even if you assume intels randomness instructions are compromised, it
still is not an issue to stirr them into the pool. Attackers need to guess
every single source of randomness.

2\. "Running out of randomness" is nonsensical. If you couldn't guess the
exact pool before, you can't suddenly start guessing the pool after pulling
out 200 exabytes of randomness either.

~~~
throw0101a
> _2. "Running out of randomness" is nonsensical. If you couldn't guess the
> exact pool before, you can't suddenly start guessing the pool after pulling
> out 200 exabytes of randomness either._

Not entirely.

/dev/random and arc4random(4) under OpenBSD originally used the output of RC4,
which has a finite state size:

* [https://en.wikipedia.org/wiki/RC4](https://en.wikipedia.org/wiki/RC4)

Rekeying / mixing up the state semi-regularly would reset things. It's the
occasional shuffling that really helps with forward security, especially if a
system has been compromised at the kernel level.

~~~
tptacek
No, Arc4random didn't reveal its internal RC4 state as it ran, in the same
sense that actually encrypting with RC4 doesn't deplete RC4's internal state.

~~~
cperciva
Many implementations didn't do enough mixing before generating output, though.

Also, when you look at cache side channel attacks -- RC4 _definitely_
publishes its internal state.

~~~
ben_bai
That's why OpenBSD cut away the start of the RC4 stream (don't remember how
many bytes) to make backtracking harder.

But the point is mood b.c. the stream cipher used switched from RC4 to
ChaCha20 like 5 years ago. And there is no side channel attack on ChaCha20,
yet.

~~~
cperciva
_why OpenBSD cut away the start of the RC4 stream (don 't remember how many
bytes) to make backtracking harder_

Yes, everybody does that. But _how many_ bytes you drop matters; over the
years the recommendations have gone from 256 bytes to 512 bytes to 768 bytes
to 1536 bytes to 3072 bytes as attacks have gotten better.

------
devit
There are two threat models against code using RNGs:

1\. The adversary has an amount of computing power that is feasible as
currently foreseeable: in this case, all you need are K "truly random" bits
where K=128/256/512 and can then use a strong stream cipher or equivalent to
generate infinite random bits, so you only need to block at boot to get those
K bits, and you can even store them on disk from the previous boot and have
them passed from an external source at install time

2\. The adversary has unlimited computing power: in this case, you need
hardware that can generate truly random bits and can only return randomness at
the rate the hardware gives you the bits

Now obviously if you are using the randomness to feed an algorithm that is
only strong against feasible computing power (i.e. all crypto except one-time
pads) then it doesn't make sense to require resistance against unlimited
computing power for the RNG.

So in practice both /dev/urandom, /dev/random, getrandom(), etc. should only
resist feasible computing power, and resisting against unlimited computing
power should be a special interface that is never used by default except by
tool that generate one-time pads or equivalent.

~~~
xyzzyz
> 2\. The adversary has unlimited computing power: in this case, you need
> hardware that can generate truly random bits and can only return randomness
> at the rate the hardware gives you the bits

What would you need those bits for in that case? Literally the only things
that comes to my mind is generating one time pads, as standard cryptography is
useless in such scenario.

~~~
firethief
Game-theoretically, you want a source of random numbers when you need to make
a decision your adversary can't predict. Traditionally some cultures have
(accidentally?) used bird augury for this, but obviously that won't do when
you're up against Unlimited Computing Power, as birds are basically
deterministic.

------
pczy
This is the best explanation of this issue that i know of:
[https://www.2uo.de/myths-about-urandom](https://www.2uo.de/myths-about-
urandom)

------
brohee
That this happens right after Thomas Pornin ridiculing the blocking pool
([https://research.nccgroup.com/2019/12/19/on-linuxs-random-
nu...](https://research.nccgroup.com/2019/12/19/on-linuxs-random-number-
generation/), HN discussion
[https://news.ycombinator.com/item?id=21843081](https://news.ycombinator.com/item?id=21843081))
is obviously purely coincidental, right? Especially as it was read and
commented upon by Theodore Tso that at last changed his mind...

~~~
tytso
Hardly; the first version of this patch series was from August 2019 (which is
before the brouhaha caused by ext4 getting optimized and causing boot-time
hangs for some combinations of hardware plus some versions of systemd/udev),
and the second version was from September 2019. In the second version, Andy
mentioned he wanted to make further changes, and so I waited for it to be
complete. I had also discussed making this change with Linus in Lisbon at the
Kernel Summit last year. So this was a very well considered change that had
been pending for a while, and it predates the whole getrandom boot hang issue
last September. I don't like making changes like this without careful
consideration.

The strongest argument in favor of not making this change was there are some
(misguided) PCI compliance labs which had interpreted the PCI spec as
requiring /dev/random, and changing /dev/random to work like getrandom(2)
might cause problems for some enterprise companies that need PCI compliance.
However, the counter-argument is that it wasn't clear that the PCI compliance
labs somehow thought that /dev/random was better than getrandom(2); it was
just as likely they were so clueless that they hadn't even heard about
getrandom(2). And if they were that clueless, they probably wouldn't notice
that /dev/random had changed.

If they really _did_ want TrueRandom (whatever the hell that means; can _you_
guarantee your equipment wasn't intercepted by the NSA while it was in-transit
to the data center?) then the companies probably really should be using some
real hardware random number generator, since on some VM's with virtio-rng,
/dev/random on the guest was simply getting information piped from
/dev/urandom on the host system --- and apparently _that_ was Okey-Dokey with
the PCI labs. Derp derp derpity derp....

~~~
rrauenza
For anyone following along not familiar with all security acronyms, in this
context PCI is Payment Card Industry not Peripheral Component Interconnect.

I was confused for a bit since we're talking about the kernel...

------
JdeBP
Interestingly, this follows the systemd people back in 2018 changing its seed-
at-boot tool, systemd-random-seed, to write the machine-ID as the first 16
bytes of seed data to /dev/random at every seed write.

* [https://github.com/systemd/systemd/commit/8ba12aef045ba1a766...](https://github.com/systemd/systemd/commit/8ba12aef045ba1a766a73f535a114781dbb763c2)

* [https://www.freedesktop.org/software/systemd/man/systemd-ran...](https://www.freedesktop.org/software/systemd/man/systemd-random-seed.service.html)

* [http://jdebp.uk./Softwares/nosh/guide/commands/machine-id.xm...](http://jdebp.uk./Softwares/nosh/guide/commands/machine-id.xml)

------
zaarn
It's very amusing that the various kernel developers are bashing on GnuPG,
going as far as calling it's behaviour a "misuse. Full stop."

PGP/GPG has certainly falled out of favor.

~~~
grammarxcore
So is GnuPG bad because it reads directly from /dev/random instead of using an
interface like getrandom()? I'm naive enough to not know reading directly from
/dev/random is bad and would love to know more.

~~~
toast0
The getrandom() syscall is relatively new. Before it was available, you had
two choices.

Use a non-Linux OS with reasonable /dev/(u)random or use Linux with its
Sophie's choice:

/dev/random will give you something that's probably good, but will block for
good and bad reasons.

/dev/urandom will never block, including when the random system is totally
unseeded.

GnuPG could not use /dev/urandom, since there was no indication of seeding, so
it had to use /dev/random which blocks until the system is seeded and also
when the entropy count of nebulous value was low. Most (all) BSDs have
/dev/urandom the same as /dev/random, where it blocks until seeded and then
never blocks again . This behavior is available in Linux with the getrandom()
syscall, but perhaps GnuPG hasn't updated to use it? Also, there was some
discussion in the last few months of changing the behavior of that syscall,
which thankfully didn't happen, in favor of having the kernel generate some
hopeful entropy on demand in case there is a caller blocked on random with an
unseeded pool.

~~~
grammarxcore
So the issue is the block? I make a blocking call and another app attempts to
make a call during the block and will fail if it's not expecting to wait? Is
that (one of) the problem(s)?

Thanks for breaking that down for me!

~~~
toast0
So, if the random system hasn't been properly seeded, you do _need_ to block,
if you're using the random for security; especially for long term security, ex
long lived keys.

The problem is, before this patch, Linux keeps track of an entropy estimate
for /dev/random, and if the estimate gets too low, read requests will block.
Each read reduces the estimate significantly, so something that does a lot of
reads makes it hard for other programs to do any reads in a reasonable amount
of time.

If you knew the system was seeded, you could use urandom instead, but there's
not a great way to know. Perhaps, you could read from random the first time,
and urandom for future requests in the same process... but that only helps in
long running processes; also reading once from random and using it as a seed
to an in-process secure random generator works almost as well. The getrandom()
syscall is really the way forward, but you would need to keep old logic
conditionally or accept loss of compatibility with older releases.

In summary, it's not really fair to say GnuPG is doing it wrong, when they
didn't have a way to do it right.

~~~
grammarxcore
Thanks! That makes sense. I appreciate you taking the time to break all that
down.

------
latchkey
Filed this one in 2011 and it got a lot of heated discussion...

[https://bugs.launchpad.net/bugs/706011](https://bugs.launchpad.net/bugs/706011)

~~~
jancsika
A professional response on a bug report seeks to narrow down the possible
source of a bug (if any) so it may be understood, tested, and addressed
properly.

The first response to start such a process is taligent in response #22.

A useful addition to that is #23 where Steven Ayre suggests opening that as a
separate bug that focuses solely on this issue.

I'm not sure what the purpose is for the other responses you received. They
seem to seek to use the breadth of your issue report to _widen_ the discussion
to maximally contentious security topics.

~~~
JdeBP
That's not a fair assessment of the other responses. Steve McIntyre's in #7,
for one example.

------
csours
I was looking at Java properties the other day and I thought to myself, "We
still need to set /dev/./urandom in 2019?"

~~~
ktpsns
This is a valid point -- most high level programming languages provide some
kind of function to provide random numbers in a given interval, such as [0,1].
See also [https://stackoverflow.com/questions/2572366/how-to-use-
dev-r...](https://stackoverflow.com/questions/2572366/how-to-use-dev-random-
or-urandom-in-c)

This is even true for shell scripting, see for instance
[http://www.tldp.org/LDP/abs/html/randomvar.html](http://www.tldp.org/LDP/abs/html/randomvar.html)

------
saagarjha
Am I correct in my understanding that /dev/random will not block anymore and
behave similarly to /dev/urandom after it has been initialized? Or is there
still some inherent difference between the two?

~~~
hannob
The "after it has been initialized" is the inherent difference.

------
SignalsFromBob
Are hardware RNGs, such as the ones that plug into a USB port, of any value
when the RNG in Linux is good enough for generating GPG keys? I'm wondering
what the use case is for people that buy them.

------
walterbell
On the subject of TRNG, John Denker wrote a 2005 paper for using soundcard
data as a source of randomness,
[http://www.av8n.com/turbid/](http://www.av8n.com/turbid/)

 _> We discuss how to configure and use turbid, which is a Hardware Random
Number Generator (HRNG), also called a True Random Generator (TRNG). It is
suitable for a wide range of applications, from the simplest benign
applications to the most demanding high-stakes adversarial applications,
including cryptography and gaming. It relies on a combination of physical
process and cryptological algorithms, rather than either of those separately.
It harvests randomness from physical processes, and uses that randomness
efficiently. The hash saturation principle is used to distill the data, so
that the output is virtually 100% random for all practical purposes. This is
calculated based on physical properties of the inputs, not merely estimated by
looking at the statistics of the outputs. In contrast to a Pseudo-Random
Generator, it has no internal state to worry about. In particular, we describe
a low-cost high-performance implementation, using the computer’s audio I/O
system._

On randomness, [http://www.av8n.com/turbid/paper/turbid.htm#sec-raw-
randomne...](http://www.av8n.com/turbid/paper/turbid.htm#sec-raw-randomness)

 _> Understanding turbid requires some interdisciplinary skills. It requires
physics, analog electronics, and cryptography._

~~~
amelius
CPUs already have a physics-based random number generator.

[https://en.wikipedia.org/wiki/RDRAND](https://en.wikipedia.org/wiki/RDRAND)

~~~
saagarjha
Which has concerns that it may be backdoored by intelligence agencies:
[https://github.com/torvalds/linux/blob/6398b9fc818eea79dcd6e...](https://github.com/torvalds/linux/blob/6398b9fc818eea79dcd6e70f981ce782da22cdee/drivers/char/random.c#L1885)

~~~
Jasper_
What's stopping the NSA from inserting a backdoor to recognize it's running
kernel randomness code and change the results too? If you don't trust your
CPU, you can't trust anything it does. Expecting the backdoor to show up in
one solely instruction is hopelessly naive.

~~~
feanaro
Why does anyone even continue to bother arguing this?

There are ways of mixing RDRAND into the entropy pool safely and this can be
done easily. Why would you deliberately _choose to_ not mix RDRAND and use it
directly instead? You wouldn't. It makes no sense. Therefore, RDRAND should be
mixed into the pool, it _is_ being mixed into the pool and there is no more
reason to debate this.

~~~
tytso
Yes, and Linux has done it for years. The problem is whether or not RDRAND
should be trusted in the absence of sufficient estimated entropy that it
should be used to unblock the CRNG during the boot process. This is what
CONFIG_RANDOM_TRUST_CPU or the random.trust_cpu=on on the boot command is all
about. Should RDRAND be trusted in isolation? And I'm not going to answer that
for you; a cypherpunk and someone working at the NSA might have different
answers to that question. And it's fundamentally a social, not a technical
question.

~~~
throwaway2048
The blocking CRNG (besides the necessary early seeding) is an entirely
artificial problem however..

------
commandersaki
I don't see why this is so difficult.

1\. Make a kernel config option that makes /dev/urandom block until entropy
pool initialises.

2\. Make a dependant kernel config option so /dev/random is /dev/urandom.

There done. Everyone can have their own choice on what security they want.

------
edoceo
This looks like it explains why syslog-ng is hanging on boot? It's trying to
read a random. At least, hangs until there is some random (have to just mash
the keyboard a bit)

~~~
beefhash
I am now somewhat curious why syslog-ng needs random bytes on boot.

------
emilfihlman
Changing getrandom was just idiotic and breaks god damn userspace. The man
page was extremely clear in documentation in the first place.

------
Erlich_Bachman
Summarizing the article, `cat /dev/random` will still work but will never
block, possibly returning random data based on less entropy than before. They
claim that in the modern situation there is already enough entropy in it even
for secure key generation. There seemingly will still exist a programmatic way
to get a random stream based on predictable amount of entropy, but not through
reading this filesystem node.

~~~
gioele
> Summarizing the article, `cat /dev/random` will still work but will never
> block

`cat /dev/random` may still block, but only once per reboot. It may block if
it is called so early that not enough entropy has been gathered yet. Once
there enough entropy has been gathered it will never block again.

~~~
simias
As mentioned by the article that's already the default behaviour of
getrandom() and the BSDs have symlinked /dev/random to /dev/urandom for a long
time already.

I think this is a change for the best, in particular this bit sounds
completely true to my ears:

> Theodore Y. Ts'o, who is the maintainer of the Linux random-number
> subsystem, appears to have changed his mind along the way about the need for
> a blocking pool. He said that removing that pool would effectively get rid
> of the idea that Linux has a true random-number generator (TRNG), which "is
> not insane; this is what the *BSD's have always done". He, too, is concerned
> that providing a TRNG mechanism will just serve as an attractant for
> application developers. He also thinks that it is not really possible to
> guarantee a TRNG in the kernel, given all of the different types of hardware
> supported by Linux. Even making the facility only available to root will not
> solve the problem: Application programmers would give instructions requiring
> that their application be installed as root to be more secure, "because that
> way you can get access the _really_ good random numbers".

The number of time I've had to deal with security-related software and scripts
who insisted in sampling /dev/random and stalling for minutes at a time...

~~~
JdeBP
A minor note:

* Only FreeBSD symbolically links, and it does it in the other direction. urandom is the symbolic link to random.

* OpenBSD has four distinct character device files: random, arandom, srandom, and urandom.

* NetBSD (as of 2019) has two distinct character device files: random and urandom. They have different semantics from each other. [https://netbsd.gw.com/cgi-bin/man-cgi?rnd+4+NetBSD-current](https://netbsd.gw.com/cgi-bin/man-cgi?rnd+4+NetBSD-current)

~~~
aquabeagle
On OpenBSD:

    
    
        $ ls -l /dev/*random*
        lrwxr-xr-x  1 root  wheel         7 Dec 10 15:05 /dev/random@ -> urandom
        crw-r--r--  1 root  wheel   45,   0 Jan  6 15:30 /dev/urandom

~~~
JdeBP
That must be a recent change.

    
    
        $ ls -F /dev/*random*
        /dev/arandom  /dev/random   /dev/srandom  /dev/urandom
        $

~~~
ben_bai
Deleted in 2017. [https://marc.info/?l=openbsd-
cvs&m=151069089605712&w=2](https://marc.info/?l=openbsd-
cvs&m=151069089605712&w=2) you can delete arandom and srandom

Edit: better link

------
quotemstr
Not blocking under insufficient entropy does not suddenly make that entropy
available. Punting entropy collection to userspace doesn't magically allow for
DoS-free random number generation --- it just transforms, silently, a
condition of insufficient entropy into a subtle security vulnerability. It
feels like a form of reality denial, a bit like overcommit. The more time goes
by, the more I wish there were a unixlike built on robustness, determinacy,
and strict resource accounting.

~~~
Hendrikto
> Not blocking under insufficient entropy does not suddenly make that entropy
> available.

That’s why it is still blocking until it has been sufficiently initialized.
After it has gathered sufficient entropy, the pool’s entropy is _not_
exhausted by reading from it. /dev/random assumes that reading 64 bits from it
will decrease the entropy in its pool by 64 bits, which is nonsense.

Linux’s PRNG is based on cryptographically strong primitives, and reading
output from /dev/random does _not_ expose its internal state.

Your pointless rant just indicates that you do not really understand what’s
going on.

~~~
jerf
"/dev/random assumes that reading 64 bits from it will decrease the entropy in
its pool by 64 bits, which is nonsense."

To amplify Hendrikto's point, /dev/random is implemented to "believe" that if
it has 128 bits of randomness, and you get 128 bits from it, it now has 0 bits
of randomness in it. 0 bits of randomness means that you ought to now be able
to tell me exactly what the internal state of /dev/random is. I don't mean it
vaguely implies that in the English sense, I mean, that's what it
_mathematically means_. To have zero bits of randomness _is_ to be fully
determined. Yet this is clearly false. There is no known and likely no
feasible process to read all the "bits" out of /dev/random and tell me the
resulting internal state. Even if there was some process to be demonstrated,
it would still not necessarily result in a crack of any particular key, and it
would be on the order of a high-priority security bug, but nothing more. It's
not an "end of the world" scenario.

~~~
quotemstr
> There is no known and likely no feasible process to read all the "bits" out
> of /dev/random and tell me the resulting internal state

That's fine if you trust the PRNG. Linux used to at least attempt to provide a
source of _true_ randomness. You and Hendrikto are essentially asserting that
everyone ought to accept the PRNG output in lieu of true randomness. Given
various compromises in RNG primitives over the years, I'm not so sure it's a
good idea to completely close off the true entropy estimation to userspace. I
prefer punting that choice to applications, which can use urandom or random
today at their choice.

Maybe everyone _should_ be happy with the PRNG output. T'so goes further and
argues, however, that if you provide any mechanism to block on entropy (even
to root only), applications will block on it (due to a perception of
superiority) and so the interface must be removed from the kernel. I see this
change as an imposition of policy on userspace.

~~~
aidenn0
> That's fine if you trust the PRNG. Linux used to at least attempt to provide
> a source of true randomness. You and Hendrikto are essentially asserting
> that everyone ought to accept the PRNG output in lieu of true randomness.
> Given various compromises in RNG primitives over the years, I'm not so sure
> it's a good idea to completely close off the true entropy estimation to
> userspace. I prefer punting that choice to applications, which can use
> urandom or random today at their choice.

Linux never provided a source of true randomness through /dev/random. The
output of both /dev/random and /dev/urandom is from the same PRNG. The
difference is that /dev/random would provide an _estimate_ of the entropy that
was input to the PRNG, and if the estimate was larger than the number of bits
output, it would block.

