
Web's random numbers are too weak, researchers warn - amouat
http://www.bbc.co.uk/news/technology-33839925
======
tptacek
This presentation makes very little sense to me.

It appears to revolve around the idea that Linux servers "produce" entropy at
an unexpectedly low rate over time, and "consume" entropy quickly.

But that's not how CSPRNGs work. A CSPRNG is a stream cipher, keyed by
whatever entropy is available to seed it at boot, and, for purposes of forward
secrecy, periodically (or continually) rekeyed by more entropy.

Just as for all intents and purposes AES-CTR never "runs out" of AES key, a
CSPRNG doesn't "run out of" or "deplete" entropy. The entire job of a CSPRNG,
like that of a stream cipher, is to take a very small key and stretch it out
into a gigantic keystream. That's a very solved problem in cryptography.

I am automatically wary of anyone who discusses "entropy depletion", even
moreso when they appear to be selling "entropy generators".

~~~
JoshTriplett
So you're saying that Linux's /dev/random should go away, /dev/urandom or
getrandom is always what you want as long as the system has collected enough
entropy (even for key material), and Linux could stop "collecting entropy"
once it has enough to seed the CSPRNG?

~~~
witty_username
Correct. If the entropy is say 256 bits, then the attacker has to try 2^256
combinations. It's quite like encryption; you can encrypt a 1 TB file with a
256 bit key. (And indeed some stream ciphers like RC4 are just a PRNG xored
with plaintext).

~~~
wolf550e
AES-CTR keystream is also a CSPRNG xor'd with plaintext and is a better
example because it's the recommended encryption mode (AES-GCM is AES-CTR +
GMAC and is what everyone recommends).

~~~
tptacek
Pedantic: there are better modes to use than GCM, for a couple reasons. GCM is
the most performant widely available AEAD though.

~~~
JoshTriplett
> there are better modes to use than GCM, for a couple reasons.

What are the better modes, and the reasons?

~~~
sdevlin
The authentication mode for GCM is sort of fragile. While nonce reuse is
always bad, it's particularly disastrous in GCM in that it immediately leaks
the authentication key. Similarly, using GCM with a truncated authentication
tag makes forgery easier than you'd expect and again leaks the authentication
key in the process.

GCM is also difficult to implement in software for the same reasons AES is:
the high-performance implementation strategies tend to rely on precomputed
tables. This puts memory pressure on servers that handle a large number of
keys concurrently. Table-based implementations also tend to expose cache-
timing side channels. Fortunately, modern Intel machines have instructions
(e.g. PCLMULQDQ) that aid implementations, though I'm not sure how widespread
their use is in practice.

To be very clear, GCM is still a fine choice, and much safer than composing
authentication and encryption yourself.

If you have access to it, NaCl's Secret Box is a good choice that avoids these
problems. Libsodium implements NaCl and is pretty widely available, I think.
OCB is also a good choice, though I haven't seen many implementations of this.

EDIT: For those interested, Niels Ferguson's criticism of GCM
([http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/comment...](http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/comments/CWC-
GCM/Ferguson2.pdf)) is a great read. Lots of minor practical issues (e.g.
specifying bit strings rather than byte strings, performance measurement
across platforms, etc.) along with the aforementioned attack on short
authentication tags.

------
__Joker
The blackhat whitepaper[1] and presentation[2] has more info, which might
appetite curious HNers.

[1]
[https://www.blackhat.com/docs/us-15/materials/us-15-Potter-U...](https://www.blackhat.com/docs/us-15/materials/us-15-Potter-
Understanding-And-Managing-Entropy-Usage-wp.pdf)

[2]
[https://www.blackhat.com/docs/us-15/materials/us-15-Potter-U...](https://www.blackhat.com/docs/us-15/materials/us-15-Potter-
Understanding-And-Managing-Entropy-Usage.pdf)

~~~
qrmn
Honestly, this sales brochure of a "paper" tastes even worse than the BBC
fluff piece. This is below the standard of paper I would have expected Black
Hat to accept.

Good CSPRNG design is not a "dark art", and entropy is not "consumed" when a
good CSPRNG is used. Any good CSPRNG uses a good PRF - any good block cipher
in CTR mode, a hash, or a HMAC, perhaps - to stretch one good, solid, 256-bit
entropy seed into as much cryptographically-secure random data as you'll ever
need over the lifetime of your cryptosystem, and ratchets forward through the
PRF after each call so the state cannot later be reversed (without breaking
the PRF, but you're using a good one, so you'll be fine).

You need _quality_ entropy to seed a CSPRNG - not quantity. Yes, it is, as we
all know, very important you don't try to use a CSPRNG before its initial seed
has collected enough good entropy - which is, yes, a particular problem in
embedded systems or headless servers - but after that, the entropy in your
CSPRNG seed isn't something that magically disappears, as you'll see from the
design of Linux's newest random-number API, getrandom, patterned after the
OpenBSD folks' ideas.

Reseeding a CSPRNG's state with more entropy is not a benefit, but in fact a
potential risk every time you do it: it can result in entropy injection
attacks if an attacker can observe the state, and control some of the entropy.
That, in turn, could break your whole cryptosystem, especially if you're using
fragile primitives like DSA or ECDSA. One source:
[http://blog.cr.yp.to/20140205-entropy.html](http://blog.cr.yp.to/20140205-entropy.html)

Detuned ring oscillator TRNGs [p2] can be troublesome to protect from RF side-
channel attacks, or even RF injection attacks in pathological cases. Carefully
used, they are fine, but best used when combined with shot-noise-based TRNGs.
You can find those in surprising places: even the Raspberry Pi's
BCM2835/BCM2836 has a perfectly serviceable one, available from /dev/hwrng
after bcm2708-rng/bcm2835-rng has been loaded, and which rngd can use with no
trouble.

Forgive me if, therefore, I perhaps wouldn't like to buy a "quantum random
number generator" from Allied Minds Federal Innovations, Inc, who are behind
this "paper", or to replace the OpenSSL RNG with theirs. That all feels far
too much like a black box, and Dual_EC_DRBG is still fresh in our memory. I'd
rather use the one Alyssa Rowan described to me on a napkin, thanks, or
LibreSSL's/OpenBSD's arc4random with ChaCha20, or CTR_DRBG, or HMAC_DRBG.

~~~
bhickey
> You need _quality_ entropy to seed a CSPRNG - not _quantity_.

For the most part I think you're spot on, but I don't follow here. Entropy is
measured in bits and bits are bits are bits. When we ask /dev/random for
256-bits it should return a 256-bit sequence, possibly after blocking. If that
sequence exhibits less than 256-bits of entropy, it just means that the pool
had a bad entropy estimate. Is there some nuance I'm missing?

~~~
qrmn
The point I'm making is that people should probably balk at the suggestion
that they need 200Mbps/sec of entropy from a mysterious black box on a PCIe
card sold to them by an NSA affiliate who want them to put it into their
critical servers. No. Just... no. Don't do that.

256 good bits, once, is quite enough, as long as they are good. You might well
try to collect more entropy, and your CSPRNG's setup might use a compression
function (e.g. a cryptographic hash) to combine them into the seed to try to
hedge against failures. That's quite a good idea, as long as the last one you
sample is your most trusted (see djb's blog for why). But you don't need
megabits of entropy, and you don't need it on an ongoing basis. That task is
solved by the PRF.

So what you should perhaps be doing is not using /dev/random at all, but using
Linux's default getrandom syscall to get 256 bits to seed your userspace
CSPRNG instead. The urandom mode of that will block if it hasn't collected
enough entropy, and will never block thereafter, and it also doesn't need a
device node handy.

Even attempting to estimate entropy is perilous, so most modern CSPRNGs don't
try. (Note, by way of example, the difference between the earlier Yarrow and
the later Fortuna.)

~~~
pimlottc
So basically, entropy bits do get used up, but it's not the problem you should
worrying about.

~~~
tptacek
In the sense you're thinking about, entropy bits do not get "used up". The
reason they're continuously refreshed is because something could theoretically
happen to your system that compromises your CSPRNG internal state, and if the
CSPRNG wasn't rekeyed you'd be permanently compromised.

------
rsy96
The definition of cryptographic pseudorandom generators are deterministic
functions that turn a true random number (seed) of bit length k into a stream
of seemingly random numbers of bit length n, where n is much larger than k. A
computationally bounded attacker cannot distinguish this stream of pseudo
random numbers from true random numbers, unless he/she knows the seed.

So if the CSPRNG are truly cryptographic secure, you don't need constant
stream of high entropy input. You only need enough starting entropy, say 256
bits, and you will be fine for a long time.

------
pilsetnieks
Is this really the kind of article we want here?

> A study found shortcomings in the generation of the random numbers used to
> scramble or encrypt data.

> The hard-to-guess numbers are vital to many security measures that prevent
> data theft.

> But the sources of data that some computers call on to generate these
> numbers often run dry.

> This, they warned, could mean random numbers are more susceptible to well-
> known attacks that leave personal data vulnerable.

I get that you have to simplify for the ordinary people but this looks like
talking to a five-year-old.

~~~
colinbartlett
Funny, I came to the comments to remark about how nice it was to see an
article very clearly articulate the issue.

Yes, I generally understand these concepts, but I am not a security
professional and found the explanation useful both for my own comprehension
and for improving my ability to relay complex technical topics to non
technical people.

~~~
pilsetnieks
I get that it's good to educate the general public but first of all, we're not
exactly the general public here, and something even a little bit more
substantial would have been nice.

Second, this description is so vague, it could apply to almost anything. When
the next openssl vulnerability, or android bug, or any other crypto weakness
appears, you could almost run the same article: "Web's secret numbers are too
weak. These numbers are vital to security. They prevent data theft. Your data
could be vulnerable." While true, it's pointless.

------
sarciszewski
Uh oh, is this another Debian bug or the underpinnings of a fundamental
weakness in Linux's CSPRNG?

I wonder if OpenBSD's arc4random_buf() is unaffected?

cc 'tptacek :)

~~~
kaesve
As far as I could see, it's not a problem in the CSPRNG itself, but in how it
is used. More specifically it seems like a lot of applications use more
entropy bits per second than servers generate by normal use. I'd say this is
the result of not understanding how CSPRNG works and how to use it safely.
Adding more or better sources of entropy to your systems would solve this.

~~~
pedrocr
>I'd say this is the result of not understanding how CSPRNG works and how to
use it safely.

My take from previous discussions is that once you seed a CSPRNG properly you
can take secure random numbers from it forever. So in a linux server once
/dev/urandom has been properly seeded you can take random numbers from it
forever with no issues.

So if what they discovered in this research is that "Linux's /dev/urandom
entropy pool often runs empty on servers" this shouldn't really be much of an
issue.

~~~
sarciszewski
So it's, once again, an issue where cloud servers aren't being seeded before
their keys are generated?

------
atoponce
This paper asserts something innacurate: entropy pools can be "used". Entropy
is not an object, it's an estimation. Just as you don't use temperature, or
barometric pressure, you don't "use" entropy.

Further, once a CSPRNG is properly seeded, there is no need to concern
yourself with whether or not it can produce "high quality random numbers",
provided the cryptographic primitive behind the CSPRNG contains conservative
security margins. The Linux kernel CSPRNG uses SHA-1 as the underlying
primitive. While SHA-1 is showing weaknesses in blind collision attacks, it
still contains wide security margins for preimage attacks, which is what would
need to be applied to a key generated by a CSPRNG (you have the output, now
predict the state that produced it). Even MD5 remains secure against preimage
and second preimage attacks.

Again, once properly seeded, the Linux CSPRNG can produce data
indistinguishable from true random indefinitely until SHA-1 is sufficiently
broken.

------
praseodym
I have a VM running Debian Jessie (Linux 3.16) that has very low entropy
available (cat /proc/sys/kernel/random/entropy_avail returns <200 most of the
time) even though the Intel RDRAND instruction is available. Shouldn't it be
using that to fill up the entropy pool, or am I misunderstanding how the
entropy_avail value works?

~~~
Freaky
Linux is very conservative with how much entropy it credits to things like
RDRAND since they can't be easily trusted. Looks like you get one extra bit
per interrupt:

[https://github.com/torvalds/linux/blob/master/drivers/char/r...](https://github.com/torvalds/linux/blob/master/drivers/char/random.c#L937)

You'll note no other use of arch_get_random_* throws anything at
credit_entropy_bits().

Linux 3.17 did bump up the assumed quality of virtio-rnd:

[https://github.com/torvalds/linux/commit/34679ec7a0c45da8161...](https://github.com/torvalds/linux/commit/34679ec7a0c45da8161507e1f2e1f72749dfd85c)

------
mukyu
The talk about needing to constantly add more entropy or 'manage' it is
nonsense. djb says it best:
[http://blog.cr.yp.to/20140205-entropy.html](http://blog.cr.yp.to/20140205-entropy.html)

Briefly, once you have say 256 random bits it is trivial to use AES and CTR
mode and turn that into 2^71 random bits until you need to rekey. If you
cannot get more entropy in the time it takes to use up all of those numbers
something is _completely_ broken. The only problem you can have is not having
enough entropy to bootstrap (such as VMs or needing to generate a key at
poweron on an embedded device), but this paper gives little more than
lipservice to it.

------
im3w1l
How big problem is this in practice? Let's say you only have 256 bits of "real
entropy" and you then stretch that into large amounts of pseudo-random bits
using a state of the art CSPRNG and use those bits for all your randomness
needs. Let's say worst case scenario here, so a server that is online for
several years, with no reseeding at all. Are there any practical attacks
against that?

~~~
marcosdumay
If you stretch it into a set of numbers with 256 bits or less, you are good.
If you expect to generate bigger random numbers from it, you have a problem.

But the pool does not stay with only 256 bits for long (if at all). It's
always accumulating more.

Anyway, if the pool ever get to zero, it means that an attacker with infinite
resources that can see the entire sequence generated by the CSPRNG could
predict the next numbers it'll generate. On practice none of those conditions
are met.

~~~
Tomte
So you don't trust modern block ciphers and avoid encrypting more than 128 (or
256) bits of data with a single key?

Isn't that extreme key rotation a bit bothersome?

No, stretching your seed of 256 bits into terabytes of pseudorandom numbers is
normal and absolutely fine.

------
mangeletti
It would be interesting to get Bruce Scheier's take on this.

He wrote this in 99, and it talks about key length near the bottom (though
doesn't cover this exact scenario): [https://www.schneier.com/crypto-
gram/archives/1999/0215.html](https://www.schneier.com/crypto-
gram/archives/1999/0215.html)

