
How secure is Linux's random number generator? - hpaavola
http://lists.randombit.net/pipermail/cryptography/2013-July/004728.html
======
nullc
The annoying thing is that the Linux RNG is really limiting without something
like RdRand.

It used to be that most drivers contributed to the randomness pool, so it
seldom ran short. It used to be that you could configure the size of the pool,
so if you were running short you could make it larger. But then it was
discovered that the pool resizing had a locally exploitable vulnerability so
it was removed, leaving it always at the smallest value; and it was realized
that many driver sources weren't very random and/or were externally
controllable so most were removed.

The end effect is that much server hardware only gets about 50-100 bits per
second added only to a pool of 4096, and /dev/random is constantly running out
leading to weird performance problems (like ssh connections taking a long
time). This results in a desperate need to replace /dev/random with something
like RdRand when it could just otherwise be another untrusted contributor if
the rest of the system around /dev/random were sane.

~~~
raverbashing
Apparently, for some "security experts" it's damned if you do, damned if you
don't

If you don't use RdRand then you have few sources of "true" randomness, hence,
your RNG is predictable, manipulable and you're an idiot and a 5 year old can
break your crypto

If you use RdRand then "blah blah blah this is opaque", hence, your RNG is
predictable, manipulable and you're an idiot and a 5 year old can break your
crypto

Perfect solutions exist only in labs and my impression is that most of these
"experts" make things less secure.

~~~
nullc
Meh. Simply making the default pool larger would go a long way towards moving
systems out of a desperate situation. With that done there would be a lot less
reason to short circuit it and go RdRand only.

No one is concerned about RdRand as a contributing source— with other genuine
source of randomness RdRand isn't likely a back door once mixed in.

~~~
tytso
Making the pool larger isn't sufficient for embedded systems that don't have a
lot of sources of entropy in the first place. Especially since very often the
most critical secrets (such as the RSA keys for the certificates used by
network printers, for example) are generated when the embedded system is first
installed, where even if you have a larger pool, there isn't any opportunity
to fill with the extremely limited amount of entropy available to said device.

~~~
raverbashing
Yes, this is _very_ bad in embedded systems

As in your example, the only source of entropy a network printer has: network
data, easy to manipulate or even no activity. So no way to generate keys for
example.

In some cases hardware sources are a must. Yes, in the end you'll need to
trust them

------
olympus
Just because something is closed source doesn't mean it's insecure. RdRand
meets various standards for RNGs and the dieharder tests don't show anything
of concern. While you can't be 100 percent sure of the reliability of RdRand
because you can't audit it, I feel safe trusting it for all but the most
critical of applications. Here's a blog post describing testing RdRand with
dieharder: [http://smackerelofopinion.blogspot.com/2012/10/intel-
rdrand-...](http://smackerelofopinion.blogspot.com/2012/10/intel-rdrand-
instruction-revisited.html)

~~~
oellegaard
You are right that closed source doesn't mean its insecure - on the other
hand, open source could prove that it is indeed secure. With new scandals
coming up every week these days, about hidden backdoors in security software,
I trust open source more than ever before.

~~~
tptacek
Ironically, it's particularly vis a vis cryptographic random number generation
where we can most easily show open source cryptography failing its users;
Debian fatally broke the OpenSSL CSPRNG so badly that attackers could remotely
brute force SSH keys.

~~~
tankenmate
Whereas with closed source you would _almost never_ know. Crypto is very hard
to do properly, but at least with open source you have the possibility of
independent third party analysis.

~~~
lmm
Wasn't the debian vulnerability discovered because someone noticed that two
different servers had the same key? That would have gone down exactly the same
with closed source.

------
dave1010uk
For those that aren't aware, the security of a rand number generator is very
important:
[http://en.wikipedia.org/wiki/Random_number_generator_attack](http://en.wikipedia.org/wiki/Random_number_generator_attack)

~~~
surement
A personal favourite:
[https://news.ycombinator.com/item?id=639976](https://news.ycombinator.com/item?id=639976)

~~~
homeomorphic
That was a wonderful read. Thank you.

------
sejje
Am I the only guy who can't figure out how to navigate mailing lists archives?

These things are internet hell.

~~~
jevinskie
I agree. Try searching for the title on gmane.org

~~~
ReidZB
Here's the list on gmane:
[http://news.gmane.org/gmane.comp.security.cryptography.rando...](http://news.gmane.org/gmane.comp.security.cryptography.randombit)

Not sure how to link a particular article in that view. The 'direct link'
sends you to an article-only page. But the message by the OP appears as the
third top-level thread in that view.

~~~
secure
The link is
[http://thread.gmane.org/gmane.comp.security.cryptography.ran...](http://thread.gmane.org/gmane.comp.security.cryptography.randombit/4689)

You get to that by clicking on the subject in the bottom frame.

------
guns
And here is the mailing list thread that the author refers to:

[https://lkml.org/lkml/2011/7/29/366](https://lkml.org/lkml/2011/7/29/366)

~~~
semenko
There was a lot more follow-up later, see e.g.
[https://lkml.org/lkml/2012/7/5/422](https://lkml.org/lkml/2012/7/5/422)

The important commit here is:

[http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.g...](http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=c2557a303ab6712bb6e09447df828c557c710ac9)

Excerpted:

 _Change get_random_bytes() to not use the HW RNG, even if it is avaiable.

The reason for this is that the hw random number generator is fast (if it is
present), but it requires that we trust the hardware manufacturer to have not
put in a back door. (For example, an increasing counter encrypted by an AES
key known to the NSA.)

It's unlikely that Intel (for example) was paid off by the US Government to do
this, but it's impossible for them to prove otherwise \--- especially since
Bull Mountain is documented to use AES as a whitener. Hence, the output of an
evil, trojan-horse version of RDRAND is statistically indistinguishable from
an RDRAND implemented to the specifications claimed by Intel. Short of using a
tunnelling electronic microscope to reverse engineer an Ivy Bridge chip and
disassembling and analyzing the CPU microcode, there's no way for us to tell
for sure._

------
acqq
The best approach to have is IMHO here:

[http://en.wikipedia.org/wiki//dev/random](http://en.wikipedia.org/wiki//dev/random)

 _Gutterman, Pinkas, & Reinman in March 2006 published a detailed
cryptographic analysis of the Linux random number generator[5] in which they
describe several weaknesses. Perhaps the most severe issue they report is with
embedded or Live CD systems such as routers and diskless clients, for which
the bootup state is predictable and the available supply of entropy from the
environment may be limited. For a system with non-volatile memory, they
recommend saving some state from the RNG at shutdown so that it can be
included in the RNG state on the next reboot. In the case of a router for
which network traffic represents the primary available source of entropy, they
note that saving state across reboots "would require potential attackers to
either eavesdrop on all network traffic" from when the router is first put
into service, or obtain direct access to the router's internal state. This
issue, they note, is particularly critical in the case of a wireless router
whose network traffic can be captured from a distance, and which may be using
the RNG to generate keys for data encryption._

It shouldn't be a religious but an engineering problem. If you manage keep
some state between reboots and use it after the next reboot, you're making it
hard enough for anybody not having the physical access to that state. Then you
can also use RdRand to mix it with the output of your stream based on your
state, and with other sources of entropy if you have them. If RdRand turns out
to be suspicious, you're at least much better off than using only hard coded
states.

Anybody knows if some kind of described state is used now?

~~~
caf
Yes - for example this is done by /etc/init.d/urandom on Debian and Ubuntu
systems.

~~~
acqq
My question was for /dev/random. The main problem RdRand solves is quantity:
obtaining a lot of random bits per second. Even if they were produced in a way
that somebody knows possible weaknesses, mixing them with something
cryptographically strong where we control the seed we'd preserve quite a high
throughput. I know that there is /dev/urandom which can often be good enough,
but I know that too much applications in fact prefer to use /dev/random so
making /dev/random robust has sense.

I see Ted Ts'o commented too, and as I understand, having RdRand is still much
better compared to having the platforms without it. There's a lot more to care
about than is RdRand "perfect" and once you have something like RdRand you can
use it safely enough, compared to not having anything.

~~~
caf
It applies to /dev/random too - the same write() implementation is used
kernel-side for both devices so it doesn't matter which one you write to.

The seed that is saved at shutdown and reloaded at startup will alter the
internal state of the /dev/random pool, but it won't add to the entropy
estimate (which makes sense). This means that the output will be more robust,
but it could still block waiting for "real" entropy.

------
denrober
Prior two Edward Snowden's whistle blowing I think you could perceive the
maintainer as paranoid around leaving the project (see linked thread) however
now I think you can't discount what, if any, cooperation technology companies
have been providing to the NSA.

~~~
dfc
I am not sure I agree that before Snowden this could have been perceived as
paranoid. As part of the discussion on the crypto list Ben Laurie brings up an
important point:

 _" But what's the argument for _not_ mixing their probably-not-backdoored RNG
with other entropy?"_[1]

Does your answer to this really change that much "pre-Snowden"?

[1]
[http://lists.randombit.net/pipermail/cryptography/2013-July/...](http://lists.randombit.net/pipermail/cryptography/2013-July/004745.html)

------
tptacek
So this is logic that more or less rules out all hardware encryption,
including HSMs, right?

~~~
vilda
No. In fact it's a matter of trust.

You can trust Skype that calls are encrypted and cannot be eavesdropped, you
can trust Verizon that your cellphone metadata are not passed to government
automatically, and you can trust Intel that their rnd is not backdoored.

Or you don't.

~~~
tptacek
Help me understand how someone who believes rdrand might be backdoored could
trust any HSM?

~~~
chiph
You can't, if you're that serious/paranoid about it.

It's possible that the HSM maker wasn't approached by the NSA and is secure,
but there are very few of them in the US so chances of the NSA having missed
one is very low. Plus, without a STM to inspect the silicon and reverse-
engineer it, how would you know?

So what if you buy one made outside the US? Say, China. Well, there's the
obvious possibility that the Chinese authorities have backdoored the silicon.
But my guess is that the Chinese maker just cloned one of the US vendors,
including the portions inserted by the NSA...

------
benmmurphy
this is my favourite conspiracy theory that the CPUs are backdoored. just
assign a bunch of registers with some special values and execute a specific
instruction and the CPU will drop all memory protection. take something like
google's NACL or a javascript JIT where you have enough control over the
registers and you have a permanent browser exploit.

~~~
marshray
The best backdoors are indistinguishable from dumb bugs when they're
discovered.

They'd looks something like Debian's OpenSSL. But I believe that was _not_ an
intentional backdoor.

------
tytso
There are several different ways in which randomness is used in the kernel.
One general class of randomness is things like randomizing the sequence
numbers and port numbers of new network connections. If you can predict the
result of this randomness, it becomes easier to carry out attacks such as
hijacking a TCP connection. (Note that if the active attacker controls the
path between the source and the destination, they'll be able to do this
regardless of the strength of the RNG; this makes just makes it easier if they
don't have 100% control of the routing.)

Another class of randomness is that which is used to randomize the layout of
shared libraries, stacks, etc. --- address space layout randomization (ALSR).
If someone is able to guess the randomness used by ASLR, then they will be
able to more easily exploit stack overrun attacks, since they won't need to
guess where the stack is, and where various bits of executable segments might
end up in the address space's layout.

Another case of randomness is to create crypto keys; either long-term keys
such as RSA/DSA keys, or symmetric session keys. If someone screws this up,
that's when the "bad guy" (in this case, people are worried about the NSA
being the bad guy) can get access to encrypted information.

It is only the first two use cases where we use RDRAND without doing any
further post-processing. These are cases where the failure of the RNG is not
catastrophic, and/or performance is extremely critical.

We do not use RDRAND without first mixing it with other bits of randomness
gather in the system for anything that is emitted via /dev/random or
/dev/urandom, because we know that this is used for session keys and for long-
term RSA/DSA keys.

The bigger problem, and it's one that we worry a huge amount about it, is the
embedded ARM use cases which do not have RDRAND, and for which there is
precious little randomness available when the system is first initialized ---
and oh, did I mention that this is when long-term secrets such as SSH and
x.509 keys tend to be generated in things like printers and embedded/mobile
devices and when they are first unwrapped and plugged in, when the amount of
entropy gathered by the entropy pool is usually close to zero? What we
desperately need to do is to require that all such devices have a hardware
random number generator --- but the problem is that there are product managers
who are trying to shave fractions of a penny off of the BOM cost, and those
folks are clueless about the difference between cost and value as far as high-
quality random number generators are concerned.

What if the RNG has been compromised by the NSA? Well, that's where you need
to mix in other sources of randomness into the entropy pool. The password used
by the user when he or she first logs into an android device, for example.
Screen digitizer input from the user while they are first going through the
setup process. In the case of a consumer grade wireless router, it could sniff
the network for a while and use packet inter-arrival times and mix that into
the entropy pool. Yes, someone who is on the home network at that time will
know those numbers, but hopefully someone who is in a position to spy on those
numbers, isn't also going to have access at the same time the super-secret NSA
key used to gimmick the RDRAND instruction (assuming that is gimmicked, which
is effectively impossible for us to prove or disprove.) But then again, your
wireless router isn't going to have access to unencrypted plaintext which is
critical --- if you're sending anything out your wireless network without
first encrypting it first, I would hope that you would consider it completely
bare and exposed!

If you are super paranoid, you'll need to find a hardware random generator
which you've built yourself --- and hopefully you are competent enough to
actually build a real HWRNG, and not something which is sampling 60 Hz hum (or
50 Hz hum if you are in Europe :-), and mix that into the entropy pool as
well. In that case, even if the Intel RDRAND is compromised six ways from
Sunday, the NSA won't have access to the output from the HWRNG --- and if it
turns out you were incompetent and your HWRNG is bogus, at least RDRAND is
also getting mixed into the entropy pool.

And if I were in China, I'd use a hardware chip built in China for the RNG,
and combine that with an Intel chip. That way even if the HWRNG chip is
compromised by the MSS, and even if RDRAND is compromised by the NSA, the
combination is hopefully stronger since presumably (hopefully!) it's unlikely
that the MSS and the NSA are collaborating with each other at that deep a
level. Ultimately, of course, if you don't trust Intel, you don't trust the
silicon fab, etc., then you'll have to build your own computer from scratch,
write your own compiler from scratch, etc.

(MIT CS undergrads used to have all of that knowledge, starting with building
a computer out of TTL chips and how to build a Scheme interpreter from machine
code, etc. But not any more, alas. Now they learn Python and it's assumed that
it's impossible to understand the entire software stack, let alone the entire
hardware stack, so you don't even try to teach it. But that's another
rant....)

~~~
gngeal
_If you are super paranoid, you 'll need to find a hardware random generator
which you've built yourself --- and hopefully you are competent enough to
actually build a real HWRNG, and not something which is sampling 60 Hz hum (or
50 Hz hum if you are in Europe :-), and mix that into the entropy pool as
well._

What do you propose as a low cost solution? I've seen some interesting
suggestions, such as having a small fish bowl or tube or tank, having air
pumped into the bottom, and sampling the patterns of bubbles with some CV
solution. Sounds geeky, would make for a nice decoration on one's table, but
building it seems like quite a job.

~~~
rational_indian
I think a zener diode based solution will suffice in a pinch
[http://en.wikipedia.org/wiki/Noise_generator](http://en.wikipedia.org/wiki/Noise_generator)

~~~
gngeal
Hmm. Haven't thought of that. Comparatively simple, yet also compact at the
same time. Thanks, nice!

~~~
rational_indian
Solid state FTW! :)

------
DanBC
What's the suggested attack here?

That Intel is cooperating with the TLAs and providing a weak on-chip random
number generator? Or a random number generator that can be made to be weak? Or
what?

And how credible is the risk when that information is used to seed a pool of
entropy, rather than being used raw?

------
teawithcarl
A longer excerpt from the same email list:

[http://cryptome.org/2013/07/intel-bed-
nsa.htm](http://cryptome.org/2013/07/intel-bed-nsa.htm)

------
marshray
[https://lkml.org/lkml/2011/7/31/139](https://lkml.org/lkml/2011/7/31/139)
_Since there was a minor amount of confusion I want to clarify: RDRAND
architecturally has weaker security guarantees than the documented interface
for /dev/random, so we can't just replace all users of extract_entropy() with
RDRAND._

I still don't get it.

~~~
wtallis
What don't you get? RDRAND is an interface to a non-blocking PRNG backed by a
HWRNG. You can't directly get the output of the HWRNG. Intel uses the PRNG to
condition the output of the HWRNG, but it will still give you numbers if the
HWRNG is having trouble (HWRNG errors can be detected, but the RDRAND
instruction itself doesn't trap on HWRNG failures). If you trust Intel's PRNG
sufficiently, then you can use RDRAND directly for /dev/urandom, but it takes
a lot more trust to use it for /dev/random.

~~~
marshray
It's documented to have the potential to not return random data, so in that
sense it's blocking like Linux' /dev/random. Sample code shows some type of
polling loop IIRC.

------
cambecc
So... taking this line of reasoning to its logical conclusion, if you don't
trust RDRAND, then you should also not trust _any_ of the hardware the OS runs
on. I imagine there would be much easier ways for Intel to implement backdoors
to the system than through the non-deterministic random number generator.

------
pronoiac
What does it take to reverse engineer the silicon? I thought I'd seen an
project for automating it, but I can't find it.

~~~
nullc
Even reversing the silicon won't likely help— and, uhh. Reversing a state of
the art CPU is not do-at-home stuff.

The reason it won't help is that the design is _explicitly_ microcoded. E.g.
RDRAND triggers running loadable microcode which is supposed to read the real
RNG and AES it. Maybe there is an unrelated "bug" that allows that microcode
to be corrupted after some particular instruction sequence happens. All your
investigation would turn up everything looking like normal.

~~~
pronoiac
It looks like the microcode is _also_ encrypted. But perhaps _that encryption_
could be reverse engineered from silicon? The Silicon Zoo tutorial noted that
Pentium I-era chips were "easily viewable" [1], probably with optical
microscopes. So perhaps some parts of some newer Intel processors can be done
at home. So, the "plan of attack" (ha!):

* decap an Intel CPU and scan it

* decode the microcode encryption

* figure out how the hardware RNG works with the microcode (it's AES? ok.)

* and then analyzing the system of microcode and hardware for robustness and security.

Yeah, this is hand-wavey and probably _incredibly_ implausible. But it seems
like an interesting and challenging project or three.

[1] [http://siliconzoo.org/tutorial.html](http://siliconzoo.org/tutorial.html)

------
Qantourisc
Lame answer I know but: recompile kernel (or patch) out this crappy Intel HW
support then ? And IIRC the Linux pseudo random generator was quite good. The
only problem is exhausting the entropy pool.

~~~
__alexs
I believe you can add 'nordrand' to your boot flags to turn off the kernel's
usage of it.

------
bobbyi_settv
Further down the thread:

> Not to mention, Intel have been in bed with the NSA for the longest time.

> Secret areas on the chip, pop instructions, microcode and all that ...

What does "pop instructions" refer to here?

~~~
nitrogen
AIUI the story goes like this: for a long time NSA required all CPU vendors to
provide a "popcount" instruction (to count the number of one bits in a
register) for any hardware contract. NSA was buying a lot of Intel processors,
but Intel CPUs lacked a documented popcount instruction until very recently.
So, there was speculation that an undocumented opcode would function as a
popcount instruction in older Intel CPUs (perhaps after modifying the CPU
microcode), and from there people speculate that there may be other
undocumented instructions and CPU features. Or so the story goes.

------
jMyles
Wow. So not very?

------
ivanbrussik
ugh had no idea intel was built into the core

i distrust

