Hacker News new | past | comments | ask | show | jobs | submit login
“We cannot trust” Intel and Via’s chip-based crypto, FreeBSD developers say (arstechnica.com)
348 points by robin_reala on Dec 10, 2013 | hide | past | favorite | 176 comments

It was always the right answer to feed all available entropy sources (irrespective of previous laundering -- eg. Intel RDRAND gets laundered through the SP800-90 AES-CTR_DRBG internally) into a decent CSPRNG. Feeding multiple entropy sources of different qualities, speeds or backdooredness cannot (by construction) decrease the entropy of the output (it can, obviously, fail to increase it -- say if your ring oscillator got stuck in a fixed bit pattern like the Taiwanese smartcards).

Yarrow and Fortuna are examples of decent CSPRNGs, so I'd say this is a pretty good move by FreeBSD.

If the RNG instructions are malicious, they can reduce entropy, by for example XOR'ing into the result previous register values that held the previous entropy source values.

That doesn't reduce entropy, it just fails to increase it.

It might reduce it, if it XORs away entropy that was previously XOR'd into it.

You're talking about entropy pools as if they just XOR'd their inputs, which is demonstrably false.

Isn't a common way of combining entropy sources just XOR'ing them together?

Aren't you talking about the Linux kernel? There the very last step is XOR'ing the random output with the hardware RNG. The article explicitly mentions Yarrow as a mixing and CSRNG function for FreeBSD though.

Sorry, that's not how entropy works with XOR.

If you do not know one of the inputs to an XOR, you know nothing about the output.

XORing random sources is great because it acts as a minimum function for entropy of all of the sources. As long as you have just one good source, the output is good.

> If you do not know one of the inputs to an XOR

Aren't we talking about potential backdoors built in to the CPU? As in, having full access to all the registers?

The processor could recognize that it is in common key generation code, and generate 'entropy' that precisely reverses most of the existing entropy. The existing 'good' entropy could be known by the hardware PRNG.

The processor can also just write whatever it wants into the memory containing the entropy pool even if you never call the RDRAND instruction.


Good call on this. If you can't trust the CPU you're running on, you can't trust anything at all. The proper solution is to find a supplier you trust and go with them.

Pulling useful entropy away from the OS's RNG functions is the best example of an ignorant knee-jerk reaction to this security problem.

I don't see where the OS's RNG functions are being deprived of such entropy as the hardware RNG provides; it looks more like the hw RNG is being used as one of many inputs to the OS RNG instead of as a completely trusted substitute for the OS RNG.

It's not checkmate. It's a completely unreasonable threat model. How would the processor know where the entropy pool used by the operating system is?

How would the processor even know it's performing crypto operations that it should swap numbers for?

Presumably using the same skynet tech it uses to look ahead and see where the rdrand is going to be xored into. I'm less familiar with precisely how linux does it, but it's not as simple as "newsecret = oldsecret ^ rdrand". The bits are scattered all to hell.

Building a "where will this rdrand go?" backdoor is harder than building a backdoor that just trawls through load addresses for various common kernels looking for the symbol table so it can poison the entropy directly.

Do you realize how much of a performance hit modern desktops would take if a processor that had to freeze the operating system while it wandered through memory first identifying the operating system and then finding the entropy pool and modifying it?

>Presumably using the same skynet tech it uses to look ahead and see where the rdrand is going to be xored into.

There is no such tech. That's why this is not 'checkmate'. A poisoned random number that generates numbers in a predictable manner is orders of magnitude easier to implement and less possible to detect than a magical processor that changes memory it thinks might be entropy for some operating systems it has been pre-programmed to look for under the assumption that kernel will never change ever. Get real.

I think you have misinterpreted my comments as arguments for the backdoor that I am attempting to dismiss.

Sure, but that's detectable. gefh's attack is not.

How so? Has anyone ever pulled the RAM sticks out of their computer to check that the entropy pool looks like they expect it to? Did they then verify that the L3 cache on the CPU had the same contents? Or did they simply ask the suspected backdoor CPU if the very memory it was just suspected of tampering with was clean?

If you're going to open Pandora's box, I don't think you get to pick and choose what hypothetical backdoors to take out. It's particularly odd to only select the subset of backdoors that you can easily defend against.

Except that if the CPU completely masked its changes it would be no threat. The trick is getting the system to use the bad randomness. I find it unlikely none of the people debugging or running slightly different kernels or drivers or rootkits would not notice something. To exploit RDRAND you would not have to worry about what all the code in the system is doing (highly volatile over different configurations and versions) but you would just need to monitor a few select kernel symbols.

>The processor could recognize that it is in common key generation code

This is very unrealistic. It would mean that processors have to be shipped with every crypto library implementation baked into the hardware. Any update would immediately break it.

>The existing 'good' entropy could be known by the hardware PRNG.

That's a different threat model that doesn't exist here. These devices don't have free access to wander around main memory.

As I understand it when you read from /dev/random it checks a measure of entropy and blocks if there isn't sufficient entropy available.

Do we trust hardware RNGs enough to allow their output to increase that measure of entropy?

/dev/random is not blocking on FreeBSD.

Even if it was, like Linux's weird RNG, the security distinction is not particularly meaningful. Developers should generally use urandom.

Unfortunately that isn't the case. Java's JCE provider for it's random will read from /dev/random, if you happen to generate a lot of keys in quick succession (such as unit tests for a product) you will run out of entropy really fast, and tests will seem to take hours...

Which really sucks because then you run something like haveged on the server running the unit tests because you can't have a build take hours because Linux takes forever to gather new entropy on headless servers :-(

Which is why you want to avoid /dev/random and just use urandom. :)

Would love to, but that's not easy to do in Java when you want to use the rest of the JCE as normal.

You are wrong. Developers should use random if it's for something security sensitive like generating keys. random will block when entropy is too low, urandom will just continue recycling entropy, which is cryptographically dangerous.

I'm sorry, but this meme is flat-out wrong.

/dev/urandom gives the exact same quality randomness as /dev/random (let's ignore the issue of boot-time and VM cloning for now).

There is a slight twist to it, where for information-theoretically secure algorithms /dev/random would be preferable, but you don't need that. Because you don't use those algorithms (the only one really worth mentioning is Shamir's secret sharing scheme).

I'm just amazed how people don't trust the cryptographic building blocks inside a modern CSPRNG, but then use the very same building blocks to use the randomness and encrypt their secrets.

A PRNG must be seeded. CSPRNGs are no different, right?

In a normal PRNG, if you want X different possible outputs, you must be able to seed it with at least X different seeds. Since each seed corresponds to an output sequence, you need at least as many values for a seed as you wish output sequences. Of course, this seed should be random, and you can't really use a PRNG to seed a PRNG.

How do CSPRNGs get around this? I assume that if I have a CSPRNG, I must seed it, and that I must draw that seed from a pool of seeds at least as big as the desired set of output streams. (See above.) If my intent is to generate 4096 random bits (say, for an encryption seed), to me it seems I must input a random seed at least that long. Thus, I need a good RNG.

Take a look at Wikipedia's definition[1], for example, of what a CSPRNG must do (as opposed to just any old PRNG):

• Every CSPRNG should satisfy the next-bit test.

• Every CSPRNG should withstand "state compromise extensions". In the event that part or all of its state has been revealed (or guessed correctly), it should be impossible to reconstruct the stream of random numbers prior to the revelation. Additionally, if there is an entropy input while running, it should be infeasible to use knowledge of the input's state to predict future conditions of the CSPRNG state.

Let's assume our CSPRNG of choice satisfies that. The problem is that second one only applies to "preceeding bits". If I know the state of the CSPRNG, I can predict future output. If Linux is low on entropy, or runs out, does this not diminish the number of possible inputs or seeds to the CSPRNG, allowing me to guess, or narrow down my guesses, to the state/seed of the CSPRNG, perhaps prior to it generating output?

Am I going wrong somewhere?

[1]: http://en.wikipedia.org/wiki/Cryptographically_secure_pseudo...

Good points.

First, I'm not saying that cryptographic randomness can be created out of thin air, without entropy, I just argue that you don't need n bits of real entropy to get n bits of high-quality randomness.

I mean if you really needed 4096 bits of random seed to generate 4096 bits on randomness, why not just take the 4096 bits you waste on the seed as randomness?

Of course you need a random seed. That's what i was alluding to with the boot-time or VM remark.

But you're not really interested in lots and lots of potential output sequences, one of them is enough. Remember, the first requirement of a block cipher is that it is indistinguishable (to a computational polynomial bound) from a random distribution.

The real counter-argument is the state attack. And that's mitigated by a modern RNG's design. Fortuna, for example, constantly mixes incoming entropy into outputs that occur far in the future (technically, it reseeds every now and then, but without estimating entropy). This does not protect you from a total state compromise, a computer is deterministic, after all, but it's quite hard to argue with a straight face that such a total compromise matters, because everything you might want to use the randomness for would most certainly be quite as compromised, as well.

So why take that (probably insignificant) risk?

Because the alternative is worse. If you want to have Linux's current /dev/random behavior, you have two things:

First, it blocks when there's not enough entropy in the pool. That's bad. Just google for "sshd hangs". Either your system doesn't work anymore, or people find creative ways to subterfuge all the crypto stuff to make it work again. Just for the far-fetched fear about this total state compromise?

Second, how much entropy do you have? Lots and lots has been written about it, but despite all this technobabble ("'entropy', there must be hard physics behind it"), estimating entropy is not really an exact science. More guesswork. So you never know how much entropy you really have. That's why Fortuna got rid of all that estimating that its predecessor Yarrow still did.

I'm dubious that there's any practical impact to using urandom with Shamir.

Yeah, I see it the same way. but at least you can put forward a highly theoretic argument there that just doesn't work with all the crypto algorithms we actually use.

Especially since: what are you doing with ssss? Usually splitting a private key to a non-information-theoretically secure block cipher, I guess. So you're back to square one.

I phrased that part poorly.

Oh, sorry, no, everything you said made sense. I'm hazarding a guess that we read the same Thomas Pornin Stack Exchange comment. :)

I think you're right. I probably channeled him inadvertently.

I've collected lots of links to articles and man pages and so on about the issue, because I have been planning to write a coherent article about urandom vs. random.

And I'm pretty sure I remember what posting you mean.

Unfortunately, writing this article gets postponed again and again, just as finishing and sending in the second set of your crypto challenges... maybe before Christmas I'm going to find a few hours to do that.

Unfortunately, this statement is only true if your entropy sources are uncorrelated. If your new entropy source is correlated in precisely the right way to cancel with a previous entropy source, you reduce entropy of the result.

With a cryptographic primitive like Yarrow finding this precisely right way should be infeasible, in the same way that finding a collision for a cryptographic hash function should be infeasible.

Fortuna supersedes Yarrow FAIAP.

Fortuna has the advantage of making it harder to game entropy pools. [1]

[1] http://th.informatik.uni-mannheim.de/people/lucks/papers/Fer...

I've been working with this project for some time:


Quite a bit of entropy using radio noise and a $15 RTL-SDR USB dongle. Still could use some work and review but seems like the start to an almost ideal solution.

I wonder how hard it is to build something like that using the mic input. Or perhaps measuring the noise on a spare 802.11 card. I know in general it is not easy to add a kernel facility that will be widely available, but how hard is it to build a physical RNG if your specific case demands high entropy and you do not want/need a commercial product?

There is already a project whose name sadly escapes me that does this, using information theory to derive lower bounds on the number of bits of entropy in data read from a line-in jack.

EDIT: I'm thinking of Turbid: http://www.av8n.com/turbid/paper/turbid.htm

EDIT2: There's also a wealth of information in this thread from the metzdowd cryptography mailing list: http://www.metzdowd.com/pipermail/cryptography/2013-Septembe...

is it VanHeusden's audio_entropyd? [1]

They also provide a timer and video based entropy daemons too.

[1] http://www.vanheusden.com/aed/

Hmm I've been having similar concerns about the Windows rand_s function. Every Windows application including every modern browser relies on rand_s for secure random number generation but obviously the function is completely closed source. Seems like a perfect target for the NSA.

Closed source isn't a significant obstacle to understanding code, and Microsoft's code is the most comprehensively reverse-engineered in the industry. It would be extremely stupid to hide a backdoor in binaries that Microsoft ships.

Calling attention to this: disassembly is a thing. It is entirely possible to reconstruct a program's design and logic from the disassembled code. This is what software crackers and antivirus writers do on a regular basis. For an enlightening write-up, see the Symantec paper(s) on Stuxnet.

"This is what software crackers and antivirus writers do on a regular basis. For an enlightening write-up, see the Symantec paper(s) on Stuxnet."

Well, not really.

I was both software cracker and virus writer myself(just for fun, we never distributed our virus or our cracks, keygens and stuff outside).

I was part of a larger group doing it. I had contacts with people that became famous for breaking several important protections, specially for games.

You have no idea how big software is. A million lines of code is impossible to read by any human being in his entire life. Yes, there are automatic tools and great disassemblers like IDA pro, but even if you were to see the entire source code, it is very easy to hide some flaw into the code.

If you add undocumented hardware into the equation, then it is very hard to disassemble without significant resources.

This hardware took dozens of millions of dollars to develop, hundreds of very smart people to design, with all the documentation. It will take at least an order of magnitude more to decode not having that info.

I don't know when you were a virus writer (the heyday of the virus writers I knew was the mid-1990s, and the virus writers I was aware of post-2000 had more knowledge of WinAPI than of x86), but: the state of the art for disassembly and reverse engineering has progressed dramatically in the past 10 years.

In particular, reverse engineering is no longer the specialty of virus authors the way it used to be; a totally mainstream application of reverse engineering, practiced by most major security companies, is reverse engineering patches to find the underlying security flaws they fix so they can be weaponized.

It is easy to hide flaws in any code. Assembly doesn't make it much easier.

Can you expand on or point to a write up on reverse engineering patches?

Do you mean that reversers' try to locate the, say, buffer overflow that was fixed and try to find another way to exploit it? Why would major companies want to do this?

The earliest talk I ever saw on automated binary reverse engineering was this one: http://www.blackhat.com/presentations/win-usa-04/bh-win-04-f...

I actually didn't see the BH talk, but I saw a similar talk that Halvar gave at CanSecWest shortly after.

The gist of it is, imagine that you have a binary that you are looking to find vulnerabilities in to exploit. You can go through all the trouble of discovering a vulnerability, and then hope it doesn't get patched; or you can sit and wait for a patch for said binary. There's reams of data out there about how long it takes for systems to apply patches, but in general, you can find vulnerable versions of patched software long after the patch has been released.

Using binary differential analysis you are basically zeroing in on the parts that were changed (which you can imagine is a much smaller subset of the overall binary) and find the vulnerability much more quickly.

There are tools (there is/was a product called Bindiff that I don't know if Zynamics still sells after they got bought by Google), which help you do this in a more automated fashion.

That means that with much less work, you can write up a working exploit that will still work on some decent percentage of the install base for the application (until everyone patches it).

Additionally, you can imagine that a lot of times when vulnerabilities get fixed, they aren't necessarily fixed with the utmost rigor. There's a lot of cases where an individual vulnerability might be fixed, but if you look at what was done, you can find other parts of the binary that are vulnerable to the same underlying flaw. Knowing what gets changed in the patch can tell you a lot about underlying issues.

So they can build tools that test for those vulnerabilities, or detect attempts to exploit them and block them on the network.

Think of Windows. You and I both know that not all machines running windows are up-to-date with their security patches. Reverse engineering a patch for a 0-day exploit could give an attacker an idea on how to compromise un-patched machines. With all the un-patched windows xp machines in the world, you could probably build your own bot net if you're smart enough :)

Right, but it's harder to detect subtle manipulation. It only takes a one bit error to reduce the key space by half.

It is hard to detect subtle manipulation period. But assembly language is an extremely concrete representation of a program; it is not an especially great way to write underhanded code.

It's possible to write obfuscated assembly, of course, but it sticks out as obfuscated.

To further get your head around the lack of cover compiled code would give a backdoor, it's worth knowing that Hopper will give you serviceable C-ish code from x86_64 for... let me check here... fifty nine bucks.

(It's possible that my only message with that last graf is "Hopper is great").

How easy is to detect off-by-one errors looking at assembly?

assembly language is not an extremely concrete representation of a program if the hardware is engineered with backdoors which is exactly what this article suggests is happening.

if some specific set of registers is set a certain way followed by a very specific series of commands, there is no practical way to prove that the hardware is doing what it should.

You are talking about a wildly different threat model than this little section of the thread is talking about.

Meanwhile: if you can't trust the hardware, you can't trust C code either.

I'm aware that programs can be disassembled and I have written x86 assembly myself. If closed source is not a significant barrier to completely understanding s_rand then I ask for one reference to someone who has completely audited the Windows 7/Windows 8 version of s_rand, and CryptGenRandom that s_rand calls.

If you look on the related Wikipedia page what you find that there was an audit done of the Windows 2000 version of CryptGenRandom (which s_rand uses), and it was discovered to be flawed in a number of ways. The follow up is that Microsoft said that the issues were fixed in Vista. It's more than a theory that s_rand vulnerabilities will be called "bugs". It already happened once.


Unless the backdoor was so incredibly subtle and hard to spot that it could be hand-waved away as a bug should it turn up.

It is actually easier to perpetrate that kind of subtlety in higher-level languages than in assembly.

It can't be, because a valid way to detect such underhandedness in a higher-level language is to compile it to assembly and check that.

MS shares information with intelligence agencies and usually before MS has even delivered a patch.

On top of it, when we analyze government malware, we don't see some weird calls to rand_s. We just see a few zero-days, exploits for known issues, or just plain-jane trojans.

Why compromise rand_s, which could help our enemies and cause a public outcry, when you're sitting on a mountain of zero days?

Also, windows is not closed source. MS shares source with governments, universities, etc. I think functions like rand_s are well understood and there has never been an MS backdoor for the government.

Microsoft shares all vulnerabilities with NSA as soon as they find out about them, and long before they fix them, anyway, which is like giving them access to a lot of backdoors into Windows. Adding one more from rand_s doesn't change the situation that much. Then there's TPM and all that other fun stuff. Until Microsoft overhauls their Windows security policies, you can never really trust Windows to keep you safe, even with the latest updates.

It sure is, but how would they do that practically? I'm not saying it's impossible, but I really wonder how they would approach this. Call Ballmer and tell him to slip in a new CRT with the next Windows update providing a targetted version seems like a long shot. And even it works they would only target a (possibly small) subset of all machines since there's enough of them that aren't updated and there's enough software out there that ships with it's own copy of msvcrxx.dll

Have people talked about using sensor device input as prng seeds? onboard microphone, fan speed jitter, etc?

Yes. People have even experimented with using CPU specific instabilities in certain operations to extract entropy.


The problem with such systems is that is generally quite difficult to have confidence that they are not subject to attacks that may starve the system of entropy or trick the entropy estimator into thinking it has any when it has none.

Low-level hardware RNGs can be constructed in ways that make them quite difficult to attack externally, but are also basically impossible for us to verify.

Really it seems that the best approach is to take a wide range of flexible entropy sources and to learn to trust our mixing pools and trapdoor functions.

> The problem with such systems is that is generally quite difficult to have confidence that they are not subject to attacks that may starve the system of entropy or trick the entropy estimator into thinking it has any when it has none.

That's why you give up on estimating entropy. It's just not possible in the general case.

Mix multiple sources of potential entropy in a secure fashion (e.g. with a crypographically-secure hashing function). Recover over time from a compromise.

> Really it seems that the best approach is to take a wide range of flexible entropy sources and to learn to trust our mixing pools and trapdoor functions.

Yup, you're exactly right. Now we just need to convince Ted T'so to replace Linux's hacky /dev/random with a Fortuna-based one.

You need some degree of entropy estimation at initialisation time or early boot numbers are predictable.

That seems to me the right kind of solution.

Hardware devices from big companies are categorically untrustworthy, because of these companies' history of cooperating with surveillance, their incentives to sell out users, and the inability of users to verify what's going on in the devices.

The answer is something that's verifiable - and that means precisely a no-code, low-tech source of entropy, plus a means of translating fluctuations into bits, the latter being "open" enough to enable users to confirm exactly what it does.

And thanks, it's funny to think of putting a microphone in a colo server and using the roar of all the nearby fans and occasional voices as a source of random bits.

Yes, Linux already does it iirc.

You have to install a utility to do it, but yeah. The problem is ambient noise isn't that random in say, a data centre - it works better in a noisy environment like an office where you have people doing fairly random things rather then a whole bunch of oscillating fans.

I'd wager machine room fan noise has at last as much entropy, since office sounds are pretty structured and usually mostly quiet. In any case there's a significant amount of noise from the a/d conversion even when recording a predictable sound.

White noise is an enormous source of entropy, by definition. For example, take a stream of samples and record whether the least significant bit is 0 or 1. That'd be almost impossible to predict or influence from outside the system.

What utility is it?

On Ubuntu you can install the 'randomsound' package. There used to be a different one I used but I can't remember the name.

I don't want Linux sampling my microphone randomly. Other things are fair game.

I don't recall who did it, but someone once pointed a webcam at a row of several lava lamps as a source of entropy. As long as at least 75% of them were on at any given time, the image-hashing function produced unpredictably random numbers. The low quality of the camera produced some random noise all by itself.

I imagine you could do something similar with a feed of traffic on a nearby highway or blobs of cornstarch suspension on a subwoofer.

Yes, but I suspect the NSA has had the foresight to backdoor most major brands of lava lamp. Same problem as before, you can't easily verify the hardware.

SGI's Lavarand [1].

[1] http://en.wikipedia.org/wiki/Lavarand

unexpectedly amused by the fact that in this case, worse hardware is better due to extra noise.

That was SGI.

I wonder if it's practical considering bandwidth. Is it enough for most practical purposes?

This coming from a BSD distribution that ships binary blob device drivers.

You seem surprised that they acted where they had the ability to do so but don't where the only option involves losing users.

This is the difference between engineering and idealism.

Ah so all of a sudden running a verifiable system is considered idealism.

> Ah so all of a sudden running a verifiable system is considered idealism.

Did you build your own CPU? Write your own firmware? Audit hardware and firmware for device with access to system memory[1]? Write your own kernel? Verify your compiler[2]? Audit every line of code for everything you run? If not, you're deciding the particular areas where you choose to harbor the illusion that you're seriously verifying something.

Fundamentally at some point you have to trust your hardware vendors unless you have unlimited resources to audit everything. Open source is just part of that picture and, like everything else in security, doing it professionally requires you to be pragmatic by balancing absolute security in a particular area against your users’ ability to actually do what they care about.

If FreeBSD did not ship a binary driver the overwhelmingly more likely outcome would be fewer people using FreeBSD. If you care about FreeBSD, as the developers presumably do, you want as many users as possible to improve the odds of being taken seriously when you try to negotiate better support with a vendor. Consider how much trouble OLPC had with WiFi firmware – that gives you a bottom range estimate for the number of units sold which has to be on the line before a hesitant vendor will consider opening something.

1. http://md.hudora.de/presentations/#firewire-pacsec 2. http://cm.bell-labs.com/who/ken/trust.html

> Did you build your own CPU?

We're playing the let's jump to absurd extremes game, are we? Ok, there's no point in any of it - nothing can be trusted - I'll go and install windows.

Yes, and?

The point was obviously that it's surprising to see that they are happy to ship binary drivers (which could contain arbitrary compromises from a user's point of view) but are not willing to trust the HWRNG (to which the same caveat applies).

Of course, I don't agree with this comparison; running the output of the HWRNG through Yarrow is good practice in any case.

What's the problem with the comparison? The only real advantage of the binary blob over the hardware is that you can theoretically verify what it's doing more easily. But actually doing that verification has a difficulty on the same order as reverse engineering the driver to release an open source version, so all the existence of the binary driver does is to imply that such verification hasn't occurred -- because if it had then those doing the verification would have been in a good position to release source code (or at least complete hardware documentation) for an open source driver.

Binary blob software also has the distinct disadvantage that even if you were to verify it, you would have to verify each release again whenever there is an update.

In my perspective, binary drives are a matter of choice, you can use them, but you dont have to, if you dont want to, whereas the random numbers generation is used everywhere in the OS, and you cant just decide to not to use it - that's a huge difference, what makes this comparison bad.

Can't you though? We're really talking about defaults here. If you don't want to use the hardware RNG, you can always disable it in either the BIOS or the OS. Or use hardware that doesn't have one. The trouble is for the common user who doesn't know anything about any of this and is just going to take the defaults for everything, who is exactly the sort of person who is going to take the binary driver to "just make it work" and then be exposed to any vulnerabilities it may contain.

You have to make the defaults secure because they're what most people will use. It doesn't matter that you can change it if most people never do.

The comparison is ludicrous. Non-obfuscated binaries, often with debug symbols, are much easier to reverse-engineer and analyze than hardware.

Additionally, you can analyze binary diffs, much like source code. You would not start from scratch at each new version.

Let's also not forget how hard it is to verify source code, and how very rarely it is done.

You're right in an academic sense - they're completely comparable.

In reality, the difference is that we can guarantee 100% of systems to work without the hardware RNG. Every binary blob we remove removes support for hardware and 'breaks' the OS for a subset of the users.

No one* is going to fire up an OS and go "Hey, this OS doesn't use my hardware RNG! This OS is broken!"

They will, however, fire it up and go "Hey, this OS doesn't support my network adapter! This OS is broken!"

Security versus functionality. It's always a trade-off. The most secure system is the one that does nothing - everything past there is adding functionality at the expense of an increased attack surface.

* To within a margin of error.

I don't know if they're that much different. If you turn off the hardware RNG then generating random numbers becomes significantly slower. Maybe most people don't care, but I expect most people also don't care if their network card doesn't do TOE as a result of you shipping an open source driver that isn't feature complete rather than a binary blob driver that is.

The flaw in your argument is the assumption that the drawback of using a software RNG instead of a hardware RNG is inherently less than the drawback of using an open source driver instead of a proprietary binary blob driver. There may exist specific circumstances where this actually the case (as is obvious if there is no open source driver whatsoever), but that isn't the immutable state of the world. If you take this kind of security to be a Serious Issue then it demands for engineering effort (or community pressure) to be directed toward the goal of having published source code for all hardware drivers. One of the ways you can do that is by refusing to ship binary blobs (as Debian does), which creates some trouble for Debian, but also creates some trouble for the hardware vendors who now have customers avoiding their hardware because it involves more trouble than a competitor's hardware.

The typical response to this is the argument that not enough people run these operating systems to make the hardware manufacturers care, so you do more harm to yourself than you do to them. But that's ignoring the demographics. The primary constituency of Unix-like operating systems is the likes of IT professionals, systems administrators and software developers. These are the people who buy computer hardware by the pallet. Hardware manufacturers do not want those people to notice them in a bad way.

This is incidentally why the state of open source GPU drivers is so much worse than the state of e.g. open source network drivers. When the local BOFH goes to buy a thousand rack units of Debian servers, he generally cares a lot more about the NIC than whether it will do 3D hardware acceleration. But that's going to change soon enough as GPGPU gets popular in the datacenter.

Sorry, I was unclear. It's an apt comparison in that sense (it's unverified). However, there's a zero-impact workaround for this (using the CSPRNG on top of the HWRNG) so it has no user impact, whereas eliminating binary blobs does.

The difference between HW RNG and binary blobs is that the computer loses no functionality without an HW RNG, although it might suffer slightly in performance. Pragmatically, this is a huge difference.

Where are the Intel and Via random instructions supposed to be getting their entropy?

Edit: thanks for the interesting replies!

In theory? Quantum and thermal effects inside an unstable configuration of transistors.

The simple example is a basic SR latch (two NOR gates, where the output of one gate feeds one of the inputs of the other, and vice versa), where you start things off by applying a signal to both S and R. When you remove the signals, the latch will eventually fall back into one of the two stable states - but which state it ends up in is random.

So you can easily produce a stream of bits from a potentially-biased random source, and then do some deterministic massaging of that stream to produced an unbiased stream of random bits.

Intel is at least supposed to be using something like a "coin-flip" circuit, which has two stable states and can be forced one way or the other based on thermal noise:


That output stream may be biased, so it subsequently goes through a "whitening" stage based on AES.

(One question I haven't seen an answer to: presuming all the hardware functions as described, would it be possible for a microcode update to change the output --- e.g., by whitening the output of the real-time clock instead of what I'm calling the coin-flip circuit?)

Usually, electric components with unpredictable timing or behaviour are used to generate randomness. Sometimes they intentionally use noisy or unshielded components to use the noise for randomness.

For some physical phenomena that can be exploited for getting entropy from a chip or other hardware, have a look at http://en.wikipedia.org/wiki/Hardware_random_number_generato...

Sounds like a good move to me. Intel has been awfully quiet about this.

In theory, Intel could be under some kind of directive or warrant to do this and it would be illegal for them to even disclose its existence, let alone protest against it.

I don't know whether Intel the corporation has said anything, but the responsible hardware engineer has personally stated there is no backdoor. http://lists.randombit.net/pipermail/cryptography/2013-Septe...

If I had an application where it mattered, I would pick RDRAND over a software system that was not reviewed, built, verified, and continuously monitored by someone I trust. It's implausible that RDRAND on my hardware is implemented differently from the hundreds of thousands of other copies of the core, or differently from the way it was yesterday. I have no such confidence in /dev/random. That doesn't mean there's anything wrong with FreeBSD's RNG; it only means that it's harder to be sure I'm actually using it.

Not to accuse Intel, because I have no idea whether or not they have backdoors, but do you want them to say: "Yes, we intentionally sold faulty products that put all your customer data at risk, and you at risk of a lawsuit for not protecting your customers' data."

Not so good for the share price, methinks.

Makes me wonder, though... if the government forces intel to sell everyone defective products does that violate our 4th amendment rights? Do I, as an individual, have standing to sue the federal government because they intentionally weakened a product I purchased, potentially exposing me to other kinds of threats?

What about someone who has actually been victimized by an exploit of a backdoor that was ordered installed by the government? Would they have standing to sue for damages that resulted?

I only ask because I know "standing" has been an important issue in many of the cases related to these abuses. Most of the suits have failed because the plaintiffs were found to lack standing for one reason or another.

No. In general, you may only sue the federal government when the federal government gives you permission to sue them.


You can file a lawsuit saying your rights were violated (the ACLU has done it multiple times). You can also sue under The Federal Tort Claims Act ("FTCA"). The Federal Government is not immune from rights violations.

Especially when Snowden outs them :-)

I used to work for an online gaming company (legal in the UK) - and they basically used the on server chips (as opposed to quantum RNG) - and was not unusual in the industry. which may lead to weaker randomness and so an exploit for scamming. It's just that the validation tests were to simulate a few million rolls of the dice and see if the graph came out right.

Is that a good validation test though? I'd have thought that if I programmed it to have a 50/50 chance that it would repeat the previous roll the graph would look about the same. i.e. half a million rolls look about the same as a million.

Edit: which is to say that I thought the issue wasn't that the Intel/via chip's random number generators wouldn't look random in aggregate over lots of uses, but that people are concerned that there could be exploitable patterns in short sequences.

Not a solution to the particular issue of trusting HW PRNG unless you can audit the innards of this particular hardware.

Maybe an opensource FPGA-based solution might do the trick if you really need high-quality highly-secure fast random number generation.

Personally when I've needed good and fast sources of entropy I've just picked a good (but slow) random number source and used it as a key for a strong stream cipher (that I would renew every megabit or so). Assuming there are no weaknesses in the cipher and you have hardware acceleration you can get a very fast PRNG source.

And unlike RDRAND you can actually audit the hardware cipher implementation because it behaves deterministically.

Some of the older devices (1997) had obvious flaws, and were lousy for crypto.

Here's a list:


mhoye points to the notes from the FreeBSD summit on his Twitter feed. The notes from the security sessions are here: https://wiki.freebsd.org/201309DevSummit/Security

Key quote: "rdrand in ivbridge not implemented by Intel."

You're referring to errata BV54?

No, that's just a failure to set the relevant property bit in the CPU at startup, so despite the RDRAND instruction being present it won't actually run on those CPUs.

It's well known that the Intel RDRAND implementation uses a hardware entropy source to seed a software RNG implemented directly in the CPU. The suspicion is that said CSRNG could be using an elliptic curve CSRNG to carry out the final mixing where the NSA has chosen both the curve and the points on the curve. It's impossible to tell from the outside whether this is the case or not - such an implementation would still pass all known randomness tests. However, if the curve points have been carefully chosen then you can derive the internal state of the RNG from the output and thus both predict future values (until the hardware entropy source re-seeds the generator, if it actually does that at all) and derive past ones (up to the last re-seed).

It's been said elsewhere that using an elliptic curve CSRNG where the NSA has chosen the curve points is effectively equivalent to doing the second half of a Diffie-Hellman exchange with the NSA to securely communicate your RNG state to them. For cryptographic algorithms that depend upon a secure RNG source this is devastating. All communications that use such are source are transparent to the NSA. (The NSA crypto geeks must have thought this was a fantastic trick: the communication would remain completely secure against all other third parties because deriving the magic numbers that let you decrypt the output from the public curve points is next to impossible - it's the discrete logarithm problem in action.)

Note that the relevant Intel docs (http://software.intel.com/en-us/articles/intel-digital-rando... ) state that the final CSRNG is AES in CTR mode, according to the spec detailed in http://csrc.nist.gov/publications/nistpubs/800-90A/SP800-90A...

As far as I know, there's no known issues with this approach, unlike with the elliptic curve random number generator detailed in the same document, where the NSA is believed to have pre-selected the elliptic curve points.

I don't understand how what you just wrote relates to your key quote: "rdrand in ivbridge not implemented by Intel."

I'm not sure how I can explain it any more clearly!

If that note is true, then crucial details of the RDRAND specification in Intel CPUs were supplied by someone other than Intel.

Who's most likely to want to push for a particular implementation to their own enormous potential advantage? The NSA. Who has prior history of doing exactly this to a random number generator standard? The NSA. Do we know that the NSA influenced the Intel RDRAND implementation in these ways? No. Did they have both motive & opportunity? Absolutely, yes they did.

Which is why a concrete reference for that note would be good: if someone can point to the place in the public record where Intel admitted that they were not solely responsible for the RDRAND implementation then that's a fairly big deal & a brief sentence in someone's conference notes isn't good enough.

Obvious follow up question: who implemented rdrand for haswell? The identification of ivy bridge seems oddly specific if they're not referring to the erratum.

Even if you could trust these RNGs, it is possible that they will malfunction at any time through a manufacturing defect or thermal issue. Blindly trusting hardware is naive. Linux never did, and decisions like this make me less likely to ever use BSD again. The critical mass of talent is just not there.

It seems unlikely that a manufacturing issue could affect the RDRAND instruction and not any other part of the CPU.

This on AMD chips as well? Since it's BSD we can just have an option to turn it off.

As far as I am aware there are no instructions for AMD CPUs to generate random numbers thus far, although Wikipedia mentions that their upcoming Excavator(scheduled for a 2015 release) architecture includes support for Intel's RDRAND.

> the NSA and its British counterpart defeat encryption technologies by working with chipmakers to insert backdoors, or cryptographic weaknesses, in their products.

I had already started to forget about that...

From that other report (september):

> They reveal a highly classified program codenamed Bullrun, which according to the reports relied on a combination of "supercomputers, technical trickery, court orders, and behind-the-scenes persuasion" to undermine basic staples of Internet privacy, including virtual private networks (VPNs) and the widely used secure sockets layer (SSL) and transport layer security (TLS) protocols.


You wouldn't prove anything by analyzing the results, for anybody who knows a bit of cryptography it's trivial to produce the stream that "doesn't have the patterns" but that can contain "master" key.

What do you mean?

The output of any good cipher in "counter" mode should be very close to random noise while still being perfectly predictable.

Basically any stream cipher worth its salt would pass the parent's test.

That's not entirely true for counter mode. For random noise, if you see 2^(n/2) n-bit blocks then there's a ~50% probability of two of those blocks being identical, but if you have that many n-bit blocks from an n-bit block cipher in counter mode the probability of two identical blocks is zero.

This doesn't apply to stream ciphers.

Interesting but I don't understand why the probability of two identical blocks in this situation would be 0 for CTR mode.

As far as I know theoretically there could be a collision and the cipher could output the same value for two different counter values, unless the block cipher guarantees that such a collision can never occur? I was not aware of such a guarantee.

A block cipher cannot produce the same output block for two distinct input blocks because it is reversible.

So "just" an order of 2 to 128 block samples has to be collected to see that it's generated? Good luck with that.

Nope, AES has a 128-bit block size so you "only" need about 2^64 blocks.

I think the parent means that you can't distinguish a biased RNG from an unbiased one if the bias is designed to look random. In a similar fashion, it's not possible to distinguish cipher text from noise - this impossibility presumably means detecting a biased RNG (i.e. output from an encryption algorithm with known key) is also impossible.

If I recall correctly, Linus refused to make this change in Linux, denouncing it as paranoia.

My fear is that the extra complexity adds additional opportunity for a back door to be inserted. However, software can be audited, the hardware cannot be.

No, Linus has refused to make a different change in Linux (the complete removal of RDRAND) because linux does not directly use RDRAND output, and it already uses RDRAND as an other source of entropy[-1], which is what FreeBSD is moving towards.

According to Theodore Ts'o, there was pressure from Intel engineers to rely solely on RDRAND but they were rejected[0], it looks like the FreeBSD devs did do it and have /dev/random using RDRAND/Padlock directly[1]

[-1] although — and I don't know why — the linux rng XORs pool output with RDRAND output instead of feeding RDRAND into the entropy pool itself[3]

[0] https://plus.google.com/+TheodoreTso/posts/SDcoemc9V3J#+Theo...

[1] That's the only way I can interpret "for 10, we are going to backtrack and remove RDRAND and Padlock backends and feed them into Yarrow instead of delivering their output directly to /dev/random"[2] anyway

[2] http://www.freebsd.org/news/status/report-2013-09-devsummit....

[3] I'll try to stop now but[4] provides a rationale for it: a history of unnoticed bugs in the random pool making the risk of a trivial xor lower as far as kernel devs are concerned: if you xor a known A and an unknown B, you can't know anything about the output (it won't be any more known than B is), whereas if you know of bugs in the random pool you might be able to take advantage of them and skew the output. Make of that what you will.

[4] http://thread.gmane.org/gmane.linux.kernel/1323386/focus=133...

The last time I've followed, RDRAND was not "just another source" in Linux, as it was xored after another sources were "fully" mixed.

Yes, I had modified my comment to make note of that. Although I tuned the language a bit more (removed the "just" of "just an other source")

Any info on why Intel engineers applied the pressure ?

I've no more information than Ts'o's word, and a commenter or two noting that they remember facing the same pressure (e.g. Jeff Garzik). It is possible and believable that the engineers themselves simply thought "we're providing a perfectly good random data stream, why would you bother doing more work on top of it and increasing the cost of your RNG?", just as it's believable they were in on it.

There are some relevant comments in the latter third of the thread:

> [Ts'o]

> When I asked the engineer who submitted it what his rationale was, he couldn't give me a coherent answer. He tried to handwave something around performance. When I pointed out this was not a problem since we had engineered solutions to support all of the systems that didn't have RDRAND, and invited him to give me one example where we had a performance bottleneck, he never responded.

> [Tony Luck]

> More clarifications - there have been engineers from both Intel and Red Hat referenced in this thread. I believe in the response above Ted is referring to the Red Hat one when he talks about not being given a rationale for a patch being submitted.

> The lead sentence of the post that started this whole thread is rather misleading, and perhaps based on Ted mis-remembering the exact nature of the patch that he applied in July 2012. At no point did any Intel engineer apply "pressure from Intel engineers to let /dev/random rely only on the RDRAND instruction". Peter pointed out above that RDRAND is not suited for this, given the ways that /dev/random and /dev/urandom are used - which is why his patch did not feed RDRAND straight to /dev/

> [Ts'o]

> +Tony Luck That's fair, I should have said /dev/random driver. However, note that get_random_bytes() does get used for key generation inside the kernel, so if RDRAND was comrpomised, we could get screwed that way. The bigger threat though is actually openssl choosing to use RDRAND directly for session keys, without doing something like encrypting the RDRAND output with a secret key. (And BTW, OpenSSL does have a RDRAND engine so that's just a configuration away....) 

What Linus correctly points out is that when multiple sources of entropy are combined, no single source can act to diminish the sum total of entropy. Even if rdrand was comprehensively backdoored and entirely predictable by the US government, it's still going to improve the quality of randomness. At worst it's a wash, but it will never reduce it.

Here's the direct link to Linus Torvalds' entertaining observation:


It's not true that it will never reduce it. It wouldn't be too hard for a CPU to notice when the result of RDRAND is XORed with something, and do evils to deliberately control the eventual result.

Yeah, in fact someone did a neat PoC of this using virtualization software the other day: https://twitter.com/DefuseSec/status/408975222163795969/phot... That's with an unmodified Linux kernel and unmodified instruction set aside from RDRAND. (In practice you'd make the result something less obvious than a constant value, but this presumably makes for a clearer demo.)

Fantastic example of what I was saying. Thanks for that.

Yes, but as stated further up, that CPU could anyway just write arbitrary data to the memory that holds your entropy pool, if you use RDRAND or not.

That's a more fragile attack though, since it requires identifying that memory location.

  fake_random = 5
  result = secret ^ real_random
  result = result ^ fake_random
You know the value of fake_random, but the randomness isn't reduced at all (sorry, I can't produce a real mathematical explanation)

  current_random = <something unpredictable>
  fake_random = RDRAND()
  new_random = current_random XOR fake_random
Surprise! new_random is predictable, because the malicious CPU peeked ahead at the XOR operation and adjusted the RDRAND operation accordingly.

Can't you wrap the RDRAND call with some long complicated code , so that tracking what fake_random will be hard to in hardware , thus preventing this attack?

Yes, but then it's an arms race between the hardware manufacturer and the random number software writer.

The fact is that the Linux/BSD random number generators were actually rather good before all this RDRAND stuff.

Try this again, but instead of:

    fake_random = 5

    fake_random = secret ^ real_random ^ 5
No matter what secret or real_random are, result is now 5. You can't hide secret or real_random from the processor.

In a larger sense, there's no way to stop the processor from simply setting result to whatever it wants before returning it to the place where it's used.

My point is, there's no real way to trust hardware that can't be verified.

Really? How?

> You recall correctly indeed

No, the situations and concerns are very different.

The situations are indeed different because it's FreeBSD and not Linux. And there's different history behind their /dev/random implementations. But the underling concern is the same: a lack of trust for rdrand -- and the solution should be the same: to only ever use rdrand as an "improver" and never as an actual source of entropy.

The difference is that Linux already didn't trust RdRand, and didn't use RdRand as the sole input to /dev/random.

You're right, the OP's post was badly worded, possibly mistaken, as it implies that the change Linus refused to make was analogous to this change.

I think you're mistaken. It looks more like FreeBSD is moving to using these devices the same way Linux already does.

No, as I understand that's a different change. The current state of FreeBSD, if I understand correctly, is that it has the option of returning the hardware-generated random sequence directly. They're changing it so it feeds entropy into the pool. Linux does neither, instead it xors the output of the entropy pool with the output from the hardware random number generator. This is strictly no less secure than the entropy pool on its own.

Note that the implementation of the systems' entropy pools is also quite different -- my understanding is that FreeBSD uses Yarrow, a pRNG, to feed /dev/random, while Linux mixes bits of 'real' random data together, without the extra pRNG stage. Which of these schemes is better is irrelevant to this particular discussion.

No links, I'm afraid, as EE apparently thinks Stack Exchange isn't suitable for under 18s and I can't turn content lock off at the moment.

> This is strictly no less secure than the entropy pool on its own.

I'm not quite certain, it should be possible (though possibly overly complex) to use RDRAND (or calls to the crypto routines) as a trigger for a XOR backdoor working in concert with a RDRAND backdoor, and thus control the final output.

Theoretically, sure.

But practically, that would be (a) easy to catch happening, and (b) require the chip to fingerprint the relevant sections of machine code as implemented in multiple versions of the Linux kernel, and as compiled with innumerable iterations of compiler settings and versions.

I don't want to say improbable, but I find it difficult to imagine otherwise.

It wouldn't actually be that hard for a CPU to notice which register the result of RDRAND was being XORed with and just set that register to something generated instead. Because, you know, that doesn't actually break any contracts. The effect is still that of XORing with a number we didn't know beforehand.

That assumes we can't read the result of rdrand after the XOR is performed. I don't know the specifics of how the chip RNG is implemented in hardware, but surely its values have to hit some sort of readable memory before the kernel's code performs the XOR?

It doesn't really assume that at all. You can maliciously generate the result of RDRAND according to the instructions that are going to use it - that doesn't prevent that value being used later.

The way a CPU works internally, intermediate results are written to registers, and the registers specified by the machine code are rewritten onto a larger set of real registers. The CPU knows at the instruction decode stage which results are going to be used by which instructions, so that it can schedule them for out of order execution. It also knows when it can throw away the result of a real register because the machine code register it represents has been overwritten with another value. So if the result is used by a second instruction, that doesn't matter

Yes, if you wrote the result of RDRAND to main memory and read it in again, you would destroy that knowledge, but I would bet that the random number generator doesn't do that.

It could certainly be used as an integrity check. If you get different results from the same operations on what should be the same numbers, you know your hardware is maliciously unfit for purpose.

Uh. You're saying you should compare the result of running a random number generator twice and seeing if the two values are the same?

No, you store the register value holding the RDRAND result into memory, then you store the register value holding the number from your other sources of entropy.

Then you read them back from memory and XOR the same numbers again. You compare that result with the XOR from the register values and see if they are the same.

If they are different, the chip is substituting a different XOR result whenever it is capable of determining that one of the operands is the output of an RDRAND instruction.

You might presume that an efficient mixer of multiple entropy sources would just XOR all of its various inputs together, so compromising XOR would be an easy way to discard the entropy of the other inputs in a way that is not easily detected without writing assembly code specifically to check it.

That doesn't protect against the CPU choosing the return value of RDRAND maliciously - just against the XOR operation being duff. To be honest, it's much easier for a CPU to have a malicious RDRAND than compromise XOR. As demonstrated in another post, returning "edx XOR <something predictable>" hoses the Linux RNG.

But a malicious RDRAND is, at worst, useless, contributing zero entropy to a pool of multiple sources. A malicious XOR could reduce the entropy of the pool.

Your comment implies that Linus trusts Intel. This is not true, since the change is unnecessary even if they were infiltrated:

"Long answer: we use rdrand as _one_ of many inputs into the random pool, and we use it as a way to _improve_ that random pool. So even if rdrand were to be back-doored by the NSA, our use of rdrand actually improves the quality of the random numbers you get from /dev/random." http://www.change.org/en-GB/petitions/linus-torvalds-remove-...

The conclusion is arguably correct, but the premise not so much. RDRAND isn't used as an input to the random.c pool: http://blog.lvh.io/blog/2013/10/19/thoughts-on-rdrand-in-lin...

> However, software can be audited, the hardware cannot be

Exactly. So work on solid security/privacy principles, rather than take the easy way out and hope for the best.

I think which Linux distro you use should be like religion and politics should be a personal matter ;).

That said I am a Mint XFCE user and have been back to 13 (before that Gnome 2) every release has been brilliant I run it on everything from an ancient ThinkPad to a thoroughly modern development machine and it has worked well.

It's also the only WM/DE that handles multiple screens (3 on both desktops and 2 on Dell/External) without any show stopping/tremendously irritating bugs couple that with the huge amount of software available via Debian/Ubuntu and PPA's and it's a cracking developer OS.

The original post is neither about Linux nor bragging about what you use. It's about the RNG in FreeBSD which is not a Linux distro

Heh, Posted the comment in the wrong thread, downvotes are fair enough.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact