Secure programs should rely on /dev/urandom, to the exclusion of other CSPRNGs, and should specifically eschew userland CSPRNG engines even when they're seeded from /dev/urandom.
This Android bug is another in a line of bugs in userland CSPRNGs, the most famous of which was Debian's OpenSSL CSPRNG bug which gave up everyone's SSH key. CSPRNGs have bugs. When you rely on a userland CSPRNG, you're relying on two single points of failure: the userland CSPRNG and the OS's CSPRNG. Failures of either aren't generally survivable.
There are people who are smarter than me (notably 'cperciva) who disagree with me on this, but this is an idea I got from DJB's NACL design paper; it's not my own crazy idea.
I got to eat dinner with Dan Bernstein the other day, by the way, and it turns out NACL is pronounced "lasagna".
The NSA did design a random number generator that likely had a backdoor in it:
Here's Bruce Schneier talking about it:
Also it's in Windows (although it's not used by default but userspace programs could rely on it).
It would be possible for the NSA to go to Intel and get them to put in something in their random number generator that would let them to basically break the encryption by massively reducing the keyspace if they have the secret key.
(This is one of those times where I regret HN that has twin interests in software security as an engineering science and software security as a political statement.)
You can verify the random number generator. If you know the algorithm and the seed values you can run it on multiple different platforms, or with a pen and paper and verify that the output is as expected and repeatable. If you have large amounts of entropy you are feeding into it, you can log it for testing purposes.
There are also apparently some EC based algorithms that can be used to fix or at lease reduce the impact of a compromised random number generator.
That might not protect against a active attack on your specific system by the NSA (they could send/embedded a magic packet that gives them total control over the CPU for example), might even be possible for it to happen on the NIC controller rather than the CPU if it has access to the system bus. At the least they could flip it into some kind of backdoor random number mode by embedding some extra data in a TLS handshake or whatever. But it should protect against widespread passive surveillance.
Again: you're making a point that has nothing to do with my comment. If you don't trust RDRAND, don't use it. But you should still be using the OS's CSPRNG. Just make sure your OS isn't using RDRAND. Done and done.
Integers form a closed group (in the group theory sense) under multiplication and addition, so if you were concerned about specific calculations being compromised you could verify the answer via alternative calculations and testing for consistency.
Faking consistency is likely impossible without causing a huge amount other calculations (which the CPU will do as part of day-to-day operations) to fail. Making the CPU alter calculations only under very specific circumstances and in an undetectable way would require a huge amount of complexity.
On the other hand we know several fairly trivial ways of faking reversable randomness using standard crypto algos that would be statistically undetectable without taking the hardware apart.
I will provide a script which monkeypatches Fixnum such that your test script still reports "Yep, it's still adding correctly" and which will predictably cause 1 + 1 to equal the string "LOL I AM THE NSA" under attacker-chosen conditions.
There are many, many ways I could do this as the crafty adversary, but I anticipate my implementation would be roughly seven lines long and trivially implementable in hardware. (At complexity somewhere between a half-adder and the circuits which we had to hand-draw for our finals in hardware design, which -- obviously -- are a drop in the ocean of complexity on a modern CPU.)
You can't just break one calculation under specific circumstances, you have to break any way of verifying it which means compromising pretty much every calculation.
(not to mention you'd have to figure out that it's the fibonacci sequence being calculated and not some other calculation that happens to start with 1+1)
Remember you don't have an interactive attack, you have one chance to build something and ship it out in hardware, and it then has to compromise software that's going to be written in the future.
The beauty of the RNG attack is that it's undetectable, introduces a backdoor into a huge number of systems and it only makes the system vulnerable to the attacker and not to anyone else.
Quick ig1, you are freedom's last, best hope. Write actual computer code which can, by adding numbers together and inspecting their output, determine whether your Ruby interpreter has been compromised by the NSA. You're lucky, since the NSA has already shipped their exploit (or did they?), they can't modify it in response to your detection code. Bad news, though: if your detection code fails and an interpreter which includes the backdoor can, after passing your detection code, still get the wrong answer for 1 + 1, an innocent user fails to find the backdoor and suffers a FatalHitByCruiseMissileError. You don't get to say "OK, so in hindsight, now that I see the backdoor addressing it was pretty darn easy. Mop up the mess and I'm sure to win round two."
I'm writing assembly code, I've spent time also to measure the time of the machine operations etc, so I simply can't imagine the valid argument which ignores exact limitations present in actual hardware.
(Google for "Illinois Malicious Processors" for what an attacker might actually do. A few extra transistors here and there can lay you wide open.)
I still don't see how adding "almost incorrectly" would have not be detected in practice. In practice you also have to implement "almost incorrectly" in a way that doesn't affect the performance of your CPU. We are talking about processors, not Ruby interpreters.
You can't compare that kind of error to a deliberately hidden flaw.
The possible flaws with Lenovo computers haven't, as far as I know, been found anyone other than the spooks. (https://news.ycombinator.com/item?id=6108980)
The RNG attack has correct behaviour under every statistical and blackbox analysis. The only way you can break it externally is if you have the key.
If you were designing a cryptosystem you'd pick the one where the security lies in the key and not in obfuscation. The same applies when designing an attack you want to keep secret.
For example, I want to run a VM on EC2, knowing the NSA's after something I have. Is there some way I could, even theoretically, build my VM, that would protect it from the NSA compelling Amazon's cooperation in ripping the data right from my VM's memory at the hypervisor level?
You are getting into the flip side of the DRM coin here; put differently: if there's a way to do what you want, there's also a way for the MPAA to do what it wants on PCs.
Personally, I think there is, at least in the dollar-cost model of security.
We're running a high-assurance, remotely attestable hypervisor inside the CPU cache and encrypting all access to main memory. This protects against threats from the physical layer, like cold boot, DMA attacks, NVDIMMs, bus analyzers, etc.
It's not quite what you're talking about in your Amazon and NSA scenario. Amazon doesn't let you bring your own hypervisor to run on bare metal and the NSA can compromise the CPU itself.
However, our approach does give you assurance that someone with physical access can't easily snapshot your VM memory.
How does this bit work, by the way? What's stopping an altered hypervisor from lying to say it's unaltered? (This is the classic "how do you verify a player on your FPS isn't running a bot instead of a game-client" problem in a nutshell.)
There are known TPM and LPC bus vulnerabilities. That is why long-term we will move away from that dependency by utilizing upcoming CPU features.
But how do you know the VM (or rather the hypervisor for the vm) is running on physical hardware, and not in a hypervisor?
I can't think of a way you could be certain of this remotely? Perhaps you could be on-site for the boot-up, and then rely on the fact that snapshotting is very hard -- but it sounds rather fragile...
Still very interesting project! I've been thinking a bit on "running inside the L/1/2/3 chache"-lately - but I hadn't thought about the particular idea that you could treat RAM as "external" -- assuming you could guarantee that you're always in cache.
Today, that attestation process relies on a TPM and a signed certificate chain baked in by the TPM manufacturer. This is standard stuff out of the Trusted Computing Group.
One more thing to add, this isn't just a personal side project. We're a company and have a beta product deployed to early adopters.
Are you aware of:
"Overcoming TPM by exploiting EFI overflow"
I don't suppose any of your software is available as open source? Where can I/we learn more?
We're also aware of vulnerabilities on the the LPC bus. The latter can be addressed with existing TPM 1.2 features -- although they aren't enabled by default.
There are CPU features in the pipeline which may make the TPM unnecessary. We're also working on some new attestation techniques which may help.
We measure and attest the state of the system with TXT. If that works as advertised, you would measure changes to the BIOS, SINIT, opt ROMs, ACM modules, etc.
The naïve way relies on: "If there isn't a Rootkit or capture mechanism, it's secure if it's obscure." That will buy an iota of time.
The other way is to simply not trust the host and end-to-end encrypt everything important. This is a PITA, but there is no current viable alternative without a significantly rigorous and extensive project (would be in the form of a VM for servers &| something that runs on js like asm.js)
Mostly you need zero knowledge computation. Storage and network are easier probs.
The intractable problem is eventually an answer needs to be presented somewhere. For now, you should minimize exposure as much as possible through all means available.
That is a valid concern that needs to be part of your risk assessment. "Do I want to protect my secrets from a well funded government agency?"
But there are other risks that need to be thought about too. Some people seem to think that hardware RNGs are better than software. Often they're not, they're lousy.
HWrngs can have subtle failure rates which are hard to detect.
Once you've done all the de-skewing and other checks they can be quite low bandwidth.
I have a bunch of links to reading about HWRNGs here - (https://news.ycombinator.com/item?id=6060636)
And here's a really nice thread (https://news.ycombinator.com/item?id=1453299)
 Although if you want to defend yourself against a well funded secret government agency you need to worry about more than a weak RNG.
If you don't trust the hw or vhost, you're stuffed. Political upwards and technical downwards are the ways to make sure this doesn't happen.
I'm confused, what's the apostrophe eliding? ;-)
(That is, nicks like 'nick, and "' or '1'='1' -- '" aren't allowed?).
I think the @user idiom is by now understood by anyone not relying exclusively on pigeons to send their missives.
Those who really do not know (yet) will quickly understand from context.
All in all, I think @foo is less confusing than 'foo, and will spare you questions like in this thread.
(Please do not throw anything at me.)
I just happen to think you're wrong (and the NaCl paper you mention actually provides advice to crypto library writers, not users, for the question at hand). I'm not animated by anything any more than you're in the secret employ of the cartesian product of all possible three letters.
My real issue isn't that devs use SecureRandom. Use it by default; it's not like we'd doc a bug on you for doing that.
My real issue is that the library developers who expose SecureRandom don't themselves simply pull the random bytes from urandom.
I never even made such a point! At all! If anything, I'd like to know how I gave you that impression so I can avoid it in the future.
What kinds of things do you think developers can do wrong reading from urandom?
Well for one thing it's entirely unportable. I think cperciva pointed out some difference on how it behaves on FreeBSD vs Linux. How does it behave on OS X? On Windows? What are the failure modes?
My real issue is that the library developers who expose SecureRandom don't themselves simply pull the random bytes from urandom.
I'm in 74211004% agreement with you there, as is the NaCl paper. Centralize randomness, ensure that it's reviewable and done right. In fact, the NaCl paper suggests:
"A structural deficiency in the /dev/urandom API provided by Linux, BSD, etc. is that using it can fail, for example because the system has no available file descriptors. In this case NaCl waits and tries again. We recommend that operating systems add a reliable urandom(x,xlen) system call."
Again, my stipulation is that this is very much not the job of app developers. The Android bitcoin wallet developers seem to be a case in point - they don't seem to have done anything particularly wrong at all. The platform failed them. So should we be berating platform providers or should we insist app-writers keep close track of various security developments? I think it's very much the former.
So here is the interesting question as I see it:
What's safer for developers, reading from /dev/urandom, or calling the platform CSPRNG when the platform CSPRNG does something other than reading from /dev/urandom?
I think the answer might be /dev/urandom. A Debian maintainer is not going to accidentally comment the randomness out of the Linux kernel CSPRNG. Nobody is going to miss the logic bug that causes the Linux kernel random number generator to not even try to seed itself.
I think you're right about this and the (rather fiddly and nerdy) point where we disagree is on where the line of abstraction lies. I don't think you have any business reading /dev/urandom any more than you have using, say, AES-NI
Well, NetBSD more or less did that twice in a row earlier this year...
edit: for those unaware, the NACL 'tptacek is referring to is  - which long predates .
/dev/random is the right way to provide hardware RNG in a UNIX environment. However, programs using it can be blocked for extremely long times if there is no dedicated hardware available for providing RNG. I have regularly seen blocking times in the multiple hour range. This is not practical for most programs.
Even relying on /dev/random isn't necessarily safe. I have seen setups where people connect /dev/random to /dev/urandom in an attempt to eliminate these long block times on /dev/random. A quick google search can bring up multiple examples of how to do this. I'll leave it to you to think about out why this is a bad idea, but the result is that you cannot even trust /dev/random.
Thankfully, Thomas Pornin wrote a definitive debunking of the urban legend here:
In short: that's not how urandom works. It's not a "fallback to a crappy RNG". Random and urandom are two different interfaces to the same CSPRNG engine. One of them happens to block based on entropy estimates, but that's a design weakness of the Linux CSPRNG, not a great feature of a CSPRNG in general.
While considering how highly rated my comment was, and weighing your own expertise on cryptographic randomness, you might also consider reading the NaCL design paper (Bernstein, Lange, Schwabe) for more details on why you want to use the OS CSPRNG instead of your own. Or, for that matter, consult the NaCL source code to see which they use.
Finally, to your main point: the reason you should use the OS CSPRNG (typically: urandom) is precisely so that you don't end up using some random bit of aspirational app crypto. The OS has to get its CSPRNG right no matter what, so just let it do the work for you.
Correct me if I'm wrong: which OSes have desirable CSPRNG properties like Fortuna (entropy pools esp.). I've looked at OSX, Win8, Free/Net&Open BSD and still haven't found anything satisfying other than class of algo. (Ignore whether closed/open source and IP issues.) I would like to see all mainstream OSes do these well enough.
The problem is that the PRNG has a weak default entropy source. The same problem existed in the kernel for ages. See http://www.factorable.net
The real advice here ought to be that if you are building an application that generates keys, you should make sure that the system has appropriate entropy before generating the keys. Don't assume the PRNG (whether it's /dev/urandom or OpenSSL) does it for you.
Further, it is harder to imagine a circumstance in which /dev/random could be broken and an app-layer CSPRNG like OpenSSL's could not be broken.
It's a little fuzzy in this case because part of the patch feeds entropy back to /dev/random, but observe (a) that the bulletin doesn't tell developers to re-key if they were depending on /dev/random (as some apps do), despite naming 5 other interfaces that do demand rekeying, and (b) that this particular patch would be an inadequate fix for a broken /dev/random anyways.
It's true that you care about entropy at cold start either way. But your choice of CSPRNGs doesn't resolve that problem for you, except that the OS needs to have secure random regardless and so you're boned whether or not you rely on /dev/random, and hence are better off just relying on /dev/random.
In practice, this is a distinction without a meaningful difference. In more modern implementations (like OS X) the CSPRNG works the same way with both interfaces.
You should generally prefer urandom.
There is a wellspring of urban legands about the security difference between these two interfaces because the Linux man page is misleading.
/dev/urandom reuses existing entropy to produce more pseudo-random numbers. It may be less "random", but will never block.
more at http://en.wikipedia.org/wiki//dev/random
I don't want less entropy thank you. If I don't have enough then let's wait until some more become available
I would use urandom if I need random but I don't care about their quality (even though urandom is good most of the time)