Yarrow and Fortuna are examples of decent CSPRNGs, so I'd say this is a pretty good move by FreeBSD.
If you do not know one of the inputs to an XOR, you know nothing about the output.
XORing random sources is great because it acts as a minimum function for entropy of all of the sources. As long as you have just one good source, the output is good.
Aren't we talking about potential backdoors built in to the CPU? As in, having full access to all the registers?
Good call on this. If you can't trust the CPU you're running on, you can't trust anything at all. The proper solution is to find a supplier you trust and go with them.
Pulling useful entropy away from the OS's RNG functions is the best example of an ignorant knee-jerk reaction to this security problem.
How would the processor even know it's performing crypto operations that it should swap numbers for?
Building a "where will this rdrand go?" backdoor is harder than building a backdoor that just trawls through load addresses for various common kernels looking for the symbol table so it can poison the entropy directly.
>Presumably using the same skynet tech it uses to look ahead and see where the rdrand is going to be xored into.
There is no such tech. That's why this is not 'checkmate'. A poisoned random number that generates numbers in a predictable manner is orders of magnitude easier to implement and less possible to detect than a magical processor that changes memory it thinks might be entropy for some operating systems it has been pre-programmed to look for under the assumption that kernel will never change ever. Get real.
If you're going to open Pandora's box, I don't think you get to pick and choose what hypothetical backdoors to take out. It's particularly odd to only select the subset of backdoors that you can easily defend against.
This is very unrealistic. It would mean that processors have to be shipped with every crypto library implementation baked into the hardware. Any update would immediately break it.
That's a different threat model that doesn't exist here. These devices don't have free access to wander around main memory.
Do we trust hardware RNGs enough to allow their output to increase that measure of entropy?
Which really sucks because then you run something like haveged on the server running the unit tests because you can't have a build take hours because Linux takes forever to gather new entropy on headless servers :-(
/dev/urandom gives the exact same quality randomness as /dev/random (let's ignore the issue of boot-time and VM cloning for now).
There is a slight twist to it, where for information-theoretically secure algorithms /dev/random would be preferable, but you don't need that. Because you don't use those algorithms (the only one really worth mentioning is Shamir's secret sharing scheme).
I'm just amazed how people don't trust the cryptographic building blocks inside a modern CSPRNG, but then use the very same building blocks to use the randomness and encrypt their secrets.
In a normal PRNG, if you want X different possible outputs, you must be able to seed it with at least X different seeds. Since each seed corresponds to an output sequence, you need at least as many values for a seed as you wish output sequences. Of course, this seed should be random, and you can't really use a PRNG to seed a PRNG.
How do CSPRNGs get around this? I assume that if I have a CSPRNG, I must seed it, and that I must draw that seed from a pool of seeds at least as big as the desired set of output streams. (See above.) If my intent is to generate 4096 random bits (say, for an encryption seed), to me it seems I must input a random seed at least that long. Thus, I need a good RNG.
Take a look at Wikipedia's definition, for example, of what a CSPRNG must do (as opposed to just any old PRNG):
• Every CSPRNG should satisfy the next-bit test.
• Every CSPRNG should withstand "state compromise extensions". In the event that part or all of its state has been revealed (or guessed correctly), it should be impossible to reconstruct the stream of random numbers prior to the revelation. Additionally, if there is an entropy input while running, it should be infeasible to use knowledge of the input's state to predict future conditions of the CSPRNG state.
Let's assume our CSPRNG of choice satisfies that. The problem is that second one only applies to "preceeding bits". If I know the state of the CSPRNG, I can predict future output. If Linux is low on entropy, or runs out, does this not diminish the number of possible inputs or seeds to the CSPRNG, allowing me to guess, or narrow down my guesses, to the state/seed of the CSPRNG, perhaps prior to it generating output?
Am I going wrong somewhere?
First, I'm not saying that cryptographic randomness can be created out of thin air, without entropy, I just argue that you don't need n bits of real entropy to get n bits of high-quality randomness.
I mean if you really needed 4096 bits of random seed to generate 4096 bits on randomness, why not just take the 4096 bits you waste on the seed as randomness?
Of course you need a random seed. That's what i was alluding to with the boot-time or VM remark.
But you're not really interested in lots and lots of potential output sequences, one of them is enough. Remember, the first requirement of a block cipher is that it is indistinguishable (to a computational polynomial bound) from a random distribution.
The real counter-argument is the state attack. And that's mitigated by a modern RNG's design. Fortuna, for example, constantly mixes incoming entropy into outputs that occur far in the future (technically, it reseeds every now and then, but without estimating entropy). This does not protect you from a total state compromise, a computer is deterministic, after all, but it's quite hard to argue with a straight face that such a total compromise matters, because everything you might want to use the randomness for would most certainly be quite as compromised, as well.
So why take that (probably insignificant) risk?
Because the alternative is worse. If you want to have Linux's current /dev/random behavior, you have two things:
First, it blocks when there's not enough entropy in the pool. That's bad. Just google for "sshd hangs". Either your system doesn't work anymore, or people find creative ways to subterfuge all the crypto stuff to make it work again. Just for the far-fetched fear about this total state compromise?
Second, how much entropy do you have? Lots and lots has been written about it, but despite all this technobabble ("'entropy', there must be hard physics behind it"), estimating entropy is not really an exact science. More guesswork. So you never know how much entropy you really have. That's why Fortuna got rid of all that estimating that its predecessor Yarrow still did.
Especially since: what are you doing with ssss? Usually splitting a private key to a non-information-theoretically secure block cipher, I guess. So you're back to square one.
I phrased that part poorly.
I've collected lots of links to articles and man pages and so on about the issue, because I have been planning to write a coherent article about urandom vs. random.
And I'm pretty sure I remember what posting you mean.
Unfortunately, writing this article gets postponed again and again, just as finishing and sending in the second set of your crypto challenges... maybe before Christmas I'm going to find a few hours to do that.
Fortuna has the advantage of making it harder to game entropy pools. 
Quite a bit of entropy using radio noise and a $15 RTL-SDR USB dongle. Still could use some work and review but seems like the start to an almost ideal solution.
EDIT: I'm thinking of Turbid:
EDIT2: There's also a wealth of information in this thread from the metzdowd cryptography mailing list:
They also provide a timer and video based entropy daemons too.
Well, not really.
I was both software cracker and virus writer myself(just for fun, we never distributed our virus or our cracks, keygens and stuff outside).
I was part of a larger group doing it. I had contacts with people that became famous for breaking several important protections, specially for games.
You have no idea how big software is. A million lines of code is impossible to read by any human being in his entire life. Yes, there are automatic tools and great disassemblers like IDA pro, but even if you were to see the entire source code, it is very easy to hide some flaw into the code.
If you add undocumented hardware into the equation, then it is very hard to disassemble without significant resources.
This hardware took dozens of millions of dollars to develop, hundreds of very smart people to design, with all the documentation. It will take at least an order of magnitude more to decode not having that info.
In particular, reverse engineering is no longer the specialty of virus authors the way it used to be; a totally mainstream application of reverse engineering, practiced by most major security companies, is reverse engineering patches to find the underlying security flaws they fix so they can be weaponized.
It is easy to hide flaws in any code. Assembly doesn't make it much easier.
Do you mean that reversers' try to locate the, say, buffer overflow that was fixed and try to find another way to exploit it? Why would major companies want to do this?
I actually didn't see the BH talk, but I saw a similar talk that Halvar gave at CanSecWest shortly after.
The gist of it is, imagine that you have a binary that you are looking to find vulnerabilities in to exploit. You can go through all the trouble of discovering a vulnerability, and then hope it doesn't get patched; or you can sit and wait for a patch for said binary. There's reams of data out there about how long it takes for systems to apply patches, but in general, you can find vulnerable versions of patched software long after the patch has been released.
Using binary differential analysis you are basically zeroing in on the parts that were changed (which you can imagine is a much smaller subset of the overall binary) and find the vulnerability much more quickly.
There are tools (there is/was a product called Bindiff that I don't know if Zynamics still sells after they got bought by Google), which help you do this in a more automated fashion.
That means that with much less work, you can write up a working exploit that will still work on some decent percentage of the install base for the application (until everyone patches it).
Additionally, you can imagine that a lot of times when vulnerabilities get fixed, they aren't necessarily fixed with the utmost rigor. There's a lot of cases where an individual vulnerability might be fixed, but if you look at what was done, you can find other parts of the binary that are vulnerable to the same underlying flaw. Knowing what gets changed in the patch can tell you a lot about underlying issues.
It's possible to write obfuscated assembly, of course, but it sticks out as obfuscated.
To further get your head around the lack of cover compiled code would give a backdoor, it's worth knowing that Hopper will give you serviceable C-ish code from x86_64 for... let me check here... fifty nine bucks.
(It's possible that my only message with that last graf is "Hopper is great").
if some specific set of registers is set a certain way followed by a very specific series of commands, there is no practical way to prove that the hardware is doing what it should.
Meanwhile: if you can't trust the hardware, you can't trust C code either.
If you look on the related Wikipedia page what you find that there was an audit done of the Windows 2000 version of CryptGenRandom (which s_rand uses), and it was discovered to be flawed in a number of ways. The follow up is that Microsoft said that the issues were fixed in Vista. It's more than a theory that s_rand vulnerabilities will be called "bugs". It already happened once.
On top of it, when we analyze government malware, we don't see some weird calls to rand_s. We just see a few zero-days, exploits for known issues, or just plain-jane trojans.
Why compromise rand_s, which could help our enemies and cause a public outcry, when you're sitting on a mountain of zero days?
Also, windows is not closed source. MS shares source with governments, universities, etc. I think functions like rand_s are well understood and there has never been an MS backdoor for the government.
The problem with such systems is that is generally quite difficult to have confidence that they are not subject to attacks that may starve the system of entropy or trick the entropy estimator into thinking it has any when it has none.
Low-level hardware RNGs can be constructed in ways that make them quite difficult to attack externally, but are also basically impossible for us to verify.
Really it seems that the best approach is to take a wide range of flexible entropy sources and to learn to trust our mixing pools and trapdoor functions.
That's why you give up on estimating entropy. It's just not possible in the general case.
Mix multiple sources of potential entropy in a secure fashion (e.g. with a crypographically-secure hashing function). Recover over time from a compromise.
> Really it seems that the best approach is to take a wide range of flexible entropy sources and to learn to trust our mixing pools and trapdoor functions.
Yup, you're exactly right. Now we just need to convince Ted T'so to replace Linux's hacky /dev/random with a Fortuna-based one.
Hardware devices from big companies are categorically untrustworthy, because of these companies' history of cooperating with surveillance, their incentives to sell out users, and the inability of users to verify what's going on in the devices.
The answer is something that's verifiable - and that means precisely a no-code, low-tech source of entropy, plus a means of translating fluctuations into bits, the latter being "open" enough to enable users to confirm exactly what it does.
And thanks, it's funny to think of putting a microphone in a colo server and using the roar of all the nearby fans and occasional voices as a source of random bits.
I imagine you could do something similar with a feed of traffic on a nearby highway or blobs of cornstarch suspension on a subwoofer.
This is the difference between engineering and idealism.
Did you build your own CPU? Write your own firmware? Audit hardware and firmware for device with access to system memory? Write your own kernel? Verify your compiler? Audit every line of code for everything you run? If not, you're deciding the particular areas where you choose to harbor the illusion that you're seriously verifying something.
Fundamentally at some point you have to trust your hardware vendors unless you have unlimited resources to audit everything. Open source is just part of that picture and, like everything else in security, doing it professionally requires you to be pragmatic by balancing absolute security in a particular area against your users’ ability to actually do what they care about.
If FreeBSD did not ship a binary driver the overwhelmingly more likely outcome would be fewer people using FreeBSD. If you care about FreeBSD, as the developers presumably do, you want as many users as possible to improve the odds of being taken seriously when you try to negotiate better support with a vendor. Consider how much trouble OLPC had with WiFi firmware – that gives you a bottom range estimate for the number of units sold which has to be on the line before a hesitant vendor will consider opening something.
We're playing the let's jump to absurd extremes game, are we? Ok, there's no point in any of it - nothing can be trusted - I'll go and install windows.
Of course, I don't agree with this comparison; running the output of the HWRNG through Yarrow is good practice in any case.
Binary blob software also has the distinct disadvantage that even if you were to verify it, you would have to verify each release again whenever there is an update.
You have to make the defaults secure because they're what most people will use. It doesn't matter that you can change it if most people never do.
Additionally, you can analyze binary diffs, much like source code. You would not start from scratch at each new version.
Let's also not forget how hard it is to verify source code, and how very rarely it is done.
In reality, the difference is that we can guarantee 100% of systems to work without the hardware RNG. Every binary blob we remove removes support for hardware and 'breaks' the OS for a subset of the users.
No one* is going to fire up an OS and go "Hey, this OS doesn't use my hardware RNG! This OS is broken!"
They will, however, fire it up and go "Hey, this OS doesn't support my network adapter! This OS is broken!"
Security versus functionality. It's always a trade-off. The most secure system is the one that does nothing - everything past there is adding functionality at the expense of an increased attack surface.
* To within a margin of error.
The flaw in your argument is the assumption that the drawback of using a software RNG instead of a hardware RNG is inherently less than the drawback of using an open source driver instead of a proprietary binary blob driver. There may exist specific circumstances where this actually the case (as is obvious if there is no open source driver whatsoever), but that isn't the immutable state of the world. If you take this kind of security to be a Serious Issue then it demands for engineering effort (or community pressure) to be directed toward the goal of having published source code for all hardware drivers. One of the ways you can do that is by refusing to ship binary blobs (as Debian does), which creates some trouble for Debian, but also creates some trouble for the hardware vendors who now have customers avoiding their hardware because it involves more trouble than a competitor's hardware.
The typical response to this is the argument that not enough people run these operating systems to make the hardware manufacturers care, so you do more harm to yourself than you do to them. But that's ignoring the demographics. The primary constituency of Unix-like operating systems is the likes of IT professionals, systems administrators and software developers. These are the people who buy computer hardware by the pallet. Hardware manufacturers do not want those people to notice them in a bad way.
This is incidentally why the state of open source GPU drivers is so much worse than the state of e.g. open source network drivers. When the local BOFH goes to buy a thousand rack units of Debian servers, he generally cares a lot more about the NIC than whether it will do 3D hardware acceleration. But that's going to change soon enough as GPGPU gets popular in the datacenter.
Edit: thanks for the interesting replies!
The simple example is a basic SR latch (two NOR gates, where the output of one gate feeds one of the inputs of the other, and vice versa), where you start things off by applying a signal to both S and R. When you remove the signals, the latch will eventually fall back into one of the two stable states - but which state it ends up in is random.
So you can easily produce a stream of bits from a potentially-biased random source, and then do some deterministic massaging of that stream to produced an unbiased stream of random bits.
That output stream may be biased, so it subsequently goes through a "whitening" stage based on AES.
(One question I haven't seen an answer to: presuming all the hardware functions as described, would it be possible for a microcode update to change the output --- e.g., by whitening the output of the real-time clock instead of what I'm calling the coin-flip circuit?)
For some physical phenomena that can be exploited for getting entropy from a chip or other hardware, have a look at http://en.wikipedia.org/wiki/Hardware_random_number_generato...
If I had an application where it mattered, I would pick RDRAND over a software system that was not reviewed, built, verified, and continuously monitored by someone I trust. It's implausible that RDRAND on my hardware is implemented differently from the hundreds of thousands of other copies of the core, or differently from the way it was yesterday. I have no such confidence in /dev/random. That doesn't mean there's anything wrong with FreeBSD's RNG;
it only means that it's harder to be sure I'm actually using it.
Not so good for the share price, methinks.
What about someone who has actually been victimized by an exploit of a backdoor that was ordered installed by the government? Would they have standing to sue for damages that resulted?
I only ask because I know "standing" has been an important issue in many of the cases related to these abuses. Most of the suits have failed because the plaintiffs were found to lack standing for one reason or another.
Edit: which is to say that I thought the issue wasn't that the Intel/via chip's random number generators wouldn't look random in aggregate over lots of uses, but that people are concerned that there could be exploitable patterns in short sequences.
Maybe an opensource FPGA-based solution might do the trick if you really need high-quality highly-secure fast random number generation.
Personally when I've needed good and fast sources of entropy I've just picked a good (but slow) random number source and used it as a key for a strong stream cipher (that I would renew every megabit or so). Assuming there are no weaknesses in the cipher and you have hardware acceleration you can get a very fast PRNG source.
And unlike RDRAND you can actually audit the hardware cipher implementation because it behaves deterministically.
Here's a list:
Key quote: "rdrand in ivbridge not implemented by Intel."
It's well known that the Intel RDRAND implementation uses a hardware entropy source to seed a software RNG implemented directly in the CPU. The suspicion is that said CSRNG could be using an elliptic curve CSRNG to carry out the final mixing where the NSA has chosen both the curve and the points on the curve. It's impossible to tell from the outside whether this is the case or not - such an implementation would still pass all known randomness tests. However, if the curve points have been carefully chosen then you can derive the internal state of the RNG from the output and thus both predict future values (until the hardware entropy source re-seeds the generator, if it actually does that at all) and derive past ones (up to the last re-seed).
It's been said elsewhere that using an elliptic curve CSRNG where the NSA has chosen the curve points is effectively equivalent to doing the second half of a Diffie-Hellman exchange with the NSA to securely communicate your RNG state to them. For cryptographic algorithms that depend upon a secure RNG source this is devastating. All communications that use such are source are transparent to the NSA. (The NSA crypto geeks must have thought this was a fantastic trick: the communication would remain completely secure against all other third parties because deriving the magic numbers that let you decrypt the output from the public curve points is next to impossible - it's the discrete logarithm problem in action.)
As far as I know, there's no known issues with this approach, unlike with the elliptic curve random number generator detailed in the same document, where the NSA is believed to have pre-selected the elliptic curve points.
If that note is true, then crucial details of the RDRAND specification in Intel CPUs were supplied by someone other than Intel.
Who's most likely to want to push for a particular implementation to their own enormous potential advantage? The NSA. Who has prior history of doing exactly this to a random number generator standard? The NSA. Do we know that the NSA influenced the Intel RDRAND implementation in these ways? No. Did they have both motive & opportunity? Absolutely, yes they did.
Which is why a concrete reference for that note would be good: if someone can point to the place in the public record where Intel admitted that they were not solely responsible for the RDRAND implementation then that's a fairly big deal & a brief sentence in someone's conference notes isn't good enough.
I had already started to forget about that...
From that other report (september):
> They reveal a highly classified program codenamed Bullrun, which according to the reports relied on a combination of "supercomputers, technical trickery, court orders, and behind-the-scenes persuasion" to undermine basic staples of Internet privacy, including virtual private networks (VPNs) and the widely used secure sockets layer (SSL) and transport layer security (TLS) protocols.
Basically any stream cipher worth its salt would pass the parent's test.
This doesn't apply to stream ciphers.
As far as I know theoretically there could be a collision and the cipher could output the same value for two different counter values, unless the block cipher guarantees that such a collision can never occur? I was not aware of such a guarantee.
My fear is that the extra complexity adds additional opportunity for a back door to be inserted. However, software can be audited, the hardware cannot be.
According to Theodore Ts'o, there was pressure from Intel engineers to rely solely on RDRAND but they were rejected, it looks like the FreeBSD devs did do it and have /dev/random using RDRAND/Padlock directly
[-1] although — and I don't know why — the linux rng XORs pool output with RDRAND output instead of feeding RDRAND into the entropy pool itself
 That's the only way I can interpret "for 10, we are going to backtrack and remove RDRAND and Padlock backends and feed them into Yarrow instead of delivering their output directly to /dev/random" anyway
 I'll try to stop now but provides a rationale for it: a history of unnoticed bugs in the random pool making the risk of a trivial xor lower as far as kernel devs are concerned: if you xor a known A and an unknown B, you can't know anything about the output (it won't be any more known than B is), whereas if you know of bugs in the random pool you might be able to take advantage of them and skew the output. Make of that what you will.
There are some relevant comments in the latter third of the thread:
> When I asked the engineer who submitted it what his rationale was, he couldn't give me a coherent answer. He tried to handwave something around performance. When I pointed out this was not a problem since we had engineered solutions to support all of the systems that didn't have RDRAND, and invited him to give me one example where we had a performance bottleneck, he never responded.
> [Tony Luck]
> More clarifications - there have been engineers from both Intel and Red Hat referenced in this thread. I believe in the response above Ted is referring to the Red Hat one when he talks about not being given a rationale for a patch being submitted.
> The lead sentence of the post that started this whole thread is rather misleading, and perhaps based on Ted mis-remembering the exact nature of the patch that he applied in July 2012. At no point did any Intel engineer apply "pressure from Intel engineers to let /dev/random rely only on the RDRAND instruction". Peter pointed out above that RDRAND is not suited for this, given the ways that /dev/random and /dev/urandom are used - which is why his patch did not feed RDRAND straight to /dev/
> +Tony Luck That's fair, I should have said /dev/random driver. However, note that get_random_bytes() does get used for key generation inside the kernel, so if RDRAND was comrpomised, we could get screwed that way. The bigger threat though is actually openssl choosing to use RDRAND directly for session keys, without doing something like encrypting the RDRAND output with a secret key. (And BTW, OpenSSL does have a RDRAND engine so that's just a configuration away....)
Here's the direct link to Linus Torvalds' entertaining observation:
fake_random = 5
result = secret ^ real_random
result = result ^ fake_random
current_random = <something unpredictable>
fake_random = RDRAND()
new_random = current_random XOR fake_random
The fact is that the Linux/BSD random number generators were actually rather good before all this RDRAND stuff.
fake_random = 5
fake_random = secret ^ real_random ^ 5
In a larger sense, there's no way to stop the processor from simply setting result to whatever it wants before returning it to the place where it's used.
My point is, there's no real way to trust hardware that can't be verified.
No, the situations and concerns are very different.
Note that the implementation of the systems' entropy pools is also quite different -- my understanding is that FreeBSD uses Yarrow, a pRNG, to feed /dev/random, while Linux mixes bits of 'real' random data together, without the extra pRNG stage. Which of these schemes is better is irrelevant to this particular discussion.
No links, I'm afraid, as EE apparently thinks Stack Exchange isn't suitable for under 18s and I can't turn content lock off at the moment.
I'm not quite certain, it should be possible (though possibly overly complex) to use RDRAND (or calls to the crypto routines) as a trigger for a XOR backdoor working in concert with a RDRAND backdoor, and thus control the final output.
But practically, that would be (a) easy to catch happening, and (b) require the chip to fingerprint the relevant sections of machine code as implemented in multiple versions of the Linux kernel, and as compiled with innumerable iterations of compiler settings and versions.
I don't want to say improbable, but I find it difficult to imagine otherwise.
The way a CPU works internally, intermediate results are written to registers, and the registers specified by the machine code are rewritten onto a larger set of real registers. The CPU knows at the instruction decode stage which results are going to be used by which instructions, so that it can schedule them for out of order execution. It also knows when it can throw away the result of a real register because the machine code register it represents has been overwritten with another value. So if the result is used by a second instruction, that doesn't matter
Yes, if you wrote the result of RDRAND to main memory and read it in again, you would destroy that knowledge, but I would bet that the random number generator doesn't do that.
Then you read them back from memory and XOR the same numbers again. You compare that result with the XOR from the register values and see if they are the same.
If they are different, the chip is substituting a different XOR result whenever it is capable of determining that one of the operands is the output of an RDRAND instruction.
You might presume that an efficient mixer of multiple entropy sources would just XOR all of its various inputs together, so compromising XOR would be an easy way to discard the entropy of the other inputs in a way that is not easily detected without writing assembly code specifically to check it.
"Long answer: we use rdrand as _one_ of many inputs into the random pool, and we use it as a way to _improve_ that random pool. So even if rdrand were to be back-doored by the NSA, our use of rdrand actually improves the quality of the random numbers you get from /dev/random."
Exactly. So work on solid security/privacy principles, rather than take the easy way out and hope for the best.
That said I am a Mint XFCE user and have been back to 13 (before that Gnome 2) every release has been brilliant I run it on everything from an ancient ThinkPad to a thoroughly modern development machine and it has worked well.
It's also the only WM/DE that handles multiple screens (3 on both desktops and 2 on Dell/External) without any show stopping/tremendously irritating bugs couple that with the huge amount of software available via Debian/Ubuntu and PPA's and it's a cracking developer OS.