
Infinite Noise – Simple, Open-Source TRNG - nickpsecurity
https://github.com/waywardgeek/infnoise
======
lisper
This is a really good writeup and design, but it misses one crucial point: it
doesn't matter what the actual quality of your noise source is. What matters
is that you have an _accurate estimate_ of the quality of your noise source.
[EDIT: actually, you don't even need that. All you really need is a reliable
lower bound.] No noise source is perfect, so you have to use a whitener
regardless. When you use a whitener you have to choose how many output bits to
extract for a give number of input bits. If you extract more than the quality
of your source allows, you lose. That's really all there is to it.

So the best approach to generating secure random numbers is simply to use a
large safety margin in your estimate of the quality of your noise source,
whatever it may be. You don't really need any fancy hardware unless you have
very high bandwidth requirements (and if you think you have high bandwidth
requirements you are almost certainly wrong). For nearly all practical
applications (unless you are a telco) you can safely generate a crypto key by,
for example, grabbing a few frames from a webcam pointed at a fan, or reading
a few seconds of audio from a microphone while you say "Shhh", and running
that through a whitener. That will give you several orders of magnitude of
safety margin, which is enough to thwart any external attack.

~~~
ScottBurson
> When you use a whitener you have to choose how many output bits to extract
> for a give number of input bits. If you extract more than the quality of
> your source allows, you lose.

Crypto is not my area of expertise, but this seems to contradict other things
I've read. For example, in the debate about whether /dev/random should block,
I seem to recall tptacek saying something like this: the idea that we have
ciphers we can't break, but we have randomness sources that "run out" of
entropy, doesn't pass the cryptographer's laugh test. I think someone else put
it this way: seed AES with, say, 512 random bits, and then you can encrypt as
many zero bits as you want to generate your random numbers; as long as AES
itself is secure, you're not in any danger.

That doesn't mean it's not a good idea to keep mixing in external entropy, of
course, as long as that's easy to do and doesn't itself give an attacker a way
to influence the results (I seem to recall seeing a paper about this
somewhere). But it doesn't seem to be _essential_ , if you believe those
arguments.

What do you think about this?

~~~
lisper
That is an excellent question!

There are two separate issues here:

1\. How many bits of real entropy can you count on getting out of a TRNG in a
given amount of time?

2\. How many bits can you safely extract from a PRNG that has been seeded with
N bits of entropy?

This discussion is about issue #1. The /dev/random vs /dev/urandom arguments
are about issue #2.

Just for the record, the answer to #2 is: once your PRNG has been seeded with
"enough" entropy, you can safely extract an essentially infinite amount of
pseudo-randomness out of it (where "essentially" means that there might be a
theoretical limit, but you won't encounter it before the heat death of the
universe).

(How much entropy is "enough"? Again, it depends on how paranoid you want to
be. 128 bits is barely enough. 256 is plenty. 512 is overkill.)

------
rohare
I don't get it. He only provided one test run of ent and failed 4 out of 6 of
the tests in that run. There should be multiple tests with multiple test
suites. Ent and Diehard would be good, or Ent and Dieharder (though a couple
of the dieharder tests are suspect, the rest are good).

I only see one run of ent on one data set and it only passed the Chi Square
Distribution test and the Arithmetic Mean test. It failed entropy, compression
(badly), Monte Carlo Pi (badly), and serial correlation.

Because it is impossible to prove non-randomness (maybe the output really was
1111111111111111111 randomly) neither I nor anybody else can claim that the
device doesn't provide good entropy, but there is no evidence that it does
either.

~~~
wallacoloo
Perhaps I'm mistaken, but it's _expected_ that the entropy test should fail.
Because the noise is being scaled by only 1.82 each round, it's pretty simple
to predict the next bit:

> P(next_bit_is_0 | prev_bit_is_0)

> = P(v _1.82 < 0.5 | v < 0.5), where v is some real from a uniform
> distribution of [0, 1)

> = P(v < 0.5/1.82 | v < 0.5)

> = (0.5/1.82) / (0.5)

> = 0.55

So 55% of the time that a 0 occurs in the bitstream, the next bit will also be
0. You can make similar predictions for 0 -> 0 -> 0 and 0 -> 1 -> 0, and more
generally, x -> ... -> y -> 0\. As a result, you can pick an arbitrarily long
chain of N previous outputs, and then compute how effective knowledge of these
N bits is in predicting the next bit. It will not be significantly different
from knowledge of the past infinity bits, provided N is "large enough". The
higher the likelihood of being able to predict the next bit given knowledge of
all previous bits, the more compressible the data.

By extension, one would expect entropy and serial correlation to also fail,
but these all have solutions/workarounds. If each output bit produces M < 1
bits of unpredictable randomness (i.e. entropy), then an algorithm can be
developed that takes k/M raw bits and outputs k bits in such a manner that the
likelihood of correctly guessing the next output bit given all previous output
bits is _much* lower.

Typically, a well-built PRNG will do that last function well. Seed it with k/M
raw bits and read M outputs, rinse, repeat.

Hope that wasn't excessive - I wrote this as much for my own benefit as
anything else.

------
nickpsecurity
USB stick. Doesn’t have risky microcontroller like some. Uses thermal noise.
Does whitening in the driver instead of hardware so hardware noise is
verifiable w/ simple implementation. Everything up to BOM is open. Cost about
$10 each to make with a minimum volume of… 9. ;)

Also a Tindie link for anyone that uses them:

[https://www.tindie.com/products/WaywardGeek/infinite-
noise-t...](https://www.tindie.com/products/WaywardGeek/infinite-noise-true-
random-number-generator/)

~~~
pawadu
looked great until I saw the FTDI chip...

For reference:
[https://www.youtube.com/watch?v=eU66as4Bbds](https://www.youtube.com/watch?v=eU66as4Bbds)

~~~
nickpsecurity
BitBabbler had one, too. Engineer said it was the simplest, cheapest, "dumb"
product they could find for USB in that form factor or packaging. So, that
problem is apparently going around. It might help if someone in open-source
hardware would make one with similar attributes. It will cost money because
ASIC's ain't cheap.

~~~
pawadu
There are tons of great alternatives:

[http://www.eevblog.com/forum/reviews/alternatives-to-ftdi-
us...](http://www.eevblog.com/forum/reviews/alternatives-to-ftdi-usb-to-uart-
converter/)

many of those support the USB serial class so they don't even need a driver.

My personal favorite is CH340g, used in many new Arduino clones:

[https://hackaday.com/2014/12/02/finding-a-cheaper-usb-to-
ser...](https://hackaday.com/2014/12/02/finding-a-cheaper-usb-to-serial-
chips/)

~~~
nickpsecurity
Thanks. I'll look at them later and forward them to various people if I like
them.

------
benkuykendall
With the whitening requirement already depending on an algorithm with
cryptographic guarantees (they use SHA3), this hardware only seems useful for
generating keys.

They suggest the following algorithm to generate N bytes:

> The Inifite Noise driver uses the reference version of the SHA3 "sponge",
> called Keccak, with a 1600 bit state. To make the state of the sponge
> unpredictable, it is initialized with 20,000 bits of of Infinite Noise data
> before any data is output. After that, reading bytes from the SHA3 sponge
> blocks until twice as many bytes of entropy have been fed into the sponge
> from the Infinite Noise TRNG.

In code, that is

    
    
      repeat N / (block size) times:
        raw <- output of hardware
        output SHA3(raw)
    

They claim that the hardware outputs only log(1.82) bits of entropy per bit.
Thus, they are relying on SHA3 to act as a cryptographic extractor.

Now, I don't think it's proven that the existence of an extractor implies the
existence of psuedorandomness. However, I believe the usual construction would
still assume that SHA3 was a psuedorandom generator. Under this assumption,
then the following algorithm also outputs psuedorandom bits, but with only one
use of the hardware.

    
    
      let seed <- output of hardware
      repeat N / (2 * block size) times:
        result <- SHA3(seed)
        seed <- first half of result
        output second half of result
    

(Assuming the PRG is length-doubling. If it is less than length-doubling, we
can still construct a length-doubling PRG wlog)

Does anyone know why (I'd accept cryptographic, philosophical, or practical
reasons) the first algorithm would be preferable to the second?

Relevant reading (both highly recommended books in their own right!):

    
    
      Arora & Barak "Computational Complexity" Ch 21 "Expanders and Extractors"
      
      Goldreich "Founds of Cryptography I" defines PRG and so forth

~~~
hackcasual
SHA3 should be assumed to be psuedorandom under its inputs. If this can be
shown to not be the case, SHA3 would be considered broken.

There is no reason once you have collected a sufficient amount of entropy
(~128 bits), to continue to collect it from the device. You would want to also
mix in other entropy sources as well, to prevent an attack along the lines of
the Dual EC DRBG kerfuffle.

~~~
drostie
That's not what was being asked; what was being asked is "why wouldn't you
want to persist some bits from a previous emission to a current one?" and the
answer is "you do; it helps you in threat models where the attacker is somehow
able to blast some hardware for a tiny time such that they can control an N
bit-chunk of seed material, but not what came before or after." If you don't
persist some bits between the calls, then this lets them determine entirely a
few outputs of your RNG -- albeit scrambled by SHA3, but for some security
expectations that's sufficient. On the other hand if you persist a seeding
block between hash invocations, you block these attacks rather easily.

------
zxcvgm
A few months ago, I came across a similar project: OneRNG [1]. The device is
open-source and one of their goals is to be verifiable by someone who buys the
device. There was an article on LWN about this too [2].

[1] [http://onerng.info/](http://onerng.info/)

[2] [https://lwn.net/Articles/629714/](https://lwn.net/Articles/629714/)

------
hackcasual
I appreciate the cool factor of a dedicated hardware RNG, but provided you've
seeded a CSPRNG with a relatively small number of true random bits, there's no
attacks on that that aren't "assuming a total preimage attack against
SHA-1...".

I guess I'm saying after ~128 bits you can safely disconnect the TRNG.

------
kwantam
It's clear that a lot of thought went into this write-up, but this is not a
design that I would use or recommend.

(Because you probably don't anything about me: I have background in both
circuit design and applied crypto, and I have published research on hardware
random number generators. I'm not saying this to bully you into believing me,
but rather to give you some motivation to think critically.)

The main problem is that the noise sources are _implicit_ : the designer has
no control over the spectrum or the relative magnitude of the various noise
sources, and there's _no_ attempt at rejecting power supply ripple or RF
disturbances. (The claim that "it naturally defends against influence from
outside signals" is not substantiated, and as I argue below, there is a strong
possibility that it is false.)

It's also worrisome that the author doesn't bother to distinguish between
"noise" in the technical sense and "noise" in the some-arbitrary-signal sense
(e.g., power supply ripple). Only the former has useful statistical
properties. The latter is useless at best, and an attack vector at worst.

As an concrete example of what worries me: this is a discrete time, positive
feedback loop. We are presented with evidence that it has a mode of operation
in which its output looks random. Critically, this does _not_ eliminate the
possibility that it has another mode that's essentially a limit cycle. My
intuition tells me there's a very good possibility that if the output of the
circuit has a heavy enough capacitive load, and the circuit's power supply is
sufficiently high impedance, the whole thing would lock up into a steady-state
oscillation. This is related to the analog design adage: oscillators don't,
everything else does :)

So what's the "right" approach (at least, in my opinion)? First, simplicity:
fewer moving parts means there's less that can go wrong. Second, pick an
explicit noise source; the best choice is a diode in avalanche breakdown (this
is sometimes conflated with Zener breakdown; they're different phenomena, and
avalanche is louder). Avalanche generates enough noise that it requires very
little amplification, and unlike thermal noise, avalanche is quite insensitive
to temperature. Third, make it insensitive to external influences by design:
shield it from RF, take special care to ensure power supply insensitivity,
etc. Fourth, run wellness tests whenever you sample from the RNG. Intel's
procedures are a good example [1].

Finally, a lot of other comments in this thread have covered this, but under
standard cryptographic assumptions there is no reason to run this RNG
continuously. Sample a few thousand bits, hash them into a key, and use
AES-256 as your PRNG. For a really nice take on randomness mythbusting, see
"Recommendations for randomness in the operating system" by Corrigan-Gibbs and
Jana [2].

[1] [http://www.rambus.com/wp-
content/uploads/2015/08/Intel_TRNG_...](http://www.rambus.com/wp-
content/uploads/2015/08/Intel_TRNG_Report_20120312.pdf)

[2]
[https://www.henrycg.com/pubs/hotos15recommendations/](https://www.henrycg.com/pubs/hotos15recommendations/)

~~~
nickpsecurity
Interesting write-up. I get really, mixed reviews on all these analog designs
from various EE's. So, what do you think about BitBabbler's analysis and
design while you're at it?

[http://www.bitbabbler.org/how.html](http://www.bitbabbler.org/how.html)

EDIT: Now I know the name. Interesting to run into a person whose team
independently invented a verifiable ASIC strategy very similar to my own of a
year or so ago. I learned digital at a high-level, generalist way of thinking
with no hands-on ability and definitely no analog. So, I have to work at that
level. Started learning risk areas from insiders like this [1]. My initial
strategy enumerated problem space to tackle it head on [2]. Yet, I switched
gears to follow how provers work: combine untrusted EDA tools and main CPU's
with traces produced for trusted, verification tools on inspectable CPU's.
Similar to what I did for high-assurance, guard designs at protocol level
without formal methods. I determined by maybe Jan2015 based on my Schneier.com
posts that trusted ASIC's should be 0.35 micron or above because they're still
_visually_ inspectable. If ever funded, I'd modify VAMP CPU to be more like
SAFE or CHERI to do the job. If too hard to do visual on, I'd start with a
Forth or JOP-like processor before that. Even considered FM9001 & Piton haha.
Also, replicating computation on diverse, fast CPU's from various vendors,
fabs, and countries with optical communication & EMSEC isolation. The old,
diverse, voter scheme I did pre-silicon for things like build systems.

Anyway, props on being ahead of me in implementation and more impressive
design in general. Great work. That we converged on similar approach means
it's probably worth me putting more effort into among the many ideas I'm
juggling. I still gotta learn analog and hands-on formal methods, though,
before I can do something like yours or provably solve the analog portion.
(sighs) My gut has been telling me since I studied TEMPEST and RobertT's
analog subversions that the solution will have strong analog component past
diverse review of gates and such I was leaning on.

[1]
[https://www.schneier.com/blog/archives/2013/09/surreptitious...](https://www.schneier.com/blog/archives/2013/09/surreptitiously.html#c1744173)

[2]
[https://news.ycombinator.com/item?id=10468624](https://news.ycombinator.com/item?id=10468624)

~~~
kwantam
_So, what do you think about BitBabbler 's analysis and design while you're at
it?_

The problem with BitBabbler is that they have a lot of text but, as far as I
can tell, no circuit diagrams and no _real_ discussion of what they're doing.
It may just be that their wall of text makes me go cross-eyed, but my snakeoil
detector goes off whenever I start reading their web page. For example, the
talk about sample-and-hold circuits as "analog shift registers" and worrying
about quantization seems slightly unhinged (for one, if you need more
resolution, just use a bigger sampling cap). Apologies to the BitBabbler
people if they're on the up-and-up, but a page that talks about auditability
and verifiability but doesn't show a schematic seems slightly dishonest.

(Sorry... rant mode off.)

Thanks for the kind words regarding verifiable ASICs! You are absolutely right
that there's a ton of complexity to worry about here. But I sincerely hope
that the world makes more headway on this problem, so I'm really glad you're
thinking about it.

