
Show HN: True-Random – Generate “truly” random numbers - jaybosamiya
https://github.com/jaybosamiya/true-random
======
balls187
> Since the number of times it would be able to flip the bit changes due to
> random fluctuations in time due to context switching of processes, this
> generates an arguably truly random bit (I would love to see a PoC that shows
> otherwise, however).

This wouldn't be true random. It's just using the time jitter introduced by
context switching to introduce entropy. Similar to many other pRNGs that use
system entropy, mouse movements, etc in order to seed the pRNG.

A pRNG is 'p' because if you know the conditions used, you can
deterministically predict the outcome. The difficulty of recreating those
conditions has nothing to do with being truly random.

------
jsonninja
'fuck, not again' \- said the cryptographer. The title of the project is
potentially very misleading.

I know most of you take this stuff seriously in your codes and rely on the
well know cryptographically secure random number generators:
[https://en.wikipedia.org/wiki/Cryptographically_secure_pseud...](https://en.wikipedia.org/wiki/Cryptographically_secure_pseudorandom_number_generator)

~~~
geofft
CSPRNGs require random seeds. They're the obvious right thing when you can
generate 256 (or whatever your security level is) bits that are truly
unpredictable, independent, and of equal probability 0 and 1, but when
generating significantly more is hard. If you can't generate 256 truly
unpredictable bits, the CSPRNG doesn't help you much. And if you can generate
an unbounded number of them, it's not clear that there's a downside to just
using the TRNG data (but there's certainly the argument that it's better-
studied and more robust to use a CSPRNG anyway).

Whether this particular method of random bit generation is in fact sound is a
different question entirely (and worth asking of _all_ ways to generate those
seeds, including the Linux kernel's entropy divination code).

------
DanielStraight
It seems a bit ambitious to call this true random without any analysis of
randomness quality or predictability.

I find it very unlikely this will be shown to be better than existing RNG
solutions.

It's clever, but clever in the way that sleep sort is clever, at least until
proven to be of actual benefit.

~~~
kazinator
It's expensive; it spins for a millisecond to obtain one bit.

Suppose this is running on a quiescent system with no interrupts or other
tasks running. I can see it degenerating into deterministic behavior. Suppose
the RTC counter and the CPU clock are from the same master clock, and the code
is executing cleanly to the clock from on-chip caches.

Ultimately, the question is: is there really one bit of entropy from each call
to get_bit(), and under what conditions?

~~~
dragontamer
It spins for a millisecond to have a chance of obtaining half a bit actually.
It takes two clock calls to create a chance of getting a bit, and only if the
bit pattern changed during that period.

~~~
infogulch
It _can_ create a whole bit per spin, but my guess is that the output was
predictable enough for the author to notice, so they added a single round of
the most basic random scrubber. 'Hey it looks random now, ship it!'

~~~
dragontamer
From that perspective, I think we can use the methodology in this code as an
input into SHA3 / Keccak or Skein (A SHA3 finalist).

I dunno Keccak, so I'll talk from a Skein perspective. If every millisecond
you added the "nanoseconds" field with the Skein cryptofunction salt =
mix(nanoseconds + salt + current seed + other Skein stuff), it'd be pretty
darn random. (Apparently Keccak can 'sponge up' entropy somehow, but I don't
know the mechanism that it does it with)

The output would at least be a crypto-secure PRNG. All that needs to be proven
after that fact is whether or not enough entropy was being gathered from the
real time clock.

------
geofft
This looks to be the same technique as Dan Kaminsky's DakaRand, including the
debiasing:
[https://dankaminsky.com/2012/08/15/dakarand/](https://dankaminsky.com/2012/08/15/dakarand/)

See also Kaminsky's implementation of the same approach in pure JS:
[https://gist.github.com/PaulCapestany/6148566](https://gist.github.com/PaulCapestany/6148566)

and Ryan Finnie's implementation in Perl:
[http://www.finnie.org/2012/08/14/twuewand-2-0-released/](http://www.finnie.org/2012/08/14/twuewand-2-0-released/)

The big concern I have is how reliable this is on virtual machines. Just about
all physical machines I'd want to use have high-quality, trustworthy-within-
my-threat-model (i.e., "if the NSA wanted to attack my silicon, there's easier
silicon for them to attack") hardware random number generators, and all
physical machines I'd want to use pick up sufficient randomness from the
kernel's entropy magic thing. But virtual machines often don't have access to
the hardware RNG, and they don't have access to enough other hardware to
populate entropy. It seems like this technique would be particularly risky
there ... although I don't think anyone's published an attack on DakaRand yet,
so maybe it's fine!

~~~
Natanael_L
The answer for VMs is guest OS drivers that taps into the host's system RNG
(with support for doing so in the VM software).

~~~
nickpsecurity
Exactly. Having a root, high-quality RNG whose output can be input into other
systems directly or seeding a CRNG is a classic, design pattern of mine. It
can be used for dedicated, simple machines too predictable for TRNG (eg
RTOS's), virtual machines, and probably other things I haven't thought of.

------
caffeinewriter
I'm really hesitant to even consider this without it being run through the
Diehard Tests[0], since from my understanding, "True Randomness" should be
cryptographically secure should this be used in a CSPRNG.

[0]:
[https://en.wikipedia.org/wiki/Diehard_tests](https://en.wikipedia.org/wiki/Diehard_tests)

~~~
Freaky
rng-tools includes a command for testing random number generator output. I
tried this with a FreeBSD port from
[https://github.com/waitman/rngtest](https://github.com/waitman/rngtest)

Results aren't great:

    
    
        rngtest: bits received from input: 300032
        rngtest: FIPS 140-2 successes: 0
        rngtest: FIPS 140-2 failures: 15
        rngtest: FIPS 140-2(2001-10-10) Monobit: 15
        rngtest: FIPS 140-2(2001-10-10) Poker: 15
        rngtest: FIPS 140-2(2001-10-10) Runs: 15
        rngtest: FIPS 140-2(2001-10-10) Long run: 0
        rngtest: FIPS 140-2(2001-10-10) Continuous run: 0
        rngtest: input channel speed: (min=236.690; avg=245.396; max=251.742)bits/s

~~~
Freaky
And with more data:

After nearly 8 hours:

    
    
        rngtest: bits received from input: 6762248
        rngtest: FIPS 140-2 successes: 156
        rngtest: FIPS 140-2 failures: 182
        rngtest: FIPS 140-2(2001-10-10) Monobit: 182
        rngtest: FIPS 140-2(2001-10-10) Poker: 174
        rngtest: FIPS 140-2(2001-10-10) Runs: 160
        rngtest: FIPS 140-2(2001-10-10) Long run: 0
        rngtest: FIPS 140-2(2001-10-10) Continuous run: 0

~~~
caffeinewriter
Yikes.

------
gjmulhol
This is almost certainly not more random than this:
[http://www.fourmilab.ch/hotbits/](http://www.fourmilab.ch/hotbits/)

It is a cool idea.

~~~
saganus
Or than

[https://www.random.org/](https://www.random.org/)

Which even includes real-time statistics on the quality of the numbers.

~~~
woliveirajr
Or

[https://qrng.anu.edu.au/](https://qrng.anu.edu.au/)

That takes his number from random fluctuations from quantic processes.

------
jamesbowman
The debiasing idea is due to von Neumann himself:

[https://en.wikipedia.org/wiki/Fair_coin](https://en.wikipedia.org/wiki/Fair_coin)

------
valarauca1
How does checking `get_fair_bit(0)' that the next bit isn't the same as the
current help fairness?

~~~
micaeked
This is a known method for a simple way of getting an unbiased bit from a
series of (possibly biased) coin flips.

Stated another way, the algorithm for getting an unbiased bit from a biased
coin is:

1\. Flip the coin twice.

2\. If it's both heads or both tails, discard result and go back to 1.

3\. If it's heads then tails, result = 0

4\. If it's tails then heads, result = 1

~~~
infogulch
This assumes the bias can't change over time. This works for a coin because
you're using the same coin for all flips, you just don't know the bias. This
falls apart in this library because in the computer the source _can_ be
affected differently over time -- an attacker can effectively switch out your
coin on every flip, rendering this whole process useless.

~~~
dakami
I'm not convinced Von Neumann debiasing is vulnerable like you think. You're
dropping runs (00's and 11's) so you don't care if it's 90% 0 or 90% 1, you
care if there's an even number of a run or an odd number of a run.

In other words:

0000000000000001 000000001 001

...are all interchangeable. So biases can change all day.

There are _other_ issues I've seen in real world data, but not this one.

~~~
infogulch
So I have the ability to arbitrarily change the bias on every flip, and you
don't think debiasing is affected?

How about this: for the next string of generated bits, for every even bit the
bias is 90% towards 1, and every odd bit the bias is 90% towards 0.

raw: 0101010101010111010101010001010101010101

'debiased': 1111111111111111

~~~
dakami
Ah, there you go. Repro code:

#/usr/bin/python

from random import *

count=0

def get_bit():

    
    
       global count
    
       count+=1
    
       if (count%2)==0:
    
          return random()<0.9
    
       else:
    
          return False
    

def get_fair_bit():

    
    
       while 1:
    
          a=get_bit()
    
          b=get_bit()
    
          if a!=b: return a
    

while 1:

    
    
       print get_fair_bit()
    

I'm wondering how often this one shows up the field -- my interest here comes
from the ~1% failure rate of RSA keys in the field because manufacturers
actually will not install hardware RNG or externally inject key material.
Field vulnerability trumps religion. It looks like you need a really stateful
vuln -- something akin to "a 1 is always followed by a 0 due to power drain"
wouldn't be enough because the number of 0's is still acceptably unbound.

I know there are debiasers that look at statistics across a group. Wonder what
else is out there.

------
Grue3
What if get_bit happens to be deterministic on a particular machine? Then
get_fair_bit would be stuck in an infinite loop. This can potentially happen
when CPU is so overloaded that executing the bit-flipping instruction takes
longer than a millisecond.

------
coldcode
Could you not build a "truly" random number generator using a quantum
computer?

------
spamfilter247
Wouldn't an example of a truly random number generator be to open up a pseudo
random webpage (say something like
www.engadget.com/page/<pseudo_random_number>) and do a modulo count of the
number of whitespace separated words/tokens?

------
anishathalye
This post reminded me of this other project:
[https://github.com/dasmithii/RCRand](https://github.com/dasmithii/RCRand) (a
similarly silly but fun race condition based RNG)

------
wyager
The only thing in the world that might be classified as truly random is
wavefunction collapse under observation, and even then we're not sure.

~~~
cyphar
Or radiation.

------
TickleSteve
this is _not_ random.... it is deterministic.

typical round-robin times are on the order of 10ms, so you would have a
significant amount of non-context-switched 'random' numbers, which when
combined with analysis using the cycle counter.... namely a counter running at
the clock-speed of your processor would yield a _very_ non-random value.

------
szc
The periodic interrupt frequency of the system this is run on will have an
impact on the numbers produced.

------
highCs
You have to ship the hardware with it.

------
arno1
ugly tests

1\. ("true"-random) 10 iterations; 32 random bytes | Result Avg Entropy: 4.82

    
    
      $ echo $(echo -n "("; for i in $(seq 1 10); do echo -n $(./generate_constant_stream |head -c 32 |ent |head -1 |awk '{print $3}')"+"; done; echo -n "0)/10") |bc -l
      4.81945500000000000000
    

2\. ("true"-random) 100 iterations; 32 random bytes | Result Avg Entropy: 4.37

    
    
      $ echo $(echo -n "("; for i in $(seq 1 100); do echo -n $(./generate_constant_stream |head -c 32 |ent |head -1 |awk '{print $3}')"+"; done; echo -n "0)/100") |bc -l
      4.37395291000000000000
    

3\. ("true"-random) 200 iterations; 32 random bytes | Result Avg Entropy: 4.35

    
    
      $ echo $(echo -n "("; for i in $(seq 1 200); do echo -n $(./generate_constant_stream |head -c 32 |ent |head -1 |awk '{print $3}')"+"; done; echo -n "0)/200") |bc -l
      4.34563333000000000000
    

1\. (openssl) 10 iterations; 32 random bytes | Result Avg Entropy: 4.88

    
    
      $ echo $(echo -n "("; for i in $(seq 1 10); do echo -n $(openssl rand 32 |ent |head -1 |awk '{print $3}')"+"; done; echo -n "0)/10") |bc -l
      4.88125000000000000000
    

2\. (openssl) 100 iterations; 32 random bytes | Result Avg Entropy: 4.87

    
    
      $ echo $(echo -n "("; for i in $(seq 1 100); do echo -n $(openssl rand 32 |ent |head -1 |awk '{print $3}')"+"; done; echo -n "0)/100") |bc -l
      4.87404420000000000000
    

3\. (openssl) 200 iterations; 32 random bytes | Result Avg Entropy: 4.88

    
    
      $ echo $(echo -n "("; for i in $(seq 1 200); do echo -n $(openssl rand 32 |ent |head -1 |awk '{print $3}')"+"; done; echo -n "0)/200") |bc -l
      4.87885575000000000000
    

1\. (/dev/urandom) 10 iterations; 32 random bytes | Result Avg Entropy: 4.82

    
    
      $ echo $(echo -n "("; for i in $(seq 1 10); do echo -n $(head -c32 < /dev/urandom |ent |head -1 |awk '{print $3}')"+"; done; echo -n "0)/10") |bc -l
      4.82264100000000000000
    

2\. (/dev/urandom) 100 iterations; 32 random bytes | Result Avg Entropy: 4.89

    
    
      $ echo $(echo -n "("; for i in $(seq 1 100); do echo -n $(head -c32 < /dev/urandom |ent |head -1 |awk '{print $3}')"+"; done; echo -n "0)/100") |bc -l
      4.88655640000000000000
    

3\. (/dev/urandom) 200 iterations; 32 random bytes | Result Avg Entropy: 4.88

    
    
      $ echo $(echo -n "("; for i in $(seq 1 200); do echo -n $(head -c32 < /dev/urandom |ent |head -1 |awk '{print $3}')"+"; done; echo -n "0)/200") |bc -l
      4.88061280000000000000

~~~
gberger
Can you explain what do these results mean?

~~~
arno1
The values presented are "N bits per character". 8 is the maximum, 0 is the
minimum. The less value is the worse. The ugly test results show that
"true"-random tool does not perform better than openssl / /dev/urandom. What
makes it even worse is that the entropy decreases with the amount of
iterations. These are simple bash tests, you can try on your own, playing with
the values. Though, I suggest to understand some theory in the first place:

\- randomness
[https://en.wikipedia.org/wiki/Randomness](https://en.wikipedia.org/wiki/Randomness)

\- entropy
[https://en.wikipedia.org/wiki/Entropy_(computing)](https://en.wikipedia.org/wiki/Entropy_\(computing\))

\- ent [http://www.fourmilab.ch/random/](http://www.fourmilab.ch/random/)

~~~
arno1
I think it is even worse than I thought...

I've asked the "true"-random tool to give me 3000 of the really "true" numbers
(as it claims) and out of 3000 it has thrown to me ~9% of char(0) and ~9% of
char(255) values (see the fractions below), whilst the others <0.01% per char
between char(1-254).

    
    
      ./generate_constant_stream |head -c3000 |ent -c
      Value Char Occurrences Fraction
      0              309   0.103000
      1               33   0.011000
      ...
      ...
      253   �           28   0.009333
      254   �           29   0.009667
      255   �          312   0.104000
      Total:          3000   1.000000
    
      Entropy = 6.861709 bits per byte.
    
    

_Update_

Running it without the char(0)/char(255) neither did outperform /dev/urandom,
but only running terribly slow (~5mins on i7) and using 100% of a CPU core:

    
    
      $ ./generate_constant_stream | cat - | sed -u 's/\x00//g;s/\xff//g' |head -c3000 |ent -c
      Entropy = 7.737683 bits per byte.
    
      $ cat /dev/urandom | sed -u 's/\x00//g;s/\xff//g' |head -c3000 |ent -c
      Entropy = 7.924493 bits per byte.

------
egoegoego
Ego has infected this thread. We can't just say that this project is
interesting. We have to talk about "expensive" computation. Why? What does
"ambition" have to do with computer science?

~~~
dang
We detached this subthread from
[https://news.ycombinator.com/item?id=11669106](https://news.ycombinator.com/item?id=11669106)
and marked it off-topic.

