How fast is this instruction? And how to access it with a C program in Linux?

agolliver · on June 9, 2013

I used RDRAND to seed a SSE Fractal Flame generator (which is a stochastic system which needs 4 random numbers per thread per loop iteration, sometimes more depending on the variations used). RDRAND has a maximum throughput of something like 500MB/s, and it takes approximately 150 clocks per invocation. I was never able to hit that performance wall with my fractal program. My fractal program is also significantly faster than other CPU implementations, though that is probably due to the intrinsic vectorization more than the random number source. I also got significantly better looking fractals then what I would get with most of the PRNG that I tried.

For more check the "performance" section of this article

http://software.intel.com/en-us/articles/intel-digital-rando...

Note: if you need more than 500MB/s you can uses RDRAND (or RDSEED in Broadwell, when it comes out) to seed a PRNG. I was doing this at first, but the performance of the system didn't improve enough for it to be worth the added complexity.

And if you want to access it in linux from C you can either:

* Use inline assembly, just be sure to check the Zero flag after calling RDRAND, because if RDRAND fails (you're exceeding the 500MB/s) the zero flag isn't set, so you have to just keep calling RDRAND until it is.

* Use intrinsics (easier, immintrin.h), here's how I did it in my program (bug reports welcome, I'm only a freshman in college who had lots of free time and a fascination with fractals)

https://github.com/aarongolliver/FractalFlameMicahTaylorEdit...

http://software.intel.com/sites/products/documentation/studi...

gus_massa · on June 10, 2013

Do you have any screenshots or samples of the fractals? Just curious.

arete · on June 10, 2013

RDRAND can do > 500MB/s when invoked by 8 threads running in parallel: http://software.intel.com/en-us/articles/intel-digital-rando... the theoretical maximum is 800MB/s

I wrote an x86-64 asm impl as part of my lightweight Java crypto library (https://github.com/wg/crypto) would be easy to drop into any C program: https://github.com/wg/crypto/blob/master/src/main/asm/rdrand...

Intel released an open source library too, though in tests my asm impl was faster ;-) http://software.intel.com/en-us/tags/20757

ape4 · on June 9, 2013

In Linux you can use /dev/random and /dev/urandom

Aardwolf · on June 9, 2013

Do those use that CPU instruction if it is available? That would make the blocking /dev/random a lot faster, wouldn't it?

diroussel · on June 10, 2013

Depends on your kernel. But with the right kernel, yes.