The linked overview on Efficiently Generating a Number in a Range by O'Neill (of PCG fame) is insightful. It gives some more background and compares several approaches (including this, Lemire's).
It seems based on her testing that a very simple bit mask & rejection technique ("Apple's method", as Apple implemented it in their modification of arc4random_uniform) is the fastest among the unbiased techniques, particularly for large N, while for mixed N Lemire's technique seems competitive or better. She suggests a further optimisation, which now clearly "wins" except for large N.
Cool stuff - I find random number generation fascinating.
If I'm reading this correctly, it is very close to what Python has been doing for over a decade, so I am not sure that using a "rejection method" counts as "new tech". Here are the relevant snippets.
Code for shuffle(x):
for i in reversed(range(1, len(x))):
j = randbelow(i+1)
x[i], x[j] = x[j], x[i]
Code for randbelow(n):
k = n.bit_length()
r = getrandbits(k)
while r >= n:
r = getrandbits(k)
return r
Rejection isn't new - what's new is using a variation of his "use the high order bits of the 64x64 multiply by a range to get a random number efficiently" trick (which he published two or three years ago) to substantially improve the efficiency of a rejection sampling method.
The efficient random number in a range trick is really cool. I've used it in a couple of data structures that we've designed - it's substantially faster than using modulus, or even using pre-computed division by reciprocal multiplication.
It seems based on her testing that a very simple bit mask & rejection technique ("Apple's method", as Apple implemented it in their modification of arc4random_uniform) is the fastest among the unbiased techniques, particularly for large N, while for mixed N Lemire's technique seems competitive or better. She suggests a further optimisation, which now clearly "wins" except for large N.
Cool stuff - I find random number generation fascinating.
http://www.pcg-random.org/posts/bounded-rands.html