If I'm reading this correctly, it is very close to what Python has been doing for over a decade, so I am not sure that using a "rejection method" counts as "new tech". Here are the relevant snippets.
Code for shuffle(x):
for i in reversed(range(1, len(x))):
j = randbelow(i+1)
x[i], x[j] = x[j], x[i]
Code for randbelow(n):
k = n.bit_length()
r = getrandbits(k)
while r >= n:
r = getrandbits(k)
return r
Rejection isn't new - what's new is using a variation of his "use the high order bits of the 64x64 multiply by a range to get a random number efficiently" trick (which he published two or three years ago) to substantially improve the efficiency of a rejection sampling method.
The efficient random number in a range trick is really cool. I've used it in a couple of data structures that we've designed - it's substantially faster than using modulus, or even using pre-computed division by reciprocal multiplication.
Code for shuffle(x):
Code for randbelow(n): Code for getrandbits(k):