

Linear Congruential Generators - nkurz
http://www.eternallyconfuzzled.com/tuts/algorithms/jsw_tut_rand.aspx

======
sjolsen
> However, because the implementation of rand is unknown, it is impossible to
> tell whether or not using modulus to shrink the range will result in an
> acceptably random sequence. The reason that this might not be the case is
> due to the very definition of linear congruential generators.

Using the modulus operation to transform the range of std::rand is
problematic, but it isn't because of any attribute of the implementation of
std::rand or of linear congruential generators.

One problem with using modulus to scale a random distribution is that it fails
to preserve uniformity. For example, consider a function 'rand150' whose
result is uniformly distributed on [0, 150); that is, each number in the range
[0, 150) has the same probability (1/150) of being returned by any given call
to rand150. Now, if you use modulus to scale to [0, 100):

    
    
        x = rand150() % 100;
    

there are 100 possible values for x. For any value greater than or equal to
50, there is exactly one result of rand150 which could produce that value;
therefore, the probability of x being that value is 1/150\. But for values
less than 50, there are _two_ results of rand150 which could produce that
value, giving it a probability of 2/150\. Since half of the possible values of
x have a probability of 1/150 each and the other half have a probability of
2/150 each (you can check that this adds up to a probability of 1), the
distribution is very much nonuniform. This result does not apply just to small
ranges; it applies any time you scale down the range by anything but an
integer multiple, although it approaches uniformity more closely the smaller
the target range is compared to the source range.

The other problem is that you can't make the range bigger. Well, you actually
can— but some of your "x" values will have a probability of zero, which
probably isn't what you want. Considering rand150 again, if you try to scale
to [0, 300) by doing:

    
    
        x = rand150() % 300;
    

how often do you expect to get 250?

> The common solution is to use the constant value RAND_MAX, defined with
> every standard library, to use the high-order bits of a random number
> through division rather than low-order bits with the remainder of division

I'm not sure if there's a typo in the code, but if RAND_MAX is, say, 200, then

    
    
        rand() / (RAND_MAX / 100 + 1)
    

gives you values on [0, 66]. Probably not what you want.

> An alternative to dividing RAND_MAX by N is to force the expression to
> floating-point and multiply RAND_MAX by N

Mathematically, this would be a sensible way of scaling _continuous_ uniform
distributions, or at least it would be if floating-point numbers weren't so
imprecise. When dealing with discrete distributions, though, this approach
still has problems. For example, what if rand gives values on [0, 30) and I
want values on [0, 20)? Let's see what values of rand map to, assuming perfect
real numbers:

    
    
        rand  x
        0     floor(0)   = 0
        1     floor(2/3) = 0
        2     floor(4/3) = 1
        3     floor(6/3) = 2
        4     floor(8/3) = 2
    

Unfortunately, 1 will only be generated half as often as 0 or 2.

> Yet another alternative is to utilize the uniform_deviate function provided
> not too long ago

Mathematically, this is equivalent to the above. The floating point arithmetic
may be very slightly different, but it will make the result worse if anything.

~~~
nkurz
You make many fine points.

Paul Hsieh goes into some ways to properly correct for the modulus effect here
(as well as debunking commonly suggested ways that don't work):
[http://www.azillionmonkeys.com/qed/random.html](http://www.azillionmonkeys.com/qed/random.html)

Alternatively, especially if you have specifically chosen an LCG because you
need the periodicity, you can come up with appropriate constants to generate a
perfectly distributed set of integers matching the period of your choosing:
[http://mathaa.epfl.ch/cours/PMMI2001/interactive/rng4.htm](http://mathaa.epfl.ch/cours/PMMI2001/interactive/rng4.htm)

This second approach not only solves the modulus issue, but can be very useful
for things such as parallelizing the shuffling of an array. Your shuffle won't
be perfectly random (as in, some sequences will occur more often than others),
but it will be properly distributed in the sense that each index will only be
used once.

~~~
sjolsen
My preferred method—well, actually, my preferred method is to use
std::uniform_int_distribution—but in lieu of that, I prefer to use an adapter
that builds values in a target range bitwise from values in a source range.

The basic idea is to drop off the top of the input range so it's a power of
two wide; this makes the bits of the input values independent (i.e., each bit
has equal probability of being 0 or 1). You gather enough bits this way to
span at least the target range, then again drop off any values exceeding the
range's bound.

The drop-off method preserves uniformity because any variable uniformly
distributed over a set is also uniformly distributed over any subset of that
set. Breaking down and building up the binary representation works because
uniformity is preserved by the Cartesian product (i.e., X is uniform in A and
Y is uniform in B iff (X,Y) is uniform in A×B, where X and Y are stochastic
variables) and the binary representation of an integer is just {0,1}^n for
some n.

The advantage of this method is that you can adapt any input range to any
output range without introducing any nonuniformity, cycles, or other "flaws"
in randomness (though it will certainly preserve them, if not amplify them).

Another interesting thing to note about this algorithm is that you don't
technically have to use binary or even a fixed base; given X in A and Y in B,
where A and B are subsets of the natural numbers and |B| ≥ 2, (X,Y) ≅ X *
(sup(B)+1) + Y. You can, if I'm not mistaken, use this property to derive the
first "more reasonable answer" in the first link you gave from the algorithm
above, assuming a smaller target range than input range. And if you look
closely at "X * (sup(B)+1) + Y," you'll see a relationship with quotients and
remainders and, I think, the LCG itself.

This is quite interesting stuff.

