
A call to PHP's mt_rand generates only odd numbers - ComputerGuru
http://3v4l.org/dMbat
======
Stormcaller
Because max value should be "mt_getrandmax()" instead of "PHP_INT_MAX", it
just gets a 32 bit number then scales it up.

see: [http://php.net/manual/en/function.mt-
rand.php](http://php.net/manual/en/function.mt-rand.php)

Under caution:

The distribution of mt_rand() return values is biased towards even numbers on
64-bit builds of PHP when max is beyond 2^32. This is because if max is
greater than the value returned by mt_getrandmax(), the output of the random
number generator must be scaled up.

edit: this post went from 5 points to 1, which I don't care about(in ~500 days
I posted less than 10 times and I have ~35 points), but who downvotes
documentation, seriously? -_-

~~~
_asummers
While documented, that is surprising behavior. If it takes in an int,
shouldn't it be able to take in PHP_INT_MAX? And shouldn't it yell at you
instead of just silently going about its day?

~~~
mikeash
The PHP approach seems to be that any crazy behavior is acceptable as long as
it's documented.

~~~
sarciszewski
That's not a "crazy behavior", it's a limitation of the algorithm they use.
mt_rand() only has approx. 2^32 possible seeds; why would you expect it to
support ranges larger than 2^32?

The best thing to do is to not use mt_rand().

~~~
pilif
"Fixing" broken data silently is tremendously bad behaviour in general because
you can't possibly know whether the caller knew that they were providing
invalid data and whether the caller is going to be happy with your fix.

If you expect input parameters in some range and you get values out of that
range, then you blow up. You don't silently truncate anything and you
certainly don't reduce the entropy of your random number generator by nearly
50%.

~~~
FranOntanaya
This was my main doubt about PHP 7's scalar types, which will autocast values
into the desired types. I don't expect the caller to know better than the
callee what are the method's boundaries, and once casted you may not be able
to see if the input was garbage. But yeah, when something breaks the method's
author can wash their hands.

------
_yy
Quoting a Reddit comment
([https://www.reddit.com/r/lolphp/comments/3eaw98/mt_rand1_php...](https://www.reddit.com/r/lolphp/comments/3eaw98/mt_rand1_php_int_max_only_generates_odd_numbers/)):

 _The problem is way worse than you think. Check out what this looks like when
printed in hexadecimal:[http://3v4l.org/XVTgS](http://3v4l.org/XVTgS)

Basically, what is going on is that PHP_INT_MAX is 2^63 - 1. mt_getrandmax()
is 2^31 - 1. The way mt_rand() makes a random number when the limit is too
large is that it makes a random number in the range [0,2^(31)), then it scales
it to be a number in the range [0,MAX-MIN), and finally adds MIN._

So in your case, it scales everything by 2^32 and adds 1. Which is why the
numbers are* extremely non-random _. [See my other comment in this thread for
a more detailed explanation and some more test scripts that prove this is what
is
happening.]([https://www.reddit.com/r/lolphp/comments/3eaw98/mt_rand1_php...](https://www.reddit.com/r/lolphp/comments/3eaw98/mt_rand1_php_int_max_only_generates_odd_numbers/ctdhxha\)*)

------
sarciszewski
And that's another reason why it's a good thing that PHP 7 (coming out soon!)
has a new function called random_int() which provides an unbiased distribution
of integers powered by a CSPRNG (yes, it uses urandom, in case anyone asks).

My employer is leading the effort to expose a compatible interface in PHP 5
applications so developers can add one line to their composer.json file and
start writing code for PHP 7. It's MIT licensed and should nearing its 1.0.0
release soon.

[https://github.com/paragonie/random_compat](https://github.com/paragonie/random_compat)

~~~
rurban
Using /dev/urandom is not a good thing. It's only needed to get secure random
numbers (CSPRGs) very slowly. You'll drain it so that the apps which really
need it will be out of seeds. To get random numbers fast, you need to use the
good bits of a PRG based on an LCG.

Everybody should know by now that the mersenne twister is not only bad, but
also slow.

Everybody should know by now that the first bits are good, and the last bits
horrible. That's why you should not use modulo %, but rshift or better use
Melissa E. O'Neill's PCG, which uses the first good bits to improve the latter
worse bits. [http://www.pcg-random.org/](http://www.pcg-random.org/)

~~~
tptacek
There is in practical terms no such thing as "draining a secure random number
generator".

Most RNGs are at bottom stream ciphers. The stream cipher problem of
stretching a small key into a very large keystream is fundamental to
cryptography, and RNGs are an instance of that problem in the most favorable
setting: no message boundaries and no coordinated state.

You don't in practice worry about AES-CTR "running out of key", and so you
shouldn't worry about urandom "running out of entropy".

It's understandable why so many people believe this: the Linux urandom man
page invents a whole parallel universe in which this is not only a live issue,
but one with applications to specific kinds of cryptography! Until we find the
appropriate incantations to shut down that particular portal to the world of
the Elder Things, it's best just not to look at it.

~~~
cpach
Still no updates on this bug report:
[https://bugzilla.kernel.org/show_bug.cgi?id=71211](https://bugzilla.kernel.org/show_bug.cgi?id=71211)

[tumbleweeds]

------
beefhash
Using a loop of that length (for ($i=0;$i<10000;$i++)) on a site allowing
people to execute code, and then linking it to HN effectively amounts to a do-
it-yourself DDoS. I don't think I wanna be the host of that site right now.

~~~
Buge
Shouldn't they cache the result?

~~~
dimino
A lesson they're learning right now, I suppose.

~~~
mojuba
To find out later that not everything can be cached, e.g. time(), rand(),
file_get_contents()... to find out further that no sys or lib call that
contains I/O can be cached. And then damn it, why bother caching at all?

------
dantillberg
I believe it's also generally an anti-pattern to do things like "num = rand()
% max" or "num = rand() & bit_mask" where rand() returns an integer from a
pseudo-random number generator, right?

PHP may not do a very good job at ensuring an even distribution throughout the
space of possible integers, but for PRNGs in general (especially the quick &
dirty ones), the _worst_ place to grab bits from is the least-significant
bits.

(my source is that I hung out with a copy of Numerical Recipes in college;
Numerical Recipes has a nice chapter for learning about PRNGs, along with
example code for a number of implementations)

~~~
sarciszewski
Yes, don't use % even with a CSPRNG!

[https://stackoverflow.com/a/31374501/2224584](https://stackoverflow.com/a/31374501/2224584)

~~~
tptacek
It's true that's an anti-pattern, but it's not a particularly exploitable one
in most of the places that really need CSPRNGs.

~~~
paragon_init
Agreed. It's a red flag for "expect more exploitable issues to be found around
the corner" and can result in biased distributions, but it by itself does not
break a RNG.

------
MrDosu
This is why you read the documentation. Don't deduce what a method does purely
on its name and your common sense, you are probably wrong...

~~~
TelmoMenezes
Yes. In fact I propose that giving methods pronounceable names is an anti-
pattern. Method names should be random strings of characters. Then you don't
fall into the temptation of not reading the documentation and assuming things.
In fact, make them long enough so that you can't memorize them.

someObj.dbrdCfj34uW31U289u(x)

This avoids a lot of frustration. Imagine if this was called someObj.sqrt(x).
People could jump to the conclusion that this takes the square root of a
value, when in reality it only does so on weekdays. On weekends it returns the
CPU temperature.

~~~
TOGoS
I suspect you are being facetious, but I've for a long time thought that this
was actually an okay idea, even for systems less screwed up than PHP.

When I need to write a function with lots of complex interactions, I mash the
keyboard rather than try to come up with a name that's misleadingly simple.
That way you have to read the documentation.

------
deckiedan
It's because there simply _are_ more odd numbers than even numbers...

(j/k!)

~~~
raverbashing
I think if you consider Z rather than N the opposite is true

Proof:

For ever number k there exists -k which has the same parity as k (that is, if
k is even, -k is even as well)

For every k there is (k + 1) with opposite parity (same thing with -k and -(k
+ 1)

Now, if we count the numbers, from 1 to infinite, the number of even and odd
numbers is the same

from 1 to -infinite the number of even and odd numbers is the same as well

And you get an extra zero, an even number

~~~
dragonwriter
Intuitions based on assuming that you can treat "infinity" the way you would a
finite number are often, as in this case, wrong.

The "infinity" referenced here, aleph-null, is provably both the cardinality
of the set of even integers, and the set of odd integers. (And also the set of
integers, and any other countably infinite set.)

So, unintuitive as it might be when thinking about finite subsets of the
integers, there are not only as many positive integers as negative integers,
but as many of either as there are _integers_.

(There are, however, _bigger_ infinite sets, there are, for instance, more
reals than integers.)

See,
[https://en.wikipedia.org/wiki/Aleph_number](https://en.wikipedia.org/wiki/Aleph_number)

~~~
mikeash
Some fun examples about infinity:

There are as many prime numbers as there are integers.

There are as many fractions between 10 and 11 as there are integers.

There are _more_ real numbers between 41.00001 and 41.0002 than there are
integers from -infinity to infinity. Even though both are "infinite."

------
InfiniteEntropy
I see your problem, PHP.

