In this case, the question is:
> If a number generator is uniformly distributed, why might that not be 'random enough'?
If we rephrase this as:
> If a number generator is uniformly distributed, why might that be 'too predictable'?
This makes the flaw obvious: there are many ways to choose numbers uniformly which are completely predictable. For example, choose the smallest value first, then choose a value which is furthest from any previously-chosen value (in, say, lexicographic gray-code order).
The point here is that 'uniformity' is not the property we care about; it is a consequence of the property we care about (unpredictability). If a distribution were non-uniform, we would be better able to predict it (by biasing out predictions to match the distribution), hence the least-predictable distribution is necessarily a uniform one.
Other times this comes up:
> Are quantum measurements 'truly random'?
That's untestable, but we can test whether they're predictable or not.
> This encryption algorithm requires a source of random bits.
It would work just as well with a source of bits that are merely unpredictable.
> This strategy can only be defeated by a random opponent.
It can also be defeated by an opponent which you can't predict.
And so on.
Simply put: given the first k bits of a random stream, can you predict the k+1th bit (more than 50% of the time)?
A generator that passes this test will pass any statistical randomness test, but the converse is not necessarily true. For example Mersenne Twister is a good generator from a statistical perspective, but it's actually quite easy to recover its internal state by observing a small amount of output. (Around 20k bits, if I recall correctly.)
This definition isn't sufficient. Suppose you have a random stream, and one predictor which asserts the next bit is "1" and another predictor which says that next bit is "0".
As k increases, there's a nearly 100% chance that one of the two predictors will be correct more than 50% of the time.
Even if you pick a single predictor, say, that all-1s predictor, there's an almost 50% chance that for a given random stream and k that it will have better than 50% predictive ability.
Just because Guildenstern's coin is heads 92 times in a row doesn't mean that it's not random. Only that it's very unlikely to be random.
Speaking of word replacements, "random" does not mean "uniformly distributed". An unfair coin toss is still random.
Oftentimes it's also just not what is actually required. Often you want uniform spacing with some jitter, not points drawn from a uniform distribution.
put on hold as primarily opinion-based by Xavierjazz,
Olli, Excellll, HackToHell, Tog 3 hours ago
Many good questions generate some degree of opinion
based on expert experience, but answers to this
question will tend to be almost entirely based on
opinions, rather than facts, references, or specific
If this question can be reworded to fit the rules in
the help center, please edit the question.
The question mixes up uniform distribution and randomness, but the answers are factual. There's little to no opinion involved here.
The problem with Stack exchange is that it conflates heavy site usage with domain experience and maturity.
Avidity for imaginary internet points and maturity are at best orthogonal.
I don't really have any hard data on this, it's opinion based on seeing extremely good answers from high rep people regularly, and not being able to recall seeing poor answers from high rep users voted highly when they aren't actually useful.
Still, I find the whole stack exchange system to be incredibly useful, including the original post.
(Edit: And it looks like you do, too, since you point out the information is factual.)
It's probably not fixable psychologically, we're just seeing patterns everywhere. This also happens with physical dice by the way. If a player rolled two or more abysmal results in a row it's common to say "hey, you should really swap out these dice".
For example, if 20 is rolled then the probability of 20 being rolled again is halved (so it has a 1/40 chance) while the probability of its inverse is doubled (so probability of 1 is 1/10). This dramatically reduces the chance of the three 20s in a row scenario and would very quickly regress to the mean, just like people expect. Added flair for affecting the probabilties of all possible values subtly (so that after rolling a 20, a 2 becomes more likely and a 19 less likely).
I'm sure I can't be the only person to think of this.
Say the player loses a 75% roll, then attempts another identical 75% roll. The game will actually give you better odds for the second roll. I think it's silly -- the game basically lies to you about your odds on the second attempt.
Sid Meier talked about it at length in this video: http://www.youtube.com/watch?v=bY7aRJE-oOY. It starts around 18:25.
> Seeing your five hit attack miss five times at 5% miss
> chance per attack will induce true frustration.
It is. iTunes would play a song, and randomly select the next one (it would do random play, rather than shuffle).
Nowadays it generates a shuffled playlist and goes along, and people whine because it doesn't reshuffle automatically (you have to disable and enable shuffle to reshuffle).
An other option is to use random sampling, and to re-sample after you're done with the current set (possibly excluding any song of the previous sample)
I know there exist "dice" packs of cards, which contain the correct distribution of rolls. If you shuffle a few of those together, you'll probably get a nicer game.
For me, it makes the game feel less "solvable" by basic analysis. Sure, statistically certain observations will hold. For the game you are in, though, you have to be ready to adjust your strategy based on what has happened. (Clearly not in anticipation of future rolls, but more based on the resources you have managed to get, not the ones you wanted.)
Of course, I haven't played at all in a long while.
The solution we settled on was block randomization. But that's probably not what you'd want for a human game as they could potentially figure out the end of each block (and what value remains). But it worked wonderfully for us.
It's why we see stars in the sky, which are more or less distributed randomly in the star field, yet they don't appear random: they appear to form constellations.
It's basically impossible to correct this bad intuition. People educated in the statistics and mathematics of random numbers have to consciously learn what real randomness looks like and override their intuition.
But try as you might, the night sky will still form constellations. And dice rolls will still, frustratingly and confusingly, fall into streaks more often than you'd expect.
Split your class into 2. Give half A the players dice and a piece of paper. Give other half B just a piece of paper. Half B plays with imaginary dice, Half A with real dice. Everyone records 20 games of a piece of paper. You hand in the papers and the demonstrator (or maybe a new one that hasn't been in the class, for extra theatrics) looks at the papers. He will accurately place them into separate piles.
The trick is that fake randomness and real randomness are different enough that they are distinguishable with a high level of accuracy. Real dice will roll the same number in a row far more often than fake dice. The more games you play, the more accurately they can be distinguished.
You can look at the formulas to figure out the odds of rolling the same number 4 times in 100 rolls, but it doesn't drive home the message that your intuition is broken until you see intuition being beaten consistently and reliably by a better tool. It's fun too.
Your players are thinking: "Dice rolls are equiprobable, so I should roll evenly-distributed results." That only starts being true after a great number of rolls.
Random distributions will be uneven and present clumps, strands and voids. For the same reason, during WWII, British intelligence thought that the Germans were bombing specific targets in London, while they were just dropping bombs at random.
Source recently posted on HN: http://www.wired.com/wiredscience/2012/12/what-does-randomne...
My point was that it's not possible to convince certain people with logic. They see random numbers and something happens to their brains, maybe it's religion. They'll always see oracles or at the very least shady manipulators behind random data.
Yet...we still swapped the dice. :)
I wonder if visualizing the dice roll could help. If you don't have ethical objections to it you could also A/B test biased dice (say, biased away from 1 and 20) for complaints (give a different contact email address to players who get biased dice) or instead show the players the distribution of the last 100 rolls.
First, the roller is absolutely anonymous. There are basically three major functions the site performs: just rolling some dice according to user input, an API taking HTTP requests answering in JSON, and a chat room feature that has a dice roller bot. It's not possible to identify individual players except maybe in the chat rooms (though they're not really authenticated there either).
I do sometimes get requests to look into chat logs of a particular room over allegations that one player somehow tampered with the results. This happens about once a month. Invariably, the accused user isn't even that lucky, they almost always have bog standard average results. It's just that the other players think this person is unreasonably lucky and they keep this belief up even after I tell them about the data (which they could have calculated themselves by looking at their own chat logs).
Second, the core value proposition of the site is its ability to render roleplaying dice codes directly. That means you don't select your dice from a drop down - you enter the code. A simple example would be "3d6+5" for damage or something, but there are lots of different codes for different rules - some of them quite convoluted. It's not feasible to tinker with them in any meaningful way because players use the site in so many different ways.
Similar to that question about biased coins and throwing 10 heads in sequence it was discussed some time ago here.
Anyone using the same PRNG can look at the output of yours and try to put their PRNG in the same state. If they succeed, the output of theirs will match yours - now and in the future - unless you re-seed.
Two problems can occur here:
1) you seed with something they can predict. e.g. seconds since 1970 (or microseconds since 1970). If they have a reasonable sample of numbers from your system, they can try lots of different seeds and see if they can find the one which gives the same output as you.
2) PRNGs have "internal state", which is a bunch of numbers they mix together. Some PRNGs have the property that if you can you can observe enough numbers in a row from the PRNG, you can turn them back into the internal state locally and then you can do the same thing as if you knew the seed (predict future numbers).
One of my favorite examples of this ever: Once in Las Vegas a keno machine mistakenly used a fixed seed. Meaning once someone figured this out, he could show up when the game started for the day and predict what the machine would do with 100% accuracy.
They found a flaw in a REAL poker site because they were using a pseudo random number generator (and a stupid algorithm) and were able to know the order of the cards being used in the game!
>The RST exploit itself requires five cards from the deck to be known. Based on the five known cards, our program searches through the few hundred thousand possible shuffles and deduces which one is a perfect match. In the case of Texas Hold'em poker, this means our program takes as input the two cards that the cheating player is dealt, plus the first three community cards that are dealt face up (the flop). These five cards are known after the first of four rounds of betting and are enough for us to determine (in real time, during play) the exact shuffle. Figure 5 shows the GUI we slapped on our exploit. The "Site Parameters" box in the upper left is used to synchronize the clocks. The "Game Parameters" box in the upper right is used to enter the five cards and initiate the search. Figure 5 is a screen shot taken after all cards have been determined by our program. We know who holds what cards, what the rest of the flop looks, and who is going to win in advance.
Bad example. Because the author specified "some dice", we can assume more than one die, in which case some numbers have a greater chance to appear than others, in a series of fair "random" throws.
It's a bad sign that the author of a piece about randomness isn't aware of the systematic behavior of his chosen example, a behavior biased in favor of certain outcomes.
Not at all. The fact that one cannot predict the next number in a random sequence is irrelevant to the fact that, in the long term, that number has a predictable relationship with other numbers in the pool of possibilities.
For a random generator of the digits 0-9, can I predict the next number in the sequence? No. Can I say what the proportion of, say, 7s will be, within a large list of outcomes? Yes, and the larger the list, the more reliable the prediction.
If your position had merit, quantum theory, probabilistic on a small scale, would be seen as violating cause/effect relationships, a rather important part of physical theory. But, because of the mathematics of quantum probability, individually unpredictable atoms become very predictable macroscopic objects.
If you can somehow control, predict or influence the seed you can predict or influence rolls.
For example if the game uses JUST the system time as a seed, I can shift the system time back to a specific position and run the application again to get the same random number generation.
If you know the algorithm, for example you're playing an open source card game, and you can determine the time (offset) on the server or the other players computer you could cheat by calculating their random variables.
Now it really isn't an issue in games, but it's incredibly important in security. Randomness makes up a huge component of encryption.
That's why you have applications that require you to move the mouse a bit, hopefully randomly and they take in a whole bunch of other entropy fed in by the system.
In Unix-like OS's you have /dev/random and /dev/urandom. /dev/random requires a certain amount of entropy and environmental noise, and it blocks on reads until it's satisfied with the output.
/dev/urandom does not block, but it gives pseudo random output. For the purposes of security, /dev/urandom should NEVER be used. However neither are truly 'random'.
Before you listen to anyone "explaining" to you that /dev/random is good for cryptographic secrets and /dev/urandom is good for everything else, consider that Adam Langley owns Golang's crypto code and Golang's crypto/random interface just pulls from /dev/urandom; Daniel Bernstein, Tanja Lange, and Peter Schwabe --- all cryptographers, with a pretty serious Unix pedigree --- wrote Nacl, an extraordinarily well-regarded crypto library, and Nacl's CSPRNG... wait for it... just pulls from /dev/urandom.
I've had to painstakingly explain to certain people why, as an example, erasing a HDD from /dev/urandom is allright. And why their program that simulates some random input should use /dev/urandom.
But no, they babble about true randomness and then complain why they get like paltry few hundred bytes per second.
Even if you have a server running casino games just use /dev/urandom/. It requires a total compromise of the server to get the internal state out of that, and in that case it's easier just to change urandom into /dev/zero.
What does this mean in practice: If I give you 10 10MB files. 5 of them created with urandom and 5 of them created using hardware RNG there is practically no chance that you could differentiate which came from which, barring knowledge of the urandom entropy pool.
But it is true that HW RNG could be useful just to keep CPU load down. By now we know that those are backdoored by NSA, so you should still use /dev/random as a source for private keys. So HW RNG is useful just to keep the load down. Not to avoid any attacks.
This is so important that I'll have to repeat it again: The only difference between urandom and random is that urandom is theoretically suspectible to an attack which could allow prediction of the output values if the attacked knows the internal entropy pool state. The statistical properties of both are the same.
Many new Intel chips do (though off the top of my head I can't tell you how easy to access it is).
The SoC used in rPis units has one that can be read from at over half an Mbit per second via /dev/hwrng (once the relevant module is loaded), so you can either use it directly or (better for portability) keep reading from /dev/urandom and use rngtools to feed the entropy into the kernel's pool as needed.
In both the above cases there is a trust issue as exactly how the RNGs work is not publically documented, but unless you are extremely paranoid (by neccesity or "issues"!) I woudl consider them decent sources of entropy.
You should always use /dev/urandom on Linux for cryptographic use, unless you know exactly why you need /dev/random. Hint: you don't.
(On FreeBSD it doesn't matter because it's all the same)
I've been meaning to write up a coherent summary of that issue with all kinds of sources forever now, and I should really get around to just doing it, because this comes up every few weeks on HN.
/dev/urandom on the other hand is a CSPRNG that is regularly re-seeded from /dev/random. The purpose is to stretch the avilable entropy to more (pseudo-)random numbers. Since it's a CSPRNG the sequence is deterministic but impractical to predict, re-seeding helps in preventing others from observing too much output to reverse-engineer the seed.
So advice is generally to only use /dev/random when you're implementing something like /sev/urandom and for SSL keys and the like you just use /dev/urandom.
/dev/random and /dev/urandom are two interfaces to essentially the same CSPRNG.
On Linux, unlike FreeBSD (where there is no difference between the two), there are two differences between random and urandom:
(1) random will block when a kernel entropy estimator indicates that too many bytes have been drawn from the CSPRNG and not enough entropy bytes have been added to it
(2) urandom and random pull from separate output pools (both of which receive the same entropy input, and both of which are managed the same way).
It is not true that /dev/urandom is a PRNG and /dev/random is not; it is not true that /dev/random is a CSPRNG and /dev/urandom is just a PRNG. They are the same thing, with a extremely silly policy difference separating them.