
How Fair Is My D20? (2015) - jandeboevrie
http://www.markfickett.com/stuff/artPage.php?id=389
======
eterm
As an approximation, we can estimate each face with an approximation to the
binomial at looking at where the cut off is for rejections at p=0.05.

If you're looking at a single face,

With 2000 rolls, you would expect a range of 81-120. (0.0405- 0.0600)

At 3000 rolls, that "narrows"\\* to 127-174 (0.0423 - 0.580)

At just 100 rolls, anywhere between 1 and 10 occurences looks fair.

However, it would be better to do a proper chi-squared test, this was just
illustrative, so let's do that.

Let's take the chessex opaque purple, which looks like it has 8300 rolls. By
our approximations above, we expect to see rolls between 377 and 454.

In our actual data we appear to have a minimum of 313 with face13, and a
maximum of 531 with face17.

So let's do a pearson's chi-squared test. Rolling 8300 times we expect 415 for
each face.

We calculate for each observation, the difference between that and the
expected value, and divide by the expected value, and then sum over all faces.

This gives a value of 155.3108 . We then have to compare this to the chi-
squared test for 19 degrees of freedom. (There are 19 degrees of freedom
because we have 20 faces, so after we have 19 results, the 20th must be fixed
by being 8300 minus the sum of the first 19 faces).

Digging out our statistical tables (you DO have statistical tables right?),
and lookup at 19 degrees of freedom, we can see 155 far exceeds even the
p=0.01 level.

So we can conclude that the chessex opaque purple die rolled here is biased.
(Or the die-roller is).

\\* It narrows in proportion to the total, the absolute margin is wider. This
is something that people often forget when dealing with the law of large
numbers.

~~~
appleiigs
Python3

    
    
        import random, statistics
    
        num_of_sims = 10000
        num_of_rolls = 3000
        results = []
    
        for s in range(num_of_sims):
    
            sum_of_rolls = 0
    
            # roll 20 sided dice
            for r in range(num_of_rolls):
                sum_of_rolls += random.randint(1,20)
    
            # keep track of the average value
            results.append(sum_of_rolls / num_of_rolls)
    
        print("ave: ", statistics.mean(results)) # which is 10.49915
        print("stdev: ", statistics.stdev(results)) # which is 0.10492
    

Now where did I put my statistical tables?

------
lmilcin
I have worked, about 15 years ago, on a contract for a company that amongst
other things produced automated roulette tables.

They had a large hall packed with those that would constantly spin up, drop
the ball and record results. There was mathematician employed to check series
for any statistical anomalies. I had few chats with the guy, I was quite
impressed the level of detail and care to make sure the wheels are perfectly
symmetrical and fair.

Most of the cost of the machine goes into making the wheel symmetrical. I
remember something like 20 or 30 thousand dollars spent on the single piece of
metal that becomes the wheel.

~~~
aidenn0
It's interesting they put so much work into that because roulette tables have
a ~5% house edge, so even a fairly large bias wouldn't cost the house.

~~~
DuskStar
Until that 5% edge turns out to be a 7% edge (thanks to the bias) and the
relevant regulators come down on you like a ton of bricks for cheating your
players.

~~~
aidenn0
Any fixed predictability can only decrease the house edge. The players make
the bets, not the house. Obviously if you could vary the predictability you
could cheat people by selectively avoiding payouts for large bets, but that's
separate.

------
appleiigs
The mechanical rolling and computer vision is impressive, but wish he had
noted if the results were statistically significant. Just a final touch to the
article. At 100-150 rolls per face, my gut is saying no. I guess I'm going to
have to whip up a couple loops in python after work to satisfy my curiosity.

~~~
jcranmer
The usual gut reaction for statistical significance is that n*p should be 20,
or 20 rolls per face. 100-150 should be statistically significant.

~~~
eterm
You can't assign significance to a sample size alone.

A sample size of 10 can be significant (if it were 10 1's for example), and a
sample size of 100,000 can be not significant, for example if you were to roll
4,953 1's.

~~~
dekhn
i believe the op meant "Statistical power"

------
11thEarlOfMar
Makes me wonder the extent to which the rolling surface would affect the
outcome. A harder surface would cause more bouncing. Would more bouncing
increase or decrease the relative fairness?

~~~
delecti
One common test of a die's fairness is floating it in salt-water, which allows
any irregularity in the weight distribution to show itself. From that,
intuitively I would assume that zero resistance (as much bouncing as
possible/needed) would result in the unfairness making itself most apparent.
Meanwhile, zero bouncing would cause the die to stick where it lands, making
any unfairness the result of how it was thrown, and it's easier to cause a lot
of randomness in that regard.

That said, I'm not sure that's generalizable to more realistic amounts of
bouncing ("only a bit of bouncing" as opposed to "none at all", and "kinda a
lot of bouncing" as opposed to "effectively zero resistance").

~~~
grawprog
I've never actually heard of the salt water test. It makes sense thinking
about it. I got a bag of dice from my dad, from his d&d days, from when he was
younger than me. There's a huge mix of some really weird dice in there.
They're fairly worn, some of the numbers aren't visible any more, and their
original quality ranges wildly. I wouldn't mind dropping them all in a bucket
of salt water to see now. I've drawn my own conclusions about them over the
years of using them, it'd be interesting to see how well they match up.

Also, i wonder how my dad's homemade d20, made out of a double d10(d20 with
0-9 twice, 0's used as 10 and 20) with half the numbers coloured red(11-20 is
red), with a pen, would rank up, it was likely just done randomly.

~~~
delecti
If you're going to do the salt water test, only use a glass of water, not a
bucket. It takes quite a bit of salt to make the density of the salt water
higher than most dice.

------
zimpenfish
For the graphs showing "normalised frequencies", I'd drop the "based at zero"
bars and draw them from 1 instead. You'd lose a lot of orange ink that's
distracting and show the deviation much more clearly.

Something like [https://rjp.is/dice-graph.png](https://rjp.is/dice-graph.png)
(best I can do with Preview, sorry!)

------
amiheines
This reminds me of an old post with a dice rolling machine for a gaming site
capable of producing 1.3 million dice rolls per day,
[http://gamesbyemail.com/News/DiceOMatic](http://gamesbyemail.com/News/DiceOMatic)

~~~
badfrog
> Currently, GamesByEmail.com uses some 80,000+ die rolls for play in games
> like Backgammon, Gambit (a RISK clone), W.W.II (an Axis & Allies clone) and
> others. To generate the die rolls, I have used Math.random, Random.org and
> other sources, but have always received numerous complaints that the dice
> are not random enough.

That's a super cool machine, but the justification seems a bit silly. The
randomization library in any programming language should be good enough for
casual games, and random.org uses atmospheric noise which is almost certainly
more random than dice:
[https://www.random.org/history/](https://www.random.org/history/)

~~~
aidenn0
This is to make players happy, not to provide better randomness. Many players
will not trust any random data coming from a computer.

~~~
Baeocystin
As a player, I'd be amused by using radioactivity-generated randomness!

[https://www.fourmilab.ch/hotbits/](https://www.fourmilab.ch/hotbits/)

------
sd_mikey
I would love to see this experiment done with the d20 from The Dice Lab. They
make their d20 dice with "ideally-balanced vertex sums while retaining the
opposite-side numbering convention". They explain why this makes for a bit
fairer dice here.
[https://mathartfun.com/thedicelab.com/BalancedStdPoly.html](https://mathartfun.com/thedicelab.com/BalancedStdPoly.html)

~~~
thaumasiotes
> They explain why this makes for a bit fairer dice here.

Well... they explain why the opposite-side numbering convention makes for
fairer dice. They don't say anything at all about balanced vertex sums other
than "we believe it's important".

Just compare the reasoning:

> If a die is unintentionally oblate (slightly flattened on opposing sides),
> the flatter regions are more likely to turn up when the die is tossed. If
> these two opposite numbers were 19 and 20 for example, then the die would on
> average roll high, since these two numbers would come up too often. Having
> the two numbers add to 21 avoids any such bias in the average number rolled.
> For this reason, the opposite-side numbering convention improves fairness.

vs

> Equally important in our opinion is balancing of the vertex sums.

This does not give me confidence that the vertex sums matter.

~~~
no_identd
The paper describing these dies might help then:

[http://www2.oberlin.edu/math/faculty/bosch/nbd.pdf](http://www2.oberlin.edu/math/faculty/bosch/nbd.pdf)
Bosch, Robert; Fathauer, Robert, Segerman, Henry - Numerically Balanced Dice
(2016)

(btw., here's the about page for The Die Lab:
[https://www.mathartfun.com/thedicelab.com/DiceDesign.html](https://www.mathartfun.com/thedicelab.com/DiceDesign.html))

More various stuff on fair dice:

On fair but irregular plyhedral dice:

[http://statweb.stanford.edu/~cgates/PERSI/papers/fairdice.pd...](http://statweb.stanford.edu/~cgates/PERSI/papers/fairdice.pdf)
Diaconis, Persi; Keller, Joseph B. - Fair Dice (1989)

[https://mathoverflow.net/questions/46684/fair-but-
irregular-...](https://mathoverflow.net/questions/46684/fair-but-irregular-
polyhedral-dice)

[https://pp.bme.hu/ar/article/download/7607/6570/](https://pp.bme.hu/ar/article/download/7607/6570/)
\- Várkonyi, Péter L. - The Secret of Gambling with Irregular Dice: Estimating
the Face Statistics of Polyhedra (2014)

[https://web.archive.org/web/20110925191300/http://blog.eqnet...](https://web.archive.org/web/20110925191300/http://blog.eqnets.com/2009/08/24/dynamical-
bias-in-the-dice-roll/)

Note: The link to
[http://maths.dur.ac.uk:80/~dma0cvj/mathphys/supplements/supp...](http://maths.dur.ac.uk:80/~dma0cvj/mathphys/supplements/supplement2/supplement2.html)
in the comments there is broken and the archived version doesn't have the
images, but the PDF copy survives in the archive. (One can find the other PDFs
from that page here, by filtering:
[https://web.archive.org/web/*/http://www.maths.dur.ac.uk/~dm...](https://web.archive.org/web/*/http://www.maths.dur.ac.uk/~dma0cvj/mathphys/*))

"Polyisohedral" dices: *
[http://loki3.com/poly/polyisohedra.html](http://loki3.com/poly/polyisohedra.html)
* [http://loki3.com/poly/fair-dice.html](http://loki3.com/poly/fair-dice.html)

[http://www.mathpuzzle.com/MAA/37-Fair%20Dice/mathgames_05_16...](http://www.mathpuzzle.com/MAA/37-Fair%20Dice/mathgames_05_16_05.html)

[https://web.archive.org/web/20050602020925/http://www.geocit...](https://web.archive.org/web/20050602020925/http://www.geocities.com/dicephysics/)

[https://savevsdragon.blogspot.com/2011/11/brief-history-
of-p...](https://savevsdragon.blogspot.com/2011/11/brief-history-of-
polyhedral-dice.html)

[https://journals.aps.org/pre/abstract/10.1103/PhysRevE.78.03...](https://journals.aps.org/pre/abstract/10.1103/PhysRevE.78.036207)
How random is dice tossing?

------
Yhippa
Interesting. I went to the most recent Prerelease event for Magic: the
Gathering and my first opponent told me to not use the D20 that came with the
kit and to use his individual D6s. He said they had known biases. Maybe he was
on to something.

~~~
aceld111
As a mtg player, I really don't get why can't we use spin-down dice as high
roll at the beginning of the game.

~~~
danielvinson
I'm a former competitive MtG player. My playgroup found that with a bit of
practice and a specific technique, we could roll 17+ on any spindown dice
almost every time (try rolling the die with it starting in your hand with the
20 facing upwards with a sideways motion). While technically you can use any
method you choose to randomly determine the start of a game, these dice have
been proven to not be "random" thus cannot be used in a tournament. It doesn't
matter at all in casual play, but in a tournament, you're just giving your
opponent a chance to play first for free.

------
nickthegreek
there is also the saltwater float test for lighter plastic dice.

[https://www.youtube.com/watch?v=VI3N4Qg-
JZM](https://www.youtube.com/watch?v=VI3N4Qg-JZM)

Anyone know of an easy way to do metal dice?

~~~
tedmcory77
Have you ever seen numbers run on float tested dice that test as biased?

Metal dice are less likely to have balance issues on the individual dice as
long as the mold is balanced (because they are solid and heavy).

Source - I own a dice company.

~~~
nickthegreek
I have not, but that sounds like a great kinda thing that a dice company
should post! My d&d players are _kinda_ dice obsessed. What is your company?
My buddy and I run a small business making dice towers/trays on a laser.

~~~
tedmcory77
Yea, it's something that we want to do at scale (rollers, dice) via machine
vision so all the "well actually" people that pop out of the woodwork with
edge cases can be addressed ahead of time.

Hahaha, not ready to dox myself yet but you can probably figure it out with a
little digging.

------
nullc
Von Neumann's algorithm for simulating a fair coin from a coin which is
arbitrarily biased (but with independent throws) has moral analogs for any
number of faces. By using more state you can also extract more data per throw.

Pieter Wuille created a table based implementation for a d6:
[https://gist.github.com/sipa/1913cf8aae565ddad0d1de7f2e9f7f3...](https://gist.github.com/sipa/1913cf8aae565ddad0d1de7f2e9f7f31)
though it isn't terribly efficient -- extracting only 2.722 bits per 4 roles,
instead of the 10.34 which is theoretically possible -- because the table
becomes fairly large fairly fast. d20 would be even worse.

[https://gist.github.com/sipa/1621c40775007f1c27a39b608a765b1...](https://gist.github.com/sipa/1621c40775007f1c27a39b608a765b16)
is a similar implementation for two sides that extracts more than half the
theoretical entropy.

If your application wanted uniform d20 throws a table that converted
independent but biased d20 throws into unbiased d20 throws would be possible
(though like above it would be inefficient unless it were very big.)

------
aidenn0
Interesting that this confirms other findings that the gamescience dice must
have the sprue-stub removed for fair rolling. This is intuitively true, but
contrary to Gamescience's marketing.

------
Fomite
Related, on d6's: [http://variancehammer.com/2015/07/31/musing-on-custom-
dice/](http://variancehammer.com/2015/07/31/musing-on-custom-dice/)

~~~
teej
Here’s a fascinating vizualization of 6000 simulated D6 rolls -
[https://www.reddit.com/r/dataisbeautiful/comments/au25nq/sim...](https://www.reddit.com/r/dataisbeautiful/comments/au25nq/simulating_6000_die_rolls_visualization_created/?st=JSM726PF&sh=a4988eeb)

------
woliveirajr
Should add [2015] to the title

~~~
markfickett
Hah yes, as the author I was surprised to see this pop up again! But it's a
pleasant surprise for sure.

~~~
jandeboevrie
It was on Lobste.rs first and I thought people here might like it as well.

------
En_gr_Student
I wonder about auto-correlation and repeated sequences of runs. If it isn't
totally random, then it is partly not random. What is the nature of the non-
randomness?

------
aidenn0
It would be interesting to take some of the unfair dice and sand down the
faces until they are equal diameters. That's a big time commitment though.

------
enthdegree
Would have liked to have seen a bootstrap estimate of the die's empirical
entropy to see how much it is lacking against uniform.

~~~
markfickett
If you want to do that analysis, I'd be very interested to see it / hear how
you do it. All the data is in the GitHub repo, for example
[https://github.com/markfickett/dicehistogram/blob/master/dat...](https://github.com/markfickett/dicehistogram/blob/master/data/151023d20chessexgeminicoppersteel/labels.csv)
lists the rolls for one test.

I ran bootstrapped subsampling to estimate confidence intervals (see comment
above), not sure how similar that is to what you're asking about.

------
purplezooey
That's pretty neat. Wonder if he wore a hole in the wood veneered desk.

------
Calashle0202
Very nice work! I am not sure what method of analysis would be the most
appropriate, but the sequence matrix deserves a closer look. You mention the
possibility of one side being preferentially followed by another one, but you
only investigate the simple case of the two sides being equal (preference for
the diagonal). However, the correlation could be less obvious, such as one
side being followed by its opposite, or by an adjacent one.

