
How to turn a biased coin into a Fair Coin - aristus
http://carlos.bueno.org/2011/10/fair-coin.html
======
blahedo
What I particularly like here is that the title of the article, and thus the
title of the HN post, fully conveyed the puzzle, so I was able to think about
it for a moment before reading the article, solve it, and then confirm that I
had the correct answer. Always a happy moment. :)

------
tyree732
This works and is very intuitive, but loses quite a bit of the coin flip's
entropy in the process. There are more efficient algorithms, such as
<http://www.eecs.umich.edu/~qstout/abs/AnnProb84.html> , which better use the
entropy generated by the coin flip.

~~~
imurray
One solution I heard has a basic idea that's quick to explain, although the
details are messy. It might be interesting to get some intution:

Flip the coin N times and count the number of heads, K. You got one of the C =
{N \choose K} = N!/((N-K)!K!) ways of getting K heads, all of which have the
same probability, no matter what the bias of the coin is. If you consider an
ordering of the C choices, then you have an integer from 0 to C-1 drawn
uniformly at random. You could represent that integer using binary and extract
about log_2(C) random bits.

For large N, the number of bits extracted is nearly the optimal NH bits, where
H is the binary entropy of the coin. To count how far through a list of
combinations I am for large N, I might construct a lattice and, I'm guessing,
start reproducing works referenced by the paper that the parent linked to.
Whereas the paper itself seems to have, from my quick skim, the definitive
answer. Nice.

As tucked away in another comment, related is:
<http://en.wikipedia.org/wiki/Randomness_extractor>

------
bretthopper
If anyone is wondering, the interactive calculator is done using Tangle
(<http://worrydream.com/Tangle/>) made by Bret Victor (most recently of
<http://worrydream.com/#!/LadderOfAbstraction> fame).

Good to see more examples of it in the wild.

------
bdhe
Interestingly, the Euro, when introduced back in the early 2000s was
considered biased and even led to a statistician testing it out.

<http://www.newscientist.com/article/dn1748>

~~~
imurray
The statement about spinning is too weak. You _have_ to spin rather than flip
to try to reveal bias:
[http://www.stat.columbia.edu/~gelman/research/published/dice...](http://www.stat.columbia.edu/~gelman/research/published/diceRev2.pdf)

As for the statistical significance, here's another view of how to do it:
[http://www.inference.phy.cam.ac.uk/mackay/abstracts/euro.htm...](http://www.inference.phy.cam.ac.uk/mackay/abstracts/euro.html)

------
grot
This is a pretty well known problem. (First problem the professor in my
undergrad algorithms class ever gave to us)

 __ __ __ _SPOILER_ __ __ __ __

p = probability of heads

1 - p = probability of tails.

flip the coin twice:

head head = p^2

head tail/tail head = 2p(1-p)

tail tail = (1 - p)^2

Notice the two in the middle there. We can just say head tail = Heads and Tail
Heads = Tails. If we get HH or PP, just do it again.

Ok, now call the above procedure, our "randomization algorithm"

we can calculate the expected running time -- our algorithm succeeds with
probability 2p(1-p). The number of trials we need to run our algorithm for is
a geometric random variable. Thus, the expected running time is 1/(2p(1-p)).

~~~
mikhael
thanks for writing this. the formula used in the webpage, (HH+TT+HT)/TH,
appears to be off by one. i calculated it myself from the series, i.e. sum(n=0
to inf, 2(n+1)2pq(p^2+q^2)^n), and through the algebra i work it out to the
same as yours (doubled).

i'm not sure how the author's formula was derived, but i get the same result
calculating the series for a slightly different (but wrong) algorithm: just
keep flipping until HT or TH appears (not flipping by pairs). this series is
sum(n=1 to inf, (n+1)(qp^n + pq^n)), i.e. sum(n=0 to inf, (n+1)(qp^n + pq^n))
- (p+q). the series works out to 1/(p(1-p)), but that last term would account
for the one-off difference.

~~~
aristus
Good point. I am going off of the original algo by von Neumann, which throws
away both flips if you get HH or TT. There are many others, and I don't see
any reason against your optimization, if the flips are truly independent.

<http://www.fas.harvard.edu/~libcs124/CS/coinflip3.pdf>

~~~
mikhael
yeah, indeed -- but i think the optimization is actually broken. if you are
just waiting for an HT or a TH in a sequence of flips, then you will just wind
up with H..HT or T..TH, so that the very first flip decides the outcome. this
is how i first (much too hastily) read the algorithm, but, somewhat
surprisingly, found that the expectation matched exactly.

------
swaraj
Cool article, I got asked this question during my FB interviews and had
luckily seen something exactly like this before.

~~~
aristus
If you did, then it was very likely me interviewing you, and you should have
said that you had seen it before. #busted :D

~~~
hugh3
Well, this is why you shouldn't be using semi-well-known problems as interview
questions.

~~~
aristus
A fair cop. :) I used to ask it as a follow-up to a much harder, less "ah-ha"
question, to see how candidates approach probability.

------
0x12
If I set the slider to 50% then it still wants me to flip 3 times. At 50% one
flip would do. There should be at least an exception for that situation, I
can't imagine I'm being alone in testing the edge cases first.

~~~
aristus
50/50 is a special case. You don't need to use the von Neumann extractor. This
algo is for sources of randomness that you suspect have a bias.

But if you did use this algo on a 50/50 coin, half the time you'll get an HH
or TT and have to start over. The other half of the time you'll get an HT or
TH and return. So, on average, you'll need to do 3 flips for each execution of
the algo.

~~~
0x12
Sure, I get that.

But that's why the bias can never be a (discrete) input, it is an emergent
property of some unknown piece of matter, not something you set with a slider
in steps of 1%.

As the coin gets more biased you'll need to flip more often to get a HT or TH
pair with the amount of bias showing up as the frequency of the heads and
tails over a longer run.

But the interesting bits are right around the middle where you can take a coin
that is nearly perfect and use it in a perfect way, even when you can't even
find out in finite time if the coin is perfect or not (and I would expect no
real life coin to be perfect, but I would expect them to be so good as to be
undetectable in a reasonable amount of time, 50/50 to me is 'perfection', I
can't tell the left 50 from the one on the right).

The 'edge case', a perfect coin does not need three flips, it just needs one.
If that's not the intended outcome you could either remove that 50/50 point or
handle it correctly.

One of the more elegant ways of doing that would be to go from 49.9 to 50.1 or
so, or an even smaller difference from the true center.

~~~
DanielStraight
I think you are missing the trick here. Since HT and TH are equally likely,
regardless the bias of the coin, as soon as you get one or the other, you can
count it as a fair flip. All other combinations (HH and TT) are just ignored.
The number of flips shown by the form is just about how patient you'll have to
be to get a fair flip (HT or TH).

It doesn't matter how often you get TH and HT, only that they have the same
frequency. Given then the probability of TH and HT are the same, if you only
consider flips resulting in TH or HT, you automatically have a fair coin, no
matter how bad (or good) the coin is.

------
vsl2
If you like these types of puzzles, the biggest online collection I've found
is at:

<http://www.ocf.berkeley.edu/~wwu/riddles/hard.shtml>

------
eridius
This article claims this algorithm is very likely happening inside my computer
right now, but it doesn't elaborate. Why would my computer be performing this
algorithm?

~~~
aantthony
From <http://en.wikipedia.org/wiki/Randomness_extractor>:

Randomness extractors are used widely in cryptographic applications, whereby a
cryptographic hash function is applied to a high-entropy, but non-uniform
source, such as disk drive timing information or keyboard delays, to yield a
uniformly random result.

Is that what he is talking about?

------
jhamburger
Maybe I'm crazy and it's the end of the day and my reading comprehension is
shot, but I honestly can't even understand what the question is. Can somebody
explain?

~~~
prof_hobart
A fair coin would be one that is equally likely to land on either heads or
tails when tossed - on any given toss, you have exactly a 50% chance of
getting either heads or tails.

A biased coin, on the other hand, has a bias towards one side or another -
e.g. you could have a coin that lands on heads 75% of the time.

The question is - if you've got a biased coin, how do you get it to behave
like the fair coin?

It's probably easier to imagine a 10 sided die. You've got one with 7 sides
painted red and 3 painted blue, so you've got a 70% chance of rolling red on
any given roll. But you and your friend want to be able to use this die to
decide who's round it is next - guess the right colour and it's the other
guy's round. If you were to base this on a single role, you'd obviously both
want to pick red. So what you want is to be able to use this die to create two
outcomes which are both equally likely to happen.

Does that help?

------
jasny
I would say both parties flip the coin. The one that throws heads wins. If
they both throw the same, they need to throw again. Easy and effective :)

------
dhruvbird
This sounds simpler - if it works (can anyone confirm it?)

* We maintain a fair coin abstraction that contains an unfair coin. * Along with the unfair coin, we also maintain a counter 'C'. * 'C' starts from 0, counting up 1 every time the coin is flipped. * If C % 2 == 0 then we report the value of the unfair coin's toss. Otherwise, we reverse the value of the unfair coin's toss and report it.

------
Steko
Related: I vaguely recall hearing of a p2p game client where dice rolls were
needed and to secure against cheating they polled each client and took the
modulo of the sum of responses (iirc). You get unbiased outcomes as long as
one player doesn't cheat.

------
ch0wn
One of the few sites, where I don't need to use my iReader chrome extension to
read the text. I love the font and the hyphenation using the sweet-justice JS
script seems to work out great.

~~~
biot
The hyphenation seems a bit off. The word mathematical is split as "mat-
hematical" which I'm quite sure is incorrect.

------
dennyabraham
i'm not sure if i understand the problem. couldn't you just take the inverted
result of every other flip?

~~~
wnight
No. With a large bias you'd have hundreds of one result for one of the other.

