

On the evidence of a single coin toss - alexkay
http://blog.moertel.com/articles/2010/12/07/on-the-evidence-of-a-single-coin-toss

======
blahedo
There are at least three different lines of inquiry here:

\- Hypothesis testing. If the [null] hypothesis is that p(heads) is 1, you
can't prove this, only disprove it. So: "doesn't sway". Not very interesting,
but there it is.

\- Simple Bayesian. The probability of his claim given that it comes up heads,
p(C|H), is the prior of his claim, p(C), times p(H|C), divided by p(H). Well,
p(H|C) is 1 (that _is_ the claim), and p(H), if I fudge things a little bit,
is about 1/2, so p(C|H) should be about double p(C)---assuming p(C) is very
low to start with.[0][2]

\- Complex Bayesian. There's a hidden probability in the simple case, because
p(C) is encompassing both my belief in coins generally and also my belief
about Tom's truthtelling. So really I have p(C) "p that the claim is true" but
also p(S) "p that Tom stated the claim to me". Thus also p(S|C) "p that if the
claim were true, Tom would state this claim to me" and p(C|S) "p of the claim
being true given that Tom stated it to me"; but also the highly relevant
p(S|not C) "p of that if the claim were NOT true, Tom would state this claim
to me ANYWAY" and a few other variants. When you start doing Bayesian analysis
with more than two variables you nearly always need to account for both p(A|B)
and p(A|not B) for at least some of the cases, even where you could sometimes
fudge this in the simpler problems.

SO this brings us to a formulation of the original question as: what is the
relationship between p(C|S,H) and p(C|S)? The former as
p(H|C,S)p(C,S)/(p(C,S,H) + p(not C,S,H)) and then
p(H|C,S)p(C,S)/(p(H|C,S)p(C,S) + p(H|not C,S)p(not C,S)) and if I take
p(H|C,S) as 1 (given) and p(H|not C,S) as 1/2 (approximate), I'm left with
p(C,S)/(p(C,S) + 0.5p(not C,S)) For the prior quantity p(C|S), a similar set
of rewrites gives me p(C,S)/(p(C,S) + p(not C,S)) Now I'm in the home stretch,
but I'm not done.

Here we have to break down p(C,S) and p(not C,S). For p(C,S) we can use
p(C)p(S|C), which is "very small" times "near 1", assuming Tom would be really
likely to state that claim if it were true (wouldn't _you_ want to show off
your magic coin?). The other one's more interesting. We rewrite p(not C,S) as
p(not C)p(S|not C), which is "near 1" times "is Tom just messing with me?".

Because a _crucial_ part of this analysis, which is missing in the hypothesis-
test version or in the simpler Bayesian model, but "obvious" to anyone who
approaches it from a more intuitive standpoint, is that it matters a _lot_
whether you think Tom might be lying in the first place, and whether he's the
sort that would state a claim like this just to get a reaction or whatever. In
the case where you basically trust Tom ("he wouldn't say that unless he at
least thought it to be true") then the terms of p(C,S) + p(not C,S) might be
of comparable magnitude, and multiplying the second of them by 1/2 will have a
noticeable effect. But if you think Tom likely to state a claim like this,
even if false, just for effect (or any other reason), then p(C,S) + p(not C,S)
is _hugely_ dominated by that second term, which would be many orders of
magnitude larger than the first, and so multiplying that second term by 1/2 is
still going to leave it orders of magnitude larger, and the overall
probability—even with the extra evidence—remains negligible.

[0] This clearly breaks if p(C) is higher than 1/2, because twice that is more
than 1. If we assume that the prior p(H) is a distribution over coins, centred
on the fair ones and with a long tail going out to near-certainty at both
ends, the claim "this coin is an always-heads coin"[1] is removing a chunk of
that distribution in the H direction, meaning that p(H|not C) is actually
slightly, very slightly, greater than 1/2. This is the "fudge" I refer to
above that lets me put the p(H) as 1/2. Clearly if my prior p(C) is higher
than "very small" this would be inconsistent with the prior p(H) I've
described.

[1] I'm further assuming that "always" means "reallllllly close to always",
because otherwise the claim is trivially false and the problem isn't very
interesting.

[2] Note that this is not actually a "naive Bayesian" approach---that's a
technical term that means something more complicated.

~~~
blahedo
A little dense, sorry; I reposted on my blog with slightly better formatting
(and at least breaking out some of the math onto separate lines):

<http://www.blahedo.org/blog/archives/001081.html>

------
jules
Say you have a prior probability distribution P(p) for the probability you
think the coin is a coin that comes up heads with probability p. Your
probability distribution P(p) will probably have a huge peak around p=0.5, but
you can choose any prior belief. So P(p) is your opinion about the coin prior
to seeing the experiment. Now we can apply Bayes' theorem to compute your
opinion P'(p) about the coin after seeing the experiment:

    
    
        P'(p) = P(p | H) 
              = P(H | p)*P(p)/P(H) 
              = p*P(p)/integral(P(H | p)*P(p)dp) 
              = p*/E(P) * P(p)
    

Your belief that the coin has probability p is skewed by a factor of p/E(p).

Here's an example of a graph of P(p) that shows how your belief about the coin
is skewed after seeing a heads:

<http://dl.dropbox.com/u/388822/coin.png>

The first graph is an example of a prior belief about the coin, the second
graph is the belief that this person should have after seeing the experiment.

So the answer to the question is:

    
    
        P'(1) = 1/E(P) * P(1) = P(1)/E(P)
    

i.e. your new probability that this is a coin that always comes up heads is
your old probability divided by your expected value of the probability of
coming up heads.

For example if your prior belief was unbiased, then E(P)=0.5, and P'(1) =
2*P(1).

~~~
Jabbles
Your prior belief that the coin is "special" should be extremely small, since
you've examined it and can see no reason for it to come up heads all the time.
After flipping, your belief will be larger, but still small.

~~~
jules
Any prior belief is valid. That's the point of this: you separate the
mathematical reasoning from the subjective assumptions (beliefs). For example
it might be the case that Tom has demonstrated several of those special coins
before, and in that case your opinion would probably be that there is a good
chance that this one is special too. The nice thing about the math is that we
can encapsulate these assumptions in the prior probability distribution P(p).

BTW I used that distribution in the plots because it was easiest to come up
with, and somewhat realistic, and it shows the skewing well. Feel free to plug
in your own beliefs.

------
vibragiel
I think this problem has no satisfactory answer. We are asked to operate with
quantifiable information —probability of special and normal coins coming up
heads— and unquantifiable information —my "belief" in a coin being a special
coin, which may, or may not, depend on my level of knowledge of several
factors, like the plausibility of the existence of special coins, their
prevalence, my competence identifying them...

EDIT: ...my friend being or not a usual liar, his sleight-of-hand skills, my
level of rational analysis and critical thinking, me being on drugs, me
dreaming, me actually experiencing the Matrix...

We are trying to quantify the unquantifiable: my "belief" in something, a
psychological phenomenon which depends on millions of rational and irrational
factors.

~~~
gjm11
... and it turns out that there is a mathematical formalism (namely,
probability theory) that allows you to reason on the basis of all this
information, provided only that you make your best attempt to quantify the
allegedly unquantifiable. Which is in fact perfectly quantifiable; it's just
hard to give a good justification for whatever particular quantifying you do.

My reasoning, for what it's worth, goes as follows.

It's surely impossible to make a coin that literally always comes up heads no
matter how you toss it. (Imagine that you contrive to send it vertically into
the air with no spin at all. It will come down the same way as it went up.) So
let's charitably interpret my hypothetical friend's claim as something like
"if you toss this in a normal sort of way, it'll come up heads at least 99% of
the time". That seems like it might be achievable with something that looks
like a normal coin. Now, how likely is that given only that (1) it looks like
a normal coin and (2) my friend has told me what he has?

Well, now it's time to pluck numbers from the air, in an attempt to quantify
my beliefs about biased coins and my friend's trustworthiness and so forth. I
suppose I think there's about a 50% chance that it's actually possible to make
a coin that behaves as described. If so, it probably isn't particularly hard
to figure out how to do so, in which case the makers of magic tricks and
practical jokes and suchlike will surely do it and sell the fake coins at a
reasonable price. Except that in most countries it's a serious offence to sell
fake currency, so really convincing fake coins would have to come from abroad
and be shipped in surreptitiously. Let's say a 10% chance, given that such
coins are possible and not too expensive to make, that it's commercially worth
while for someone to do that.

Now, my friend is purely hypothetical, so who knows what his character and
interests might be? Let's suppose he's known to be interested in this sort of
thing. Then perhaps he's about equally likely to procure such a coin, if it
exists, as to play a joke on me by claiming to have such a coin and just
hoping it comes up heads. (Of course if no such coins exist then he'll
certainly do the latter.)

So, my prior probability for the coin's being fake in the appropriate way is
about 1/20.

Now Pr(coin is fake|heads) / Pr(coin is normal|heads) = Pr(coin is fake) /
Pr(coin is normal) times Pr(heads|coin is fake) / Pr(heads | coin is normal).
That's 1/20 times 0.99/0.5, or about 1/10.

So, before the coin toss I'd have given roughly 20:1 odds against the coin's
being fake; after it, roughly 10:1.

(Note: I have deliberately done the calculations only very roughly. That's
usually good enough in practice, and all my prior probabilities are crude
estimates anyway.)

Of course, if (e.g.) my friend starts offering suitable bets on the outcome of
the next few tosses, that will itself be evidence that the coin really is
fake. So if you happen to have such a coin, you won't be able to cheat me out
of as much money as you might hope :-).

------
proemeth
This is a bayesian problem, depending on the a priori probability given to a
"special dice" pri,

P = 2 * pri / (1+pri)

(Which is > to the probability before tossing)

------
kondro
Everyone here seems to be so much smarter than me at this math stuff.

I think I'll stick to writing accounting software.

~~~
rsaarelm
Probability theory looks to be one of the harder areas in math to get anywhere
with by just being smart, as opposed to studying it. Plenty of professional
mathematicians, including Paul Erdös who was pretty smart about math stuff,
got the Monty Hall problem wrong when they first encountered it, for example.

------
SeanDav
Well it will have an effect. He could be telling the truth or lying. This has
to be taken into account. The question is how much weighting you give to his
statement. If you inclined to believe his statement then the first heads must
increase the confidence, for a given weighting.

