On the evidence of a single coin toss

jules · on Dec 7, 2010

Say you have a prior probability distribution P(p) for the probability you think the coin is a coin that comes up heads with probability p. Your probability distribution P(p) will probably have a huge peak around p=0.5, but you can choose any prior belief. So P(p) is your opinion about the coin prior to seeing the experiment. Now we can apply Bayes' theorem to compute your opinion P'(p) about the coin after seeing the experiment:

    P'(p) = P(p | H) 
          = P(H | p)*P(p)/P(H) 
          = p*P(p)/integral(P(H | p)*P(p)dp) 
          = p*/E(P) * P(p)

Your belief that the coin has probability p is skewed by a factor of p/E(p).

Here's an example of a graph of P(p) that shows how your belief about the coin is skewed after seeing a heads:

http://dl.dropbox.com/u/388822/coin.png

The first graph is an example of a prior belief about the coin, the second graph is the belief that this person should have after seeing the experiment.

So the answer to the question is:

    P'(1) = 1/E(P) * P(1) = P(1)/E(P)

i.e. your new probability that this is a coin that always comes up heads is your old probability divided by your expected value of the probability of coming up heads.

For example if your prior belief was unbiased, then E(P)=0.5, and P'(1) = 2*P(1).

Jabbles · on Dec 7, 2010

Your prior belief that the coin is "special" should be extremely small, since you've examined it and can see no reason for it to come up heads all the time. After flipping, your belief will be larger, but still small.

jules · on Dec 7, 2010

Any prior belief is valid. That's the point of this: you separate the mathematical reasoning from the subjective assumptions (beliefs). For example it might be the case that Tom has demonstrated several of those special coins before, and in that case your opinion would probably be that there is a good chance that this one is special too. The nice thing about the math is that we can encapsulate these assumptions in the prior probability distribution P(p).

BTW I used that distribution in the plots because it was easiest to come up with, and somewhat realistic, and it shows the skewing well. Feel free to plug in your own beliefs.

vibragiel · on Dec 7, 2010

I think this problem has no satisfactory answer. We are asked to operate with quantifiable information —probability of special and normal coins coming up heads— and unquantifiable information —my "belief" in a coin being a special coin, which may, or may not, depend on my level of knowledge of several factors, like the plausibility of the existence of special coins, their prevalence, my competence identifying them...

EDIT: ...my friend being or not a usual liar, his sleight-of-hand skills, my level of rational analysis and critical thinking, me being on drugs, me dreaming, me actually experiencing the Matrix...

We are trying to quantify the unquantifiable: my "belief" in something, a psychological phenomenon which depends on millions of rational and irrational factors.

gjm11 · on Dec 7, 2010

... and it turns out that there is a mathematical formalism (namely, probability theory) that allows you to reason on the basis of all this information, provided only that you make your best attempt to quantify the allegedly unquantifiable. Which is in fact perfectly quantifiable; it's just hard to give a good justification for whatever particular quantifying you do.

My reasoning, for what it's worth, goes as follows.

It's surely impossible to make a coin that literally always comes up heads no matter how you toss it. (Imagine that you contrive to send it vertically into the air with no spin at all. It will come down the same way as it went up.) So let's charitably interpret my hypothetical friend's claim as something like "if you toss this in a normal sort of way, it'll come up heads at least 99% of the time". That seems like it might be achievable with something that looks like a normal coin. Now, how likely is that given only that (1) it looks like a normal coin and (2) my friend has told me what he has?

Well, now it's time to pluck numbers from the air, in an attempt to quantify my beliefs about biased coins and my friend's trustworthiness and so forth. I suppose I think there's about a 50% chance that it's actually possible to make a coin that behaves as described. If so, it probably isn't particularly hard to figure out how to do so, in which case the makers of magic tricks and practical jokes and suchlike will surely do it and sell the fake coins at a reasonable price. Except that in most countries it's a serious offence to sell fake currency, so really convincing fake coins would have to come from abroad and be shipped in surreptitiously. Let's say a 10% chance, given that such coins are possible and not too expensive to make, that it's commercially worth while for someone to do that.

Now, my friend is purely hypothetical, so who knows what his character and interests might be? Let's suppose he's known to be interested in this sort of thing. Then perhaps he's about equally likely to procure such a coin, if it exists, as to play a joke on me by claiming to have such a coin and just hoping it comes up heads. (Of course if no such coins exist then he'll certainly do the latter.)

So, my prior probability for the coin's being fake in the appropriate way is about 1/20.

Now Pr(coin is fake|heads) / Pr(coin is normal|heads) = Pr(coin is fake) / Pr(coin is normal) times Pr(heads|coin is fake) / Pr(heads | coin is normal). That's 1/20 times 0.99/0.5, or about 1/10.

So, before the coin toss I'd have given roughly 20:1 odds against the coin's being fake; after it, roughly 10:1.

(Note: I have deliberately done the calculations only very roughly. That's usually good enough in practice, and all my prior probabilities are crude estimates anyway.)

Of course, if (e.g.) my friend starts offering suitable bets on the outcome of the next few tosses, that will itself be evidence that the coin really is fake. So if you happen to have such a coin, you won't be able to cheat me out of as much money as you might hope :-).

proemeth · on Dec 7, 2010

This is a bayesian problem, depending on the a priori probability given to a "special dice" pri,

P = 2 * pri / (1+pri)

(Which is > to the probability before tossing)

SeanDav · on Dec 7, 2010

Well it will have an effect. He could be telling the truth or lying. This has to be taken into account. The question is how much weighting you give to his statement. If you inclined to believe his statement then the first heads must increase the confidence, for a given weighting.

kondro · on Dec 7, 2010

Everyone here seems to be so much smarter than me at this math stuff.

I think I'll stick to writing accounting software.

rsaarelm · on Dec 7, 2010

Probability theory looks to be one of the harder areas in math to get anywhere with by just being smart, as opposed to studying it. Plenty of professional mathematicians, including Paul Erdös who was pretty smart about math stuff, got the Monty Hall problem wrong when they first encountered it, for example.

blahedo · on Dec 7, 2010

There are at least three different lines of inquiry here:

- Hypothesis testing. If the [null] hypothesis is that p(heads) is 1, you can't prove this, only disprove it. So: "doesn't sway". Not very interesting, but there it is.

- Simple Bayesian. The probability of his claim given that it comes up heads, p(C|H), is the prior of his claim, p(C), times p(H|C), divided by p(H). Well, p(H|C) is 1 (that is the claim), and p(H), if I fudge things a little bit, is about 1/2, so p(C|H) should be about double p(C)---assuming p(C) is very low to start with.[0][2]

- Complex Bayesian. There's a hidden probability in the simple case, because p(C) is encompassing both my belief in coins generally and also my belief about Tom's truthtelling. So really I have p(C) "p that the claim is true" but also p(S) "p that Tom stated the claim to me". Thus also p(S|C) "p that if the claim were true, Tom would state this claim to me" and p(C|S) "p of the claim being true given that Tom stated it to me"; but also the highly relevant p(S|not C) "p of that if the claim were NOT true, Tom would state this claim to me ANYWAY" and a few other variants. When you start doing Bayesian analysis with more than two variables you nearly always need to account for both p(A|B) and p(A|not B) for at least some of the cases, even where you could sometimes fudge this in the simpler problems.

SO this brings us to a formulation of the original question as: what is the relationship between p(C|S,H) and p(C|S)? The former as p(H|C,S)p(C,S)/(p(C,S,H) + p(not C,S,H)) and then p(H|C,S)p(C,S)/(p(H|C,S)p(C,S) + p(H|not C,S)p(not C,S)) and if I take p(H|C,S) as 1 (given) and p(H|not C,S) as 1/2 (approximate), I'm left with p(C,S)/(p(C,S) + 0.5p(not C,S)) For the prior quantity p(C|S), a similar set of rewrites gives me p(C,S)/(p(C,S) + p(not C,S)) Now I'm in the home stretch, but I'm not done.

Here we have to break down p(C,S) and p(not C,S). For p(C,S) we can use p(C)p(S|C), which is "very small" times "near 1", assuming Tom would be really likely to state that claim if it were true (wouldn't you want to show off your magic coin?). The other one's more interesting. We rewrite p(not C,S) as p(not C)p(S|not C), which is "near 1" times "is Tom just messing with me?".

Because a crucial part of this analysis, which is missing in the hypothesis-test version or in the simpler Bayesian model, but "obvious" to anyone who approaches it from a more intuitive standpoint, is that it matters a lot whether you think Tom might be lying in the first place, and whether he's the sort that would state a claim like this just to get a reaction or whatever. In the case where you basically trust Tom ("he wouldn't say that unless he at least thought it to be true") then the terms of p(C,S) + p(not C,S) might be of comparable magnitude, and multiplying the second of them by 1/2 will have a noticeable effect. But if you think Tom likely to state a claim like this, even if false, just for effect (or any other reason), then p(C,S) + p(not C,S) is hugely dominated by that second term, which would be many orders of magnitude larger than the first, and so multiplying that second term by 1/2 is still going to leave it orders of magnitude larger, and the overall probability—even with the extra evidence—remains negligible.

[0] This clearly breaks if p(C) is higher than 1/2, because twice that is more than 1. If we assume that the prior p(H) is a distribution over coins, centred on the fair ones and with a long tail going out to near-certainty at both ends, the claim "this coin is an always-heads coin"[1] is removing a chunk of that distribution in the H direction, meaning that p(H|not C) is actually slightly, very slightly, greater than 1/2. This is the "fudge" I refer to above that lets me put the p(H) as 1/2. Clearly if my prior p(C) is higher than "very small" this would be inconsistent with the prior p(H) I've described.

[1] I'm further assuming that "always" means "reallllllly close to always", because otherwise the claim is trivially false and the problem isn't very interesting.

[2] Note that this is not actually a "naive Bayesian" approach---that's a technical term that means something more complicated.

blahedo · on Dec 7, 2010

A little dense, sorry; I reposted on my blog with slightly better formatting (and at least breaking out some of the math onto separate lines):

http://www.blahedo.org/blog/archives/001081.html

khafra · on Dec 7, 2010

Nice walkthrough of bayesian probability, then bayesian epistemology. All that's missing is a link to http://yudkowsky.net/rational/technical for those who'd like furhter readings on the subject.

hammock · on Dec 7, 2010

This answer makes the most sense to me, though I dont understand it.

Some quantity x which is your trust in Tom, plus some quantity y which is the probability of the coin multiplied by some factor of increasing confidence

hackerblues · on Dec 7, 2010

For a different approach: from a scientific/logical perspective the result has no bearing on the truth of his claim. The only possibility which we can remove from consideration is the claim that "The coin will always land tails."