
A Truth in the Law of Small Numbers - houseofshards
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2627354
======
dzdt
I don't get this yet. Their lead paragraph says:

> Jack takes a coin from his pocket and decides that he will flip it 4 times
> in a row, writing down the outcome of each flip on a scrap of paper. After
> he is done flipping, he will look at the flips that immediately followed an
> outcome of heads, and compute the relative frequency of heads on those
> flips. Because the coin is fair, Jack of course expects this empirical
> probability of heads to be equal to the true probability of flipping a
> heads:0.5. Shockingly, Jack is wrong. If he were to sample one million fair
> coins and flip each coin 4 times, observing the conditional relative
> frequency for each coin, on average the relative frequency would be
> approximately 0.4.

If I try to work this out, I write down the 16 possibilities for four coin
flips:

    
    
        TTTT, TTTH, TTHT, TTHH,
        THTT, THTH, THHT, THHH,
        HTTT, HTTH, HTHT, HTHH,
        HHTT, HHTH, HHHT, HHHH
    

I count 24 instances where H occurs before the end of the sequence, 12 of
which are followed by H and 12 of which are followed by T. So I get the
expected 0.5 outcome.

The authors do some other calculation, and I don't understand what they are
thinking. Can someone explain?

~~~
iron0012
Here's an explanation that addresses this apparent inconsistency:
[http://andrewgelman.com/2015/09/30/hot-hand-explanation-
agai...](http://andrewgelman.com/2015/09/30/hot-hand-explanation-again/)

~~~
mcnamaratw
The linked explanation seems to be that if you do the probability wrong in a
certain way, you come up with something below 50%.

Here's one way to reproduce the 40% number they get in the paper. Take a
sequence of four flips. Consider five cases:

1\. 0 heads. Probability of head following a head=0

2\. 1 head. Probability of head following a head=0

3\. 2 heads. Probability of head following a head = 1/3

4\. 3 heads. Probability of head following a head = 2/3

5\. 4 heads. Probability of head following a head = 1

Now if those five cases were equally likely, then what would be the expected
number of heads following a head?

Answer: (0 + 0 + 1/3 + 2/3 + 1)/5 = 0.4

Is this what they assume gamblers are using for 'empirical probability'? I
can't tell.

------
mcnamaratw
Does anyone understand their argument clearly?

For three flips I get this sample space:

TTT no data; TTH no data; THT one data point HT; THH one data point HH; HTT
one data point HT; HTH one data point HT; HHT two data points HH, HT; HHH two
data points HH, HH

total data points by result: HT 4 HH 4

For four flips it's the same deal, after an H I'm equally likely to get
another H or a T. Of course.

What am I missing? I don't quite understand their definition of 'empirical
probability.'

------
notsurenot
It is clear that there is a selection bias. Suppose there are 3 heads in a
row, HHH, and we are going to estimate the probability of the next coin to be
H. But we do it this way: if the outcome is H you don't count that as a H, a
success, instead we proceed to make another estimation replacing HHHH as the
initial condition. Since you don't count correctly the number of successes,
this way of estimating probabilities constitutes a biased estimator of the
real probability.

~~~
LittlePeter
It is not THAT clear. Hence that multiple other papers, as the article says,
made mistakes.

