
When intuition and math probably look wrong (2010) - Tomte
https://www.sciencenews.org/article/when-intuition-and-math-probably-look-wrong
======
Xcelerate
How you state the problem has a lot to do with the resulting probabilities.
Many times the answer is ambiguous until you definitively identify the space
of states that a state can be drawn from.

Bertrand's paradox is an example of this issue:
[https://en.wikipedia.org/wiki/Bertrand_paradox_(probability)](https://en.wikipedia.org/wiki/Bertrand_paradox_\(probability\))

------
tel
This example along with all the similar ones drives home the basic idea that
probability is tied to its model and its model is often best phrased as a
generative story. If we agree on the generative story we'll agree on the model
and then your intuition probably works just fine. In situations where the
story is obscured—in this case, only the end results are mentioned so that the
story is ambiguous—then you can have ambiguous answers.

------
gpawl
After a long writeup about the treachery of ambiguous problem statements, the
article ends with ... a misleading ambiguous statement:

> the ukulele-playing and dancing ambitions would affect the probabilities
> about the sex of his sibling.

I suspect that this whole article would have been much better if was written
by one of the interviewees instead of the jouralist. Sigh.

------
tylerhou
The best intuition I've heard for the two-child problem is as follows.

Suppose you have two doors with two children behind them. The 1/2 chance of a
boy version works like this: Say you randomly open one of the two doors and
there is a boy. That doesn't give you any information about the other boy, so
you say that the probability of the other one being a boy is 1/2.

1/3: This time beforehand you know that there is at least one boy. Then you
open a door and it happens to be a boy. That tells you something about the
other door, which alters its probability to 1/3.

~~~
notahacker
Interesting you mentioned doors, since my first thought on reading the article
was of the Monty Hall Problem, specifically _is this guy trying to parody the
Monty Hall problem by coming up with purposely obtuse interpretations of
purposely vague questions to complicate what would normally be considered a
trivial conditional probability question?_

But the Monty Hall problem came later. This just feels like the probability
equivalent of Xeno's Paradox, except revolving around arbitrarily shifting
unstated assumptions about selection effects rather than arbitrarily reducing
distances travelled.

------
kpil
Stupid question perhaps, but why not:

    
    
      boy           girl
      boy(notgirl)  boy
      boy           boy(notgirl) <-reversed order
      girl          boy
    
    ?

~~~
acqq
You were probably annoyed before you've read the whole article, like I was at
that moment. But it goes on:

"Not so fast, says probabilist Yuval Peres of Microsoft Research. That naïve
answer of 1/2? In real life, he says, that will usually be the most reasonable
one."

"If I specifically selected him because he was a boy born on Tuesday (and if I
would have kept quiet had neither of my children qualified), then the 13/27
probability is correct. But if I randomly chose one of my two children to
describe and then reported the child’s sex and birthday, and he just happened
to be a boy born on Tuesday," "the probability that the other child will be a
boy will indeed be 1/2."

~~~
kpil
Well I got annoyed a little bit further down when they started to coalesce
events arbitrarily, and stopped reading.... I probably don't understand why
it's meaningful or even valid to do that though...

On the other hand, I spent a rather sad year reading statistics because I
thought it was interesting, and got so bored that I got a job as a developer
instead, so I am rather confident that it's not for me to truly understand...

------
natosaichek
I'm not sure I totally understand, but maybe someone can confirm this for me.
If we add more "extraneous" information, it seems like it pushes the
probability closer to (the naiive answer of) 1/2\. If we add lots of
extraneous information, does it get really close? What if we did something
like this:

I have two children, one of whom is a black-haired, blue-eyed son with an owl-
shaped birthmark on his right leg born on a Tuesday in Argentina during an
eclipse while a flock of 231 seagulls circled clockwise overhead. What is the
probability that I have two boys?

Am I just muddying the water, or is the probability vanishingly close to 2?

~~~
08-15
> is the probability vanishingly close to 2?

It is. Here's how it clicked for me:

Forget about Thursday; someone tells you "I have two kids, at least one of
which is a boy who is _special_ ". Now you get these cases for the two
children: Bg, gB, Bb, bB, BB (where g is a girl, b is a boy, B is a _special_
boy). Without the last case (two _special_ boys), everything is symmetrical,
and the probability that the second child is a boy is 1/2\. The less likely it
is that the guy has two _special_ boys, the closer the answer is to 1/2.

This also means that the everyday answer is actually 1/3, because parents
always think that all of their children are _special_ , so the problem reverts
to the simple Two Children Problem with the equally likely cases BG, GB, and
BB (yes, the girls are _special_ , too).

------
tunesmith
Peter Norvig has an iPython (jupyter) notebook that explores this same puzzle:
[http://nbviewer.jupyter.org/url/norvig.com/ipython/Probabili...](http://nbviewer.jupyter.org/url/norvig.com/ipython/ProbabilityParadox.ipynb)

------
seycombi
Argument by Gary Foshee

"My solution was based on set theory. Look at the entire set of all families
with two children. Then look at a subset: those with two boys. Then look at
another subset: those with a boy born on Tuesday. If you look at it that way,
then 13/27 is the correct answer."

~~~
kgwgk
13/27 is the correct probability if we got the answer "I have two children,
one of whom is a son born on a Tuesday" by dialing random numbers from the
phone book and asking "could you confirm if you have to children, one of whom
is a son born on Tuesday?" Which might or might not be equivalent to the
actual data generation process.

------
kutkloon7
It always helps me to picture a probability space (as in, an actual physical
space, where size correlates to probability).

Assuming that children are always a boy or a girl, and are equally likely to
be a boy or a girl, the 'complete space' of families with two children would
be distributed as ABBC where A means two boys, B means a boy and a girl, and C
means two girls. By excluding C from the space, ABB remains. So indeed, the
probability would be 1/3 that both children would be boys.

But now I have made another assumption, namely that the family is picked
uniformly random from all families with two children with at least one boy.

As you can see, there are quite a lot of assumptions. One convention seems to
be that when you can't tell for sure how something is distributed, it is
uniformly distributed.

For example, when a person blindly grabs a ball from a box with one red and
one blue ball (and no other balls), the probability is not always 50% that he
grabs the red ball. The red ball might be a bowling ball and the blue one
might be the size of a marble.

While in this example it is reasonable to assume a uniform distribution, you
can point out similar, but more subtle assumptions in most questions about
probability.

A more complicated example: Weatherman A predicts the weather right 70% of the
time. Weatherman B predicts the weather right 60% of the time. Weatherman A
predicts rain tomorrow, weatherman B predicts dry weather. What is the
probability that it will rain tomorrow?

The 'right' answer is 14/23\. In order to arrive at this answer, you need to
assume that the predictions are statistically independent (which is an
unrealistic assumption). Indeed, it is easy to sketch a situation in which the
probability is different: it is always dry, weatherman A predicts rain 30% of
the time, and weatherman B predicts rain 40% of the time. This is consistent
with the question, and the probability that there will be rain tomorrow is
obviously 0%.

This lack of rigor always bothers me.

~~~
mgraczyk
Your second example has completely different assumptions than your first. To
get 14/23, you don't have to assume that the predictions are independent, only
that they are independent given future weather.

Let W be 1 if it will rain tomorrow and 0 if it will not rain, with 0.5
probability either way. Let A be weatherman A's prediction and let B be
weatherman B's prediction. We have

    
    
      P(W = 0) = P(W = 1) = 0.5
      P(A = w | W = w) = 0.7 # Weatherman A is right 70% of the time
      P(B = w | W = w) = 0.6 # Weatherman B is right 60% of the time
    
      P(A, B | W) = P(A | W) P(B | W)
    

Now we want to know "Weatherman A predicts rain tomorrow, weatherman B
predicts dry weather. What is the probability that it will rain tomorrow?"

That is, what is P(W=1 | A=1, B=0)?

    
    
        P(W=1 | A=1, B=0) = P(A=1, B=0 | W=1) P(W=1) / P(A=1, B=0)
        = P(A=1 | W=1) P(B=0 | W=1) P(W=1) / (sum_w P(A=1, B=0|w)P(w))
        = 0.7*0.4*0.5 / (P(A=1|w=0)P(B=0|w=0)P(w=0) + P(A=1|w=1)P(B=0|w=1)P(w=1))
        = 14/23
    

If you don't like the assumption that the predictions are statistically
independent (herding, etc), then you just have to come up with the conditional
joint P(A,B | W). That wouldn't be difficult given a small amount of data
since W is binary. You would put a dirichlet prior on the distribution
(basically just a beta distribution with an additional dimension) and
essentially just count the times each triple (a, b, w) happens.

Still, the problem isn't a lack of rigor, it's a lack of clarity in stated
assumptions.

To be specific, you didn't state the assumption that P(w)=0.5, which you used
to compute 14/23.

~~~
kutkloon7
I don't understand your point. I was arguing that the way the problem is posed
seems to imply a unique solution. I was showing the problem was ill-posed, and
that many probability problems have similar subtle or less subtle hidden
assumptions. Here, this assumption is P(A, B | W) = P(A | W) P(B | W). I don't
think you even need P(W = 0) = P(W = 1) = 0.5.

These assumptions usually seem quite natural to make, but can be very
unrealistic (why would the predictions of weather men be independent? I would
bet they are not in reality). This is a very bad to teach students. It is
always very important to know which assumptions you are making, and if
textbooks do this wrong, it will be nearly impossible for students to get this
right.

I would think of a student which struggles with this problem as a better
mathematician than the student which just uses P(A, B | W) = P(A | W) P(B | W)
'because the formula is in his textbook', but the second student is more
likely to get rewarded (especially in American education, since the USA seems
to be especially fond of textbooks which give a ready-made recipe for every
problem that a student is supposed to solve).

~~~
mgraczyk
> I don't think you even need P(W = 0) = P(W = 1) = 0.5.

You do,

In the article, the prior of interest is the prior gender of a child, which
can be safely assumed to be 0.5. Similarly, independence of children's genders
is a very good approximation as well. It wasn't necessary for the article to
state these assumptions because they are obvious common knowledge.

------
seycombi
Best advice (in general not just this particular problem) I heard was by Joe
Blitzstein (Harvard): Practice, practice, practice.

Much of statistics/probability is about pattern recognition, and developing
pattern recognition requires lots of practice.

I enjoyed and learned a lot from his Harvard course Statistics 110:
Probability

Video Lectures
[http://projects.iq.harvard.edu/stat110/youtube](http://projects.iq.harvard.edu/stat110/youtube)

