Hacker News new | past | comments | ask | show | jobs | submit login
Coding Horror: Finishing The Game (codinghorror.com)
52 points by Anon84 on Jan 1, 2009 | hide | past | favorite | 39 comments



The question is ill posed, because the anwser is highly dependent on how the couple told you that they have one girl.

If they tell you "we have two kids", you ask them if at least one of them is a girl and they answer "yes", then indeed the probability of the other kid beeing a boy is 2/3.

If they tell you "we have two kids, and one of them is a girl", then by all rules of rational discourse the probability for the other kid being a boy is 1.

Things start to get interesting if the talking member of the couple is a mathematician (so the rules of ordinary discourse do not apply) and he says "We have two kids, and at least one of them is a girl". Now you might say that this is equivalent two the first case. But you could also go bayesian and look at the possible conversations with couples containing at least one mathematician that have two kids. Say you have 100 couples. On average, 25 of these will have two boys and tell you "We have two kids, and at least one of them is a boy", another 25 will have two girls and tell you "We have two kids, and at least one of them is a girl". The remaining 50 couples will have one boy and one girl. Assuming no gender bias on behalf of the speaking mathematician (ha!), 25 of these will tell you "We have two kids, and at least one of them is a boy", while the remaining 25 will say "We have two kids, and at least one of them is a girl." So out of the 50 couples that tell you "We have two kids, and at least one of them is a girl," 25 will have two girls, and 25 will have a boy and a girl. So in this case, the correct answer is 1/2.

(This is essentially the argument from the intro on this page that was linked from the discussion on coding horror: http://www.overcomingbias.com/2008/10/my-bayesian-enl.html )



I think that people are debating between 3 choices: (a) 100% chance that there is a boy and girl, (b) 2/3 chance that it is a boy and a girl, or (c) 50% chance that it is a boy and a girl.

It depends very much on the exact wording of the problem.

Like many people in the comments of Jeff's post said that if the person says that "one of my kids is a girl," then common sense makes it sound like ONLY one kid is a girl and the other must be a boy (case a).

If the question is worded precisely like question #2 in this wikipedia article: http://en.wikipedia.org/wiki/Boy_or_Girl#Second_question (where you have a population of families and you randomly pick a family which has a girl), then you use the Bayesian approach to conclude that there is a 66% chance that the other kid is a boy (case b).

Finally, I think that for case c, if you are just walking around and you're not specifically looking for a family with a girl, and you strike up a conversation and you hear that the person randomly decides to talk about his daughter, then I think the odds are 50% that this person's other kid is a boy. This is because you did not require that you choose a family with a specific gender (like in the wikipedia #2 scenario). First you picked a person to talk to, and then this person just happened to mention he had a daughter. Basically, we did not require that the person has a girl (case b), we only learned that he has a girl (case c).



Sorry Jeff, I understand the problem, but the language of the setup was wrong. Since a PERSON told you they have one girl, the GB and BG are equivalent and collapse to one case instead of two in the way that normal people talk.

If you had said that a mathematician or a statistician said they have one girl, that would be a different story.


I don't think you do understand the problem. The thing you're taking issue with - GB vs BG - is the vitally important to the statistics, and is the thing that "normal" people get wrong when they see this kind of problem for the first time.

The only real objection to the problem as it's posed is that it's pretty unlikely that anyone would say that one of their children was a girl when both of their children are girls. For a normal conversation, the ambiguity wouldn't be present; "two children" combined with "one is a girl" means the chances of one boy and one girl are 100%.

It's a bit like the old joke about "which month of the year has 28 days?" - answer being, they all have 28 days, just some have more.

However, in other less contrived situations, this is important. People risk discounting permutations, and only considering combinations, when the permutations are necessary to get a correct view of the odds.


His point is that a person telling you they have one girl would be accepted as referring to a particular child. Thus, BG and GB merge and the odds are 50% of a boy and a girl.


Pedantically, they don't merge, you just rule out GB or BG. So, initial conditions:

    1) X: G, Y: G
    2) X: G, Y: B
    3) X: B, Y: G
    4) X: B, Y: B
If you are told, "at least one is a girl," as in the posted question, you can only rule out case (4). If you are told more specifically that X is a girl, you can rule out (3) and (4), giving the 50-50 chance.

Which still seems a little weird to me, that knowing which is a girl, regardless of which one you know about, changes the chances.


Exactly - the difference is between "at least one is a girl" and pointing out a specific child as a girl.


And this is flat out incorrect.


I'm not sure what you mean be equivalent, but the relevant point is that a family is twice as likely to have one boy and one girl than have two girls.


The only collapse I see is that it's possible that "I have one girl" would imply ONLY one girl.

BG and GB as possibilities that often collapse in the minds of those thinking about this problem but that is exactly why the problem is paradoxical, why a certain group of people not only get it wrong but also can't let go of their wrong answer.


Another possible collapse would be to say "I have two children, and the oldest [or youngest] is a girl". That would disambiguate GB & BG, by effectively stating I have G? [or ?G]. Then the odds would be 50%.


Wrong, wrong, WRONG. You DO NOT understand the problem, if you think that this means the probability of the other child being a boy is 1/2. GB and BG are equivalent linguistically, but not mathematically; they do not represent identical possibilities in the space of all conceptual representations of a family of two.

Here, I'll prove it for you:

    ;; Have a child, with an equal probability of the child being a boy or a
    ;; girl.
    (defun make-child ()
      (if (eql (random 2) 0)
          'boy
          'girl))

    ;; Make a family of n members, with each child having an equal probability
    ;; of being either a boy or a girl.
    (defun make-family (n)
      (do ((family nil)
           (i 0 (+ i 1)))
          ((>= i n) family)
        (push (make-child) family)))

    ;; As in the story, meet someone at a party who informs us that (1) she or
    ;; he has two children, and (2) that at least one of these children is a
    ;; girl.  E.g.,
    ;;
    ;; Me: "Hi there, what's your name?"
    ;; Cute lady: "Jennifer"
    ;; Me: "Nice to meet you, Jennifer, I'm Mark.  What brings you here?"
    ;; Jennifer: "Well my husband is out of town on a business trip, so I
    ;;            wanted to come here and catch up with some old friends of
    ;;            mine.  Fortunately I was able to get a baby sitter for my
    ;;            two kids on such short notice.  One of the kids, Meg, has
    ;;            to be up early in the morning for dance practice and..."
    ;; Me: "Husband?  Damn, all the good ones are taken."
    ;; Jennifer: "What?"
    ;; Me: "Nothing.  Hey, hang on while I work out the probability that your
    ;;      other child is a boy, based on the information that you've already
    ;;      given me."
    ;; Jennifer: "You're weird.  Have a good evening."
    ;;
    ;; This function works by calling (make-family 2) repeatedly until we get a
    ;; family that meets both of these criteria, then returns a representation
    ;; of said family.  This is a precise analogy for the story: just as the
    ;; fictional person at this party, here we take the space of all possible
    ;; conceptual representations of a family of two, then discard any of these
    ;; that *does not* include at least one female child.  THIS IS THE ONLY
    ;; REASONABLE WAY TO HANDLE THE INFORMATION THAT JENNIFER RELATES IN THE
    ;; ABOVE EXCHANGE.
    (defun meet-at-party ()
      (do ((family (make-family 2) (make-family 2)))
          ((find 'girl family) family)))

    ;; Perform the (meet-at-party) scenario num-total times, and then return
    ;; the fraction of those times in which the family contained a boy.
    (defun run-test (num-total)
      (let ((num-with-boys 0))
        (dotimes (i num-total (float (/ num-with-boys num-total)))
          (if (find 'boy (meet-at-party)) (incf num-with-boys)))))
Just call (run-test 1000000) or something. The result is approximately 2/3.


seano nailed the way I interpreted the (poorly-phrased) version of it:

"Both of my kids are driving me crazy! Just yesterday I had to pick one of them up from the police station--I grounded her for a month!" - given just the information in your quote, the odds are 50% of a boy and a girl.

http://news.ycombinator.com/item?id=416555


Jeff didn't phrase the question carefully enough. In English if you say, "I have two children, one is a girl" that CANNOT mean both are girls. If both were girls you would never say that. Saying you have 1 girl implies that you have 1 boy. Or maybe 1 girl and one hermaphrodite.

100% was the right answer.

It's easy to get people to argue when you give them an almost-ambiguous word problem; they're not arguing about the math, they're arguing about the mapping from ambiguous words to math.


I don't agree; Jeff was not giving a quote. Instead it is just the relevant information abstracted from whatever the person said. By choosing the quote you did, you have added more information to the problem (at least when reading it with conversational English).

I think this would be a better quote of what the person might have said:"Both of my kids are driving me crazy! Just yesterday I had to pick one of them up from the police station--I grounded her for a month!" Pulling out the information corresponding to gender and family size would give only the information given in Jeff's post.

When applying math to the real world, you have to pull out the important information and deal with just that information. But here you are doing the opposite--trying to find a real world situation that applies to the math problem. In my opinion, your example does not quite apply.

(I don't know how the probabilities change when you account for hermaphrodites, but if it changes significantly enough so that approximately 66% is a bad answer, I would find that very interesting!)


"Both of my kids are driving me crazy! Just yesterday I had to pick one of them up from the police station--I grounded her for a month!" - given just the information in your quote, the odds are 50% of a boy and a girl.


Can you give me your reasoning? I think the only conclusions you can get from that quote is that the person has two children, and at least one of those is a girl. Do you disagree with that?

If I am correct about that, then it matches the conditions discussed in the article and the answer would be 2/3 for a boy and a girl.


The difference is that you have specified that the child picked up from the police station is a girl, thus only the sex of the other child is unknown. This other child is either a boy or a girl, presumably with a 50/50 chance either way, resulting in a 50% chance of one being a boy and one being a girl. Concretely, using a capital letter to denote the sex of the child picked up from the station, there are only two possible permutations Gg and Gb, each of equal probability, and 50% of those are girl and boy (Gb). On the other hand if we only know that at least one is a girl we have the permuations, gg, gb, bg - resulting in the 66% chance.


How much do you need to know about the person to know which child is which?

I don't quite understand it, but apparently anything which can be used to distinguish the children will do.

Possibilities with two children:

    Gg, Bg, Gb, Bb
If one of them has a distinguishing mark, they have an apostrophe: (in jail, has red hair, or born first)

    G'g, B'g, G'b, B'b, Gg', Bg', Gb', Bb'
Then note that the marked one is a girl:

    G'g, G'b, Gg', Bg'
So, there is a 50% change that the children are a boy and a girl. Only if there is no way to distinguish them, do you get the 66% behaviour, where the set is:

    Gg, Bg, Gb


No, that is not it. With G'g and Gg' you are repeating the same permutation in your set above!

It makes no difference if they have distinguishing marks or not. It matters if you are told that a particular child is a girl (50%) or if you are only told that at least one child is a girl (66%).


How can you be told information about a particular child if you have no way to distinguish them?

I partially understand your point about G'g vs Gg' now: if having a prime is the only way to distinguish the children, then G and g must be indistinguishable, so G = g.


Think about it this way: I take two coins out of my pocket and hold them inside my hand so that neither of us has seen them. I show you the coin in my left hand and you see that it is tails, what are the odds of the coin in the other hand being heads? 50%. This is the chance of a head/tail combo in this case.

I now put the coins back into my pocket, shuffle them about, and again take them out inside my hands. This time I look inside both my hands, not letting you see, and tell you (truthfully) that at least one is tails. Given that information, you can deduce three mutually exclusive possibilities each of equal probability - both are tails, only the coin in my right hand is tails or only the coin in my left hand is tails. Hence we have the odds in this situation of 2/3 for a head/tail combo.

It is easy to see that the first situation is akin to knowing that a particular child is female, whilst the second is akin to knowing that at least one of the children is female. Also, in either case it does not matter if the coins are distinguishable - one could be a euro and the other a pound.


No, I don't think the label "picked up from a police station" affects the problem the way you are thinking. What I believe you are not accounting for is that there are twice as many mixed families in the population as there are pure girl families.

Take 1000 two-child families (so as to intuitively ignore fluctuations). Families 1-250 are girl-girl, 251-500 had a girl then a boy, and 501-750 had a boy then a girl. Any of those 750 families could have made the quote in my post, yet there are 500 families with boy-girl, and 250 with girl-girl.

Or another example that I think speaks more directly to your post: you see a woman with a T-shirt reading "Proud Mother of Two" next to a girl who is obviously her daughter. What is the probability of her other child being a boy? Again, since there are twice as many mixed families as pure girl families, the odds are 2/3.


"Or another example that I think speaks more directly to your post: you see a woman with a T-shirt reading "Proud Mother of Two" next to a girl who is obviously her daughter. What is the probability of her other child being a boy? Again, since there are twice as many mixed families as pure girl families, the odds are 2/3."

The odds are 50%, do you seriously think different?

Your mistake is that in families 1-250 the girl next to the mother could be either daughter.

In families 1-750 there are 1000 daughters, the daughter standing next to the mother is equally likely to be any of those and half have brothers, half have sisters.


Sorry; you are definitely right about that. I was so focused on the mom with the T-shirt that I did not even think about being able to count the states with the girl, but yes, the answer is obviously 50%.


Saying "A person has two children, at least one of which is a girl. What is the probability that both of the children are girls?" would more clearly explain the problem.


This kind of problem comes up all the time in a poker tournament. Each person has a different number of chips in front of them, and when they reach the final table, since so much money is at stake, they want to make a deal. How do you determine a fair deal when each person has a different number of chips and prizes are radically different for first, second, and third place. Fun problem. Answer: The independent chip model

Gambling to the rescue


It is very clearly mirrored in backgammon theory, where touneament players memorize "match equity" tables and use them to govern their use of the doubling cube.


This can be used as a warm-up question for another fun problem (which a friend of mine was asked during a phone screen for a position at DE Shaw): What if instead of telling you that they have two children at least one of them a girl, someone told you that they have two children at least one of them named, say, Linda? The assumption is that there are no boys named Linda, that the likelihood for a girl to be named Linda is not affected by having an older brother or sister, etc. It's not a trick question, nor a sociological problem requiring any knowledge about names, it's a purely mathematical puzzle. And, as you may have guessed, the solution is not the same as in the previous case (with two children at least one of them girl).


someone told you that they have two children at least one of them named, say, Linda

You would need a special case to determine if you were talking to Johnny Cash's father...


I'm not sure what you mean. Is the question still the same? Then why wouldn't the result be 2/3rds in either case?


It is not the same question, they only seem that way. The information given is different. The answer in the second case depends on the popularity of the name Linda.


# The hard part: THIS IS AN 'A POSTERIORI' PROBLEM.

# The fact as ALREADY happened.

# BG/GB being DIFFERENT is NOT THE TWIST.

# Whether they are or not it will amount to 50% total.

# TWIST: BB is not possible.

  def boy_girl_problem():
    from random import choice

    families = {}
    for i in range(1000):
      families[i] = []
      families[i].append(choice(['BOY', 'GIRL']))
      families[i].append(choice(['BOY', 'GIRL']))
  
    had_both_sex = filter(lambda f: set(f) == set(['BOY', 'GIRL']), families.values())
    had_two_boys = filter(lambda f: 'GIRL' not in f, families.values())

    return len(had_both_sex) / float(len(families) - len(had_two_boys))
# average(boy_girl_problem) == 2/3


# Here's a less crude rewrite

# It reflects conditional probability better

  def boy_girl_problem():
    from random import choice

    families = {}
    for i in range(1000):
      families[i] = []
      families[i].append(choice(['BOY', 'GIRL']))
      families[i].append(choice(['BOY', 'GIRL']))
  
    had_both_sex = filter(lambda f: set(f) == set(['BOY', 'GIRL']), families.values())
    had_one_girl = filter(lambda f: 'GIRL' in f, families.values())

    return len(had_both_sex) / float(len(had_one_girl))


Interesting theory, I'd never thought of every possibility, especially in this context, where we are talking about a person, I was thinking about human psychology or behavior when I first read that article.


URGH! I just lost :(


Aw come on... I was just following Rule#3: http://en.wikipedia.org/wiki/The_Game_(mind_game)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: