The question is wrong 50 points by peter123 on Jan 3, 2009 | hide | past | web | favorite | 54 comments

 In the correct version of this story, the mathematician says "I have two children", and you ask, "Is at least one a boy?", and she answers "Yes". Then the probability is 1/3 that they are both boys.See here for why I think this problem confuses people: http://www.overcomingbias.com/2008/03/mind-probabilit.html
 If I told someone that in real life and they replied by asking, "Is at least one a boy?", I would probably back away very, very slowly.The main point is that given the way Atwood asked the question the correct answer is 50% which wasn't the answer he intended to discuss and indicates he himself didn't fully understand the subject he was writing about. It wouldn't be the first time either.
 Atwood doesn't seem wrong or particularly unclear to me. If you "met someone who told you they had two children, and one of them is a girl", then presumably we should imagine this person saying: "I have two children and one of them is a girl," and not "I have two children, X and Y, and X is a girl." Obviously in the first case, we don't know if it's X or Y that's the girl, so the set of possible worlds is [(X-G & Y-G), (X-G & Y-B), (X-B & Y-G)], so we get the 2/3 answer. But maybe other people don't share my linguistic intuitions...
 If I met someone who told me "I have two children, and one of them is a girl," I would be pretty sure they have one girl and one boy. If they had two girls, they would have said "I have two children, and both of them are girls." So the English statement has some implicit information. On the other hand, if I were given a riddle with that statement, I would know the implicit information may be intentionally misleading. In this case, as Paul Buchheit says, the statement does not have enough information.
 Writing about something is a way to understand it better. Why people write about things they fully comprehend is beyond me ;)
 Yes, the question must be very carefully worded (which is kind of my point, if I have one).In order to get the "unintuitive" outcome, there must be some element of selection (much like the Monty Hall problem has). In your formulation of the question, the GG possibility has already been eliminated because the mathematician answers "yes".
 I just realized, even your "correct" version, although more precise, is still underspecified and can't be answered. :)Can you spot the problem?
 You know what's really scary? I saw this out of context, on news.yc/newcomments, and I didn't realize till I came to "pairs of GG, GB, BG, and BB" that it was a fictional setup.
 This Bruce Schneier story about someone successfully bringing the components of gunpowder past the TSA is relevant here, AND to the recent thread about how to make it in medieval Europe: http://www.schneier.com/blog/archives/2008/12/gunpowder_is_o...
 I can't. What is it?
 The author apparently doesn't understand where the confusion in the original question (or variations) comes from.One way of phrasing it is asking,"If you have an unrelated man and woman and they both have two children (one of which is a boy), where the oldest son of the man is a boy - what are the odds that they both have two boys?"(There are any number of variations on this - what are the odds of the man having two boys? what are the odds of the woman having two boys what are the odds of the man but not the woman having two boys .. all the same problem, but stated differently).This 'loads' the question by implying (superficially incorrect) that there might be a difference between the chances of a man and a woman having two boys.Next up is a bit of probability theory. In the case of the woman, no order is stated, so the chances of her two children have no connection - the events are unrelated. The man, however, has as a first child a boy (which eliminates the possibility of this being a girl).This is another variation on the Monty Hall problem, http://en.wikipedia.org/wiki/Monty_hall_problemRead up on some historical background on this here, http://www.marilynvossavant.com/articles/gameshow.htmlAnd as for the overall birthrate of men vs women or the possibilities of having twins/triplets/etc (and their male/female ratio) ... well, that's really out of the scope of a fairly trivial question statement such as this.[edit for non-trivial detail]
 No, you're wrong. The thing is that there are two kinds of confusion that arise from Atwood's problem. The first kind of confusion comes from not understanding how probabilities work, which you discuss. The second kind -- which is what Paul Buchheit is talking about -- comes from noticing that the statement "I have two children, and one of them is a girl" can be parsed in two different ways. It can be parsed as, "I have two children, and at least one of them is a girl," or as "I have two children, and the gender of one of them is #{my_first_child.gender}." Despite what many commenters in this thread are saying, these are not the same.That's the real problem here: the English language is inexact. The words that Atwood used to describe the scenario actually describe at least two mathematically distinct scenarios.The Monty Hall problem suffers from this fact, too, but not as badly -- because both interpretations yield the same conclusion, namely, that you should switch doors. It's just that under one interpretation, you get a car 1/2 of the time, and under the other you get it 2/3 of the time. Also, under the interpretation that yields a car 1/2 the time, it's logically implied that the host is willing to open a door and reveal a car -- which most people use to rule out that interpretation, if only subconsciously.
 Maybe this is because English is my first language - but I cannot imagine how "I have two children, and one of them is a girl" could mean: "I have two children, and the gender of one of them is #{my_first_child.gender}." - if someone was to confer that meaning I would expect him to say: "I have two children, and the gender of my first child is girl". But frankly as someone already pointed out - normally the sentence "I have two children, and one of them is a girl" would implicitely mean that the other child is a boy.
 It's the difference between the situation where a child is chosen and then the gender announced, and the situation where a girl is found, and then the gender announced. If you started out saying "Is one of them a girl?", and only continuing if the answer is yes, then the probability changes.
 Actually, even under the "correct" interpretation of the host's actions in the Monty Hall problem (where he knows where the car is and will never open that door and the participant knows this), there's still an analogy, an even closer one, to this ambiguity.Just as here, if you ask the question in a certain way, the choice of the person, when they have the choice, becomes important (which gender to announce in case of GB/BG), in Monty Hall, if you ask the question in one common way of asking it, the choice of the host becomes important. The host can choose which door to open if they're both goats. If, in that case, the host will pick randomly, the probability of a win by switching remains 2/3. But suppose you picked door 1, and the host will always prefer door 3 when he can, for whatever reason, and you know this. Then given the information that he opened 2 or 3, the probability to win is 100%/50%, respectively. So if you ask the Monty Hall question this way: "I picked a door and the host opened another, what's my probability of winning now?", to get 2/3 the question should include the information on the host's random selection between 2/3 when the car is in 1. Admittedly that's a bit pedantic, but there you go.
 "I have two children, and the gender of one of them is #{my_first_child.gender}." Are they not the same because of FIRST_child (which would be true), or is the FIRST just an accident here? I for one don't see how the original statement could be interpreted as describing the second algorithm. We clearly get the information that there is a girl among the children, and nothing is said about it being the first child.Basically we have a situation where we see a person and we know he has two children, at least one of them being a girl. If it is impossible to derive correct probabilities from that information, then probability theory is useless. There is no algorithm, it is just a situation.
 That's probably the clearest explanation I've seen so far on this thread. Thanks.
 In retrospect using "#{my_first_child.gender}" was a pretty big mistake. I should have put "#{one_of_my_children.gender}". For some reason I thought the former would actually be less confusing than the latter.
 Manolis comments on Paul's post summed it up pretty well for me.What feels counter intuitive is that announcing the gender of one child seems to increase the chances of the other child being of a different gender, but it's actually the opposite."Family with 2 children" -> chance of having at least one boy: 75% "but I have at least one girl" -> chance of having at least one boy: 66% "but the first one is a girl" -> chance of having at least one boy: 50% "but they're both girls" -> chance of having at least one boy: 0%
 Isn't the probability still 2/3 even in the proposed alternative? The fact that we "arbitrarily announce the gender of one of the children" provides new information for calculating the probability, which is ignored if you assume that the original 50/50 distribution is unchanged by the announcement.In an extreme example, consider this algorithm, analogous to the one in the article:1. Choose a random parent that has exactly two children2. Announce the gender of _both_ children3. Ask about the odds that the parent has both a boy and a girlIn this situation, it would be ridiculous to say that the odds calculated at step 3 are still 50/50. Because of step 2, it is either certainly true (100%) or not true (0%) that the parent had both a boy and a girl. Granted, on average with repeated trials, half the time the step-3 value will be 100% and the other half 0%. But still, the step 2 information changes the value at step 3.
 No.You've gained no useful information. The person who announces the gender of one child can say one of two things: "this child is a girl" or "this child is a boy." However, these two statements are symmetric in their effect on P(one child is a girl and one is a boy).There's a huge difference between announcing the gender of one child and announcing the gender of both children. If the announcer announces both children's genders, he can say that "both are boys," that "both are girls," or that "one is a boy and one is a girl." But the last statement is not symmetric to the first two in how it affects P(one child is a boy and one is a girl).
 It's true, as a commenter on this article mentioned, that successive children are (more-or-less) independent events, but that doesn't mean what that commenter (or our author) apparently thinks. If you specify that the older child is a girl, then the chances of the younger child being a girl are 1/2. If you don't specify which child is a girl, only that one is, then it's 1/3. This and the Monty Hall problem are just examples of the same limited-information situation.
 Nope. Announcing the gender of one child does not magically alter the gender of the other child. It's like a print statement in code. If you don't believe me, try writing some code to simulate this. I will bet you an arbitrary amount of money that I'm right :)Here's another way of looking at it: By your logic, if I announce that one of the children is a girl, then the other child only has a 1/3 chance of also being a girl. Likewise, if I announce that one of the children is a boy, then the other child only has a 1/3 chance of being a boy. Therefore, by your logic, the act of my arbitrarily announcing the gender of one of the children increases the probability that the other child is of the opposite gender from 1/2 (what it was before I spoke) to 2/3, regardless of whether I said it was a girl or boy. Hopefully you can see why this is not correct.
 "Announcing the gender of one child does not magically alter the gender of the other child."Nor is that what I said. :)Assuming you meant "chance of being a boy" where you said "chance of being a girl", I agree that announcing the gender of one of the two children increases the probability of the other child being the opposite gender from 1/2 to 2/3.The reason this is so is that some of the original probability was sunk in a case you've now eliminated: the case where there were two of the gender you didn't announce.[Edit: Jeff already wrote some code, and while I haven't bothered to review it, I assume the lack of outcry about it indicates its correctness. Care to share where he got that wrong?]
 So if I ask all of my friends who have two children to "tell me the gender of one of their children", then you think that after they answer the question, there is a 2/3 chance they have both a boy and a girl? (but before answering the question the probability was 1/2) Doesn't that seem a little absurd to you?If Jeff wrote code that yielded 2/3, then he was implementing my "algorithm 1" (which has selection), not the second algorithm (which does not do any selection).I just updated my post with an explanation that may clear things up for you.
 I thought it might be instructive to implement both algorithms side-by-side in the same loop:`````` import random jeff_stats = { 'boy_and_girl': 0, 'two_girls': 0 } paul_stats = { 'boy_and_girl': 0, 'other': 0 } for i in xrange(100000): children = [random.randint(0, 1), random.randint(0, 1)] # Jeff's algorithm if children[0] or children[1]: print 'I have a girl' if children[0] and children[1]: jeff_stats['two_girls'] += 1 else: jeff_stats['boy_and_girl'] += 1 else: print "Oops, I'm a liar - I have no girls, so don't count me" # Paul's algorithm print 'I have a ' + ('girl' if children[0] else 'boy') if children[0] != children[1]: paul_stats['boy_and_girl'] += 1 else: paul_stats['other'] += 1 print "Jeff stats: " + str(jeff_stats) print "Paul stats: " + str(paul_stats) `````` Output (ignoring the actual printouts for each person):`````` Jeff stats: {'boy_and_girl': 50027, 'two_girls': 24949} Paul stats: {'boy_and_girl': 50027, 'other': 49973} `````` I liked tromino's explanation best: it's really an ambiguity in the English language. Under your 2nd algorithm, 1/4 of parents can't truthfully announce "I have a girl" - but in ordinary conversation, this doesn't matter, because they'd just truthfully say "I have a boy" and we don't make a distinction. It's only when we explicitly filter out parents who don't have a boy that we change the probabilities.
 The question "what is the gender of one of your children" gives you some information about the gender of both of their children. It's enough information to eliminate the case where both children are the gender opposite the one they mentioned.It can still be the case that the chances of there being a boy and a girl are 1/2, while the chances of there being a boy after you find out about a girl are 1/3, purely because you don't know whether you found out about an older or younger girl. Probability is just a function of our ignorance (at least in this case; I'm not qualified to have an opinion about whether it always is).[Edit: Hm. Looking again, it seems Jeff Atwood's code is for a different problem he was also discussing, so nevermind. After considering your update, I think you might be right.][Further edit follows:]Here's some python that essentially shows you're correct, unless I flubbed it:`````` firstpickwasboy = {'boy':0, 'girl':0} firstpickwasgirl = {'boy':0, 'girl':0} choices = ['boy', 'girl'] for i in range(100000): children = [random.choice(choices), random.choice(choices)] whichchild = random.randint(0, 1) otherchild = 1 if(whichchild): otherchild = 0 if(children[whichchild]=='boy'): firstpickwasboy[children[otherchild]] += 1 else: firstpickwasgirl[children[otherchild]] += 1 >>> firstpickwasboy {'boy': 25132, 'girl': 25116} >>> firstpickwasgirl {'boy': 24679, 'girl': 25073}``````
 I'm willing to bet \$5 that the probability (according to the current statistical science corpus) is still 2/3 (even if you're right that announcing the child doesn't alter the sex of the other). See my comment in the main thread.
 I'll take that bet. Write the code, but be careful to implement my second algorithm exactly.
 Ok this thread is confusing. Actually, your second algorithm wasn't the point of my argument. You're right that according to your second formulation, the probability is 1/2, of course.My point is that there is nothing wrong with Jeff's question, and I believed that you made a distinction with the Monty Hall version. But your second algorithm isn't the same as Jeff's question, my mistake.It is very clear that my argument specifically adresses the question the way Jeff has formulated it, so you can even forget the part about "your second formulation", what I meant was the question of Jeff and why there is nothing wrong with it.In statistics, it is commonly accepted that we don't know what we don't state (for an obvious practical purpose). Therefore, when we say : "One of them is a girl", it is implied that we don't know wich one because we didn't specify it.Therefore there is nothing wrong with Jeff's question (and the probability is 2/3), at least for anyone familiar with basic statistics and probabilities conventions (and I suspect you are), anything else is just quibble. I assumed your argument wasn't about nitpicking this kind of convention, but it turns out I was wrong. I guess nobody wins this one :)
 This hinges on whether we decided ahead of time that we would only consider cases in which there is a girl, something which I didn't see earlier. :)As mentioned in another thread started by Eliezer, it's the difference between "What is the sex of one of your children?" and "Is (at least) one of your children a girl?". For the second question, the results skew to 1/3 and 2/3 because we're discarding the cases where the answer is 'no'.The light went on for me when Paul pointed out in his update that getting a random answer for "What is the sex of one of your children?" eliminates (an unknown) one of BG and GB.
 Basically you're right, and Paul is right too. This is just a matter of convention, it depends on what you want to hear in the question. If Jeff said "one of the children is a girl", but added "but we don't know wich one of the two is", this post would have never existed and we wouldn't have argued so much over nothing.In the world of mathematical conventions you learn in school, the question is understood and you're right. In the "normal" world where you can quibble with language because there is no specific agreement, Paul is right and the question should be more precise. In both cases the conclusion remains the same : what a waste of time.
 > If you don't specify which child is a girl, only that one is, then it's 1/3You mean, if you only specify that at least one child is a girl, then it's 1/3.Actually you're still underspecified, because you haven't said what you do if the parent has no girls.
 You are right about the question ("Announcing the gender of one child does not magically alter the gender of the other child"), but wrong about the statistics, even in your second formulation the probability is 2/3, not 1/2. This problem, the way Jeff has formulated it, has nothing to do with the Monty Hall problem. But still, the end probability is the same : 2/3 (that's the only common point). Tell me if there is a mistake :The question is : What are the odds that the person has a boy and a girl, if we already know that one of the child is a girl ? (If we agree on the question, then we must agree on the following probabilities).Possibilities are :1/ boy / boy2/ boy / girl3/ girl / boy4/ girl / girlSince one of them is a girl, we must remove possibility number 1. That leaves us with 3 possiblities, and 2 of them have a boy. Probability : 2/3(edited for correction, we search the probability of having a boy, not a girl :p)
 Ordering is irrelevant here. Options 2 and 3 are identical.
 This is precisely the first subtility you encounter when you learn probabilities : they are not identical. Paul is right that the formulation of the question doesn't refer to the Monty Hall problem at all, this isn't the same algorithm. But in this case, the probability turns out to be the same. That's the real confusion in Jeff's article.I mean, it's not even my own deduction, it's what you are taught when you learn probabilities. It's a basic and core paradigm, and I'll digg it up from Wikipedia if somebody still doubts it :)
 See my post above. You can take ordering into account if you'd like, but the result is the same, given the parameters of the problem.
 Options 2 and 3 each still have the same probability weight as each of options 1 and 4.
 No, they don't.Again, ordering is irrelevant in this problem. We want to know only the probability that there will be a boy/girl pair, not the probability that the boy/girl pair was born in a particular order.But the same result can be obtained when taking ordering into account -- the key observation is that if you subdivide options 2 and 3 to account for sibling ordering, then you also must subdivide options 1 and 4 to account for sibling order, resulting in 6 total possibilities, of which 2 are M/F sibling pairs, 2 are M/M pairs, and 2 are F/F pairs:1) M/m2) m/M3) M/F4) F/M5) F/f6) f/FGiven the knowledge that one sibling is female, you then exclude 2 of the 6 possibilities (the m/M and M/m pairs), to obtain 2/4 = 50% probability that the pair of siblings is of mixed sex.The mistake you're making is that you're including ordering on the mixed-sex pairs, but not including ordering on the same-sex pairs.
 If you include m and f in the universe of possibilities, then you must add the following combinations as well :7) m/f8) f/mYou are not allowed to skip arbitrarily some of the combinations of your universe (wich is now [M, F, m, f]). Probability : 2/3 =]
 Don't be silly -- do you think I invented a couple of new sexes by adding lower-case letters? You're just getting thrown by the notation. I could have written the options as:1) M/M2) M/M3) M/F4) F/M5) F/F6) F/Fbut I thought that was confusing, so I introduced a symbol to more clearly illustrate the differences between the ordering of the same-sex options.
 Rather than insulting me, take a moment to think about the consequences of what you're saying: you're arguing that by announcing the sex of one child in a pair, the probability of the other child's sex being a particular value changes to 2/3. Does that make sense to you? Really?Again, this has nothing to do with symbols or notation. There are two sexes, two symbols: M, F. The undergrad probability 101 mistake you're making is that options:MF, FMtake order into account, while options:MM, FFdo not. This is incorrect. If you take order into account for the mixed-sex case, you must take order into account for the same-sex cases. MM and FF encompass four options with ordering, not two.
 I'm sorry if I sounded offensive, and I indeed was, I was tired when I wrote my last comment and my words didn't reflect my tought. I'd like to elucidate this problem once and for all, I really do believe we can both agree on a conclusion.I'd like you to notice that your new table of probabilities imply that I have a chance of 1/6 to guess the gender setup of a family of 2 children. I don't think that makes sense either to you.Let's make the experiment a bit clearer :- We gather a number of families who have 2 childs.- For each family, we announce the gender of one of the child, but we don't know wich one.- We are then asked to guess the sex of the other child.At this point, you believe that the probability to guess right is 1/2, and that the 2/3 probability doesn't make any sense. My claim is that you fall in the Monty Hall problem trap, wich is very counter-intuitive and doesn't seem to make sense at first.But here is some clarification of the problem :- What we are really asked is to guess the _gender_ setup of the family. So we need to establish the universe of possible family setups before answering. What are they ?Even if we don't care about the order, we must acknowledge that there are 2 childs in the family, so there must be a first child, and a second child.Setup 1 : both childs are boys : M/MSetup 2 : both childs are girls : F/FSetup 3 : the first child is a boy, and second child is a girl. M/FSetup 4 : the first child is a girl, and the second child is a boy. F/M.Why order matters in setup 3 and 4 ? Because M does not equal F, while M=M and F=F. We investigate not the individual itself, but the property of the individual (in this case, the gender). Therefore, M/F is not equal to F/M, and in the real world there must be a first and a second child.If you are asked to write down all the possible setups of a family of two in the real world, you would write the same table. You'd say that :Some families have 2 boys = 1 setupSome families have 2 girls = 1 setupSome families have one girl and one boy = 2 possible setups (1st one is a girl OR a boy).Your argument of F/M = M/F implies that all families have _either_ one of the 2 setups, every first child is a boy, or a girl. But it doesn't work like this in real life. That is why order matters.Conclusion : if we agree that there are 4 possible setups in a family of 2 childs, then we have a probability of 1/4 to guess the correct setup of the family given NO information. But if we are informed of the gender of one of the child, then one solution of the setup is removed (F/F or M/M), and we have a chance of 2/3 to guess right IF we chose the opposite gender (see Monty Hall problem).And if we are informed of the gender of one specific child (1st one or 2nd one), then it leaves us with only 2 solutions ! And here, the probability becomes 1/2.
 It's all a question about the correct precodition definition. I think people who heard about conditional probability know that the correct answer is "2/3". Most people will just imagine the first precondition "they have two children" and forget the second one.What about going in the opposite direction and simply ask "What is the probability that a mother has a boy and a girl?" I think many people would answer "1/2", too, simply adding the precodition "she has 2 children" ;)The origin for this is - I think - an implicit context people are thinking of without real awareness of it. You think quicker with implicit context.
 If Jeff Atwood argument would be right, couldn't it than be applied to roulette as well?If you play two games for example, there are 4 different orders of black/red possible (neglecting 0/00):b,bb,rr,br,rEach has the same possibility of 1/4. If you place 1\$ in the first round on black, your chance to win is 1/2. If it is black (or red), only 3 of the original 4 option are left and two will lead to red...
 If the FIRST round is black, only two of the four possibilities are left, one of which leads to red.Similarly, if Atwood's problem stated that the FIRST child is a girl then the probability that the second is a boy is 1/2.The problem, however, states that one of the children is a girl -- it could be the first or the second. In this case the probability that the other child is a boy is 2/3.
 No, you've reintroduced ordering.
 randallsquared is correct. You are misunderstanding sequences of related events.Running the roulette wheel twice is a sequence of unrelated events. The (ideal) roulette wheel is totally random. The results of run #1 are unrelated to the results of run #2.However, in this case of the man and woman both having two children, one of which is a boy, but the man having as an oldest child a boy we are talking about two different scenarios. The events are suddenly related. If you want it spelled out, the matrices are,The childbirth options are this,BB BG GB GGHowever, for the man it's been stated that his oldest child is a boy, so it becomes,BB BG XX XXFor the woman (who has as a condition 'at least one boy, but not necessarily the eldest') it is,BB BG GB XX(Where 'B' is Boy, 'G' is Girl and 'X' is a non-option).
 In Jeff's post the question was posed to show how bad humans are at probabilities, and it got the point across very well. I know that I fell right into the trap...This post takes what was meant to be an enlightening and refreshing take on how bad we as humans are at probabilities and turns it into a numbers-duel.No the original questions doesn't clearly state the definition of who is asked, it doesn't take into account that slightly more girls than boys are born, and it doesn't answer the question about whether this all happened in a cult where all boys are killed at birth. It was a simple metaphor. And it worked...Here is a TED talk about the subject of how we are fooled by probabilities: http://www.ted.com/index.php/talks/peter_donnelly_shows_how_...
 Really, a (not the, but a) problem is that almost everybody is right. We are certainly fooled by probability. But also, if someone volunteers that one out of their two children is a girl, then it's effectively-100% that the other is a boy, in real life.Slightly amending a quote in a similar context from "Principles of Economics, Explained" (http://www.youtube.com/watch?v=VVp8UGjECt4), nobody says "I have a girl. I have another girl. I have another girl."One way of interpreting this perennial debate is as proof of how just how fuzzy English can be. People aren't just arguing about the answer, they argue about the question too.
 Really, a problem is that almost everybody is right.Yes - this is why I pointed out that in Jeff's post this riddle was used as a metaphor. Otherwise we could spend all our time bickering about the correct spelling of "colour" or other trivialities instead of addressing the interesting questions.The interesting question in Jeff's post was the poor human understanding of probabilities.
 "But also, if someone volunteers that one out of their two children is a girl, then it's effectively-100% that the other is a boy, in real life."Well, yes, because saying one is implies that the other isn't, in speech. But we could talk about sets of two of anything that has a 50% probability.

Applications are open for YC Summer 2018

Search: