"If Alice tosses a coin until she sees a head followed by a tail, and Bob tosses a coin until he sees two heads in a row, then on average, Alice will require four tosses while Bob will require six tosses (try this at home!), even though head-tail and head-head have an equal chance of appearing after two coin tosses."How does this work?

 > Intuitively, first, both have to get a head. After that, if Alice "fails" by getting a head, then she still needs only one tail. Her first head doesn't get "reset" by failing her second try. But after getting a head, if Bob fails by getting a tail then he does get reset -- he has to start all over.
 Explain the difference?Are you saying that HH causes an early failure for HT , instead of a potentially longer success HHHT? If so , it is poorly worded, to be ambiguous about how to count failures. (In the 4-6 variation, there are no failures)
 I'm saying that if you just keep flipping coins (So, no Bob/Alice situation), and stop whenever you hit HH or HT, that the average number of flips to get to HH is 3, and the average number of flips to get to HT is 3.But, if you are focussing on a particular scenario that you will flip coins until you get to HT, the average number of flips will be 4, and if you flip coins until you get to HH, the average number of flips will be 6.I just find that really hard to grasp intuitively.
 "and stop whenever you hit HH or HT" is equivalent to say "whenever you hit H, flip one more then stop". Average to hit H is obviously 2, plus one is 3, so you are right, but IMO it's not that counter-intuitive.
 Do I understand your first scenario correctly, that if for example you flip TTHH, you stop there, count that as "took four flips to get to HH," count nothing for HT, and then start over?Seems like you're just taking a biased sample, which cancels out the differences. To take an extreme example, imagine one candidate is HHHHHHHHHH and the other candidate is any other sequence of ten flips. In the "try until you get either one" scenario, the average number of flips for either one will be 10. Testing them independently, the average number of flips for the second one will be slightly over 10, and for HHHHHHHHHH it'll be huge.
 I don't think that's true. I think that the average number of flips before _stopping_ might be 3, but it doesn't become more likely to get HT simply because you're also looking for HH (or vice versa).
 It turns out that the average number of flips to get HH and HT is 3 - someone else on the thread described it well - on average, the number of flips to get to a "H" will be 2, and then you will always stop on the next flip - 50% at flip 3 with a HT, and 50% at flip 3 with a HH. Explained that way, it makes sense that the average number of flips is 3.But, I still find it strange that if you are flipping with one particular scenario in mind, HT or HH, that the average number of flips goes from 3 to 4 or 6, even if I can reason it out with a bit of thinking.
 `````` flip coins if HT or HH stop `````` vs`````` flip coins if HT stop flip coins if HH stop `````` two different scenarios
 A similar property was recently taken advantage of to reduce the time needed to brute-force older garage door openers; roughly speaking openers look for a particular base-3 string, but don't require a start/stop sequence, so you can try N length M permutations with much less than N*M symbols transmitted.
 That's a De Bruijn sequence. See https://en.wikipedia.org/wiki/De_Bruijn_sequence . A example applied to door codes is at http://stefangeens.com/2001-2013/2004/10/the-de-bruijn-code/ . A search for "De Bruijn garage door" finds a couple of examples of what you mentioned.
 Thanks for the name, with it I was able to find the article I was thinking of: http://samy.pl/opensesame/
 Ah, thanks for that. So it's not a property of the coin toss itself, but the fact that a failure at the second step in the series only resets to before step 2, instead of before step 1.
 That was going to bother me all day. Thank you.
 I thought the article was saying Alice must get a head followed immediately by a tail. If that's not the case, then it makes total sense, but it would seem the article is a bit vague about that.Actually, the details in the article say:>even though head-tail and head-head have an equal chance of appearing after two coin tosses.That implies that the tail is expected immediately after the head for Alice's goal.
 Correct, but the point is if Alice doesn't get a tail, that means she got a head, so she is still in the same position as she was before the flip, only needing a single tail to complete the sequence. If Bob gets a head, then a tail, he now needs two consecutive heads to complete his sequence.
 The only way she can get a tail which isn't immediately after a head is to never get a head in the first place.
 Yeah, thought so - it's specifically heads->tails, rather than either heads->tails or tails->heads being okay. So the probabilities don't seem remotely counterintuitive, just crappily communicated IMO.
 I'm not sure your explanation fits with the probabilities. Pointing out that you're only looking for heads->tails and not tails->heads seems to suggest that the HT pattern should be less likely than the HH, but it's actually more likely (for reasons explained well by others).
 I don't have time to put numbers on it, but consider the following with 3 tosses only. There is 8 different enumerations:* HHH * THH * HTH * TTH * HHT * THT * HTT * TTTAlice is looking for HT, so she will succeed in HTH, HHT, THT, HTT, that is 4 out of 8 possible outcomes. Bob on the other hand is looking for HH, that is only in HHH, THH, HHT, 3 out of 8 possible outcomes. So while HH and HT are equal in probability when you consider 2 coin flips, the combination of HT happens more often than HH. This is the case with 3 coin flips - there is no guarantee it translate to the same with more coin flips, but that is my bet.
 Amusingly, the question "which substring occurs earlier on average" is different from the question "which substring is more likely to occur before the other". In fact the second question sometimes has a circular answer! For example, THH typically (with >50% probability) occurs before HHT, which typically occurs before HTT, which typically occurs before TTH, which typically occurs before THH.Also the question "which substring occurs earlier on average" is intimately connected with algorithms for substring search. For example, if you want to check that a string doesn't contain HHH, you need to look at every third character, but for THH that's not enough.Fascinating stuff.
 The key "insight" that helped me intuit this is that "HH" can self overlap, while "HT" can't.
 I came across this scenario while interviewing at a HFT firm about a year ago - caught me off guard :)
 If Alice fails, and gets a second head, she only has to get one toss right, and hasn't completely reset her sequence.For Bob, as soon as he sees a tail, his sequence is completely reset and he now has to get two tosses right.I'm not sure how to calculate averages, though.
 So if Alice fails she is halfway through her sequence. But if Bob fails he needs to start over.That makes sense.
 Uh. I totally forgot everything about statistics and probabilities (and I'm too lazy to remember those, so I won't check numbers 4 and 6), but I think the core idea how it works is that Bob's option just has lower chances. Say, we toss a coin up to 3 times.Possible outcomes are: TTT, TTH, THT, THH, HTT, HTH, HHT, HHHAlice successes are: HT, THT, and HHT, but Bob has less options: HH and THH. That's why he needs more tosses on average.
 Yes but why is Bob's harder? Answer without enumerating the full list of possible flip sequences
 If they both fail to get what they need on their second tosses, it's possible for Alice to win on the next flip, but Bob needs two flips to win!If Alice has failed to win on her second toss, her flip sequence was "HH", so she can win on the next toss, by flipping a T!If Bob has failed to win on his second toss, his flip sequence was "HT", so he needs two more flips to win!
 An easy way to understand it is by thinking about bunching. Since you're only flipping until you hit the first matching sequence, on average you'll hit the more evenly distributed sequence more quickly than the bunched sequence.Multiple heads in a row are more bunched than transition sequences because, for example, a sequence of three heads in a row will include two sequences with two heads in a row. You can't do that with a transition sequence--it takes at least four tosses to get two identical transition sequences.
 Your explanation actually confused me until I realized we can't conflate the two different players' throws.We basically have four cases for Bob:- HH: terminates for Bob.- HT: Bob restarts the sequence.- TT: Bob restarts the sequence.- TH: This is a continuation. It degenerates into the answer to a single throw. Otherwise its just a recursion.Then we have another four cases for Alice:- HH: This is a continuation. It degenerates into the answer to a single throw or else a recursion.- HT: terminates for Alice.- TT: Alice restarts the sequence.- TH: This is a continuation. It degenerates into the answer to a single throw. Otherwise its just a recursion.So with that understanding it is a bit more clear how we get this result:Alice has one win and two opportunities for a win and one for a restart. Bob has one win and one opportunity for a win and two restarts.That was a bit confusing. I wonder how the problem could be worded to ensure that people got the answer correctly every time?
 The most intuitive way to answer this question is that you're not comparing individual coin flips but pairs of adjacent coin flips in a longer sequence of ones. These pairs are no longer independent trials: if your first two coin flips are TH, then getting an HH if you look at the second two is much more likely than getting a TH.
 Alice: first toss: head. Second toss: head. She can take that second toss as the begining of the new sequence and if she gets tail on the third toss she is done. In Bob's case, if he gets tail on the second toss, that toss no longer counts and he must get head for the new streak to begin.
 Because if Alice fails to get H->T, then she has to be back at the first step, H, ready to flip again for H->TIf Bob starts on H and gets T, he needs to continue flipping until he gets back to H.
 Starting from scratch, they first need to get a head. This takes 2 tosses on average (1 with 50% probability, 2 with 25% probability, 3 with 12.5% probability, etc.). At that point, both Alice and Bob have 50% chance of getting the target sequence with one additional toss.In the case of failure, Alice still has 50% chance of success in each subsequent toss. On average she will need two additional tosses to get a tail and the answer is 2+2=4.In the case of failure, Bob has to start again. If we call the answer x, we can write x=2+0.5 1+0.5 (1+x) and solving the equation we get x=6.
 Don't know if this is exactly why, but the situation certainly reminds me of this: https://en.wikipedia.org/wiki/Penney%27s_game
 A great and very entertaining explanation is in Peter Donnelly's TED talk: "How juries are fooled by statistics".
 If Alice fails after two tosses, she has a high probability of ending with a head, so that the next toss is more likely to be successful.