See fivethirtyeight.com for much better simulations - 10,000 run each day based on weighted national and state polls. Plus their graphic design is excellent in being both very clean and informative.
I just downloaded Randall's code and modified it to match my analysis, which is: there are only 3 states that matter (FL, OH, CO) and if McCain wins all 3 then McCain wins but if Obama wins any one of them then Obama wins.
Here's what I got: Obama 94.6%, McCain 5.4%
Assuming my analysis is right, am I doing anything wrong mathematically?
But it's not quite that dependent either. There are stages of dependencies. Some states hang together more tightly (Northeast, West Coast, and Deep South) than others (Midwest and Southeast). And gradual shifts can be detected through polling and pinned to real world events (conventions, crises, etc). The campaigns use game theory to move resources accordingly (see McCain now in Minnesota versus pulling out of Michigan). Obama has a very strong firewall.
That's where, I guess, 538.com has taught me so much. The polls are a constant flow of changing opinions. But if your models are sensitive and well-designed you can see the slow wave shifts developing. It's not predictive, but history helps too. For instance, no candidate has ever been this far up, with this little time left, and lost.
No. It means he's ahead in so many states, that he'd pretty much have to do something big (and it's going to take more than a little faux pas) to fuck it up, and there's about a 10% chance of that.
Well, I'm certainly not arguing that these odds are correct overall. I figure there's a 10% chance we'll find out that either candidate once murdered a hooker at a party of Ted Kennedy's.
I see now what is meant though by Eliezer's comment, I think I missed it at first. I'll have to think about the effect on the outcome though. Luckily I have lots of friends with math PhDs.
I'm curious how he modeled it. Do actual results follow a bell curve with prior polling as the median and some well-known standard deviation? If so, it would explain why a 10% lead in the polls is considered nearly insurmountable in a given state. I know that if someone is polling at 60-40 in terms of voters, their odds of winning the state are much higher. I've heard 90% thrown out, but don't have any hard data.
Perhaps a better Monte Carlo would use polling data and that.
Because the xkcd odds are computed based on the completely bogus assumption that Obama and McCain win or lose states independently.
When we get to election day, the states become pretty independent; but until that day, there are very strong correlations between the results in different states.
Thank you. I actually did this same simulation (I have the Python code if you want it), and got about 80% for Obama -- this was a couple of weeks back.
I sent it around and people rightly pointed out that the states are definitely not independent. The most interesting thing you can do with this is figure out what the implied dependence is.
First: early voting has already started in some states, second: there is always a correlation between how states vote, because they're based on how the candidates have performed.
Suppose that all states have an 80% probability of voting for Obama.
If they are independent, the odds of an Obama win are .8 * 8 * .8 + 3 * .8 * .8 * .2=.896
However, suppose the states are maximally DEPENDENT-- they always vote the same way. Then, the odds of an Obama win are exactly .8
The effect will be much stronger with more states, hence the gigantic difference between XKCD's result, and intrade's odds on an obama vs mccain win. In the limit of an infinite number of states, 51% odds in each + independence would give Obama a 100% chance of winning!
Things are much more complex, but this is basically the point cperciva was making. It is almost certainly true that the states are not independent. For example, the odds in Ohio are factoring in the chance of some gigantic gaffe or a terrorist strike or whatever, that would change the vote everywhere.
I really hope Munroe's right, and that Obama wins, but I think that 91% is too high. An October surprise could drive a large move, which would make the states very highly correlated. This is much like what often occurs in the stock market. Individual stocks might be only mildly correlated (0.1 - 0.4) in general, but in a crash (or rally) the correlations converge upon 1. Elections are full of last-minute "crashes" and "rallies".
My market on an Obama win is 70-74. I'm including the very serious possibility of election tampering favoring the Republicans, as many allege occurred in 2004, as well as (alas) the Bradley effect. It's difficult to place probabilities on unprecedented, infrequent events, and this is an unprecedented presidential election, not only because Obama is black-- that's only a small component-- but because he's an entirely different kind of political candidate than we've seen in the past, and because we're frankly in a very different country in 2008 than we were in during 2000 or '04.
Well - the 68% is not really a percentage: it's the price of a contract which will pay 100 to the owner if Obama wins. It's the market's consensus of what the probability is, so yeah it's a percentage.
Given that the state-by-state percentages are also contracts, it seems that somebody with a little bit of time on their hands, and some programming smarts, could make a little money doing some arbitrage between the state contracts and the overall contract.
Betting sites like Intrade are lagging indicators with respect to the polls. Bettors adjust their bets whenever new polls are released, but it doesn't generally work the other way around (people responding to pollsters don't do so on the basis of what they saw on Intrade).
That sounds like a valid point, but bear in mind that a lot of research has been done on the IEM's (more credible than Intrade), and they are more predictive than polls.
I disagree. I think prediction markets are better because people are risking money. Of course they adjust based on sentiment and other external factors, but I do think it's superior to traditional polling that has significant sampling and measurement problems.
As other have pointed out, this is silly because the states aren't independent. There's no way Obama is going to lose OH but win IN, for example, although probabilistically speaking there's some chance.
538 models the dependence along demographic lines, so that demographically similar states, districts, and counties are assumed to vote similarly.
This is interesting because it points out yet another opportunity. If you believe 538, you should pretty much be buying Obama to win on Intrade and shorting him on the swing states.
It would hurt my brain too much to do so, but you could probably figure out a betting strategy with a small or moderate assured return.
Munroe links to them at the bottom.