
Simulating the World Cup Knockout Stage - nreece
http://blog.wolfram.com/2010/06/24/simulating-the-world-cup-knockout-stage/
======
jarek
This still depends on rankings. What we've seen in this tournament (and many
before it) is that rankings are a pretty poor estimate of actual performance
-- or at the very least, there are some outlier results which have perhaps
gained notoriety due to their outlier nature.

One problem I would suggest is that ranking are primarily derived from past
performance. I'm not sure about the Elo ranking, but the official FIFA one is
based on results from the last four years (eight years before 2006!) -- in
highest-level football that is way too long.

Perhaps another problem is that there are simply too few games in the
tournament for this to work reliably. Given 50 games, a team's win-draw-lose
count might be estimated pretty reliably, because a string of three bad games
can be offset by a string of three above-expectations games. With only 3 games
in the group round, funny things happen to probability.

See: Paraguay with Elo ranking 22, Slovakia with Elo ranking 48, New Zealand
with Elo ranking 60, Italy with Elo ranking 7, France with Elo ranking 12.

~~~
_delirium
The folks over at fivethirtyeight.com (usually a political-statistics blog)
have a somewhat more complex simulation model that adds information about
teams' average goal-scoring / goal-conceding, an estimate of home-field or
home-region advantage, etc. It also uses ESPN's continually updated rankings
instead of FIFA's stale ones. Still subject to a lot of the same problems,
especially the small-sample-sizes issue, but it's at least an improvement. See
the sidebar at: <http://www.fivethirtyeight.com/>

------
alecco
There are just too many variables to predict properly a single world cup
match. Things you might not think of right away can completely change the
outcome: weather, bad referees (even good referees doing just a bad call),
combined strategies, bad player changes, to name just a few.

It's part of the taste of watching football, you can't be too sure about
what's going to happen and very often even recent history data doesn't match
outcome.

Perhaps this teaches how modeling highly unpredictable events shouldn't be
taken lightly.

------
demallien
Of course, there was a paper that came out about a year ago that basically
said that of all the major sports, football was statistically the least
predictable. Basically this means that you need to change the formula used by
the ELO rankings so that it is much flatter than the equation that Wolfram has
chosen here. Once that has been done, the error bars on he predictions just
sky-rocket.

To give a concrete example from this World Cup; Germany thrashed Australia in
their first match, 4-0. Germany then went on to be beaten by Serbia in a
closely fought match 1-0, and then Serbia was dominated by Australia in the
last round of group matches, 2-1. Other groups have had the world champions -
Italy, and runner-ups, France, knocked out of the competition in the group
stages, yet none of the current models predicted that either. Makes their
value quite dubious, in my opinion.

~~~
smackfu
It's interesting that football doesn't use series of games to try to cancel
out some of the randomness. It works pretty well for other sports. If two
teams go to seven games, you know they are pretty well matched.

It also seems like the group stage doesn't do a good enough job of canceling
out random chance. Too many ties mean that too many of the groups came down to
who won the final games of the group.

~~~
yalurker
Keep in mind, ties are sometimes strategic. Many games this world cup have
seen teams playing very conservative/defensive because they were intentionally
hoping for a tie.

Side-note, the Champions League and Europa League (where the top teams from
each European Country's pro league compete against each other) uses a two-game
system, one home and one away. The winner is decided by the cumulative score
of the two games.

------
ry0ohki
The graphics are either misleading or wrong, the tie-ups for the winners are
incorrect... the winner of the USA v Ghana game plays the winner of the South
Korea v Uruguay game, but this simulation has Ghana playing Argentina in the
next round???

~~~
tharris7
> (based on the current top two teams in each group)

It must be using standings from a day or two ago

~~~
ry0ohki
The winner of Group A was always going to play the Runner up of Group B
_shrug_

------
greyman
World Cup is very difficult, if impossible to predict. For example, I was
watching Slovakia-Italy yesterday, and in around 95 minute, Italia had a good
chance to equalize to 3:3, it was a matter if the player will hit the ball
correctly - I don't know how something like that could be predicted.

Moreover, no one expected that Slovakia will play better than in their
previous matches, and Italy will play worse than in their previous matches.
One possible factor (for Slovakia) could be that the coach and players were
angry at Slovak media, which all blamed them mercilessly for a poor
performance in the previous two matches - and that I think had more influence
on the outcome than any previous math statistics about those two teams. They
wanted to show that the media were incorrect and that added an extra
motivation.

Where I think this math approach can work is in the leagues, where the same
teams play against each other regularly.

