

World Cup Follow-Up: Update of Winning Probabilities and Betting Results - etienneb
http://blog.wolfram.com/2014/06/26/world-cup-follow-up-update-of-winning-probabilities-and-betting-results/

======
chollida1
Love or hate Wolfram, the one thing that I find they do extremely well is make
their language work well with consuming and working with a wide variety of
data sets.

Or put another way they don't just focus on making a general purpose
programming language or even a math focused language, they actually work to
make the language inter operate well with real world data sets.

In my opinion they've passed Matlab and R in terms of being able to take a raw
data set and quickly ask and answer questions about it and as someone who does
this for a living I'm very happy Mathamatica exists, although R still wins in
terms of cost:)

~~~
jkldotio
I haven't used it but I have found most of these demos to look very staged and
there is too much magic going on. Isn't the whole point of this to blur the
lines between language and cloud so you are eternally dependent on their
system and have to pay them when you need to seriously scale or deploy
something?

~~~
tokipin
it isn't magic, but simply the fact that the language is "dynamically typed"
(or more accurately, that the whole system works symbolically). the "magic" is
due to its UI compositing system using the same exact semantics as the rest of
the language (or rather, it _is_ the same language). this makes graphics-
making and UI-making super easy.

have a look at the Mathematica stack exchange[1] site to see more code. there
isn't any contrivance there.

[1]
[http://mathematica.stackexchange.com/questions/tagged/graphi...](http://mathematica.stackexchange.com/questions/tagged/graphics?sort=votes)

------
sspiff
I'm not sure I understand the data this model presents. I focused on Belgium
(being Belgian) and their next competitor (the US).

Even though Belgium comes out slightly ahead in the "chance of victory" graph,
and in the "most probably game tree" they are picked as the more likely
winner, the lower graphs (chance to reach, chance of knock-out) show the US as
having a higher chance of making it to the quarter finals than Belgium.

Isn't this inconsistent?

~~~
etienneb
Indeed you are right, the figure showing the "chance to reach, chance to
knock-out" was accidentally an outdated one (from a test simulation with a
lower number of trials). The real figure will be updated soon. They are quite
close though, good job noticing that! \-- it has been updated now

~~~
sspiff
I figured as much, seeing as the chance to reach the group of 16 was below 1.0
for all teams. Looking forward to seeing the new charts!

------
thathonkey
Comparing this to 538's (quite different) Analysis [1]:

#1

* 538 - Brazil - 36.0%

* WOLF - Brazil - 32.2%

#2

* 538 - Argentina - 17.0%

* WOLF - Netherlands - 23.5%

#3

* 538 - Germany - 12.0%

* WOLF - Germany - 21.6%

This paints a pretty interesting picture about trying to use statistics and
math to predict sporting tournaments like this. Seems like nobody has been
very successful at this yet. Obviously it gets "easier" as the tournament
progresses, though.

* [1] [http://fivethirtyeight.com/interactives/world-cup/](http://fivethirtyeight.com/interactives/world-cup/)

~~~
tragomaskhalos
This is interesting but not surprising: the fact that scoring in (association)
football is a relatively rare event means that it is less predictable at the
individual match level than other team sports such as rugby, American football
and basketball that feature more frequent scoring events; i.e. there is a
greater likelihood of the underdog snatching a surprise win.

~~~
kapkapkap
That is absolutely incorrect. Heavy underdogs in the group stages of this
world cup had frequently had moneylines of +1300 and higher (ex Netherlands vs
Austrailia). There is NEVER an NFL game in a typical season where the
moneyline (ie without pointspread) on the underdog will pay out that high. So,
the is a much SMALLER likelihood of the underdog snatching a surprise win.

~~~
streptomycin
Many pro sports leagues like the NFL are designed to have parity amongst the
teams, to make it more exciting. There is no way to ensure parity in national
teams. If there was an international American football competition, it would
have even less parity than international basketball competitions (where the US
is extremely heavily favored in nearly every game).

------
ccozan
I think you need to factor some other datapoints too:

\- probable weather - some teams play differently depending on conditions

\- yellow and red cards - some major player could miss a match. If team A is
missing a defender, might be that is more vulnerable and thus, may lose a
match.

and so on.

~~~
willis77
I find comments like this unconstructive, and they always surface when people
present forecasts. It's easy to play the shoulda-woulda-coulda game with data.
"You should account for the breaking strengths of each team's players' ACL
tendons. It could affect the outcome if they get injured!" Okay, so what? Did
you do it? Does the data exist? Do you have any quantitative evidence of a
statistically significant link between the weather and team performance? If
you do, do you also have the evidence to show that the game forecast is not
rate limited by the error of the weather forecast?

I'm not saying you have to write a thesis to suggest new data sources, but I
am saying that "you need to do XYZ" is not a productive piece of advice
without the evidence to back it.

~~~
badusername
Though I agree with you on the weather bit, there is already a history of
yellow and red cards from the tournament already, which could have huge
implications on the tournament ahead. (For eg, Luis Suarez getting booted is
one.)

~~~
darthgoogle
It would be useful to have a database of football punishments so teams and
fans could see whether a player has been dealt with fairly or not. Perhaps one
already exists?

------
replicant
You know what would be interesting, to perform this analysis to all past world
cups, looking at the group results and computing the possible outcomes of the
brackets. And then, compute some statistical indicator like the p-value (sorry
I am very weak in statistics).

------
introex
All is well until your main player bites an opponent and gets banned.

~~~
pfortuny
Or until said player is not turned away from the match, which would have
changed at least 15minutes' worth of it.

------
wsxcde
These statistical analyses never work well for football and this one is
another example of that happening.

For example, except for Spain in South Africa, no European country has ever
won the world cup outside of Europe. Why is this? A large part of this is due
to the weather. And unsurprisingly, South African weather in the southern
winter isn't terribly different from what the Europeans are used to. This stat
alone should've predicted a lot of the European "upsets" and should trigger
your suspicions about 2 out of the top 3 favorites being European.

And even if you were unconvinced about this prior to the tournament, having
watched teams struggling in Manaus should've convinced you that these
statistical analyses ignore important information that is obvious to even
semi-casual fans.

~~~
jsnell
I am still unconvinced. The data set is incredibly small, incredibly noisy,
and after the South African tournament only convincing if you cherry pick the
right results.

The last time an Europen team didn't make it to the finals was in 1950. If
climate really did matter that much, you wouldn't expect that. In addition to
2010 that you chose to ignore, the US world cup final went to the penalties
and the 1986 Mexico final was tied 2-2 between Argentina and West Germany
until something like 10 minutes before the end. (And a non-European team was
in the finals largely thanks to the most famous refereeing error in the
history of the sport).

That evidence looks really weak.

------
r41nbowcrash
>Can we at least conclude that our probabilities outperform bookmakers?

More like outperform the market?

Not sure if more precise probabilities outweigh the risk of ruin, if there's
unbalanced amount of money backing each option.

~~~
cwyers
It's very, very hard to make a model that outperforms bookmakers in the long
run. If you find some sort of inefficiency in the betting market, maybe. But
it turns out that most often the betting market is inefficient at the points
where casual betters outnumber the "sharps," which tends to be the points at
which the bookmakers have the highest juice on their odds, so you're not
likely to come out ahead anyway.

~~~
izyda
do you happen to know where one could find data on this? Not just bookmakers'
odds, but volume of bets? This would make for interesting research

~~~
phillc73
Betfair has an API. This can be used to return the odds and volumes matched
for all sports on their Exchange platform.

[https://developer.betfair.com/](https://developer.betfair.com/)

------
toolslive
It's not the complete picture, but more expensive players tend to be better
than cheaper players. So more expensive teams tend to be better than cheaper
teams. I see this is completely ignored.

Also team Elo has issues, as not a lot of games are played, former
achievements weigh in too heavily.

Nonetheless, this is a really nice effort. Let's see how well it works.

------
Huggernaut
Can someone explain the Belgium vs USA Tree/Graphs for me. The most likely
tree shows Belgium beating the USA but in the graphs the USA has a higher
chance of getting to the Round of 16?

~~~
TheCoelacanth
It looks like they switched the graphs for Belgium and the US, since most
other sources[1][2] give Belgium a slight edge and even their own most likely
outcome graph shows Belgium as the favorite.

[1] [http://fivethirtyeight.com/interactives/world-
cup/](http://fivethirtyeight.com/interactives/world-cup/) [2]
[http://rogerkaufmann.ch/dsaINTe_r.htm](http://rogerkaufmann.ch/dsaINTe_r.htm)

------
nly
This article makes the rather optimistic assumption that bookie prices
efficiently and purely represent probabilistic models. This isn't quite the
truth, as many bookies can and do take a position against the weight of money
(punters often bet disproportionately and according to some psychology).
There's also the juice to consider, particularly for underdogs, where value is
quickly eroded.

------
imarg
It would be interesting to see if the predictions would be different if the
official FIFA ranking[1] was used instead of ELO.

[1]
[http://www.fifa.com/worldranking/rankingtable/](http://www.fifa.com/worldranking/rankingtable/)

------
talles
> There have been some surprises: from 10 of our favorite teams, 3 have been
> eliminated (Portugal, England, and, most surprisingly, Spain)

So Italy, that won the world cup 4 times, wasn't one of the 10 favorites??

~~~
TheCoelacanth
I hardly think winning in 1934, 1938 or 1982 has any impact on present day
results. Even claiming that the 2006 win has much bearing is a stretch since
they have a different head coach and only 4 of the same players.

~~~
rbonvall
Then England should not be a favorite either.

~~~
TheCoelacanth
The favorites are based on the team's recent performance, not on World Cup
victories. For instance, the Netherlands have never won a World Cup, but their
recent performance prior to this World Cup is undeniably better than either
England or Italy, so they are ranked higher.

------
IpV8
They're bracket prediction just picks the higher ranked team every time...

~~~
skj
yes, and?

~~~
TallGuyShort
I'm not sure exactly what they're referring to, but they may mean that in a
series of Monte Carlo simulations, the higher ranked team doesn't win every
time, they win with a higher probability. Which means that (probably) more of
the outcomes result in them winning, but (probably) not all of them. The
article did say they were doing Monte Carlo simulations but I don't see where
they're not, except for when they define the most likely bracket (in which
case choosing the higher-ranked team every time is correct).

------
spacefight
This guy is doing the same calculations with slightly different predictions
than Wolfram

[http://rogerkaufmann.ch/dsaINTe_r.htm](http://rogerkaufmann.ch/dsaINTe_r.htm)

~~~
talles
Brazil with 33.1% of change of winning and Germany with 13.2%?

No way.

~~~
pfortuny
The problem is that they meet (if at all) before the final, so the
probabilities of Germany go down very fast if you assume they will lose
against Brazil more probably.

------
niccolop
Costa Rica knocked out both England and Italy, and yet it places below USA?
Nonetheless, it's an interesting approach.

~~~
peterevans
I would guess that world ranking plays into the algorithm a bit. USA entered
the World Cup ranked 13th in the world, while Costa Rica were placed 32nd
(from their Elo Ratings ([http://eloratings.net/)](http://eloratings.net/\))).
Both teams, if they make it into the quarter finals, would probably face
buzzsaws (Netherlands for CRC and Argentina for USA).

------
sirkneeland
The US has a 1.2% chance of winning.

"So you're saying there's a chance..."

