

What Real-Time Gambling Data Reveals About Sports - lil_tee
http://www.gambletron2000.com/about

======
jonnathanson
ESPN Magazine recently had a "conspiracy theories" issue, in which it explored
(among other things) the long-held, popular theory that basketball is fixed.
College basketball, in particular. IIRC, the preconditions for a fixed game
tended to be:

\- Non-tournament, regular season play (b/c not as many bettors and media
would be paying attention)

\- The favorite team is favored by 11 or more points

\- The favorite team is dominated by one or two very strong players

If one player controls his team's play, and he's favored by 11+ points, he has
the incentive, the ability, and the margin to shave points without risking
losing the game. With a smaller point spread, on the other hand, it's too
risky. For reasons I can't recall, an 11-point spread was the magic number. It
provided just enough cushion to cover shaving, without jeopardizing the
nominal win.

When analysts looked at the history of games that met these criteria, they
found consistently abnormal distributions of outcomes in favor of the winning
team, but just south of the spread. They estimated that about 3-4% of games in
the study sample are quite likely to have been fixed.

At any rate, it would be interesting to see bigger data sets plied for this
sort of thing.

~~~
T-hawk
An alternative explanation is simply that a lead of 11 points tends to dwindle
because the good players relax at less than maximum effort while on the court
when the win is in hand. That would be statistically indistinguishable from
intentional point shaving. (In fact, it IS emergent point shaving, just
without any sinister motive.) This also explains the postseason discrepancy,
that players wouldn't let up in a tournament game.

The correct resolution would be for the pregame betting line to account for
that tendency, taking into account the favorite's likelihood to coast during
garbage time.

~~~
jonnathanson
It's certainly plausible, but in theory, the point spread is supposed to
account for that effect (among all other natural factors involved in play). As
you suggest in your second paragraph.

Intellectually, of course, I tend to assume the Occam's Razor solution (i.e.,
natural variance and other effects over deliberate fixing) in most cases. I'm
not one for conspiracy theories as a first line of thinking. And I think a
bigger sample size is needed to begin with.

Another thing to keep in mind is that -- at least in theory -- players aren't
supposed to be cognizant of a point spread when playing. If we wanted to
detect _deliberate_ point shaving, we'd need to look at instances where the
player behaved as if he was very aware of the spread and was actively trying
to manage it. For example, he actively takes possession of the ball and runs
out the clock, misses easy shots, passes into heavy coverage, and then
reverses course to correct when/if he goes too far. Basically, the behavior of
someone who's _trying hard in either direction_ , as opposed to someone who
just tries hard to secure the win and then lets up.

------
gd1
"On the other hand, all the leagues have significantly lower average hotness
in the first half compared to the second half, so maybe it’s not just the NBA
that has a boring first half problem."

This seems to be a misconception. It stands to reason that the chart will grow
jumpier as you near the end of the game. In options lingo, the implied
volatility will be more stable the further you are from expiry (the end of the
game). Theta decay and all that. The market is basically giving you an
integrated forecast from each point in time until the final buzzer, and as
that window shrinks you expect the odds to be jumping around more.

~~~
dllthomas
Actually, I'm not sure whether it's a misconception or not. What you say about
applied volatility is true, but if your excitement is proportional to effect
on the odds of the outcome then that is still what you're measuring. I have no
idea whether that is actually the relationship, though (but hey, you could
measure it!).

~~~
gd1
Heh, true. I guess what I was saying is that if you freeze a basketball score
at 20-18 in the second quarter, and at that point the odds are 52-48 in favour
of team leading by 2, and then neither team scores for the rest of the match,
then the odds will asymptotically approach 100-0 as you approach the final
buzzer. Even though no one is scoring. By the variance measure used here, the
second half will appear more exciting, even though nobody scores. Reminiscent
of football (soccer) perhaps. The fact that first halfs have less variance is
just evidence that they are further from conclusion, not necessarily that they
are less exciting, but then maybe the two are linked.

~~~
dllthomas
I'd really like to do that actual study - monitor people watching the games,
see how heart rate and perspiration match up with odds-based measures of
"excitement".

------
fidotron
It strikes me as a shame that people can be addicted to gambling and we see it
as a moral problem. For example, gambling data around elections seems highly
likely to be more useful than opinion polls are. After all, this is simply
another application of the ideas behind how we expect markets to function.

Gambling is one of the few ways you can incentivise someone to be honest with
you about their opinion, and for that reason I think it's actually a mass of
untapped potential.

~~~
normloman
I don't care if it makes people honest. If it's that easy to lose control and
bet away your life savings, we shouldn't encourage it.

~~~
theorique
The same kind of arguments are used to argue against the Second Amendment: "I
don't care if guns make people safer in their homes. If it's that easy to lose
control and shoot your spouse, we shouldn't encourage it."

The fact that immoral men exist and occasionally do terrible things should not
bias us against the vast majority of good men for whom such acts are
unthinkable.

~~~
rmc
Comparing gambling to guns-in-USA is not a great example, because many other
countries do not recognise the right of the people to guns.

~~~
theorique
Not every country loves freedom ;)

------
xpose2000
This is pretty interesting and fun data. The article talks a lot about what
games are exciting and it basis this off of wild fluctuations in a team's
chances of winning. This makes perfect sense, but there is more to it than
that.

If there are two teams playing each other and the score is 43-36 with plenty
of ups and downs along the way, is it an exciting game? Sure sounds like it.
What if those two teams are the Browns and Dolphins playing in a meaningless
game in December with two backup quarterbacks? Is that game still exciting?

These things are hard to quantify because the algorithm needs to put things
into context that it may not be able to understand.

~~~
aet
What data exactly is this site using?

~~~
malbs
at a guess - Betfair.

------
kohanz
Link to the site being explained:
[http://www.gambletron2000.com/](http://www.gambletron2000.com/)

and the non-RapGenius about URL:
[http://www.gambletron2000.com/about](http://www.gambletron2000.com/about)

This is unbelievably cool. I am blown away.

------
sp332
Blatant blogspam? It's a single paragraph that links to
[http://news.rapgenius.com/Atodd-what-real-time-gambling-
data...](http://news.rapgenius.com/Atodd-what-real-time-gambling-data-reveals-
about-sports-introducing-gambletron-2000-annotated)

Edit: I guess it's cross-promotion, which is fine just not what I expected.

~~~
nightpool
There's a script on the page that embeds the Rap Genius article with
annotations (a new feature they just debuted), which I'm assuming you're not
seeing because you have javascript turned off. The content is on Rap Genius so
anyone can add annotations to it. Notice that the linked website
(gambletron2000.com) is also a side project of the Rap Genius Engineering
team.

All of that aside, I'm not really a hundred percent sure you can call a
website's "about" page blogspam...

~~~
sp332
Ah, I feel silly now! After the first paragraph (which is all that loads),
there's a friendly link that says "Read this on NewsGenius." So I really
thought that's all there was to the page. I'm using RequestPolicy so it
blocked the content from the rapgenius domain from loading, but usually I'm
better at noticing when that happens.

------
gd1
How do these betting sites handle real-time events? I believe there was a case
recently where a man at the Australian Open tennis in Melbourne was arrested
for transmitting point information outside the stadium before it could be
broadcast on television (there is always broadcast delay), which obviously
could give you a big advantage. I think European football ones go into a vol
auction (pause betting) when a goal is scored? But you could have a guy
sitting in the stadium, wired up with a buzzer to press when a striker goes
one on one with a goalie, or a penalty is awarded, and then just go all-in on
the market? The whole thing is a can of worms.

~~~
alexdias
They handle real-time events exactly as your latter example stated: an
external (licensed) company provides live data, and orders that were placed
during a dangerous situation (e.g. one-on-one) are subsequently voided if a
goal is scored.

~~~
gd1
Why did you edit away your claim that it is to the benefit of the customers?
Being a market-maker and being able to void any trades that occur during
volatile periods sounds like a dream to me. No need to worry about adverse
selection eh? Just printing cash with fat bid-ask spreads and if anyone dares
to get the jump on you, they get voided. I wish I could do that in the real
markets.

~~~
BSousa
I can't speak for all companies, but some companies I worked with with honour
those bets (places just a second before the goal). They try to find a pattern
and ban the users after they find them doing this (and it is getting harder
since live feeds are faster and faster) but I never saw a company voiding a
bet because of this.

------
gomox
Great stuff. Very entertaining read.

This said, I am a bit skeptical about the asessment of "game hotness". Of
course games that are tied or close near the end exhibit significant agitation
at that stage (and "boring 1st halves") from a betting standpoint.

This might sound obvious, but great games are not just about the outcome.
Think of something like soccer, where few points are scored in a given match.
It would be very interesting to see what the data looks like for those, as
there are fewer data points.

~~~
jebus989
He computes "hotness" for soccer (football, to me) matches too:
[http://www.gambletron2000.com/?sport=epl%2Cchampions](http://www.gambletron2000.com/?sport=epl%2Cchampions)

As you'd imagine, each goal brings about a massive spike in implied win/loss
probability.

~~~
nicolsc
The hotness factor is only based on the outcome of the ongoing game. It should
also consider the live impact on classification/awards.

Have a look at yesterday's CL game between Paris & Leverkusen :
[http://www.gambletron2000.com/events/2276/paris-st-g-v-
lever...](http://www.gambletron2000.com/events/2276/paris-st-g-v-leverkusen),
which was the second of a 2-legs opposition, Paris having won 4-0 on the first
one. Which means 99.9% chances of qualifying It has a _mildly hot_ 762 score,
where it should have been between 0 and 10. As the game was almost
meaningless, the fact that Leverkusen scored first before finally losing the
game didn't bring any excitation.

~~~
iamwithnail
Yeah, but that score is based on _in match_ betting, rather than "to qualify",
no? So what I find most interesting there is that the odds continued to rise
for PSG after the goal, even though people (like me) put money on when they
went behind -better odds, and cheers for the £20. The crucial thing that's at
issue with this system is that it's not just the weight of money that's at
work here, but also the bookies adjusting the odds to ensure their book is
still green, and to encourage/discourAge people putting further weight on
particular outcomes.

Tl;dr - the method is really interesting, but imperfect because it's not a
pure market, I think. Bet fair odds or similar would be interesting, but their
API is horrendous.

------
polemic
_" Maybe there’s a slight tendency for teams in the 10-20% range to win at a
slightly higher rate than expected (and consequently teams in the 80-90% range
to lose more than expected), but the difference is pretty small, and given the
number of observations and parameters, it would not be surprising if this
deviation occurred completely randomly."_

This is a well known phenomena [1] that manifests in almost all prediction
markets. People tend to overestimate the likelihood of likely events, and
underestimate the chance of a rare events. If you're patient (and,
importantly, trading fees are low enough) then it is usually possible to
profit from these "sure thing" positions over many event

1\. Just one of the many links you find in google on the subject:
[http://journal.sjdm.org/9729b/jdm9729b.html](http://journal.sjdm.org/9729b/jdm9729b.html)

------
prl315
It will be interesting to compare the data and graphs presented here to
historic win probability charts provided by Advanced NFL Stats[1] for the NFL
and Fangraphs[2] for MLB. See how Vegas stacks up against models based on
historical data.

[1] [http://live.advancednflstats.com/](http://live.advancednflstats.com/) [2]
[http://www.fangraphs.com/wins.aspx?date=2013-10-30&team=Red%...](http://www.fangraphs.com/wins.aspx?date=2013-10-30&team=Red%20Sox&dh=0&season=2013)

------
Yhippa
I think the Recap functionality might be understated. If they can add more
color data about event times, actual scores, player names, then I think they
could be on to something in automating recaps of games.

~~~
kohanz
Shameless plug for my very work-in-progress (read: lots of broken links) side-
project: [http://recappd.com](http://recappd.com)

Any tips for taking the information there and turning it into a natural
language sounding recap would be appreciated.

Also, one of the leaders in this field:
[http://automatedinsights.com/](http://automatedinsights.com/)

------
Dirlewanger
Pretty cool data, but doesn't reveal anything too insightful into peoples'
gambling behaviors regarding sports. People are prone to a game's intangible
momentum, who knew?

------
splike
I wonder where they get their data, I feel that the community could come up
with a better ranking algorithm than the square distance formula that they
give

~~~
kylebrown
I'm wondering the same. They mention TradeSports, which is Intrade's
prediction market for sports, but it hasn't launched yet afaik (I've been on
their e-mail list since before Intrade shut down in March 2013, and I haven't
received anything).

EDIT: nm. the reference to TradeSports was to 2007 (before it shut down in
2008).
[http://en.wikipedia.org/wiki/TradeSports](http://en.wikipedia.org/wiki/TradeSports)

------
cschmidt
I wonder how much better the prediction market does compared to a predictive
model based on the score and time remaining. Given data on the games, it would
be pretty easy to develop a model on the P(winning|scores,time_remaining).
Would that do as well as humans in aggregate? Obviously it doesn't take into
account momentum, how well the teams are playing, etc.

------
gomox
Another thought: Some indication of a "trade volume" equivalent might be
useful as well (and probably a good input for a "hotness" score, as more
popular games will probably get more bets). I don't know if that information
is available though.

~~~
malbs
it definitely is available with betfair - it allows you to filter a huge
amount of garbage data because people are always setting 1.01 / 1000.0
positions, and if you don't exclude them they skew actual implied
probabilities.

------
FollowSteph3
I wonder how this applies to other markets and domains. As well cod it be used
to find irregularities and cheating?

