
Finding the Tennis Suspects - aroman
https://medium.com/@rkaplan/finding-the-tennis-suspects-c2d9f198c33d#.g93vsuejk
======
dates
Great work! As someone who never does data exploratino or use python, I
appreciated how annotated your iPython notebook was. It was fun to read
through. (link [https://github.com/BuzzFeedNews/2016-01-tennis-betting-
analy...](https://github.com/BuzzFeedNews/2016-01-tennis-betting-
analysis/blob/master/notebooks/tennis-analysis.ipynb) )

~~~
guava
That's the iPython notebook published by BuzzFeed showing their procedure.

Here's the link for the relevant notebook for this work:
[https://github.com/rkaplan/deanonymizing-tennis-
suspects/blo...](https://github.com/rkaplan/deanonymizing-tennis-
suspects/blob/master/notebooks/tennis-analysis.ipynb)

~~~
dates
oops thanks!

------
te
I have no problem believing that there is match-fixing in tennis, especially
among the non-elite ranks. However, it is still an extraordinary claim, and as
such, requires extraordinary evidence. They have examined a dataset that
includes match losses by 1,509 unique players. Among others, they have accused
a fellow whose match results, by their own estimation, are 4.7% likely to have
occurred by chance. If you're going to be involved in throwing around
accusations of this magnitude, I'd like a little more rigor in the statistics.
I don't know how they sleep at night.

~~~
jib
Yep. "Assuming the initial odds are right, these guys may have cheated".
That's one hell of an assumption. The odds moving is one of the strongest
indicators of the initial odds not being right. Market efficiency etc.

~~~
phillc73
Is the market truly efficient? Value betting is all about finding when it's
not.

~~~
SolarNet
Iff NP=P.

------
Infernal
I find this to be a useful example of the dangers of so-called anonymized data
(namely, the danger that it isn't).

I'd be curious to know how much fuzzing of the percentages published by
Buzzfeed would've been sufficient to prevent this sort of obvious matching.

------
ch4s3
Something feels a bit ethically questionable about publishing the names like
this.

~~~
chris11
It also seems really stupid. I would not be surprised if he got threatened
with lawsuits from the named players.

~~~
achamayou
On what grounds? They're not making any claims, they're just noting the fact
that interesting and unlikely things happened to these guys' odds (public
data) during matches.

They've gone through a reasonable degree of trouble to make sure their numbers
are completely right too.

~~~
gohrt
The original authors published accurate data. The medium author accused named
individuals of cheating.

~~~
achamayou
They're not, read again, in fact they start with a disclaimer explicitly
stating they are not commenting on the allegations, only de-anonymising the
data.

------
joaoqalves
It looks like someone is going to deal with a lawsuit. Good work, but if I was
one of those players, I'd be very mad.

~~~
s_q_b
Not very likely in American courts given the public figure libel standard.[0]

Essentially, in order to accuse a public figure (e.g. politicians), you have
to prove that the defendant acted with _actual malice._

Actual malice in this case is defined as either actual knowledge that the
statements were false, or reckless disregard for the truth.

A reckless disregard for the truth is more than mere negligence, the person
making the statement must have actual and demonstrable doubt as to the truth
or falsity of the statements made. [1]

It is very difficult to meet that standard.

[0]
[https://en.wikipedia.org/wiki/Public_figure](https://en.wikipedia.org/wiki/Public_figure)

[1] _New York Times Co. v. Sullivan_ (1964)

------
bitL
Tennis is a puzzling sport - you can be up 5:0 and yet lose most of the
following games as well as the match itself. Anyone that plays tennis knows
about this "roller-coaster" (especially visible in WTA). "Unusual" is quite
usual in tennis, honestly. You should take it into an account when you do
stats. Obviously, people try to exploit this fact to make believable pre-
determined matches, but it's really difficult to tell who is doing it, and who
is simply more "unstable".

------
Sealy
Very clever. I just hope Buzzfeed don't chase you down and force you to take
the list down!

------
dwd
Seems choking is also now cheating.

Tennis is a confidence game, miss a big few shots and you stop going for them.
It's all downhill from there.

We need to see the phone records.

------
kyberias
What is the true motivation of doing this? (publishing the names, that is)

