
The unreasonable effectiveness of Soccermatics? (2017) - mpweiher
https://www.interaliamag.org/articles/david-sumpter-unreasonable-effectiveness-soccermatics/
======
dpleuler
Funny to stumble across this on Hacker News.

I'm the analyst mentioned in the article. Happy to answer any questions about
our weird little industry.

~~~
r2d2-c3po
From following various soccer blogs, it seems that defensive stats aren't as
polished and explored as offensive ones. I'm curious what stats beyond things
such as tackles/interceptions are looked at. For example, Maldini has the
reputation of being one of the top defenders ever, but also is known for his
quote "If i have to tackle, then I have already made a mistake" (paraphrased).
His tackling stats seem to support that style of play in that he made fewer
challenges than most. Perhaps he made tons of interceptions, and in that
sense, never had to tackle? I'm unable to find a good source of his playing
stats.

Anyways, what sort of things do you look at for defensive players? It seems
that its when I look at things such as WhoScored's statistical team of the
season, it has players such as Mustafi, who generally has a negative
reputation for his play. I suspect he is so high because the rating metric
used by whoscored overvalues offensive contributions of defenders, vs. pundits
more likely look for a defender's defensive contributions. Are there any form
of 'advanced stats' for defenders beyond the basic measured stats of
challenges, interceptions, etc, that you and/or the industry looks for?

[https://www.whoscored.com/Statistics](https://www.whoscored.com/Statistics)

~~~
dpleuler
Defensive metrics are very difficult for a couple reasons.

The first problem is the data. The soccer viewing public is largely familiar
with event-level data, typically provided by (my previous employer) Opta.
They've done a great job normalizing soccer statistics on the cultural level,
but the information they've collected at scale isn't that useful for creating
good defensive metrics.

Other companies have sensed an opportunity here and have started providing
more detailed data around things such as defensive pressure. Suddenly, you can
contextualize each offensive event with the level of defensive pressure
applied to it. I think this will be a game changer, but we're in early days
there.

Other companies provide player-tracking solutions that give you real-time
position of all players on the field. This is great because you have a
"complete" picture of the game, but it requires a lot of work to build more
sophisticated spatial/geometric models.

There's also the "Howard Effect", coined largely as a basketball term, but
it's similar to the Maldini example you provided. Some defenders are so good
that they don't have to be "active" defenders. That's something which is
really difficult to adequately control for.

~~~
r2d2-c3po
Thanks! Also, what are some sources for stats that would be available for free
online? I've looked at some 538 stuff, some Statsbomb stuff, whoscored, and
football-data.

~~~
dpleuler
Legally, there isn't much. Most of those sites are powered by Opta data, which
is exclusively for-purchase.

Statsbomb is a little different. They've started collecting their own data and
are offering some free data from various Womens leagues.

~~~
r2d2-c3po
Thanks! Appreciate your comments

~~~
thom
This repo should grow each week, and eventually catch up to the real world in
terms of NWSL data:

[https://github.com/statsbomb/open-data](https://github.com/statsbomb/open-
data)

------
patrickk
For those interested in this sort of thing, 'packing' and 'impect' are also
interesting to read about, developed in Germany but it still hasn't widely
spread into the mainstream like more traditional stats like possession, pass
completion %, or meters run. These traditional stats very often do not explain
why team A or B actually won the game. Therefore, two new statistical measures
were developed, which better explain why a particular team won a game.

Packing is the measurement of how many opposition players are beaten by a pass
(or dribble or other move).

Impect is the number of deep lying defenders beaten, which is obviously more
valuable compared to high pressing strikers.

The key insight is that defenders beaten, particularly deeper-lying defenders
are the measurement needed to identify who is expected to win (and therefore
assessing the value of passing players, or defenders).

[http://bundesligafanatic.com/20160610/impect-packing-the-
fut...](http://bundesligafanatic.com/20160610/impect-packing-the-future-of-
football-analytics-is-here/)

~~~
m_myers
I'm not that familiar with soccer stats, but this section needs a reply:

> A “shutdown” cornerback like Richard Sherman can be a star in the NFL thanks
> to interceptions, broken up plays and tackles. “Lock down NBA defenders”
> like Bruce Bowen and Dennis Rodman can prove their worth with steals, blocks
> and rebounds. Football has goals and assists, that’s it.

Just as in soccer, those "counting stats" are not great measures of defensive
ability. The recently retired cornerback Darrelle Revis was regarded as the
best defensive player in the NFL from about 2009-2011. Yet he did not rack up
interceptions or pass breakups -- in fact, in 2010, he had 0 interceptions and
only 9 pass breakups over 13 games. Why? Because the receivers he was covering
were never open, so quarterbacks rarely attempted to pass in his direction.

Similarly, steals, blocks, and rebounds are only a vague indicator of
defensive ability; it's never a bad thing to get a steal or a block, but if
you routinely leave your man to try to poke the ball away from someone else's
man, you're likely hurting your team overall. The NBA has been working on
developing better stats, including deflections (you get your hand on a pass
but don't necessarily come away with the steal) and shots defended (you're
within a short distance of a shooter). But Darryl Morey, general manager of
the Houston Rockets and a well-known stats nerd, has said in a Reddit AMA that
no publicly available defensive statistic is useful.

Part of this is because all three sports in question are team sports, which
means a great deal depends on the defensive scheme. You may not block shots,
but is it your job to block shots or to stop the ballhandler from getting near
the basket? You don't get tackles, but is it your job to get tackles or to
funnel the ballcarrier right into the linebackers?

It's extremely difficult for any outsider to determine the defensive
effectiveness of a player. The only thing we can offer is guesswork.

~~~
QuercusMax
Sounds very similar to the problem of accurately assessing performance of
software developers working as a team.

Sure, you have people who write a ton of code and implement a lot of features,
and fix a lot of bugs, who clearly are contributing a lot. But you also have
people who, though other means (code reviews, refactoring and other code-
health work, etc) ensure that a project is maintainable and sustainable.

How do you measure the value of 100 bugs that never made it to production
because of high quality code-reviews? Or those 5 high-value features which
were a snap to implement because somebody took the time to clean up all the
cruft from Mr Rockstar Bro who made a gigantic mess?

------
vanderZwan
> _The fact that non-mathematicians can produce sharp and correct criticism of
> an applied mathematician’s work shows us that the success of mathematics
> does not rest on its abstract beauty. The ability of researchers not versed
> in the subtleties of mathematics to help develop models contradicts Wigner’s
> ‘unreasonable effectiveness’ argument. It tells us that people not trained
> in mathematics are also able to give deep insights to the subject.
> Mathematics is not there to be discovered, it is part of the patterns of
> reasoning in all of our brains._

The key insight. It kind of reminds me of how physicists like Robbert
Dijkgraaf (I think) have argued that physics and other fields are now
contributing to mathematics through their own outlandish needs for weird
maths.

~~~
lucideer
I found this article a bit odd; in my mind, some parts seemed to contradict
others in strange paradoxical ways.

For example, from your quoted section:

> _Mathematics is not there to be discovered, it is part of the patterns of
> reasoning in all of our brains._

Isn't that the very definition of elegance? He later says, talking about A.J.
Ayer's definition of "non-sense":

> _Wigner freely admits that his idea about maths comes from a feeling that
> can’t be verified by known scientific methods_

Elegance, while there is often a wide collective understanding of what it
means in mathematics, is essentially a subjective, aesthetic property; an
intuitive one. Is this not exactly what:

(a) Wigner means in that quote,

(b) A.J. Ayer means in his definition, AND

(c) the author refers to as "patterns of reasoning in all of our brains"?

~~~
goldenkey
We like compression. That's our main requirement. We may not like hard coded
constants in our physical theories but thats only because we'd prefer a single
constant that uncompresses to all six. Or a single law about laws that
uncompresses to the entire theory. Constructor theory [1]

We also like symmetry but symmetry is just a way to identify attack surfaces
for compression.

[1] [https://www.edge.org/conversation/david_deutsch-
constructor-...](https://www.edge.org/conversation/david_deutsch-constructor-
theory)

------
severine
Is this an article or a book teaser?

(DDG !libgen soccermatics)

Ah, looks very good!

~~~
thom
There's two editions of Soccermatics, and his new book Outnumbered.

------
codethief
Related research on basketball:
[https://www.physics.umass.edu/events/2015-09-09-statistics-b...](https://www.physics.umass.edu/events/2015-09-09-statistics-
basketball-scoring-and-lead-changes) (arXiV:
[https://arxiv.org/abs/1503.03509](https://arxiv.org/abs/1503.03509))

I had the pleasure of attending the mentioned colloquium by Sid Redner back in
2015 and I was absolutely blown away by the effectiveness of a simple random-
walk model.

------
everdev
> The ability of researchers not versed in the subtleties of mathematics to
> help develop models contradicts Wigner’s ‘unreasonable effectiveness’
> argument. It tells us that people not trained in mathematics are also able
> to give deep insights to the subject.

This would seem to be consistent with the notion of "common sense" that it's
possible even for an average person to be able to ask meaningful questions and
cast doubt on subjects they're not specifically trained in.

------
syastrov
Newton developed the mathematics of calculus to solve physics problems. That
goes against Wigner’s hypothesis. I believe that it is difficult to pinpoint
where inspiration comes from. However, mathematical rigor can often help. You
could also argue reductionalistically that, since we all live in a physical
universe, we are always inspired by something physical.

------
brootstrap
interesting read. some of the passing diagrams and such are quite interesting.
arsenal fan fyi, COYG and FOYS (ef off) spuds fans. with wenger gone arsenal
will overtake =)

from a blog post, shows passing networks and has some vector fields overlaid
on the soccer pitch. [https://www.fourfourtwo.com/features/soccermatics-how-
mesut-...](https://www.fourfourtwo.com/features/soccermatics-how-mesut-ozil-
so-good-and-why-wenger-relies-ramsey)

------
IshKebab
"The unreasonable effectiveness" is the new "considered harmful". Be more
creative!

~~~
alanbernstein
You should write an article called 'The unreasonable effectiveness of "The
unreasonable effectiveness" considered harmful'

