

The evolution of chess: Game lengths and outcomes - rhiever
http://www.randalolson.com/2014/05/24/a-data-driven-exploration-of-the-evolution-of-chess-match-lengths-and-outcomes/

======
dfan
This is all pretty standard knowledge in chess circles, but it's always nice
to see someone taking an interest in analyzing chess stats. I was a bit thrown
by the misuse of chess terminology, though.

In chess, a "match" is a series of games between two players. The correct term
for an individual game is a "game".

"Plays per person" is known as simply "moves". If each player has made 37
moves, that's 74 half-moves, or 74 ply (in the context of lookahead in the
game tree by a computer).

Most analysts, when studying performance, assign scores of 1 and 0 to White
and Black when White wins, 0 and 1 when Black wins, and 1/2 point to each when
there's a draw. It is unusual to not count draws when analyzing performance.
Doing so would decrease White's apparent advantage. For example, I just
checked all the 2013 games from Chessbase's Megabase and White scored 53.4%
with draws counted as a half point for each side.

By the way, one of the reasons that draws have decreased in the last 30 years
is probably the advent of faster time controls. I also suspect one reason that
games have gotten longer is the death of adjournments.

~~~
rhiever
I revised the wording in the post to reflect your wording suggestions. Thanks!
The visualizations are a pain to replace, but I'll get the terminology right
on those in future posts.

To me, it doesn't make sense to assign a draw as 1/2 to Black and White.
Technically, neither side won, so why not just throw that game out when
looking at which side wins more often? That's what I do in this analysis,
although I show the full breakdown in the final area plot.

~~~
monkeypizza
Consider this hypothetical situation:

A game in which for evenly matched players, in a match of 1000 games, 9 are
wins for white, 1 is a win for black, and 990 are draws. Would it really be
reasonable there to say that W has a massive 90% advantage? I don't think so -
in that situation white only has a tiny average advantage per game, so that
winning such a match would mostly depend on real skill differences.

------
jrpt
Translation of terms: Play -> ply (or half-move) Match -> game

The data doesn't support the conclusion that chess is becoming more defensive.
It could be that time controls or new rule changes have changed the play. For
instance, one proposed rule change is to not allow draws before move 50, which
would definitely impact the number of moves.

~~~
rhiever
>The data doesn't support the conclusion that chess is becoming more
defensive.

That's a good point. I weakened the wording to make it clear that I'm
speculating.

~~~
jrpt
Relatedly, if you want to analyze PGNs, I put together some Python code that
can analyze PGNs in combination with a chess engine. It can also draw chess
positions, if you find interesting positions you want to programmatically
draw:

[https://github.com/pickhardt/chessbox](https://github.com/pickhardt/chessbox)

I'm kind of hoping someone will do a thorough analysis of the types of moves
chess grandmasters make during a game. For instance, here is a chart from a GM
book that categorized moves into four categories:
[http://imgur.com/BMOElXN](http://imgur.com/BMOElXN)

~~~
rhiever
Nice! I've been needing a tool like this for these chess analyses. Do you have
docs showing how to use the scripts?

~~~
jrpt
Not really, it was mostly made for my own use then thrown on Github in case
someone wanted it. It works for most games, but isn't guaranteed to be able to
parse all games. I added example code to the readme.

------
jcrites
> Note: In all of the following plots, the white line is the mean and the
> shaded blue area is the 95% confidence interval [for the mean of moves-per-
> game]

Is confidence interval the right statistical concept to use when
characterizing this data set? It seems like it would be more meaningful to
examine how various percentiles of "moves per game" change over time rather
than plotting a confidence interval around the mean.

In particular, the mean moves-per-game that he displayed is (I assume) the
actual mean of his data set. It's not an estimate of the mean of some larger
population of games, if I'm understanding the article correctly. So what does
it mean to display a confidence interval for a parameter that's known rather
than been estimated?

It seems more interesting to treat the data set as the population and analyze
how it changes over time. For example, I'd be interested to see a graph of
5th, 50th, and 95th percentile of match length. I notice that the confidence
interval is shrinking around his mean -- is that really reflecting that
percentiles are converging on the mean because of changes in play style? Or
have I misunderstood and the data set is being treated as a random sample of
some larger population of chess games? (In that case, since there are more
games in his data set in later years, it is unsurprising for the confidence
interval to shrink. Whereas if the percentiles of match length are converging
on the mean, then I think that's pretty interesting.)

~~~
rhiever
>In particular, the mean moves-per-game that he displayed it is simply the
actual mean of the data set. It's not an estimate of that mean, if I'm
understanding the article correctly. So what does it mean to display a
confidence interval for a parameter that's known rather than been estimated?

Technically, I don't have the "true" mean because this data set isn't a set of
_all games ever_. As such, I'm treating the games I have as a sample of all
games ever, and providing an estimate of how confident I am in the mean I'm
reporting.

>I notice that the confidence interval is shrinking around his mean -- is that
because there are more recorded games

Exactly! Law of large numbers.

~~~
jsweojtj
I have to admit that I (also?) was expecting the shaded region to give
something like the standard deviation about the mean, not the confidence in
the mean.

------
primelens
If moves from all these games are available I am wondering if ngram type
analysis could be used to tease out the emergence of new opening strategies
etc. Given enough games, someone could build a markov-chain "Kasparov
simulator" or something as well... :-)

~~~
rhiever
I'm actually looking at specific moves right now! Hope to have something to
report on by the end of the weekend.

------
JacobAldridge
One of my friends (kerno) and I went through a phase when we were living in
London of playing a LOT of chess against each other. What became apparent was
that he was excellently skilled at openings; my experience was much more honed
finding a kill in the end game. While the statistics are now lost, it became
obvious to me at some point that if I could extend the game length beyond
15-20 moves then I was almost guaranteed to win.

I will also put on the record that kerno once 'pantsed' me - creating a
checkmate in about 8-10 moves wherein I failed to capture a single piece of
his. It was an epic highlight among a series of long games (one went to 57
moves) where I was triumphant.

------
toolslive
Just a remark on terminology: isn't this 'game' length iso 'match' length, and
'move' iso 'play' ? As a match refers to a series of consecutive games between
the same 2 players. (fe a world championship match).

------
dfan
By the way, I would be interested to see a graph of the average rating of the
players by year, as well as what the other graphs look like when controlled
for rating. I wouldn't be surprised if the explosion of the data set in recent
times is partially due to many more games between lesser players being
recorded, and this could certainly have an impact on the stats.

~~~
rhiever
Here's a distribution of Elo scores in the data set:
[http://www.randalolson.com/wp-content/uploads/chess-elo-
scor...](http://www.randalolson.com/wp-content/uploads/chess-elo-score-dist-
plot.png)

This data set is from chess tournaments, so it's predominantly games with
skilled players.

~~~
dfan
What I was wondering specifically was how the average rating varies by year.
My hypothesis was that it has gone down over time as it has become easier for
games between less skilled players to get into databases.

~~~
rhiever
Here's a quickly made chart to show that:
[http://i.imgur.com/lHbVzXM.png](http://i.imgur.com/lHbVzXM.png)

(I don't have enough games w/ Elos pre-1960 to show a reliable mean.)

~~~
dfan
Thanks. FIDE didn't implement ratings until 1970 anyway. (I'm not sure what
the ratings you have for the 1960s actually are - perhaps USCF ratings.)

------
Brajeshwar
I'm just a casual Chess Player. Can someone point me to the new rules of Chess
mentioned multiple times in the comments?

------
bane
I wouldn't necessarily assume that games would be shorter because of better
efficiency, but that the central optimization has been piece preservation and
high level players have become very good at it. This creates the defensive
play style that seen in the modern game.

------
jds375
If you wanna interactively check out chess games recorded over the past few
centuries check them out here:

[http://database.chessbase.com/js/apps/onlinedb/](http://database.chessbase.com/js/apps/onlinedb/)

------
gregschlom
> With the advent of computers in the mid-1900s [...]

Slightly off-topic but... wow! I didn't see this coming. It feels like the
advent of the computer age is officially part of History now.

------
jdale27
What accounts for the big decline in match length between 1950 and 1970?

~~~
jdoliner
I have a guess. In 1952 FIDE reworked the rule about draws by agreement
removing a clause which prevented players from agreeing to a draw before move
30. This is something that isn't mentioned in the article and I think really
should be touched upon if the author wants to claim that longer chess matches
imply chess has become more defensive. A lot of games are draws by agreement
(hopefully the article could also tell us how many of those there are) and so
a longer average game could just mean that players are waiting longer to agree
to a draw.

Edit: Citation
[http://en.wikipedia.org/wiki/Draw_by_agreement](http://en.wikipedia.org/wiki/Draw_by_agreement)

~~~
Isofarro
This is a good point and find, but is incomplete, it needs the context too.

The missing factor is that FIDE took control of the World Championship title
in 1946, and implemented a regular cycle of Zonal tournaments, Interzonal
tournaments, candidates tournaments to determine a challenger to the existing
Word Champion starting in 1949/1950\. (The candidates tournament was changed
to candidates matches after 1963 - see Bobby Fischer's complaint "The Russians
have fixed World Chess":
[http://sportsillustrated.cnn.com/vault/article/magazine/MAG1...](http://sportsillustrated.cnn.com/vault/article/magazine/MAG1074080/)
)

Before the war, major chess tournaments were sponsored by wealthy chess
patrons. FIDE represented amateur chess players, I think their crowning
achievement was their nomination of the Max Euwe (who described himself as an
amateur, he was a full-time teacher) as Alekhine's challenger in 1935 (and
consequently beat Alekhine and became the 6th World Champion). FIDE had no
hold over top-level chess. (Edit: FIDE also suggested Bogolyubov as Alekhine's
challenger, twice. In all cases FIDE could not force Alekhine to accept, but
Alekhine did accept to these challenges - possibly for financial necessity)

FIDE's World Championship cycles from 1949 onwards greatly increased the
number of chess tournaments titled players could participate in, globally.
Particularly the Zonal and Interzonal tournaments, brought together thousands
of chess players regularly in - at the start - 3 year cycles.

So immediately, the number of tournaments involving titled players that
adhered to FIDE rules jumped up substantially, and since the World Chess
Championship cycle became the main top-tier events, other chess events
naturally standardised on FIDE rules, and gave rise to hundreds of FIDE-rules
tournaments.

So yes, the removing of the short-draw avoidance rule, in the context of a
massive and new qualification cycle for the World Championship title gave rise
to the side effect of shorter games.

Before the war, in patron sponsored top-level chess tournaments, each
tournament had it's own set of playing rules. They were generally the same, a
modification here and there based on previous tournament experiences of
sponsors and influential players. There wasn't a consistent set of rules
(until FIDE in 1946). For instance, in the Nottingham 1936 tournament book by
Alekhine, he writes that they dropped the no-short-draw rules because it was
becoming more evident players were circumventing that rule anyway, so it
wasn't stopping non-competitive games from happening.

And the World Championship matches were predominantly decided by the current
title holder, and mostly about whether it was financially worthwhile to the
title holder to risk his title on a match with a sponsored challenger.

In turn, I wonder whether the tailing off of that in 1970 was the introduction
of the Elo rating system and the Fischer-effect leading to the
professionalisation, financial viability and the commercialisation of chess?

Maybe a rating system infused more competitiveness into tournaments, more
professional application from the players.

Also, the tailing off of game length happened right when Fischer returned to
the qualification cycle, where his spectacular series of results culminated in
winning the World Championship in 1972 against Spassky in Reykjavik, starting
the fracturing of the Russian hegemony of World chess.

Fischer also pushed for chess professionalism from a commercial/financial
point of view. More money started to flow into chess because of Fischer and
his financial demands / quality expectations. Perhaps that started the
conversion of chess from a state-run Russian-owned speciality to a
commercial/sponsored prestigious tournaments (e.g. Montreal 1979, San Antonio
1972, Milan 1975). And in conjunction with a rating system, compelled players
to be more motivated towards playing for the full point, to improve their
international standings, to improve the quality of tournament invitations,
leading to more financial stability.

Somewhere between the Russian dominance of the World Chess Championship, up to
Spassky losing to Fischer in 1972, and the multi-million dollar Kasparov-
Kasparov matches from 1985 onwards, elite chess became a financially enriching
career path for more and more players. 1970 may have predicated that, and the
gradual rise of the length of games a side-effect of money being pumped into
chess and initially the Fischer-effect.

------
unknownian
New players to chess are introduced to classical and Romantic games from the
1800s, and while I'm nowhere near very good, it seems harmful to show off
these games as quality when they are often based on large inaccuracies. That
being said, when I follow current top 10 player matches the players are so
focused on playing the right move to maintain or develop an advantage that
there is less sense of excitement or sharpness than games I've seen in games
from 40-50 years ago. Perhaps that's why the famous players from then will be
immortalized more deeply than prodigies like Caruana. Still, the advent of
computers really changed the game, perhaps for worse.

~~~
S4M
I wouldn't advise beginners to study today's elite's game, because the opening
preparation is incredibly deep and what you see is only the pit of the
iceberg, meaning that it's impossible to really understand what's going on
without deeply studying the opening theory, which is not what you'd want a
beginner to do.

------
aurelianito
Do anyone know about a similar analysis of go? There are records of games
dating hundreds of years too.

~~~
rhiever
I just found this research paper on Go:
[http://biorxiv.org/content/early/2014/05/19/005223](http://biorxiv.org/content/early/2014/05/19/005223)

