
Chess Move Compression - nreece
https://triplehappy.wordpress.com/2015/10/26/chess-move-compression/
======
mjn
Although I'm also not very familiar with the history of such schemes, I do
recognize the (not quite) 12-bit one, which I believe was the first scheme
proposed, by Claude Shannon in his 1950 article "Programming a Computer for
Playing Chess" [1]. Perhaps because it's naturally the first scheme anyone
would come up with, but nonetheless, here's what he had to say about it:

> A move (apart from castling and pawn promotion) can be specified by giving
> the original and final squares occupied by the moved piece. each of these
> squares is a choice from 64, thus 6 binary digits each is sufficient, a
> total of 12 for the move. Thus the initial move P-K4 would be represented by
> 1, 4; 3, 4. To represent pawn promotion on a set of three binary digits can
> be added specifying the pieces that the pawn becomes. Castling is described
> by the king move (this being the only way the king can move two squares).
> Thus, a move is represented by (a, b, c) where a and b are squares and c
> specifies a piece in case of promotion.

I'm actually slightly surprised Shannon didn't propose a more compact scheme
using the lower entropy of legal chess moves, but I guess his purpose in this
article was more to do a ballpark estimate of the feasibility of computer
chess playing in general.

[1]
[http://www.pi.infn.it/~carosi/chess/shannon.txt](http://www.pi.infn.it/~carosi/chess/shannon.txt)

------
bo1024
Very cool. I suspect that much better compression is possible in principle
(not that I'd want to implement it) using an openings book or game database
and an engine. The idea would be to first record the opening played in the
game and the move number at which the game deviates. A lot of work would need
to go into figuring out the optimal opening-book size. Then, use a
deterministic chess engine with predefined parameters at each move, and record
which move number on its suggested list was played (e.g. the top move, second
move, third move, etc) with a fallback to manually encode the move if none of
the top 8 or so moves are played.

A more sophisticated version would use arithmetic coding, with the predictions
of the next move initially coming from an opening book / game database, then
coming from the engine. The idea being that most games you want to compress
are at a high enough level that the engine gives good predictions ... perhaps
one could even tune the engine's parameters for better results. But again,
like I said, it doesn't sound like fun to code.

A separate comment: I wonder if the time efficiency issues mentioned are
really that severe? Since the problem is so small/finite.

~~~
billforsternz
Article author here: I do mention the idea of using a chess engine to improve
compression. I propose a simple scheme and calculate/estimate my scheme would
compress moves in a reasonably played game to about 3.9 bits each on average.
I am sure it's possible to do better, but I suspect you'd hit diminishing
returns before you get close to 3 bits.

I really should have included something about adding an opening book as I
thought about that quite a lot. My conclusion was that an opening book is not
going to be a really dramatic win. A simple scheme might encode say 64K
opening sequences using 2 bytes, and save an average of perhaps 10 (half)
moves. So a saving of 10*4 - 16 = 24 bits, spread over an average of 80 (half)
moves. So about 0.3 bits per move.

You might question my estimate of 10 half moves max, but it's an educated
guess. One thing I've discovered whilst working on my chess database is that
the standard tabiya positions are reached by huge numbers of different
transposition possibilities. See my blog post at

[https://triplehappy.wordpress.com/2013/11/13/statistics-
and-...](https://triplehappy.wordpress.com/2013/11/13/statistics-and-
transpositions/)

This means that the standard canonical way of reaching a well known position
doesn't serve as a good proxy for the start of all the games that include that
position.

~~~
howeman
In the "Checkers is solved" paper, they state "The complete 10-piece databases
contain 39 trillion positions (Table 1). They are compressed into 237
gigabytes, an average of 154 positions per byte!"

Do you have any idea how this would be done? It seems crazy.

~~~
mappu
Probably just standard compression techniques - the linked article is mostly
discussing a standalone move, but there are a lot more options available when
working with large strings of text (dictionaries, BWT, arithmetic coding,
...).

The downside is a lack of individual byte accessing without a lot of
surrounding decompression work, but it'd be appropriate for stream processing

In fact the best compressed size is probably found by reducing some of the
clever tricks in the article in order to expose more structure to a general
compressor. Similar to running `precomp` or `antiX` before solid-packing
multiple already-compressed files.

------
slm_HN
Chess move compression was an interesting topic back when the games were
stored on 360k floppy disks. Nowadays every master chess game ever played in
the history of chess fits easily on one DVD, uncompressed.

So it's not clear what the point of compressing the moves, especially since at
some points the article is concerned about size and sometimes about speed. If
it's just an intellectual exercise then consider the following scheme:

Generate the legal moves for a position, then sort them. However don't sort
them using a naive method like alphabetical order. Instead sort them in order
of likeliness of being played. For example moves that capture the last moved
piece are at the top of the list. So for example 1.e4 d5, now the first move
in the list would be exd5, capturing the last moved piece. So the move exd5
can be encoded in 1 bit. Now imagine a 40 move game where every move played
was the first one on the sorted list. This takes 80 bits to store the entire
game. Of course moves farther down the list take more bits to encode.

This is similar to one of the schemes in the article, but the article gets
hung up with fixing bit sizes rather than just using the exact number of bits
required for each move which results in variable bit lengths for each move.

This is, more or less, the scheme Chessbase first used for their data files
almost 30 years ago.

~~~
steveridout
> Nowadays every master chess game ever played in the history of chess fits
> easily on one DVD, uncompressed.

If someone created a mobile app containing this database, I would certainly
appreciate those multiple gigabytes of data being compressed.

------
abecedarius
That 8-bit code is clever.

At [http://stackoverflow.com/questions/1831386/programmer-
puzzle...](http://stackoverflow.com/questions/1831386/programmer-puzzle-
encoding-a-chess-board-state-throughout-a-game/1838023#1838023) I suggested
combining the sort-the-moves-by-their-evaluation scheme with arithmetic coding
according to statistics from a database of games -- how often do people choose
the move the engine picks as best? Etc. (It's always easy to propose work for
someone else.) (The third paragraph of that answer is irrelevant to real
games, which always start from the same configuration.)

------
bjornorn
Related: 100TB of all the possible 7-piece chess board states and their
solutions [http://tb7.chessok.com/about-site](http://tb7.chessok.com/about-
site)

------
fabriceleal
I've been doing some spare chess programming on a GUI myself (non-intensive
on-off work for 1 year so far) and I decided to stay with the naive method (12
bits, straightforward src and dest squares) for compressing moves. It's
obviously easier and faster to implement, I'm not hostage for some wacky bugs
I could have done, and it's straightforward to parse and to format to long
algebraic form (e2e4), which is the format UCI likes to receive (stockfish,
for instance) and to store to a file.

Having said that, I might take a look into this 8 bit format :)

------
level3
Very interesting! But I don't quite see why you need to bother with tracking
pieces and swapping them. Can't you just go by the convention that pawn 0 is
the first pawn you find when scanning across from a1 to h8, pawn 1 is the
second pawn, etc.? Similarly for the knights, bishops, and rooks? (obviously
still using the fallback when necessary)

That would eliminate the need for computing swaps while still producing the
same move code for a given move in a given position.

~~~
billforsternz
This would work but it requires a scan of the whole board for pawn (and knight
and rook) moves. I wanted (and eventually got - after a lot of mistakes along
the way) a system which would use negligible CPU cycles for almost all moves.

~~~
level3
That makes sense. I'm probably underestimating the amount of cycles saved by
your swap method.

------
meta_AU
I'd suggest a slight change. Code all the pawns into 2 'pieces' and have a
special piece for promoting. The 16 moves for the special piece can be an
index from a list of possible promotions which is easier to generate and
canonicalise than the total move list. This frees up 5 pieces for promoted
queens. You could special case the king into the spare entropy in the rooks
and bishops to save one more piece.

~~~
billforsternz
I don't think that would work (but possibly I misunderstand!). The system
cannot cope with (up to, if all 8 pawns stood on the 7th rank) 8 more pieces!
The fact that the queen needs 1 more 'piece' already requires a certain amount
of ingenuity to work around.

------
breakingcups
"An amusing point is that some moves really would require zero bits – this
happens when there is only one legal move in the position, there’s no need to
store anything at all in that case."

What if a player decides to resign before making that move?

~~~
billforsternz
You would need to indicate the number of moves in the game at the start.

------
PepeGomez
Why does every move have to occupy the same number of bits? That looks
inefficient to me. Why not make the movement of pawns occupy fewer and the
queen more bits?

~~~
billforsternz
In the worst case pawns can be quite demanding - because of underpromotion -
there can be up to 12 moves available to a pawn - more than a knight or a
king. Of course it is possible to design a scheme where moves take a variable
number of bits - I discuss some promising methods in the article. But you will
always need 8 bits for some moves (information theory says so) and the beauty
of my "sweet spot" scheme is that it simple and quick, whilst still offering
reasonable compression.

------
mherrmann
Interesting. What are the practical implications of using 8 vs 12 bits? Does
using fewer bits allow chess engines compute more moves ahead?

~~~
billforsternz
No, chess engines use the simple highly performant "native" move
representations. My 8 bit scheme approaches the performance of native
representation, but is only (potentially) useful for other types of
applications, particularly chess databases.

------
hartror
Love the disclaimer at the start.

~~~
billforsternz
Article author here: Hope you are not being sarcastic/ironic. I thought the
disclaimer was necessary and complete :-)

~~~
leni536
I think you are much more modest than necessary, not a bad thing though.

