> A move (apart from castling and pawn promotion) can be specified by giving the original and final squares occupied by the moved piece. each of these squares is a choice from 64, thus 6 binary digits each is sufficient, a total of 12 for the move. Thus the initial move P-K4 would be represented
by 1, 4; 3, 4. To represent pawn promotion on a set of three binary digits can be added specifying the pieces that the pawn becomes. Castling is described by the king move (this being the only way the king can move two squares). Thus, a move is represented by (a, b, c) where a and b are squares and c specifies a piece in case of promotion.
I'm actually slightly surprised Shannon didn't propose a more compact scheme using the lower entropy of legal chess moves, but I guess his purpose in this article was more to do a ballpark estimate of the feasibility of computer chess playing in general.
A more sophisticated version would use arithmetic coding, with the predictions of the next move initially coming from an opening book / game database, then coming from the engine. The idea being that most games you want to compress are at a high enough level that the engine gives good predictions ... perhaps one could even tune the engine's parameters for better results. But again, like I said, it doesn't sound like fun to code.
A separate comment: I wonder if the time efficiency issues mentioned are really that severe? Since the problem is so small/finite.
I really should have included something about adding an opening book as I thought about that quite a lot. My conclusion was that an opening book is not going to be a really dramatic win. A simple scheme might encode say 64K opening sequences using 2 bytes, and save an average of perhaps 10 (half) moves. So a saving of 10*4 - 16 = 24 bits, spread over an average of 80 (half) moves. So about 0.3 bits per move.
You might question my estimate of 10 half moves max, but it's an educated guess. One thing I've discovered whilst working on my chess database is that the standard tabiya positions are reached by huge numbers of different transposition possibilities. See my blog post at
This means that the standard canonical way of reaching a well known position doesn't serve as a good proxy for the start of all the games that include that position.
Do you have any idea how this would be done? It seems crazy.
The downside is a lack of individual byte accessing without a lot of surrounding decompression work, but it'd be appropriate for stream processing
In fact the best compressed size is probably found by reducing some of the clever tricks in the article in order to expose more structure to a general compressor. Similar to running `precomp` or `antiX` before solid-packing multiple already-compressed files.
Just noticed this. The beauty of my "sweet spot" scheme is that processing these 8 bit moves is almost as quick as processing a straightforward normal move representation (32 bits in my code because I store extra information to make moves "undo-able" - although 16 bits is reasonable). This is important for some of the things I want to do - For example I am experimenting with the idea of implementing a query to search a database of games for a position by literally playing through every game looking for the position (actually the hash of the position). The standard approach is (I think) to index all the positions in the database - which swells the database a lot. If you do a little math you'll see that you need the move processing to be lightning fast for this to be responsive in a multi-million game database. Move processing involving generating move lists in every position would leave the user waiting minutes (maybe hours!).
So it's not clear what the point of compressing the moves, especially since at some points the article is concerned about size and sometimes about speed. If it's just an intellectual exercise then consider the following scheme:
Generate the legal moves for a position, then sort them. However don't sort them using a naive method like alphabetical order. Instead sort them in order of likeliness of being played. For example moves that capture the last moved piece are at the top of the list. So for example 1.e4 d5, now the first move in the list would be exd5, capturing the last moved piece. So the move exd5 can be encoded in 1 bit. Now imagine a 40 move game where every move played was the first one on the sorted list. This takes 80 bits to store the entire game. Of course moves farther down the list take more bits to encode.
This is similar to one of the schemes in the article, but the article gets hung up with fixing bit sizes rather than just using the exact number of bits required for each move which results in variable bit lengths for each move.
This is, more or less, the scheme Chessbase first used for their data files almost 30 years ago.
If someone created a mobile app containing this database, I would certainly appreciate those multiple gigabytes of data being compressed.
1st move in the list takes one bit (great)
2nd move in the list takes two bits (good)
3rd move in the list takes three bits (okay)
4th move in the list takes four bits (average)
5th move in the list takes five bits (worse than average)
There are many positions where there are lots of plausible moves, in such positions your scheme could use many bits. I would estimate your scheme would take around 4 bits per move on average, much like the similar scheme I describe.
You are correct that a good modern database of 5 million plus games stored uncompressed occupies about 5 gigabytes or 1 DVD. Clearly compression is very useful for online distribution, which is what people expect these days. Chessbase compression reduces the 5 gigs to 0.5 gigs, about 10:1.
The scheme I describe is a nice compromise between performance and maximum compression. In my database program I achieve much faster position search than Chessbase. I am trying to achieve similar results without massive position indexes, and my performant compressed move scheme is a key ingredient in my work.
Edit: But as I state in the intro to my article, I am just an amateur playing around - I am not expecting to 'beat' Chessbase. I'll settle for having some innovative aspects to my program.
At http://stackoverflow.com/questions/1831386/programmer-puzzle... I suggested combining the sort-the-moves-by-their-evaluation scheme with arithmetic coding according to statistics from a database of games -- how often do people choose the move the engine picks as best? Etc. (It's always easy to propose work for someone else.) (The third paragraph of that answer is irrelevant to real games, which always start from the same configuration.)
Having said that, I might take a look into this 8 bit format :)
That would eliminate the need for computing swaps while still producing the same move code for a given move in a given position.
What if a player decides to resign before making that move?