
Programming Puzzles, Chess Positions and Huffman Coding - fogus
http://www.cforcoding.com/2009/12/programming-puzzles-chess-positions-and.html
======
lmkg
Arithmetic encoding is usually more space-efficient than Huffman encoding, but
it's not used frequently because it has much higher computational complexity
for very small gain. However, if you are looking for the most space-efficient
representation, it's worth taking a look at.

<http://en.wikipedia.org/wiki/Arithmetic_encoding>

Arithmetic compression differs from Huffman encoding because it encodes the
entire dataset as a whole, rather than symbol-by-symbol. At worst, it will
provide the same encoding that Huffman does, and it has far better performance
on degenerate cases (such as 1 symbol having 75% chance). I think the commonly
cited statistic is about 5% better on average, for some definition of
'average.'

For the curious: Arithmetic encoding works by taking all possible sequences,
and taking the closed interval from 0 to 1 and chopping it into pieces, where
the size of each piece is the probability of the corresponding sequence. Then
the encoding of a sequence is the decimal (well, binary) expansion of any
number that falls within the corresponding interval.

------
jah
Semi-related - for those who are interested in writing their own chess
programs - the de facto standard method of representing positions (in a human
readable form) is to use <http://en.wikipedia.org/wiki/Forsyth-
Edwards_Notation> (FEN).

The starting board position in FEN would look something like this:
rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1

------
tudorachim
I'm surprised <http://en.wikipedia.org/wiki/Zobrist_hashing> wasn't mentioned
at all in the article.

~~~
cletus
To be honest with you I'd never heard of Zobrist hashing but then again I've
never tried to write a Chess or other such board game for which that it is
appropriate for decision trees and such.

I'll stress that the idea of this post isn't to implement a Chess engine but
merely to examine an admittedly artificial problem (much like Code Golf
results in code you usually wouldn't actually use for anything real because
it's typically unreadable even if clever).

Various hashing techniques could speed up comparisons of positions but they
don't allow you to store positions.

------
wglb
A nice article about how to encode the state of a chess game at any point
which recognizes that Morse code is a form of and in some small way
anticipates Huffman encoding.

------
BearOfNH
_En passant_ only needs 3 bits to encode the file ("column") containing the
eligible pawn. The rank ("row") is inferred from the context -- whose move it
is. (W=5, B=4)

3 bits always suffice to indicate either (a) a pawn _en prise_ (as it were) or
(b) no en passant possible because there's no enemy pawn on the proper square
for that file. (You can't have pawns on all 8 files and at the same time have
one eligible for en passant capture, since you've got to have an opponent's
pawn on an adjacent file. So in the usual case of no en passant you just pick
a file not occupied by a pawn of the right color, alternatively one of the
right color not adjacent to a pawn of the color to move. This is always
possible).

