
Reading Chess (1990) - slbenfica
http://fermatslibrary.com/s/reading-chess
======
cmontella
I took a couple courses by Baird at Lehigh before he retired. He used to teach
"Pattern Recognition" as it was called before "Machine Learning". His approach
was decidedly old school, but really that was the whole point; a lot of the
trends in computing are cyclic, and if you focus on the fundamentals you have
context for where new trends fit in.

One of his biggest take-aways in building systems like the one in the linked
paper was the need for layered understanding. For example, to understand the
meaning of a passage of scanned text, you can't just look at each individual
letter of the text. There is ambiguity in characters like l and I or 0 and o
and O. To remove that ambiguity, you bring in more context by looking at a
character in a word. Then a word within a sentence. Then a sentence within a
passage. This is how they were able to build highly accurate systems without
massive amounts of training data.

Now that I'm on a roll recollecting the teachings of Henry Baird I will add
two more:

1) There is a difference between an algorithm and a heuristic that is often
overlooked these days: algorithms are provably correct. If you can't prove
it's correct, then it's a heuristic. By this measure, there are very few
algorithms out there, because there are very few proofs of correctness. Most
likely, anything you've ever called an algorithm that you've written yourself
is actually a heuristic.

2) Because of the above, Baird was very fond of KNN classifiers, because they
are one of the few classifiers that have any sort of provable guarantees --
namely that with infinite data, the error rate is guaranteed to be no worse
than 2x the Bayes error.

------
Nzen
tl;dr web rendered case study of a computer vision system by K Thompson and H
Baird. They used ocr plus word level heuristics to digitize an encyclopedia of
chess games/moves. By modeling each game during a second page of parsing, they
were able to constrain the possible interpretations of questionable
characters/words to legal moves. They claim this reduced their error rate 30
fold.

