
Monte Carlo instead of Alpha-Beta for chess programs? - S4M
https://en.chessbase.com/post/monte-carlo-instead-of-alpha-beta
======
dragontamer
Monte-Carlo seems more akin to how humans play games.

Alpha-Beta pruning is the classical computer science algorithm, easier to
understand, describe, and analyze... but very inhuman. Its an exhaustive
search, like a breadth-first search.

Monte-Carlo has a lot of variations. Classically, the original MCTS algorithms
for Go would play all the way to the end of a game before searching other
parts of the tree. In effect, its a depth-first search, you play until you
make a conclusion (potentially a "bad" conclusion). The reasoning is simple:
on the average, a lot of "samples" of playthroughs will lead to a better
estimate for win/loss probability. Think of it as a random-sample of the
win/loss potential of a move.

EDIT: Cleaned up the above paragraph.

MCTS implemented by AlphaZero is different however. It uses a neural-net to
find "interesting" positions to guide the search, and the same neural net also
evaluates the position (who is winning or losing). It seems like a very good
way to have a single neural network mostly perform double-duty (aside from the
two output nodes). Most of the input layers / early hidden-layers can be
"recycled" between the "explore" function and the "evaluation" function.

EDIT: I don't know how AlphaZero gets its "exploration" network trained up
however. Its possible that the original randomized MCTS algorithm (search
randomly... which btw, performs way better than you'd expect...) might be used
to "bootstrap" the exploration weights.

So MCTS just naturally works with neural nets very well.

~~~
java-man
> Monte-Carlo seems more akin to how humans play games.

I don't think random choices is how humans play chess, or games in general. I
would say it's pattern memory: one learns to play by building a hierarchy of
patterns that lead to a prediction of outcome, thus making it possible to
select outcome(s) that lead to success.

~~~
dragontamer
Hmmm, you're right in that humans don't play randomly. But Monte-Carlo in this
instance meant "MCTS", and not "random play". I apologize for the imprecision.

MCTS is very methodological despite the name. Especially in the AlphaZero
implementation, where there's no random play during inference / during games
(!!). So where did the "Monte Carlo" name come from?

The "original" MCTS algorithm used random play to guestimate the strength of a
position. But the "real" theory behind MCTS is the math revolving around
multi-armed bandit problems. Given a large number of slot machines (in Chess
or Go, the positions are conceptually seen as "slot machines" in a casino),
the goal of MCTS is to find the slot-machine that gives the highest
probability of winning.

In effect: classic MCTS rolls out a position all the way to the final result:
win, loss, or draw. Not really in AlphaZero, but this is how MCTS was
originally designed. AlphaZero shortcuts this process with a neural-network to
guestimate the endgame result.

But still: the MATH, and the precise order which the search tree conducts is
very heavily based in multi-armed bandit theory and random samples. Even if
the algorithm in practice doesn't use randomization, all the math and
understanding of the search tree is derived from randomized ideals.

I think MCTS well-represents the human thinking of the typical expert. Every
play is read all the way to the end of the game instinctively. Experts study
the pawn-king-rook endgame not necessarily because they expect to see pawn-
king-rook endgames... but because the pawn-king-rook endgame is of huge
importance to middle-game pawn development.

Learning to see how middle-game (and even early-game) moves develop into
endgame is pretty much what being an expert is about. And MCTS best emulates
that kind of thinking.

True, chess is very tactical. But humans are generally bad at exhaustive
searches and tactics. Humans win with positional play, and with better
understanding of endgame positions.

\----------

So rest assured, when I say "Monte Carlo", I'm not talking about a typical
monte-carlo algorithm. I'm talking about Monte Carlo Tree Search, which is a
very, very different theoretical basis than Alpha-Beta pruning.

~~~
java-man
thank you very much for the explanation!

------
EventH-
Why does this article emphasize Komodo MCTS and completely ignore LeelaZero
which has had much more exciting and interesting results as far as
'alternative method' engines go? This is especially strange given that Leela
operates similarly to AlphaZero which the article strongly praises.

~~~
dmurray
The "article" is an ad for Komodo. That's also why it includes 'ten times
faster progress than the "normal" Komodo version'.

To be fair, Komodo MCTS is an exciting and new development, and relevant
because it can run on a typical desktop machine without a GPU, but you can't
expect this piece to be intellectually honest or rigorous.

------
LukeWalsh
I would love to use Alpha Zero. The article makes it sound like Komodo (which
uses similar techniques to Alpha Zero) could also beat Stockfish (since Alpha
Zero beat Stockfish). Stockfish still beat Komodo and all other engines
according to the computer chess championship which uses equal-and-limited
compute requirements.

The 5/2 blitz computer chess championship is currently live (with both
Stockfish and Komodo competing): [https://www.chess.com/computer-chess-
championship](https://www.chess.com/computer-chess-championship)

~~~
dmurray
You can run LeelaZero, which is weaker than AlphaZero but neck and neck with
Stockfish. It lost 50.5-49.5 in the recent TCEC championship [0], which sounds
like it is what you are referring to. (It historically has used equal
hardware, but that has been complicated by the emergence of GPU engines).

What I'd like is a shim to act as a UCI engine but actually relay moves
to/from LeelaZero on a remote machine. Chessbase offer something like this,
but as a paid service on a proprietary protocol.

[0][https://en.wikipedia.org/wiki/TCEC_Season_14](https://en.wikipedia.org/wiki/TCEC_Season_14)

~~~
LukeWalsh
Yea the TCEC is one. I also think the chess.com computer chess championship
(CCC) is interesting (Stockfish is winning here as well). The CCC uses more
advanced hardware. Both are live now and interesting to watch, I mentioned the
link for CCC but the TCEC is here:
[https://tcec.chessdom.com/](https://tcec.chessdom.com/)

A server-based championship would also be interesting but pretty degenerate. I
guess that's why computers playing more complex real-time games is becoming
the bleeding edge.

TCEC Hardware \-------------- CPU: CPUs: 2 x Intel Xeon E5 2699 v4 @ 2.8 GHz
Cores: 44 physical Motherboard: Supermicro X10DRL-i RAM: 64 GB DDR4 ECC SSD:
Crucial CT250M500 240 GB Chassis: Supermicro OS: Windows Server 2012 R2

GPU: GPUs: 1 x 2080 ti + 1 x 2080 CPU: Quad Core i5 2600k RAM: 16GB DDR3-2133
SSD: Samsung 840 Pro 256gb

CCC Hardware \-------------- CPU: CPUs: 2 x Intel Xeon Platinum 8168 @ 2.70
GHz 33 MB L3 Cores: 48 physical (96 logical) RAM: 256GB DDR4-2666 ECC
Registered RDIMM SSD: 2x Crucial MX300 (1TB) in RAID1 OS: Windows Server 2016

GPU: GPU: 4x Tesla V100 (64 GB GPU memory) CPU: Intel(R) Xeon(R) CPU E5-2686
v4 @ 2.30 GHz Cores: 16 physical (32 virtual) RAM: 256 GB

