
Alphazero crushes Stockfish in 1,000 game match - stillsut
https://www.chess.com/news/view/updated-alphazero-crushes-stockfish-in-new-1-000-game-match
======
AndyNemmity
"CCC fans will be pleased to see that some of the new AlphaZero games include
"fawn pawns," the CCC-chat nickname for lone advanced pawns that cramp an
opponent's position. "

The "fawn pawn" naming comes from fans of kingscrusher on youtube who analyzes
Leela Chess Zero games.
[https://www.youtube.com/user/kingscrusher](https://www.youtube.com/user/kingscrusher)

His accent makes him saying "thorn pawn" sound like "fawn pawn", and thus the
name has been given.

Here is a link to shirts he sells with "fawn pawn" on them.
[https://teespring.com/fawn-
pawn?tsmac=store&tsmic=kingscrush...](https://teespring.com/fawn-
pawn?tsmac=store&tsmic=kingscrusher#pid=211&cid=5291&sid=front)

------
sinuhe69
I feel compelled to repeat the previous criticism on the games: \- Stockfisch
was never designed nor tested on so many cores. Running on 44 cores may
degrade the performance of Stockfish. \- Stockfish was designed to start a
game with an opening book. In games with opening book, Stockfish has won
significantly more games, too. \- Stockfish 8 was not a particularly good
implementation. Stockfish 9 or 10 should be a better choice, though.

Nevertheless, the performance of AlphaZero was impressive, especially the
positional knowledge it has acquired is second none. In all existing chess
engines, positional knowledge is under-represented through simple heuristics.
Acquiring positional knowledge was a longtime dream of chess programmers many
generations. The dream was to create an engine which plays a more human-like
style of chess. AlphaZero has realized this dream and even goes beyond that:
extending the humans knowledge of chess.

I believe the most intriguing question right now is why AlphaZero stopped to
improve after 9 hours of training? It’s due to the inherent problem of chess
or due to the limits of ANN? If it’s the latter, how we can breakthrough and
create a new generation of engines that can even surpass AlphaZero?

~~~
forgot-my-pw
Perhaps in the highest level of chess, almost all games will be a draw. So
there's not much progress left to be made other than playing a variation that
has less draws, like Shogi?

------
dragontamer
I'm glad that they took a lot of criticisms to heart this time.

The opening book and Syzygy Tablebases were enabled, so we're seeing Stockfish
go at full power here. The only last bit of problem is that Stockfish's
scaling isn't very good. But there's not much that the admins of the test can
do about that.

This test seems fair IMO.

------
maxander
Do we know what hardware each was using? Aside from time, s criticism of the
previous AlphaZero/Stockfish match was that AlphaZero was using a tremendous
amount of TPU power while Stockfish was running on, essentially, an average
laptop.

~~~
SethTro
They used the same config as TCEC.

"Stockfish was configured according to its 2016 TCEC world championship
superfinal settings: 44 threads on 44 cores (two 2.2GHz Intel Xeon Broadwell
CPUs with 22 cores), a hash size of 32GB, syzygy endgame tablebases, at 3 hour
time controls with 15 additional seconds per move"

~~~
marcinzm
For comparison, AlphaZero used 4 TPUs and the same number of cores. So more
computing power but not absurdly so.

~~~
why_only_15
And it's literally true that the TPU has more computing power than a CPU in
terms of e.g. flops but part of the reason why NNs are so powerful is that
their operations can effectively be parallelized whereas Stockfish is
optimized for CPUs that can efficiently branch and do smaller, more flexible
operations.

------
foobaw
How did AlphaZero lose 6 games? Do we have an analysis of those? I'd love to
see what happened.

~~~
SethTro
They are available in the appendix but here's one of them:
[https://lichess.org/IVW0T4Jd](https://lichess.org/IVW0T4Jd)

~~~
im3w1l
Kinda strange stockfish won it since it looked like stockfish pushed for draw
via repetition multiple times with alphazero declining.

~~~
dmurray
AlphaZero seems to hate repetitions, whether in won drawn or lost positions. I
don't understand why, given that it isn't explicitly programmed to have
"contempt" like most engines.

~~~
david-gpu
Most chess engines have "contempt"? What does it do and what purpose does it
serve?

~~~
detaro
It influences how the engine values situations that could lead to draws. If
playing against an (assumed) weaker opponent, it's likely good to avoid draws
when there's probably a good chance to win. In reverse, trying to lock in a
draw is better than likely loosing later.

Since repetition is one way to get a draw, an engine with positive contempt
(it assumes the opponent to be weaker than itself) will score repetitive moves
lower and is more likely to pick something else.

------
mindgam3
Previous discussion from a submission by a DeepMind engineer here:
[https://news.ycombinator.com/item?id=18620978](https://news.ycombinator.com/item?id=18620978)

------
sytelus
More detailed article: [https://deepmind.com/blog/alphazero-shedding-new-
light-grand...](https://deepmind.com/blog/alphazero-shedding-new-light-grand-
games-chess-shogi-and-go/)

Paper: [http://science.sciencemag.org/content/362/6419/1140/tab-
pdf](http://science.sciencemag.org/content/362/6419/1140/tab-pdf)

------
Mtinie
> According to DeepMind, AlphaZero uses a Monte Carlo tree search, and
> examines about 60,000 positions per second, compared to 60 million for
> Stockfish.

The previous statement was talking about how much faster and more efficient
AlphaZero is, but the interpretation I pickup from that sentence is the
opposite. Is this a “golf score” situation where lower is better?

~~~
bjterry
When Stockfish evaluates a position, it explores moves to a greater depth (a
greater number of plays ahead), with its guesses and the value of the final
board arrangements it can get to estimated using a relatively simple
heuristic. AlphaZero evaluates the different potential moves using the neural
network, which guides the search with a very complex heuristic that implicitly
incorporates a tremendous amount of depth from prior games that have been
incorporated into the model. Similar to the way an image recognition model
takes in a whole image and says "this is an image of a goat," AlphaZero takes
in a whole board and says "this is a winning board."

~~~
dragontamer
> Similar to the way an image recognition model takes in a whole image and
> says "this is an image of a goat," AlphaZero takes in a whole board and says
> "this is a winning board."

IIRC, AlphaZero has two outputs from the neural network. You described the
first output.

The 2nd output was absolutely critical to it growing in strength. In effect,
this 2nd output value is the difference from AlphaGo and AlphaZero. The 2nd
output value guides the monte-carlo tree search.

Naive MCTS looks at board positions randomly. AlphaZero's MCTS looks at board
positions the neural network deems "interesting". In effect, the neural
network both guides the search (output #2), and evaluates the position (output
#1).

MCTS chooses a position based off of the "interesting factor", as well as "how
much that position has been evaluated". Ex: if "Knight to c3" has been
evaluated 1-million times, MCTS will try to look at other positions. But if
the neural network says that "Knight to c3 is really, really interesting",
MCTS will still favor to look at that position, more so than other positions.

Etc. etc. down the hierarchy of moves.

------
clks
"Crushes _an old version_ of Stockfish." The current Stockfish 10 is said to
have a >100 Elo advantage over Stockfish 8.

~~~
tim333
Says they had almost the same results against Stockfish 9. That said it would
be interesting to see a more fair competition with Stockfish set up by an
opposition team and maybe the same budget for cloud compute. Given they run on
different hardware that might be a way to do it.

------
rurban
Still, Houdini is again leading on the CCCC live championship table, with
Stockfish at the #2, and the open-source AlphaZero clone lc0 at #3

[https://www.chess.com/computer-chess-
championship](https://www.chess.com/computer-chess-championship)

------
wslh
Will AlphaZero be available to more chess players to play? It would be
interesting to find a blind spot in this engine in a format where humans could
use their brains and more tools trying to beat it. Or is it really unbeatable?

~~~
remify
Chess AI has been "unbeatable" for long time now.

------
chrisMyzel
What would happen if you let AlphaZero compete against AlphaZero?

~~~
nkurz
Personally, in that case I'd probably bet on AlphaZero to win. (that was a
joke)

It might be an interesting question what the appropriate bet would be for win
vs draw in this case, and how this would change with greater training.
Presumably the more you train both sides the more likely they are to draw?

Also interesting would be to quantify the effect a small hardware handicap
has, and how this trades off with training. Is more training always better
than more hardware? Vice versa?

~~~
forgot-my-pw
> Personally, in that case I'd probably bet on AlphaZero to win. (that was a
> joke)

Then I would bet on a draw.

~~~
nkurz
Unless you think they are "perfectly" matched, in which case you might bet on
White to win because of a hypothesized first player advantage. I think it's
still an open theoretical question as to what would happen with two equally
matched perfect players: [https://en.wikipedia.org/wiki/First-
move_advantage_in_chess](https://en.wikipedia.org/wiki/First-
move_advantage_in_chess)

------
nisuni
Shouldn’t we take into account the computing power used to train Alphazero?

I feel the comparison is a bit unfair...

~~~
FartyMcFarter
Stockfish's evaluation/search heuristics are also tuned with a lot of CPU
power:

[http://tests.stockfishchess.org/tests](http://tests.stockfishchess.org/tests)

Plus it has all of the knowledge from past human/computer chess research,
experimentation and tuning that's been done in other chess engines since the
70s helping it.

~~~
nisuni
That’s a misleading comparison.

Stockfish would still be an extremely strong engine even without the training.
Alphazero couldn’t even move a single piece without having been trained
extensively.

