
Google's AlphaZero Beats Stockfish In 100-Game Match - tambourine_man
https://www.chess.com/news/view/google-s-alphazero-destroys-stockfish-in-100-game-match
======
teilo
>This would be akin to a robot being given access to thousands of metal bits
and parts, but no knowledge of a combustion engine, then it experiments
numerous times with every combination possible until it builds a Ferrari.

No, not even close. First, AlphaZero started with the rules of chess, chess
pieces, and a chess board. Second, the possible moves are several orders of
magnitude fewer than the steps needed to build a working car out of parts.

Closer would be: Here's a car. Here are all the tuneable parameters. Make it
as fast or as efficient as possible. But that would still be inordinately more
complex then groking chess.

~~~
JustAnotherPat
>Second, the possible moves are several orders of magnitude fewer than the
steps needed to build a working car out of parts.

Aren't the number of chess moves, for practical purposes, near infinite? A car
can be made only in so many ways, but a chess game can be won under scenarios
that outnumber the atoms in the universe.

~~~
allenz
Chess has 16 pieces each in one of 65 positions, so the state space is roughly
65^16. A car has more than 16 parts each in more than 65 possible positions.

There are around 100 legal moves for any chess position, and only a handful of
rules. The possible actions and rules for building cars is literally all of
physics and engineering.

~~~
JustAnotherPat
I didn't read it that way. You're essentially reading it as the monkeys will
eventually write Shakespeare scenario.

The way I read it was, you have a robot and it:

1\. knows it has to build a ferrari (implying it knows what it looks like at
least)

2\. knows nothing about building, parts, etc

3\. can engage in some sort of learning behavior

Once a robot knows it can only connect two parts by screwing them together, it
will only try to screw them together, and it might even transfer that behavior
over to other parts.

The possible scenarios become much more limited as the robot gets smarter.

~~~
allenz
You're probably imagining an assembly line, where one step follows another in
a logical procession, and there doesn't seem to be much of a branching factor.
However, each step along that line is the process of many deep optimizations.
It is more complicated than you think; thousands of people work on it.
Consider just the case of choosing a screw, probably one of the most simple
steps in building a car. These 24 pages only scratch the surface of the
choices available:
[http://www.fastecindustrial.com/images/MediaCatFastener.pdf](http://www.fastecindustrial.com/images/MediaCatFastener.pdf)

------
seanwilson
I'm surprised how slow the press has been to pick this up. This seems like an
amazing step forward to me. AlphaZero played only against itself as training
and it beat one of the best chess AIs in the world that has been finely tuned
with decades worth of human knowledge.

Now that Go and Chess are efficiently solved for AI...what's next? Are there
any other interesting complete information games remaining? What's the next
milestone for incomplete information games?

~~~
317070
> Are there any other interesting complete information games?

How about axioms of logic as legal moves, and asking it to go from a set of
axioms to open mathematical problems?

Or chemical procedures and components as moves, and asking it to tackle a
disease with a known molecular structure?

It is not as straightforward as I make it sound, but these are complete
information problems.

~~~
rhaps0dy
>How about axioms of logic as legal moves, and asking it to go from a set of
axioms to open mathematical problems?

>Or chemical procedures and components as moves, and asking it to tackle a
disease with a known molecular structure?

>It is not as straightforward as I make it sound, but these are complete
information problems.

However they are not adversarial games. Self-play only works for adversarial
games. Also each mathematical problem is different and needs to be solved only
once, and likewise for molecular structures. Additionally, they only need to
be solved once, so there is no "gradient of skill" that we know to climb.

~~~
fmap
Theorem proving in intuitionistic logic is a two-player game and maps
perfectly to the kind of Monte-Carlo Tree Search that's employed here. Except
that it is far more difficult than Chess/Go/etc., since the branching factor
is essentially unbounded.

~~~
seanwilson
Can you elaborate? How could you score how close you are to a solution and
what would the second player be doing?

~~~
fmap
Consider a formula made up of only conjunctions and disjunctions and
true/false. The first player tries to prove the formula and gets to move at
every disjunction and is allowed to select which side of the disjunction to
prove. The second player tries to prevent the first player from finding a
proof and gets to move at every conjunction, selecting a side of the
conjunction to descend to. The final state here is an atomic proposition which
is either true or false and determines which player won. You derive a value
function from that in the same way as you do for Go or Chess.

You can extend this idea to full first-order intuitionistic logic and probably
also to higher-order logics, as well as many different modal logics. There are
also formulations of classical logic as a single player game, but that doesn't
seem to be very useful here.

------
asimpletune
I have to say it’s a completely inappropriate comparison to make since
Stockfish was forbidden from using its opening book. I would like to see the
results when both are at their best.

~~~
nilkn
Another way of looking at it is that it would be inappropriate to give
Stockfish an opening book but not AlphaZero, and I'm guessing the latter
doesn't have one since that goes against its whole premise.

With that said, whether it's fair or not, I am curious to see how Stockfish
with its opening book would compare to AlphaZero in its current state.

~~~
Certhas
Edit: I just learned that things have changed quite a bit since I looked into
computer chess more deeply, and Stockfish does actually not use a book. So
they did play the default config. The below is thus completely off.

\--

These engines are built for results, not as technical demonstrators. You are
testing the engine in a scenario that it was not built to cover. Opening books
are not optional add ons for these engines, they mean that the heuristics are
not tuned for early game, and no work is put into optimizing the evaluation in
that phase of the game.

If they could have beaten Stockfish in its default configuration (using book),
they wouldn't have artificially weakened it, right?

------
king07828
Looks like a residual network (ResNet) feeding a Monte Carlo tree search
(MCTS) solves the strategy optimization problem.

A critique is that the game model (rules, pieces, movements, legal moves) is
still bespoke and painstakingly created by a human being. One next step would
be for an algorithm that develops the game model as well as the strategy and
the I/O translation. E.g., use Atari 2600 Video Chess frame grabs as the input
and the Atari controller as the output. After experimentation the algorithm
creates everything: the game model (chess, checkers, shogi, go), the strategy
for the game, and the I/O processing needed to effect the strategy with the
available inputs and outputs.

~~~
afpx
Would what you're asking for even be possible for a human? For instance, if I
plop my child down in front of a game and tell them to figure it out without
telling them the rules, would they be able to figure out the complete set of
rules even by trial and error?

~~~
nemo1618
Actually, this is a game itself. I used to play it with friends growing up --
we called it "Mao." It's a card game where you are penalized for breaking the
rules (at least one person must know the rules in advance). You are allowed to
pause the game at any time and guess what a rule is, and the players who know
the rules will confirm or deny your guess. You'd be surprised at how quickly
people can pick up rules like "only face cards may be played after a 7" or "if
an ace is played, the next person is skipped."

------
erikpukinskis
Question for the AGI believers out there:

We have this, state of the art, AI which can turn the screws and hone in on
some underlying reality about “how to win at Chess”, a formal game. Great.

How does this then extend into the social domain, where AGI would be
operating? Like, how does AlphaZero optimize for “how to slow Climate Change”?

I can’t even fathom how it would even understand climate change without an
army of scientists publishing new work for it to consume. And then on top of
that it will need to understand how it’s adversary, Putin, will try to
optimize for the opposite, “ensure global warming to open up our shipping
routes and arable land”.

It just seems like a non-starter to me. Saying you could win at chess, so
winning at geopolitics is just a scaling problem to me is like saying I can
drink a bowl of miso soup so drinking the ocean is just a scaling problem.

It would seem to me that intelligence at the highest levels isn’t constrained
by foreknowledge, it’s constrained by the consequences of past decisions made
during the (inevitably ongoing) interactive learning phase.

~~~
RivieraKid
> Like, how does AlphaZero optimize for “how to slow Climate Change”?

It doesn't, AlphaZero is a domain specific AI. That's like asking: "Like, how
does differential equation solver slow climate change?". AlphaZero can learn
to play a narrow class of games given the rules, that's all it does.

But I agree with your sentiment obviously. We're _n_ technological
breakthroughs away from AGI, where _n_ is unknown. We have approximately zero
idea how to move from DNNs to AGI - or weather DNNs are the right approach.

~~~
epaga
Yeah, seems like weather DNNs might be the right approach for solving Climate
Change.

------
scott00
Can any computer chess/stockfish experts comment on the choice of 1 GB for
hash size? I have no chess or computer chess domain expertise whatsoever, but
it strikes me as a suspiciously low setting for a memory-related parameter on
what was probably at least a 32 core machine. It makes me wonder if they
simply dialed down the hash size until they got a nice convincing win.

Update: took a look at settings used for TCEC. Looks like they used 16 GB in
season 7, 64GB in season 8, 32 GB in season 9, and 16 GB in season 10. Two
observations: (1) interesting that they've decreased hash sizes in recent
years (2) definitely seems like 1 GB is not reflective of how an engine would
be configured for TCEC.

~~~
shmageggy
It does seem low, but I doubt the effect on performance would be huge.
Certainly not enough to affect the major claims of the paper.

For example, on my 4-core machine I just loaded up two instances of Stockfish,
one with 32MB hash and one with 512MB, and assigned 2 cores to each. I loaded
up a few random middlegame positions, and after analyzing for 1 minute, the
evaluations and main lines were generally the same (within the margin of error
for repeated runs of the same engine). when analyzing the Kasparov's immortal
game, it was a toss-up which engine would find the famous rook sac first.

1GB is probably suboptimal on the hardware they used, but the difference is
probably minimal.

------
forgot-my-pw
Some in the chess community seems to be still in denial phase.

The Go/Baduk community has experienced the similar thing early last year.

\- Jan 2016 (AlphaGo beat Fan Hui): Fan Hui was only a 2p and European
champion, he's no way near the top

\- Mar 2016 (AlphaGo beats Lee Sedol): AlphaGo still lost 1 game. The #1 rank
player can probably still beat it

\- Jan 2017 (AlphaGo Master beats 60 pros): Ok, AlphaGo is strong, but those
are only online games with short time control.

\- May 2017 (AlphaGo Master beats #1 ranking, Ke Jie): ...

\- Oct 2017 (AlphaGo Zero beats AlphaGo Master): Ok, nothing we have right now
can probably beat it.

~~~
RivieraKid
Denial of what? That one chess engine dethroned the currently best one? This
happens every second year. And doesn't make sense why the chess community
would be in denial, did you mean chess engine community?

~~~
joefkelley
The difference is this is an entirely new paradigm of chess engine. So it's
not just Stockfish getting an edge over Houdini that year or whatever.

There is plenty of "they didn't give stockfish enough memory, the hardware
isn't comparable, the time controls were too short, etc etc" in the chess
community so there is certainly some denial. But nobody thinks any human would
stand a chance.

But I agree it's nowhere near the Go denial since the chess community gave up
on humans remaining competitive with computers long ago.

~~~
RivieraKid
I see this as scepticism, not denial. Denial implies that you don't want
something to be true, which is perhaps the case with engine developers, not
the general chess community.

~~~
forgot-my-pw
I think you're right. Scepticism is more correct.

------
no_gravity
AlphaZero and Stockfish did not run on the same hardware.

So it's not clear if the algorithm is better or the algorithm was just run
faster.

~~~
grondilu
Still, when you look at the games, it's hard not to think something genuinely
new has happened. Several grandmasters have expressed amazement at the style
of play. AlphaZero won some games in a romantic style that's reminiscent of
old champions. It's definitely not the kind of play we've been accustomed to
with chess engines : AlphaZero seems to rely on a deep strategic understanding
of piece placement and dynamism opportunities.

It's difficult to attribute some of these wins to a simple superiority in
computing power.

~~~
no_gravity
Imagine a computer with a simple brute force algorithm but unlimited computing
power. It would win 100:0 against AZ.

Would the grandmasters look at the games and say "Yes, it won every time but
it's style is rather clumsy"?

~~~
joefkelley
Hard to say, since we don't know what such perfect play would look like. But I
expect it would look quite boring, yes.

It's important to state, though, that calling an engine's play "romantic" is
not a value judgement. Stockfish and other engines play in a way that give a
clear concrete advantage after X moves, whereas AlphaZero relatively prefers
move that are "creative" or "beautiful" in the sense that there's less of a
clear definite benefit, but instead some sort of slight positional advantage
or more attacking chances or something similar. It's harder to prove that it
was a good move, but it just "feels" good. In that sense humans enjoy it
either since it's surprising, or maybe closer to how we're able to think.

As an example, there is one game I saw analyzed where AlphaZero gradually
gives up more and more material with little clear compensation, but gradually
its position gets better and better, until Stockfish is completely out of good
moves. It then turns the corner, snatches up more material than it gave up,
and converts into a winning end-game. Such imbalanced style of play is much
more interesting to watch than previous computer slogs of slowly jockying for
slight advantages until one side can grab a pawn without giving up much and
build into a win from there.

------
partycoder
You will see here how Stockfish improvements are tested.

[http://tests.stockfishchess.org/tests](http://tests.stockfishchess.org/tests)

Some code or weight is changed, then they have it play to see if it leads to
better performance.

An exhaustive, sort of manual work. On the other hand Deepmind's bot is fully
automated and they have it just running day and night improving itself on a
large hardware configuration.

------
sanxiyn
Yesterday's discussion here:
[https://news.ycombinator.com/item?id=15858197](https://news.ycombinator.com/item?id=15858197)

------
cyberferret
Amazing. I assume AlphaZero knew the basic moves of the pieces, but had to
figure out defensive moves etc. against the other computer 'on the fly'? Are
those learning games included in the statistics (which include ZERO losses)??
If so, it is a remarkable learning engine.

Shades of WOPR. "This is a game nobody can win..."

~~~
317070
Yes, it got a set of all legal moves, of which it had to choose one. It was
trained for 8 hours where it only played against itself, only using the
knowledge of which moves are legal. After self-playing for 8 hours, it played
100 times against stockfish. In those 8 hours, it was apparently able to infer
all human knowledge on chess acquired over centuries, and surpass it.

And, when the legal moves are replaced with the moves of other games (like Go
for which it was actually written, or Shogi), it did exactly the same thing.
Redevelop millennia of human knowledge and go beyond, all within 8 hours of
compute.

Makes you wonder what will happen when instead of the rules of chess, you put
in the axioms of logic and natural numbers. And give it 8 months of compute.

~~~
soVeryTired
Amazing achievement. Still, "eight hours" is a bit misleading. That's eight
hours on thousands of parallel TPUs.

~~~
wereHamster
More interesting would be how many games it has played in that time.

~~~
andrioni
44 million, according to their paper, and they used 5000 TPUs, which are
capable of 4.6×10^17 operations per second.

(The operations the TPU can run are far simpler than what supercomputers can
do, but just for the sake of comparison, the current top supercomputer in the
world can do 1.25×10^17 floating point operations per second)

------
jtraffic
> Nielsen is eager to see what other disciplines will be refined or mastered
> by this type of learning

> of course it goes so much further

> The ramifications for such an inventive way of learning are of course not
> limited to games.

>But obviously the implications are wonderful far beyond chess and other
games. The ability of a machine to replicate and surpass centuries of human
knowledge in complex closed systems is a world-changing tool

Okay then. Let's go beyond games already!

------
rothron
While interesting, it's comparing 5000+ Tensor Processing Units against 64 CPU
threads. I suspect this isn't a fair comparison by watts spent.

~~~
bazzargh
No - 5000 TPUs were used for training, but the 100-game matches used just 4
TPUs (bottom of p4 in the paper)

------
otaviokz
>This would be akin to a robot being given access to thousands of metal bits
and parts, but no knowledge of a combustion engine, then it experiments
numerous times with every combination possible until it builds a Ferrari.
That's all in less time that it takes to watch the "Lord of the Rings"
trilogy. The program had four hours to play itself many, many times, thereby
becoming its own teacher.

This absurd comparison would raise my eyebrows coming from an English tabloid.

Having said that I've looked the author's profile and was appalled to learn
he's a chess prodigy. Then I also seen he's a Chess Journalist. Apparently he
became much more a journalist than a chess master...

------
romaniv
What I'm reading: an existing chess engine that runs on much poorer hardware
and for some weird reason was deprived of its usual initialization data
achieved 73% draw rate against a ridiculously hyped "deep" neural network/MCTS
algorithm.

It's interesting that AlphaZero was finally applied to a different game,
though. I wonder what architectural changes they had to make. I've read that
pure MCTS isn't that good at playing Chess. How true is that?

~~~
qeternity
AlphaGo and AlphaGo Zero are different than Alpha Zero.

------
Symmetry
I'd be curious about how it's strength relative to Stockfish might change as
the amount of time per move is varied.

~~~
imrehg
There's a graph of that in the Arxiv paper submitted yesterday:
[https://news.ycombinator.com/item?id=15858197](https://news.ycombinator.com/item?id=15858197)
(page 7)

As it looks, for move times < 0.2s, Stockfish is stronger, anything above
that, AlphaZero is stronger.

------
margorczynski
Personally I was always interested if it's possible now to use them (NN, DL,
etc.) to infer theorems on it's own. Because if the difference between it and
a human would be as big as we see in these expert-systems (trained only to do
one thing) then it could provide amazing results.

------
CGamesPlay
Did DeepMind publish anything about this? Is this literally a straightforward
plug of the AlphaGo Zero techniques into a chessboard with no novelty? Don't
get me wrong, I'm impressed, I'm just looking for a more primary source.

~~~
wrsh07
The arxiv paper is here:
[https://arxiv.org/abs/1712.01815](https://arxiv.org/abs/1712.01815)

------
ouid
I think the real measure of an AIs success in this field is the absence of
pathological boards like this one.

[https://lichess.org/analysis/8/p7/kpn5/qrpPb3/rpP2b2/pP4Q1/P...](https://lichess.org/analysis/8/p7/kpn5/qrpPb3/rpP2b2/pP4Q1/P3K2b/8_w_-_-)

Is it still easy to find positions that alphazero totally misunderstands?

~~~
tialaramex
Such positions are not interesting to Alpha unless it might run into them. If
Alpha would never choose moves that lead to this position, it needn't have any
insight into them.

If a Hold'em AI would never choose to bet 72 off it doesn't need to have an
opinion of what to do when that bet is raised by the opponent.

~~~
ouid
Sorry, I missed this response. The goal, to me, for a chess playing AI, is not
that it be very effective at the game. We have already shown that simple
algorithms exist which are better than humans. The novelty presented by
Alphazero is the generalization of positional evaluation with deep learning
structures. If you present Alphazero with this board, and ask it to learn how
to play starting from this position, what does it discover? Does that
translate to other "similar" pathological boards?

------
skarist
This is of course all very, very impressive, but it would be great to see more
details on this. We are told AZ only started with the basic rules. What was
included in the "basic rules"? How were they codified? The engine looks at
80.000 positions at second, so obviously it has some evaluation function. What
is the position evaluation function? Presumably it was codified in some way in
the beginning, and then got improved by the training period? It would be very
interesting to see the first 100 games, or so, the engine it played against
itself.

------
megaman821
Does anyone know how well a system like AlphaZero can be applied to a field
like material science? It would seem that you could make a scoring function
against how well the material meets the desired criteria.

~~~
grizzles
Pretty well. There are several active research groups working with AI/ML for
materials science research. Since you asked about AlphaZero, here is an
article directly about the DeepMind guys working on this problem:
[https://qz.com/1110469/if-deepmind-is-going-to-find-the-
next...](https://qz.com/1110469/if-deepmind-is-going-to-find-the-next-miracle-
material-experts-dont-know-how-theyll-pull-it-off/)

------
ken47
If AZ is truly superior to Stockfish, why was Stockfish given an amount of RAM
that could only be considered standard for the 1990s?

------
logicallee
Completely dishonest title and first 1,000 words of the write-up. That's when
you get to the words:

> pointed out that Stockfish's methodology requires it to have an openings
> book for optimal performance.

I went from amazed, utter shock like "What!! No way. This is unreal. This is
absolutely unreal. What? What? What?" to a total feeling that I've been
reading 1,000 words of fake news.

I feel cheated by this write-up and flagged it for this reason. They need to
mention it in the first couple of words, not after selling that it has deduced
1600 years of human chess knowledge in 4 hours.

------
grandalf
It's interesting to consider what it means when the AI can succeed without
using brute force.

Suppose at every turn there are n possible future states of the game based on
the rules. To avoid "brute force" the AI must be able to ignore many of those
states as irrelevant. In effect, the AI is learning what to pay attention to,
not just considering what might happen, thereby conserving computational
resources.

Chess and Go are interesting for two nearly opposite reasons: 1) because they
are _too large_ for humans to consider the reasoning obvious, and 2) because
the input to the reasoning is simply a small (and easily perceived by humans)
grid of rule-constrained pieces.

But when you think of AI in an information theoretic way, so that given
representative training data the system (if large enough) will always "learn"
perfectly, it's not really all that remarkable. It's just a different
computational way of doing the same transformation from input states to moves.
Given a problem (chess, go, etc.) the researchers must simply learn what
network structure and training regimen will do the job with the least
computational cost.

To see why this is relevant, consider a deep learning model that could
continually generate successive digits of pi (or primes) without having the
concept baked in already. Would the result be computationally cheaper than
highly optimized brute force algorithm? No, because what it would "learn"
would be something already known by humans. Perfect chess is simply a function
from input states to moves that humans do not already know the definition of.
Most humans do know the definition of this function for the game of tic tac
toe by the time they reach middle school.

I'd argue that while this is useful it's ultimately not hard. Comparing it
with Stockfish mainly demonstrates how _chess is hard for humans_ to reason
about and hence hard for humans to write non-brute-force algorithms to solve.

Thus, I think this is an example of "weak AI" even though humans associate
chess with high degrees of exceptional human cognition. Chess data contains no
noise, so the algorithm is dealing only with signals of varying degrees of
utility.

I'm looking forward to AI that can be useful in the midst of lots of noise,
such as AI that analyzes peoples' body language to predict interesting things
about them, analyzes speech in real time for deception, roulette wheels for
biases, and office environments for emotional toxicity.

Chess is interesting because we can't introspect to understand what makes
humans good at chess (other than practice). So many human insights and
intuitions are similarly opaque yet the data is noisy enough that it will take
significantly better AI to be able to do anything that truly seems super-
human.

------
reader5000
Number of humans that could put together a Ferrari from parts: ~10000?

Number of humans that can beat Stockfish: 0

~~~
albertgoeswoof
Number of cats that can meow: ~1000000000000?

Number of cats that can beat Stockfish: 0

~~~
bluecalm
And therefore meowing is simpler than beating Stockfish which was parent's
point.

------
rurban
But now please Houdini and the rest. Stockfish doesn't play interesting, and
has less ELO than Houdini. Though that new search will beat Houdini also I
guess.

------
cerealbad
seems like the real untapped potential goldmine here is to carefully observe
millions of workers, notice when they are working and why and when they are
slacking off- and enforce penalties and rewards relative to their peak
performance to motivate increased productivity. measure average key presses,
response times, eye engagement, fidgeting and body motion, facial expressions.
do you really need robots if you can train the human network to be more robot
like? the 21st century assembly line is so delicious, think of the
possibilities.

going to need stimulants to work at 100% or your company AI will cut your pay.

------
blondie9x
Four hours to learn on a high speed computer is what millions upon millions of
games? Thousands of human lifetimes lived out in four hours. The four hours to
learn thing is fake news and distorts the reality of how many games and trials
the machine actually went through.

~~~
kazagistar
I don't think so. It seems to me that the whole point of AI is how many real
world human years it saves by mastering and executing things fast. The fact
that it can learn this well at all, after any amount of time, makes it
impressive, and the fact that it learns quickly makes it all the more useful.

------
nickjj
Another really impressive feat is Elon Musk's OpenAI which defeated a number
of world class dota2 players in a 1v1 match.

This is a real time strategy video game. The number of decisions you need to
make in this game are mind boggling. It takes most people many months, if not
longer just to get to the point where you're not clueless.

A recap video is at:
[https://www.youtube.com/watch?v=jAu1ZsTCA64](https://www.youtube.com/watch?v=jAu1ZsTCA64)

~~~
erk__
On the other hand it quickly got figured out how to beat it. And it could only
play one hero.

~~~
nickjj
But it still beat a bunch of top players.

I've played dota-like games in the past (at a medium-high level). The skill
gap in this game is remarkably huge.

It's a type of game where a new human player would get absolutely destroyed
with a 0% chance of possibly ever winning, even in a minimal scenario where
only 1 hero were involved.

