
AlphaGo beats Ke Jie again to wrap up three-part match - iandanforth
https://www.theverge.com/2017/5/25/15689462/alphago-ke-jie-game-2-result-google-deepmind-china
======
strin
Not surprising.

However, I am not buying the claim that the algorithm is going to transform
many other domains:

" The system could help optimize power grids, he says, or streamline shipping
routes, or refine scientific research."

by Wired [https://www.wired.com/2017/05/googles-alphago-levels-
board-g...](https://www.wired.com/2017/05/googles-alphago-levels-board-games-
power-grids/)

The main concern is data efficiency. AlphaGo essentially baked good movies
into value and policy network by playing millions of times. In reality, unless
you have a really good simulator, deep reinforcement learning is almost never
applicable.

~~~
WilliamDhalgren
> AlphaGo essentially baked good movies into value and policy network by
> playing millions of times.

I don't think that's a very good description of how AlphaGo was trained at
all; you're essentially saying it merely overfits the training set, yet it
clearly generalizes rather well to unseen board situations and still evaluates
them sucessfully. No machine learning system would be found usefull if all it
could do is merely memorize the training data.

Re the use of deep reinforcement learning, well for one the role of
reinforcement learning in the first version of AlphaGo, the one described in
the Nature paper was rather limited, and a small part of its training; it just
made a ~3d KGS policy network into a ~5d KGS bot, and used to generate a
training sample for the value net. If we had enough recorded human games to
train the value net directly, that'd be an unnecessary step anyhow. And you
could create such a training set w/o reinforcement learning since there are
pure monte carlo bots stronger than 5d KGS - but that'd be far more
computationally expensive.

But its still not really true that there aren't obvious applications of deep
reinforcement learning - indeed robotics is one promising application, and
that seems rather relevant. this paper initially demonstrated an impressive
improvement in manipulative tasks, and you can prob follow its numerous
citations for newer stuff:
[http://arxiv.org/abs/1504.00702](http://arxiv.org/abs/1504.00702)

I do agree that this exact architecture in AlphaGo prob doesn't have
applications beyond just teaching us how to play go better; it seems too
specialized. I believe they mean it in just the vaguest possible sense; that
the kind of deep algorithms demonstrating incredible performance in AlphaGo
have diverse applications; but this should not come as a surprise to anyone
even loosely following what people have done with deep learning in the past
couple of years anyhow.

~~~
hajile
Go works precisely because it is a small closed system. An interesting match
(from an AI perspective) would be a pro playing alphago on an unusual board
(eg, one in the shape of a cat). The pro would take everything he knows about
the game and apply it to the odd situation. Alphago is so specifically tuned
that it cannot even handle any case except 19x19 (and maybe 9x9). Another
interesting question would be small rules changes like "you may not play on
any star points or any point directly touching them until turn 30".

Go has deep strategy, but it is very well defined in terms of what can and
cannot be done and those rules are not particularly complex. Power grids in
contrast are far more complex. There are thousands of rules, but also many
more thousands of unwritten assumptions and case-by-case analysis. A final
issue is that there exist unsolved and unrecognized problems.

The last AI winter (deep learning is just the latest rebrand) came from
researchers overstating their accomplishments and making promises about
general intelligence that could not be kept. Any claim about anything that
requires general intelligence in the near future is undoubtedly overpromising.

~~~
sangnoir
> Alphago is so specifically tuned that it cannot even handle any case except
> 19x19 (and maybe 9x9).

Do you have any sources to back this assertion? It sounds unintuitive as I
know object recognition sytema are usually trained on small images but they
generalize well to arbitrary image sizes. What you are describing sounds like
overfitting.

~~~
hajile
The paper itself repeatedly says that all 48 layers of the policy network are
19x19 matrices. To make the point though, they initially train alphago using
actual games. After a hundred thousand or so training games, it's finally
ready to start playing and learning. There are less than a couple dozen
recorded games on larger boards.

If you haven't played go very much, you may thing that "it's just a bigger
board". 19x19 is commonly used because it has an even balance of edge and
center influence (in reality, edge influence seems to be slightly higher).
With the 13x13, corner plays have overwhelming influence in the center. At
9x9, there is basically no center strategy at all. Normal strategies starting
in the corners and expanding influence toward the center don't work as
effectively with larger boards (the larger the board, the more this becomes
true).

This is a much different issue than image recognition in that strategy doesn't
scale in the same way that images do.

------
partycoder
I did not see Alpha Go's invasion of the top left corner coming.

The black group consisting of 3 stones (B13, C12, E12) did not seem alive at
first, but it turned out to not only live but completely reduce white's
corner.

~~~
partycoder
Deepmind published a summary video for the game here:
[https://www.youtube.com/watch?v=5fOmbqjH7zI](https://www.youtube.com/watch?v=5fOmbqjH7zI)

------
RcouF1uZ4gsC
I wonder what would be the result if the rules of the game were tweaked
slightly (in an arbitrary fashion) just before a game, then the match went
ahead as before. My guess would be Ke Jie would be fine and would adapt fairly
quickly and AlphaGo would look pretty bad.

I think the AI/human crossover boundaries are at meta boundaries. As we get
more meta, humans do better over AI. AI has been eating up the lower meta
levels for some time now. It has been a very long time since a human could do
division faster than a computer.

The question is how far does this meta staircase ascend, and if/when will AI
climb all the way to the top?

~~~
grenoire
What sorts of modifications could even be made to Go that it doesn't turn into
an imbalanced mess?

~~~
londons_explore
Every move, you flip 6 coins. If all 6 come up as heads, you swap seats and
colors with your opponent.

This means that an advantageous board position is much less important, and it
will be in both players favour to keep the board position very near evenly
matched.

~~~
Someone
Not always. A player that thinks he will lose a game that's tied in the last
8-ish moves will have an incentive to create a highly unbalanced game. That
way, that player gets a 50% chance of winning.

~~~
wapz
Given that there's only one flip left and you need 6 heads, he has a 1/2^6 or
1/64 chance to win.

In go the last 8 moves is almost never going to make the difference (the last
20-40 moves in a game that doesn't end in resignation there are almost no
mistakes in pro games).

------
tosh
Naive question: is something like AlphaGo also better at chess compared to
previous approaches to chess? Can someone elaborate?

~~~
Recursing
Two approaches used by the previous version of AlphaGo (MCTS and a NN for
position evaluation) have been tried in the past in computer chess with
relatively poor results, while in the past few years they were already giving
good results for Go.

The current version might be using some different techniques that could be
more useful, though. They should release some details in the next few weeks.

~~~
wapz
Can't they effectively "solve" chess with the power of AlphaGo? Someone posted
on the last thread that the AI plays one game of go every 2 seconds against
itself to train (that's ~150 moves/second). Wouldn't chess be solved for all
combinations in a matter of days?

~~~
BoiledCabbage
Short answer is no. One estimate says there are 10^120th possible chess games.
I'm not doing the math now, but I believe that falls into the not-physically
possible within our lifetime using known physics. An "If you had started
calculating at the beginning of the universe, you wouldn't have even tackled a
sliver of the work by now" type of problems.

[http://www.popsci.com/science/article/2010-12/fyi-how-
many-d...](http://www.popsci.com/science/article/2010-12/fyi-how-many-
different-ways-can-chess-game-unfold)

~~~
wapz
What I mean by solving "every" move is that the AI would be able to determine
which trees are a "losing" state and not continue, removing well over 99% (or
99.999% I don't know) of the possible games. I guess maybe solve is not the
correct answer but something like 99% presumption on the best or best 3 moves.

------
romaniv
The official tournament page says it was only the second game. Why does the
article suggest it was the third?

[https://events.google.com/alphago2017/](https://events.google.com/alphago2017/)

~~~
skybrian
If you win two games of a three game match, the last game doesn't matter.

------
devy
If AlphaGo has a dan rating, it would have been rated at 100 dan by now.

~~~
partycoder
The problem with professional dan ranks is that they're capped at 9p, and each
association has different rules for awarding that rank. Because of this there
is a lot of variance among 9p.

For instance, Fernando Aguilar 6d (aguilar1 and aguilar on KGS), an amateur
dan from Argentina, has defeated Hasegawa Sunao 9p as well as Yo Kagen 9p. In
theory this should not have happened.

If the cap didn't exist, I think Alpha Go may be around 13p, based on the fact
that Lee Sedol and Ke Jie may be able to give reverse komi to many top 9p
players, and that Alpha Go may be about 3 stones stronger than both. Park
Yeong-hun is less often mentioned but he may be playing at a higher level to
that of Lee Sedol.

~~~
wapz
I don't know Fernando Aguilar but is the reason he's 6D not that he's capped
at it before becoming professional? IIRC when I played in SF bay area
(probably all of USA in the AGA), you couldn't get higher than 6D(7D?) without
becoming a pro. I think in tournament lineup you would see something like Mr.
A 6D (7.2) where the 7.2 was his estimated rating but it was a long time ago.

~~~
partycoder
Not entirely sure about caps at the amateur level. At least go servers don't
have such cap, and there's certain mapping between KGS ranks to offline
amateur ranks. The highest attainable rank at Go servers is 9d, aguilar is 6d
there.

[https://www.gokgs.com/graphPage.jsp?user=aguilar1](https://www.gokgs.com/graphPage.jsp?user=aguilar1)

An informal mapping among ranking systems can be seen here:

[http://senseis.xmp.net/?RankWorldwideComparison](http://senseis.xmp.net/?RankWorldwideComparison)

Servers usually manually override ranks for professional players...

But going back to my point, a 9p should be consistently stronger than a 6 dan
amateur, but these counterexamples challenge that.

------
IMTDb
When Ke Jie says : "I also thought I was very close to winning the game in the
middle but maybe that’s not what AlphaGo was thinking". Do we know what
AlphaGo was thinking ?

~~~
Ajedi32
From what I've heard AlphaGo does maintain an estimate of what its odds of
winning are at any given point in the game. Not sure if anyone from DeepMind
has released any of that data from game 2 though.

~~~
arcanus
Pretty early on Denis tweeted that AlphaGo believed his opponent had been
playing a perfect game, up to that point:

[https://twitter.com/demishassabis/status/867584056095002624](https://twitter.com/demishassabis/status/867584056095002624)

------
kgdinesh
Well, we are all fucked aren't we?

~~~
wavesandwind
As in AI taking over the world? No, since AlphaGo is optimized to do just one
thing.

~~~
ggreer
The underlying technology is far more general than than you think. In addition
to playing Go, it has already been used to play video games[1], diagnose
diseases[2], and make Google's datacenters more efficient.[3]

1\. [https://deepmind.com/blog/enabling-continual-learning-in-
neu...](https://deepmind.com/blog/enabling-continual-learning-in-neural-
networks/)

2\. [https://deepmind.com/applied/deepmind-
health/](https://deepmind.com/applied/deepmind-health/)

3\. [https://deepmind.com/blog/deepmind-ai-reduces-google-data-
ce...](https://deepmind.com/blog/deepmind-ai-reduces-google-data-centre-
cooling-bill-40/)

~~~
yorwba
In each case with lots of manual tweaking to get the best performance. Deep
learning is not magic; you need to find the right kind of architecture for
your problem, and that requires prior knowledge. Humans will still be in the
loop for at least the next decade or so.

~~~
Filligree
> Humans will still be in the loop for at least the next decade or so.

Reassuring. Really.

------
hacker_9
I was impressed when this happened last year. Funny how all the AI fear
mongering was the same back then. But a year later the result is just driving
a car poorly around a GTA map. Oh and AlphaGo still winning Go matches. Scary
indeed.

