
DarkForest: Deep learning engine for playing Go from Facebook research - adamnemecek
https://github.com/facebookresearch/darkforestGo
======
andrewljohnson
The name alludes to a book by Cixin Liu, the second in The Three Body Problem
trilogy:
[https://www.amazon.com/dp/B00IQO403K/](https://www.amazon.com/dp/B00IQO403K/)

This is perhaps the best of a hundred scifi novels I've read in a few years.
These novels will make you question the wisdom of sending traceable radio
signals into space. Stay quiet, it's a dangerous Dark Forest out there.

~~~
lesdeuxmagots
I am a bit baffled that people think about it so highly. I found three-body
highly flawed, but at least moved at a good pace; I could barely get through
the first few chapters of dark forest. The characters lacked any depth, the
writing and structuring were very poor as well.

Unfortunately just having fascinating premise and good story is not good
enough for me. I'd almost rather it be an essay, a thought experiment, than a
book.

~~~
trungaczne
I'm in the same boat, I didn't enjoy the writing at all and could not progress
very far.

------
boto3
Why don'd Facebook, Google and potentially MS, Apple, etc. set up a yearly
tournament for the world title of 'Champion of Go'. This is similar to F1
races by the car manufacturers.

~~~
gjm11
Taking your question at face value (why don't they?): Because right now, so
far as I can tell, Google is so far ahead that there would be little interest
in the contest.

Perhaps Facebook will start to catch up, or Microsoft will suddenly appear
with something startling, but for the moment the only real competition for
AlphaGo comes from top human players. (Perhaps not even them? I don't think
anyone really knows.)

~~~
duaneb
> Google is so far ahead that there would be little interest in the contest.

Sounds like EXACTLY the motivation a contest would help to foster.

~~~
mtgx
I think he's saying that, right now, both other companies would expect to be
_crushed_ in such a competition, so why would they not spare themselves the
embarrassment?

And because we're doing silly memes, to accept the challenge, Microsoft and
Facebook would probably have to go into the competition with this kind of
attitude:
[https://www.youtube.com/watch?v=9ZYg4ZbcOPQ](https://www.youtube.com/watch?v=9ZYg4ZbcOPQ)

------
andreyk
I've read a lot about AlphaGo, so I'll make a go at offering a quick
explanation of (as far as I understand) how DarkForest compares to AlphaGo in
terms of implementation.

Both do the cool thing of combining Monte Carlo Tree Search (which used to be
state of the art, and is like really smart brute force tree search) with Deep
Learning, and there is not much more to DarkForest than that so we can
describe that first. The Deep Learning portions involves training a Deep
policy Convolutional Neural Net, meaning a neural net that takes in a Go
position (actually both take in Go-specific features about the position) and
outputs the best moves. Datasets from human moves are used to get this, so its
pretty easy. This policy neural net is used to guide the Monte Carlo Tree
Search, which just means using the net's move predictions to play out a bunch
of games into the future to evaluate the best move. In classic MCTS you play
out to the end of the game and estimate the 'value' of a move by just ratio of
wins it gets at the bottom of the tree, but the estimation of value is a more
complicated for both systems. In DarkForest they 'Use PUCT and virtual loss.
Remove win rate noise'. The move with the highest estimated value based on the
tree search (guided by the neural net) is chosen as the next move.

AlphaGo does a bunch in addition to this which is hard to sum up, but
basically: 1\. They train a fast and a slow (but better) policy net, using the
same dataset. Both of these are used to guide the tree search. The slow but
better is used less often than the fast net, so a lot of games can be played
out in the tree search. 2\. The slow but better policy net is improved through
reinforcement learning - making the system play itself and learn from that
rather than just the human moves dataset. 3\. Since there is a policy network
that can choose moves, one can reasonably assume we can also get a 'value'
network to evaluate how good a position is. This is also done, and the value
network is used as part of an equation to compute the value of positions in
the tree search (there is also a much simpler hard coded value equation, and
the final value used as part of tree search is a weighted combination of these
two). 4\. All this is run in an absurdly, radically distributed manner with an
insane amount of compute power - on the 'non distributed' version AlphaGo uses
40 search threads running on 48 CPUs, with 8 GPUs for neural net computations
being done in parallel, and in the 'distributed version' it uses more than a
thousand CPUs and close to 200 GPUs. I don't even know if that's up to date,
those number are at least a few months old.

To sum up: DarkForest combines a policy network trained by supervised learning
to guide Monte Carlo Tree Search for move selection. AlphaGo uses a small/fast
policy network trained by supervised learning AND a slow but better policy
network trained by both supervised and reinforcement learning to guide monte
carlo tree search, and computes the value of final position with a combination
of a hardcoded value equation and a 'value' network derived from the slow
policy network. AlphaGo is absurdly, radically distribued (which helps a lot).
Oh yeah, it should noted the amount of engineering required for AlphaGo
involved dozens of people (two dozen are listed on the paper), whereas
Darkforest seems to have been mainly developed by two people.

~~~
joe_the_user
Thanks for the summary.

I suppose the question one could then ask is "will AlphaGo's approach wind-up
being emulated over time or is it going to be something like a cul-de-sac?

How many single algorithmic challenges are worth expending this much effort
on? Could AlphaGo's approach be applied to other such problems? Will
increasing processor speed just make all this effort moot? Is AlphaGo
something like Deep Blue (the custom computer that beat Kasparov and then was
dismantled rather than being developed further)?

~~~
andreyk
These are all precisely the right questions to ask about this, I think.

My take is that the approaches of AlphaGo are more applicable to other
problems than DeepBlue, but not by much. Rigid rules make tree search and
reinforcement learning easily applicable to Go, but not so much for many real
life problems. I made a small diagram to illustrate this point
([http://www.andreykurenkov.com/writing/images/2016-4-15-a-bri...](http://www.andreykurenkov.com/writing/images/2016-4-15-a-brief-
history-of-game-ai/31-venn.png)) as part of a series of posts about Game AI
([http://www.andreykurenkov.com/writing/a-brief-history-of-
gam...](http://www.andreykurenkov.com/writing/a-brief-history-of-game-ai)).

Still, the general ideas of supervised learning followed by reinforcement
learning, training multiple models of varying complexities from the same
dataset, and combining tree search with learned models as they did are useful
general ideas. Hybrid methods as a whole will become increasingly common, I
think (no doubt self driving cars already are very complicated hybrid models).

------
gooseus
Curious if the name DarkForest is based on the Three Body Problem sequel, The
Dark Forest, and/or does the dark forest theory outlined in the book somehow
relate to any aspect of their learning/decision algorithm?

I would outline the theory here, but it's kind of a book spoiler. So be wary
if you have interest in reading it eventually and want to know what I'm
referring to now.

~~~
bcoates
This sort of strategy game AI in general works on a Dark Forest-ish theory,
recursively evaluating what it would do if it were in the opponent's shoes to
predict the reaction it will get to each possible move.

Since Go is zero-sum the only reasonable strategy is to be maximally malicious
and assume the other guy is too.

------
olimashi
An interesting code name for this project - slightly ominous given the novel
of the same name?

------
tedmiston
Can anyone comment on why the author used Lua?

I'm curious if it's for Torch, or perhaps there's something more fundamental
about why it's a good language for AI that I don't know yet.

~~~
kmiroslav
Mostly because that's what Torch uses. LUA itself is extremely fast for a
scripting language but while that characteristic is vital in video games
(where LUA is used quite a bit), it's pretty irrelevant for machine learning
since the core of these engines is usually written in C/C++ for speed, and the
scripting language is just used as an entry point that makes the API and data
entry easy.

Still, I think ultimately, LUA is a liability for Torch, and it's not helped
by the fact that not only is TensorFlow a superior competitor (in my opinion)
but also because TensorFlow is based on Python (which is vastly more popular
than LUA).

Despite all their investment in Torch, it wouldn't surprise me if eventually,
Facebook transitions to TensorFlow because it's probably what it will take for
them to effectively compete against Google on the machine learning front.

~~~
cjbprime
Though note that AlphaGo used Torch too, so it can't be the source of
Facebook's weakness at a Go engine.

[https://m.facebook.com/story.php?story_fbid=1015344288418214...](https://m.facebook.com/story.php?story_fbid=10153442884182143&id=722677142)

------
yazriel
Can anyone comment on the CPU requirements for such a system? (either this one
or AlphaGo) How many commutative CPU/GPU hrs are required to get to here?

------
umutisik
I would be very interested in finding out what the key factors are that make
Alphago stronger than Dark Forest.

~~~
lukeHeuer
Purely from a hardware advantage standpoint Alphago runs on TPUs, and Google
asserts these ASIC units offer "an order of magnitude better-optimized
performance per watt for machine learning."[0] Facebook doesn't seem to have
an equivalent afaik.

[0] [https://cloudplatform.googleblog.com/2016/05/Google-
supercha...](https://cloudplatform.googleblog.com/2016/05/Google-supercharges-
machine-learning-tasks-with-custom-chip.html)

~~~
visarga
That doesn't make TPUs better at achieving performance, just cheaper. My bet
is on the size of the team. FB only put in 2 people, or so it seems, while
Google invested 10x more man-time.

~~~
lukeHeuer
It most certainly does mean they are better at achieving performance. The real
headline of the optimization bit is that they are capable of more ops per
second, not that they just use less energy and are therefor cheaper. To use a
more open/mainstream analog with freely available metrics, look at the
performance difference in bitcoin mining between GPUs and ASIC units.

------
amaks
I'm curious when their engine will be good enough to start playing against
leading human Go players?

~~~
spot
it's plenty good to beat everyone up to very strong amateurs. it's not at
professional level yet.

------
andreygrehov
DarkForest vs. AlphaGo - is it a possible scenario in which DarkForest learns
from AlphaGo?

~~~
aab0
If AlphaGo ever gets released, sure. DarkForest can learn from corpuses of AG
self-play games.

------
joe_the_user
Does anyone have an idea how strong this in terms of stones?

~~~
cjbprime
AlphaGo could probably give it 9 stones and still win, 9p->5d. (Stones go non-
linear around the pro ranks, so we can't just count the rank difference in the
way we do as amateurs.)

~~~
joe_the_user
OK, I see the reference now.

It's not even the strongest bot on KGS.

It seems like an indication that either they haven't incorporated the advances
of AlphaGo or that Alpha succeeded through the investment of a huge amount of
tuning time and processing power rather than through specific advances.

------
owaislone
So when are we gonna pit it against Google Brain?

------
jolux
>We hope that releasing the source code and pre-trained models are beneficial
to the community.

Translation: Google beat us and there's no point keeping this private anymore.
;)

~~~
semisight
Playing them against each other would be a pretty cool way for the two
companies' AI teams to compete though.

~~~
oh_sigh
That would be equivalent to Lebron James going head to head with a disabled 5
year old.

~~~
dfan
No, it would be equivalent to LeBron James going head to head with a talented
amateur basketball player. Of course he would still win every time, but it
would at least be a game of basketball.

------
ns0xai
is someone working to port it to tensorflow

