Teaching Deep Convolutional Neural Networks to Play Go [pdf] (semanticscholar.org)
Wups, I've just noticed myself that this is a repost: https://news.ycombinator.com/item?id=8753347

The first comment of that thread predicted AlphaGo! :O

Well, so does this paper. Many people, myself included, thought the next step was a CNN+MCTS, because it was the obvious next step. It was doing it well and getting it all the way to superhuman quality (along with reinforcement learning/self-play) that made AlphaGo so important.

Biggest fundamental difference: no policy network learned via reinforcement learning. AlphaGo = MCTS + Value network + (Policy network). I think that that piece is pretty important and is what allowed AlphaGo to improve so much with self-play.

Aparantly, just the policy network of alpha go plays at a decent level (around 1kyu - 1Dan IIRC).

Alphago started with the policy network and later build the rest around that.

