I don't get it. If reasoning is not an option how does deep learning beat the bo...

sherjilozair · on July 17, 2017

Memorisation + small amounts of generalization.

erikb · on July 18, 2017

Unlikely. If it's mostly memorisation it couldn't learn from playing itself.

And what you describe is how AI beats chess. The problem with that is that it is a quite inhuman way to play. But AlphaGo plays quite humanly.

sherjilozair · on July 18, 2017

1. Imagine infinite compute capability. Exhaustively play all possible games, and use that to figure out best moves at any state. This is essentially what Alphago did, but using translation variance to reduce the search space.

2. There is no contradiction here. We just have to accept that human-like play can emerge from memorization.

erikb · on July 18, 2017

Can you explain what from your point of view is the difference between AlphaGo and a Chess AI? Because to me it sounds like the one should have resulted as an evolution from the other if it would be that simple.

sherjilozair · on July 19, 2017

Yes. In Chess, it's relatively easy to judge how good a board position is, and thus people have been successful by hand-engineering the board position evaluator (also called value function in RL lingo), and then just doing tree search to take the action which improves board position the most. In Go, evaluating board position is much more difficult, and it's not possible to approximate the value function by hand-engineered code. Thus, AlphaGO approximates the value by simulating the game till win/lose from arbitrary board positions to evaluate its value. This doesn't really require neural networks. You could also do the same with table lookup. What neural networks offer here is some translation invariance generalization, and capability to compress the table into fewer parameters by identifying common input features. It's possible to achieve AlphaGo performance by just having a BIG table of state-values and using some kernel to do nearest neighbour search (such as done by deepmind here: https://arxiv.org/abs/1606.04460)