"Here we introduce a new approach to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves... Without any lookahead search, the neural networks play Go at the level of state-of-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0."
This matches something I've previously named in private conversation as a warning sign - sharply above-trend performance at Go from a neural algorithm. What this indicates is not that deep learning in particular is going to be the Game Over algorithm. Rather, the background variables are looking more like "Human neural intelligence is not that complicated and current algorithms are touching on keystone, foundational aspects of it." What's alarming is not this particular breakthrough, but what it implies about the general background settings of the computational universe.
Go is a game that is very computationally difficult for traditional chess-style techniques. Human masters learn to play Go very intuitively, because the human cortical algorithm turns out to generalize well. If deep learning can do something similar, plus (a previous real sign) have a single network architecture learn to play loads of different old computer games, that may indicate we're starting to get into the range of "neural algorithms that generalize well, the way that the human cortical algorithm generalizes well".
A number of commenters are talking about how the human professional beat was well below the world champion. This is entirely missing the point. Beating the best human is an entirely arbitrary threshold, which is why Deep Blue vs. Kasparov wasn't a great sign per se. There's probably nothing computationally distinguished about the very best human versus a very good human - the world champion isn't using a basically different algorithm. What matters is the discontinuous jump, how it was done, and the absolute level of human-style competence achieved.
This result also supports that "Everything always stays on a smooth exponential trend, you don't get discontinuous competence boosts from new algorithmic insights" is false even for the non-recursive case, but that was already obvious from my perspective. Evidence that's more easily interpreted by a wider set of eyes is always helpful, I guess.
I hope that everyone in 2010 who tried to eyeball the AI alignment problem, and concluded with their own eyeballs that we had until 2050 to start really worrying about it, enjoyed their use of whatever resources they decided not to devote to the problem at that time.
""neural algorithms that generalize well, the way that the human cortical algorithm generalizes well"."
I think we have already been seeing this for years with image recognition, speech recognition, and other pattern recognition problems. As with those problems, playing Go is one of those things you can easily get heaps of data for and formulate it as a nice supervised learning task. The task is still spotting patterns on raw data with learned features.
However, the current deep learning methods don't (seem to) generalize well to all that our brains do - most of all learning to do many different things with online small input of data. I have not seen any research into large scale heterogenous unsupervised or semi-supervised learning with small batches of input - these big neural nets are still used within larger engineered systems to accomplish single specific tasks that require tons of data and computing power. Plus, the approach here still uses Monte Carlo Search in a way that is fairly specific to game playing - not general reasoning.
Clearly this is another demonstration Deep Learning can be used to accomplish some very hard AI tasks. But I don't think this result merits thinking the current approaches will scale to 'real' AI (though perhaps a simple variation or extension will).
The difference isn't easy to describe, but one such difference would be that a single extra stone can change a Go position value much more than a single pixel changes an image classification.
The problem changes dramatically when the AI is supposed to take arbitrary input from the world. Then the AI needs to determine what input to collect, and the path length connecting its decisions to its reward grows enormously.
I still agree with your take though: there's an important milestone here.
A CNN can still distinguish extremely subtle differences of various animal breeds, exceeding human performance in such tasks. Why was that advance not a warning sign? The rotational-translational invariance prior of the convolutional neural network probably helps because, by default, local changes of the patterns can massively change the output value without the need to train that subtle change for all translations. Also, AlphaGo does a tree search all the way to the games end, which can probably easily detect such dramatic changes of single extra stones. Reality is likely much too unconstrained to to able to efficiently simulate such things.
Of the algorithms used in this work, which would be touching on foundational aspects of human intelligence, in your view?
Thinking about the variant of MCTS used in this work, for example, it's not clear to me that tree search, no matter how clever, touches much on human cognition, at least significantly more than deep blue did.
On the other hand, the idea of bootstrapping a network with huge numbers of expert interactions before 'graduating' it to more complex training and architectural enhancement might turn out be an important part of the Game Over algorithm, as you call it. Even if it doesn't resembles much how humans beings learn.
It's quite a low-key affair (no GPU clusters and so on) but what I think is remarkable is that it described a fairly complex cognitive (neural) architecture that was able to mimic certain kinds of child-level cognition with TINY amounts of training data. Instead, human supervisors guided the evolution of its cognitive strategy in a white-box kind of fashion to encourage it to answer questions as children did.
In many respects (like training data volume, layer depth, and so on) it couldn't be further from the current deep learning trends, and yet it seemed much more along the lines of what I imagine an actual AGI would be doing, especially one we hope to control.
A key point is that the central executive (the core, which controls the flow of data between slave systems) is a trainable neural network itself, which learns to generate the "mental actions" that control the flow of data between slave systems (like short-term and long-term memory), rather than rely on fixed rules to control the flow. This allows the system to generalise.
(Some of the training itself doesn't seem to be that similar to training a human child, though. It's instead tailored to the system's architecture.)
I'm excited to see how much richer this and similar systems will become, as researchers improve the neural architecture and processing efficiency. Will we see truly human-like language and reasoning systems sooner than expected?
1. I don't claim everything always stays on a smooth exponential trend, but that things do so on a large scale as small variations average out. It doesn't surprise me that someone managed to get above-trend performance at Go. However, I predict this will not lead to above-trend GDP.
2. I don't predict we will get human-level AI in 2050. I predict we will probably never reach a tech level that would make human-level AI a possibility. The more people start fretting about the 'AI alignment problem' (which we would in any case no more be able to solve today than Leonardo da Vinci could design a fail-safe nuclear reactor), the lower the probability that we ever reach that tech level. Conversely, though a small thing, this news of continued incremental progress makes me a tiny fraction of a percent more optimistic.
And life seems to be quite resilient and eerily altruistic at times.
We still haven't wiped this planet clear despite having enough nukes to do so. Some people put their military careers and the security of their countries at risk refusing to launch nukes despite snafus higher in the command chain.
And then you have all those westerners whose day jobs became so meaningless, detached from their environment and sometimes outright hostile to other humans that even if they don't commit suicide, they willingly stop breeding and openly talk about replacing themselves with more down-to-earth folks who have their priorities right: food, children, food for children, and maybe then some little AI R&D, although who cares about that if you can have more children.
Maybe the selfish gene is doing quite well.
I think something like the Culture represents the best case for humanities long term future as this is almost certainly going to include AIs of far greater powers than us bags of meat.