edit: For those saying it's still a long way to beat the strongest player - we are playing Lee Sedol, probably the strongest Go player, in March: http://deepmind.com/alpha-go.html.
That site also has a link to the paper, scroll down to "Read about AlphaGo here".
This is a great advancement, and will be even more so if you can beat Lee Sedol.
There are two interesting areas in computer game playing that I have not seem much research on. I'm curious if your group or anyone you know have looked into either of these.
1. How to play well at a level below full strength. In chess, for instance, it is no fun for most humans to play Stockfish or Komodo (the two strongest chess programs), because those programs will completely stomp them. It is most fun for a human to play someone around their own level. Most chess programs intended for playing (as opposed to just for analysis) let you set what level you want to play, but the results feel unnatural.
What I mean by unnatural is that when a human who is around, say, a 1800 USCF rating asks a program to play at such a level, what typically happens is that the program plays most moves as if it were a super-GM, with a few terrible moves tossed in. A real 1800 will be more steady. He won't make any super-GM moves, but also won't make a lot of terrible moves.
2. How to explain to humans why a move is good. When I have Stockfish analyze a chess game, it can tell me that one move is better than another, but I often cannot figure out why. For instance, suppose it says that I should trade a knight for an enemy bishop. A human GM who tells me that is the right move would be able to tell me that it is because that leaves me with the two bishops, and tell me that because of the pawn structure we have in this specific game they will be very strong, and knights not so much. The GM could put it all in terms of various strategic considerations and goals, and from him I could learn things that let me figure out in the future such moves on my own.
All Stockfish tells me is that it looked freaking far into the future and it makes any other move the opponent can force it into lines that won't come out as well. That gives me no insight into why that is better. With a lot of experimenting, trying out alternative lines and ideas against Stockfish, you can sometimes tease out what the strategic considerations and features of the position are that make it so that is the right move.
For your second question, of course, a grandmaster can only tell you why they think they made the move. It may be a postrationalization for the fact that they feel, based on deep learning they have acquired over years of practice, that that move is the right one to make.
It doesn't seem too difficult to have an AI let you know which other strong moves it rejected, and to dig into its forecasts for how those moves play out compared to the chosen move to tell you why it makes particular scenarios more likely. But that would just be postrationalization too...
I think this is wrong. This idea comes from when humans first tried to program computers to do what we wanted them to do. We failed, and it turned out to be really hard. A grand master couldn't explain the algorithm he used exactly. But that doesn't mean he couldn't give any useful information at all.
Think of it like describing a tiger. I have no idea how to describe complicated edge detection algorithms and exact shapes. But I can say something like "an animal with orange stripes", and that would be useful information to another human.
Likewise a grandmaster can explain that pawns are valuable in situations like this, or that point to a certain position and say don't do that, etc. To a computer that information is useless, but to a human that's extremely useful. We already have some pattern recognition and intelligence, we just need some pointers. Not an exact description of the algorithm.
"Likewise a grandmaster can explain that pawns are valuable in situations like this, or that point to a certain position and say don't do that, etc. To a computer that information is useless, but to a human that's extremely useful. "
I think that the only reason this is true is because, humans, shared the same low (or intermediate in this case) level features of their models of the world and a common language to share them.
Artificial neural networks understanding have evolved along other path, and they probable have another different organization between the different levels, but that doesn't make the fundamental mechanism different.
I play competitive chess, and I assure you most moves are made because of players proving in their minds that a move is objectively good.
The reasons for why the player may think the move is objectively good can vary, but they are almost always linked to rational considerations. E.g. that the move increases their piece count, their control of center squares, their attacking opportunities, or is tactically necessary due to the position.
My point being that when grandmaster players play chess, they are just as interested in finding objectively right moves as a computer is. Unless it's speed chess it's rarely a "I suspect this might be good" sort of thing.
(That said, many grandmasters do avoid lines of play they consider too dangerous. World Champion Magnus Carlsen's "nettlesomeness" - his willingness to force games into difficult positions - has been one explanation for why he beats other Grandmasters.)
If the move's objectively good, there would be no variation in moves between players. Since there is variation, I assume different players apply different heuristics for 'good'. And whether the move increases their piece count is a fine justification, but why are you privileging increasing your piece count at this point in this game against this opponent? At some point the answer becomes 'because I learned to'.
Well almost every computer playing chess algorithm uses piece counts to evaluate the quality of chess positions, because barring an amazing tactical combination (which can usually be computationally eliminated past 5 moves) or a crushing positional advantage, a loss of pieces will mean the victory of the person with more pieces.
I would argue you see far more pattern recognition at play in chess than you do of heuristics. Heuristics is more common at lower levels of play.
When Grandmaster's rely on pattern recognition, they are using their vast repertoire of remembered positions as a way to identify opportunities of play. It's not that they think the move looks right, it's that they played a lot of tactical puzzles, and because of this pattern recognition, they are now capable of identifying decisive attacks that can then be objectively calculated within the brain to be seen as leading to checkmate or a piece advantage.
They don't make the move because of the pattern or heuristic. They make the move because the pattern allowed them to see the objective advantage in making that move.
------
As for your point about a move being objectively good: Unless you completely solve the game of chess, there will never be always one move in every situation that's objectively the best. In many games (and you will see this in computer analysis), 2 or 3 moves will hold high promise, while others will hold less. From an objective standpoint all these three moves could be objectively better than all others, but it could be hard to justify that one is necessarily better than another.
The reason for this is partly because between two objectively 'equal' moves, there may be a rational reason for me to justify one over the other based on personal considerations (e.g. because I am familiar with the opening, because I played and analyzed many games similar to this line, because I can play this end game well, etc.) Decisions based on those considerations are not what I would call heuristics, because they are based on objective reasons even if heuristics may have contributed to their formation within the mind.
"Well almost every computer playing chess algorithm uses piece counts to evaluate the quality of chess positions"
This is quite wrong. They use a score that material is only one (although a major) factor of.
"because barring an amazing tactical combination (which can usually be computationally eliminated past 5 moves) or a crushing positional advantage, a loss of pieces will mean the victory of the person with more pieces."
Again, this simply isn't true. For one thing, talk of "piece counts" and even "increasing piece counts", rather than material, is very odd coming from a serious chessplayer. Aside from that, time, space, piece mobility and coordination, king safety, pawn structure, including passed pawns, how far pawns are advanced, and numerous other factors play a role. All of these can provide counterplay against a material advantage ... it need not be "crushing", merely adequate. And tactical combinations need not be "amazing", merely adequate. And whether these factors are adequate requires more than 5 moves of lookahead because chess playing programs are only able to do static analysis and have no "grasp" of positions. All of which adds up to the need for move tree scores to be made up of far more than "piece counts".
You're right that material is the correct term. I was trying to use language appropriate for someone thinking about programming a chess machine.
I perhaps resorted to hyperbole in my original description for the sack of emphasis. You are correct that at higher levels of play, positional considerations matter far more than material considerations. The advantage does not need to be amazing, but adequate. However, as material begins to accumulate the advantage one must have in position in order to justify the loss will increasingly require a position that moves into the realm of "amazing" and "crushing".
You are right that objectively calculating the positional strength of a position is very difficult to do without immense brute forcing, and likely needs more than 5 moves ahead of insight. When I said that I was really referring quite strictly to tactical combinations where the vast majority of tactical mistakes can be caught quickly.
> If the move's objectively good, there would be no variation in moves between players.
If we could solve chess, this most likely would be true, just as it's true for tic-tac-toe, which anyone can solve in mere minutes once they realize that symmetry allows for only 3 distinct opening moves (corner, middle and edge) and games should always end in a draw unless someone makes a silly mistake.
Granted, there are lots of paths to draw that one might take, but the objectively strongest move is to take a corner, then the opposite corner, which requires the opponent to either try to force a draw or lose, whereas it's not hard to use weak moves to hand either player a victory, even though the game of tic-tac-toe can always be forced into a draw with skilled play.
If I had to speculate at a high level why the Sicilian opening is so popular for black in professional play, it would be ultimately because the Sicilian allows black to obfuscate white's board symmetry, which creates opportunity for counterplay against white's fundamental advantage of having the initiative.
I will say though that as someone who devoted some serious time into trying to become a master, that opening theory completely changes as you get to the master level and beyond.
In tournaments I would play a solid but relatively obscure opening as black that worked very well as a safe opening to guard against highly tactical book play, but when I really analyzed the entire line going out past 12-15 moves with a grandmaster, I learned that with careful play there actually was a way to gain a slight edge for white with it -- enough to make the opening uninteresting to most grandmasters. It would play well against masters, but not against a top GM who would know how to play out the line correctly.
Very true. And even in long, professional play its not uncommon to see GM's play highly tactical, but unsound openings if they think the other player doesn't know how to beat it. E.g. I saw Nakamura play the Kings Gambit once against somebody sub-2000 in a professional tournament once (not blitz, full regular timed game).
It's clear that you don't play chess. Anyone who does understands from experience why "increasing your piece count" (which is a backwards and inaccurate way to put it) is the most important and reliable path to victory ... of course it's not always the right thing, but other things being equal, winning material is advantageous. Asking why gaining material advantage is "privileged" is like asking why a weightlifter "privileges" gaining strength, or why a general "privileges" winning battles or destroying supply lines. It's not "because they learned to", it's because "duh, that's obvious".
And the claim that there would be no variation in moves between players if moves were objectively good is absurd nonsense. Just because not everyone plays the best move, that doesn't mean it's not the best move. Of course different players apply different heuristics -- some players are better than others. But in the vast majority of positions, all grandmasters will, given enough time for analysis, agree on the best move or a small number of equally good best moves. When there are multiple best moves, different grandmasters will choose different ones depending on their style, familiarity, opponents, and objectives (tournament players play differently when all they need is a draw than when they need to win).
Your previous comments, about "postrationalization", are also nonsense. Certainly GMs play intuitively in blitz games, but when taking their time they can always say why a move is better -- and they do just that in postgame analyses, many of which can be seen online. The explanations are given in terms of the major strategic factors of time, space, and material, or other factors such as pawn structure and piece coordination, or in terms of tactical maneuvers that achieve advantages in those factors ... or that result in checkmate (which can be viewed as infinite material gain, and many chess playing programs model it as such).
But chessplaying programs aren't goal driven. They evaluate such factors when they statically analyze a position, but they evaluate millions of positions and compare the evaluations and bubble these evaluations up the game tree, resulting in a single score. That score does not and cannot indicate why the final choice is better than others. Thus
"It doesn't seem too difficult to have an AI let you know which other strong moves it rejected, and to dig into its forecasts for how those moves play out compared to the chosen move to tell you why it makes particular scenarios more likely."
is just facile nonsense grounded in ignorance ... of course it can let you know which other strong moves were rejected, but it cannot even begin to tell you why.
"But that would just be postrationalization too... "
You keep using that word, in completely wrong fashion. The computer's analysis is entirely done before it makes the move, so there's nothing "post" about it. And it makes moves for reasons, not "rationalizations". Perhaps some day there will be AIs that have some need to justify their behavior, but the notion does not apply to transparent, mechanistic decision making algorithms.
It's interesting that Lee's style is similar to Carlsen: Lee Sedol's dominating international performances were famous for shockingly aggressive risk-taking and fighting, contrary to the relatively risk-averse styles of most modern Go professionals.
I'm not sure they're that similar. The whole professional Go world had to change to a more aggressive fighting style in order to dethrone Lee Changho, whereas Carlsen in a sense had to do the opposite to consistently take on the best GMs -- he was very aggressive when he was younger, now he is "nettlesome" which isn't quite the same thing.
With me knowing nothing about this particular pocket f the world, how does one live as a "Go professional"? Who pays for this? I don't imagine this is very attractive for sponsors, or do I underestimate how popular this is in Asia?
They are viewed the same sports professionals in China/Japan/Korea. Go news would share sports frontpage, and players are paid well. In China the national Go association is managed under the sports council, with dorms and national squads under the association.
As it's a game more popular amongst older demographics, there tend to be a lot of wealthy patrons and supporters (individuals and companies) who sponsor tournaments and teams. One of the highest-paying competitions is the Ing Cup with a prize of $400,000. Japan has nearly 10 major year-long tournaments every year, totaling over $2mil in prizes, many are sponsored by major newspapers.[1] China has domestic year-long leagues, where city teams each have their own sponsors. All the games I mentioned here pay a match fee whether players win or lose.
So yes, it is a popular game in Asia, however less so for the younger demographic and is unfortunately in decline. Most people just don't have the attention span, interest or time these days. :(
But not as popular as Chinese chess (xiangqi) as a game people actually play, though. Go might be more popular than xiangqi as a spectator sport; I don't know.
I was responding to a comment specifically about China.
Actually Janggi, the Korean variant of Chinese chess (yes, there are some rule differences, but it's recognizably the same game - like Chess without castling and en passant) is very popular, though according to Wikipedia currently less so than Go.
People have learned to communicate heuristics which is very useful for beginning players. A grand master may not be able to communicate nuances of the game to low level players, but they do benefit from working in groups which suggests they can share reasons for a given move not just propose moves and have other independently evaluate them.
So, while people can't map out the networks of neurons that actually made a given choice, we can effectively communicate reasons for a given move.
Perhaps postrationalizaton is still useful? Could empathy and mirror neurons help transfer some of that deep learning? Two weeks later the student faces a similar situation, and they get the same gut feeling their teacher did, and that helps them play better than if their teacher didn't postrationalize?
Absolutely! My point is that human intelligence doesn't actually have any deep insight into how it makes decisions, so we shouldn't be that disappointed that an AI doesn't, either. Humans can postrationalize - explore how they think they make decisions - but they can't tell you how they actually decided. Doing the same for an AI is interesting, but I don't think it's a necessary component of making good decisions in the first place to be able to explain why you made it.
On the other hand, not all actions are decisions. There is plenty of actions we would classify as decisions which are application of rules instead. There is a clear rational path in application of rules. To clarify terminology, for me decisions are actions in response to intractable computationally challenges, e.g. will that couch fit in my living room (when I am in a store, where size calculations are hard in an unfamiliar context) etc. This could mean that your action is my decision if we are not equally capable of calculating based on rules.
Although, when looking at the Deep Dream images that come as a by product (more or less) of image recognition AI, I get the impression that there ARE ways of communicating things about what a computer is "thinking" when trying to solve problems in areas where humans are traditionally better.
Both points are excellent. I think the second one is more important immediately. If you look at the "expert systems" literature, specifically older expert systems based on logic programming, there usually was an "explanation component" in the general architecture.
However I think this area has been under-researched but is obviously important. It would enable very strong learning environments and man/computer hybrid systems. I think there's very direct relevance in safety/security critical systems and there's some literature in operators not understanding what is going on in complex systems and how that can be fatal (think power plants and the like)
For #2, is Stockfish implicitly discovering things that a human might explicitly recognize and articulate (e.g. that the pawn structure has an outsized impact on the value of certain pieces)?
If so, could it be just as easily programmed to answer those questions as it evaluates moves? That is, it seems the information is there to form those answers, but it's not the question the AI has been asked.
Historically there has been a back-and-forth with chess engines....
Early engines tried to really "smart", but consequently couldn't really analyze very deeply, as they spent a lot of time on each position. Newer engines mostly churn through really deep analysis, going many layers deep, but are making comparatively simplistic analyses.
"For instance, it wasn't asked to evaluate the pawn structure and provide that analysis as an output, but it certainly could be programmed to do so."
This quite misses the point. These programs do that as a matter of course, for individual positions. But choosing a move is the result of evaluating many millions of positions and comparing the scores through tree pruning. The program cannot tell you that it chose one move over another because the chosen move is generally better for the pawn structure than the move not chosen, because it doesn't have that information and cannot obtain it.
It should be possible. E.g. I've seen people train a neural network or similar to classify images and then "run it backwards" to get e.g. the image that the network thinks is most "panda".
"If so, could it be just as easily programmed to answer those questions as it evaluates moves?"
No. A chess playing program's move score is a value obtained from treewise comparisons of the static evaluations of millions and millions of positions.
This reminds me something I've read long time ago about the Heroes of Might and Magic game. At some point the AI was so good, it wasn't fun to play, and it had to be dumbed down.
This is a frequent pitfall in video game AI - people go into it thinking the AI's goal is to win, then learn the hard way that the AI's goal should be to lose in an interesting way.
Nobody says that the red koopa in Super Mario Bros. has bad AI.
I can't remember, but I do think it was an interview with Jon Van Caneghem... Either in a book of game design, or magazine. I have to find it.
Similarly long time ago, read about how AOE (Age of Empires) used a lot of computers to play against each other, then do statistic which units actually got used. The idea was to rebalance the units such that all are almost equally used (well in terms of computer-ai play).
I think these two articles were both in the same book, so I'll have to dig.
HOMM3 is also my favourite game, along with Disciples. I'm a big turn-based strategy fan :)
It just means analyzing more than one move at a time. Any engine supporting the UCI protocol should be able to do it. Like Stockfish, which is free.[1] So you don't need to create a new engine to implement this feature. It might be possible to implement it with a bash script. Certainly with JS.[2]
Stockfish does have a "skill level" setting, and it's not terrible at faking club-level play (if you have the Play Magnus mobile app, it's just Stockfish at different skill levels). However, as of 2013 the implementation is much more primitive than what I'm suggesting here.[3]
To be clear, even though it's incredibly obvious, I've never seen this idea anywhere else. It first occurred to me after reading the original Guid & Bratko paper on intrinsic ratings in 2006. Happy to continue this offlist if you want to work on it. My e-mail is in my profile.
I actually understood what you meant, I just think it's funny when we use "just" right before saying something that seems complex (even if it isn't actually all that complex).
Having said that, I only added the link to that comic because I didn't want to just write a comment saying "thanks for the links", and I'm only replying again, because I'm hoping it continues the pattern where you keep replying to my replies with super interesting links!
You can successfully make a computer program play at a 2000 FIDE level (say), in that its win/loss/draw results will be consistent with that of a 2000 FIDE human. IPR is a good way of doing this in a quantitative way.
The interesting problem is to make the computer play like a 2000-rated person, not just as well as a 2000-rated person. I'm a big fan of Regan's work, but I don't think IPR on its own is sufficient to make the computer play idiomatic suboptimal moves.
Shredder claimed to have human-like play at lower levels, so I gave that a try. It works surprisingly well at my level, making plausible mistakes in a fairly consistent manner. When I was playing against it I was in the 1200-1500 range, so I don't know how well it does at higher levels. Also, it had a setting where it would rate you and auto-adjust itself for the next game.
It made playing against a program a lot nicer than other chess programs I had tried.
> 1. How to play well at a level below full strength. In chess, for instance, it is no fun for most humans to play Stockfish or Komodo (the two strongest chess programs), because those programs will completely stomp them. It is most fun for a human to play someone around their own level. Most chess programs intended for playing (as opposed to just for analysis) let you set what level you want to play, but the results feel unnatural.
The astronauts in 2001 play chess with the computer, and it sometimes loses on purpose to make the game more interesting for them.
This is actually a problem I've given a decent amount of though on (although not necessarily reaching a good conclusion), but I think these problems are actually related and not impossible for this simple case. It comes to an issue of what parts of the analysis and at what depth did a best move come to vision? Was it bad when it was sorted for 8 ply but good at 16? Maybe that won't "tell" a person why a move was good, but it gives a lot of tools to help try to understand them (which can be exceedingly difficult right now if a line is not part of the principal variation, but ultimately affects the evaluation by an existing "refutation". But I think the other "difficulty" is that 1800 players play badly in lots of different ways, 2200s play badly in lots of different ways and even Grandmasters play badly in lots of different ways, but very strong chess engines play badly only in a few sometimes limited ways.
It's a bit of a game design problem too, since you may want to optimize for how "fun" the AI is to play against. There are patterns of behavior that can be equivalently challenging, but greatly varying in terms of how interesting or enjoyable they are to play against.
I.e. there are various chess bots that can be assigned personality dimensions like aggressiveness, novelty, etc.
A general AI will likely be LESS able to explain why a move is good, for exactly reason mentioned above (post-rationalization of a massive statistical computation)
No, a truly general AI would play the way we do ... based on goal seeking. Current chess playing programs give moves a single score based on comparing and tree pruning millions of position evaluations, so they cannot possibly articulate what went into that score.
For point #2, the current state of AI allows only for "because you're more likely to win with this move." Today's AI can't reason like a human mind does, it just simulates thousands of scenarios and analyzes which are more likely to be successful, with a very primitive understanding as to why.
When playing, they have a strategy, which they could explain to other go players. They don't just recognize patterns or do brute-force look-ahead. The same is true for good chess players.
There's typically an extra layer (or more) with humans. "Because this puts your Bishop in wasabi which makes it harder for your opponent to extract his Kinglet, making it more likely to win."
Wouldn't it be possible to compare the "top" move with the "runner up" move, compare outcome percentages, and declare whether there is a small or large deviation? Or comparing the "top" move with any other possible move? Or is that too much calculation?
Well, you can make chess engines give you a numeric evaluation for several possible moves. These are typically tuned so that the value of a pawn is around 1 point. A grandmaster 1 or 2 points ahead can routinely convert to a won game, assuming he doesn't blunder.
So if the best move has an evaluation of +0.05 and the second best has -0.02 , the difference is probably a very subtle positional improvement (and the first move may not in fact be better; chess programs aren't perfect). If the best is +3.12 and the second is -0.02, and you can't see why, there's a concrete material tactic you're missing (or, less likely, a major, obvious positional devastation).
But, it can't tell you what you're missing, just the magnitude of the difference.
Seems like a pretty thin line between these conceptions of "understanding". If the AI is programmed to "understand" the rules and ojectives of the game, and it uses that information to assess the best moves for various scenarios, then how does that materially differ from human understanding?
It's strange that almost no one commenting on this has the faintest idea how chess programs work. Chess programs score moves by scoring and comparing, recursively, many millions of positions.
Actually, I wrote one as a senior project years ago. It worked exactly that way, with the exception that the number of positions was less, owed to the available computing power.
Concept was the same as you describe. No rocket science. I purposely read nothing beforehand, because I wanted to devise an approach and this seemed the most obvious.
Of course, nowadays that is not the only technique in use.
In any case, I'm not sure why you think it impossible to add any additional analysis to the program as it repeatedly scores millions of positions.
To summarize, I believe what they do is roughly this: First, they take a large collection of Go moves from expert players and learn a mapping from position to moves (a policy) using a convolutional neural network that simply takes the 19 x 19 board as input. Then they refine a copy of this mapping using reinforcement learning by letting the program play against other instances of the same program: For that they additionally train a mapping from the position to a probability of how how likely it will result in winning the game (the value of that state). With these two networks they navigate through state-space: First they produce a couple of learned expert moves given the current state of the board with the first neural network. Then they check the values of these moves and branch out over the best ones (among other heuristics). When some termination criterion is met, they pick the first move of the best branch and then it's the other player's turn.
they also train a mapping from the board state to a probability of how how likely it is a particular move will result in winning the game (the value of a particular move).
How is this calculated?
When some termination criterion is met
Were these criterion learned automatically, or coded/tweaked manually?
1. The value network is trained with gradient descent to minimize the difference between predicted outcome of a certain board position and the final outcome of the game. Actually they use the refined policy network for this training; but the original policy turns out to perform better during simulation (they conjecture it is because it contains more creative moves which are kind of averaged out in the refined one). I'm wondering why the value network can be better trained with the refined policy network.
2. They just run a certain number of simulations, i.e. they compute n different branches all the way to the end of the game with various heuristics.
This was the question which originally led me to lose faith in deep learning for solving go.
Existing research throws a bunch of professional games at a DCNN and trains it to predict the next move.
It generally does quite well but fails hilariously when you give it a situation which never comes up in pro games. Go involves lots of implicit threats which are rarely carried out. These networks learn to make the threats but, lacking training data, are incapable of following up.
The first step of creating AlphaGo worked the same way (and actually was worse at predicting the next move than current state of the art), but Deep Mind then took that base network and retrained it. Instead of playing the move a pro would play it now plays the move most likely to result in a win.
For pros, this is the same move. But for AlphaGo, in this completely different MCTS environment, they are quite different. Deep Mind then played the engine against older versions of itself and used reinforcement learning to make the network as accurate as possible.
They effectively used the human data to bootstrap a better player. The paper used a lot of other cool techniques and optimizations, but I think this one might be the coolest.
> How can a human ever get better than their teacher?
By learning from other teachers, and by applying original thought. Also, due to innately superior intelligence. If your IQ is 140, and that of the teacher is 105, you will eventually outstrip the teacher.
I concluded that the all time no. 1 master Go Seigen's secret is 1. learn from all masters; 2. keep inventing/innovating. Most experts do 1 well, and are pretty much stuck there. Few are good at 2. I doubt if computers can invent/innovate.
I would have thought (he says casually) that some kind of genetic algorithm of introducing random moves and evaluating outcomes for success would be entirely possible, no?
It's because they have a much larger stack size than a human brain (which does not have a stack at all, but just various kinds of short term memories). An expert Go player can realistically maybe consider 2-3 moves into the future and can have a rough idea about what will happen in the coming 10 moves, while this method does tree search all the way to the end of the game on multiple alternative paths for each move.
Not true. Profession go players read out 20+ moves consistently. Go Seigan's nemesis Kitani Minoru regularly read-out 30-40 moves.
As an AGAAmateur 4 dan I read 10 moves pretty regularly, that's including variations.
And if the sequence includes joseki (known optimal sequences of 15-20+ moves), then pros will read even deeper...
Yes, the latter number was perhaps too conservative; no doubt about deeper predictions being easily possible, but I doubt even expert players consider many alternative paths in the search tree. They might recognize overall strategies which reach many moves into the future, but extensive consideration of what will happen in the upcoming moves is probably constrained to a only few steps; at least relative to the number and depths of paths that AlphaGo considers.
I think a key missing component to crowd success on real expert knowledge (as opposed to trivia) is captured by the concept of prediction markets. (https://en.wikipedia.org/wiki/Prediction_market) The experts who are correct will make more money than the incorrect ones and eventually drive them out of the market for some particular area.
That's no counterpoint because the World team (of which I was a member) was made up of boobs on the internet, not players of Kasparov's strength, which was the premise of the question you responded to.
The easy thing about combining AI systems is that they don't argue. They don't try to change the opinion of the other experts. They don't try to argue with the entity that combines all opinions, every AI expert gets to say his opinion once.
With humans on the other hand, there will always be some discussion. And some human experts may be better at persuading other human experts or the combining entity.
I think it would be an interesting thing to try after they beat the number 1 player. Gather the top 10 (human) Go players and let them play as a team against AlphaGo.
This is nonsense. To combine AI systems requires a mechanism to combine their evaluations. The most effect way would be a feedback system, where each system uses evaluations from other systems as input to possibly modify its own evaluation, with the goal being consensus. This is simply a formalization of argumentation -- which can be rational; it doesn't have to be based on personal benefit. And generalized AI systems may well some day have personal motivations, as has been discussed at length.
the key part is that they basically just play all the permutations possible and next permutations and so on and get a probability to win out of each path and take the best. It is indeed a very artificial way to be intelligent.
Hey Inufu. I just replayed the games and have to say that the first game the bot shows very high quality plays.
The next two games, it seems like Fan Hui did not perform as well as the first (as opposed to the computer being clearly better than him). Where the games played in a row?
Regardless, I'm looking forward to the games with Lee Sedol. I studied in his school in Korea, and personally know how hard it is to get to that level.
My assessment is that the bot from those games will NOT beat Lee Sedol. So train it hard for march :)
You can see that in the first game the bot played really solidly without risk-taking and didn't want to lose even a few stones. You could say that it played very conservatively but solidly. It won by a tiny margin (2.5) so Fan Hui probably concluded that the bot would win every game by a similar small margin if he didn't change the style of play. I'm sure that in the first game Fan Hui was sounding the bot out for strengths and weaknesses, seeing if it knew all the tesujis and josekis and whatnot.
So from then on you see Fan Hui trying to mix it up and play more aggressively and what is very interesting is that he got outplayed in every game, even to the point of losing a big group and resigning.
So - if you play conservatively and tentatively and solidly it'll beat you by a sliver, if you try to out-think it it'll nail you. At least at the 2dan pro level.
I'd be hesitant for calling a Lee Sedol victory ahead of time. We know that in chess Kasparov beat IBM's bot initially but then IBM tweaked and within a couple of years the bot was too strong. Even though go is much harder than chess I predict that if Google lose this time and if they don't lose by much they'll win the time after that.
The games were all played on separate days. As Fan Hui mentioned in the video, he changed his strategy after the first game to fight more, so that may explain why it seems his performance changed.
This bot has a very flexible style. It is at ease both in calm point-grabbing positions (first game) and large-scale dogfights (see the second one), where humans used to crush computers.
Lee Sedol is so strong at fighting, this is gonna be a great match between touch opponents.
Japanese 9-dan pros and former Japanese cup holders who played against CrazyStone beat it less than 80% of the time [1], while AlphaGo's win rate against it is 80%, according to a comment below by inufu, an engineer at Google DeepMind.
If transivity applies then AlphaGo is likely stronger than the average of those former Japanese champions, including Norimoto Yoda, who is currently ranked at 187th (about 300 Elo rating below Lee Sedol and 300 above Fan Hui) [2].
There's a saying in Go circles that there is a substantial gap in playing style or intuition between pros and even top-level amateurs. Whether that is true or not, AlphaGo has definitely crossed the threshold to pro-level play in Go.
By March 2016, Google DeepMind would have improved AlphaGo somewhat at least through self-playing and perhaps more processing power.
The game with Lee Sedol will be an interesting one to watch!
Just to clarify your comment so people aren't confused by the apparently low winning percentages, those are all 4-stone handicap games. (It's still an apples-to-apples comparison.)
I think the 80% win rate against CrazyStone is for the single-machine version. The distributed version won 100% of the time against CrazyStone and 80% of the time against the single-machine version.
2 dan pro is a much stronger strength than 2 dan amateur. The difference between a 2 dan pro and a 9 dan pro is usually just one stone handicap whereas the difference between a 2 dan amateur and a 9 dan amateur would be around 7 stones.
You need to keep in mind that professional Dan levels aren't purely based on ELO or some other objective measure.
You'll find lots of Korean pros who got awarded their 1p for going abroad and teaching. 8p and 9p in Japan, in my recollection, are reserved for those winning one of the major tournaments and defending a title there.
Wow, this is stunning. You guys beat a professional 2-dan player. That happened a lot sooner than expected. There's some kind of exponential evolution going on with AI these days.
Is AlphaGo being made available to the public? I'm a mediocre player, but I'd like to try a few games against it. Current synthetic players don't quite play the same as humans do, and it's a bit jarring. I wonder if the Google AI is more human-like in its style.
Interesting that DeepMind was using Google Cloud for compute. I imagine that the MCTS expansion can become massive. Any chance DeepMind may publish some of the internals about how many instances were used, how computation was distributed, any packages or frameworks used, etc.
And congrats on achieving this impressive AI milestone!
Thanks, cfcef! Implementation details are in the "Methods" section. Have started experimenting with small GPU ML cloud jobs and the costs do add up. Wanted to get a sense what a large job looked like and indeed, AlphaGo is gargantuan. 50 GPUs approx train time one month for the policy/value network. So, a Google R&D size budget would be a prerequisite ;)
In the recent 5-game match between Jie and Sedol a few weeks ago, it was decided in Ke Jie's favour by less than a single point in the fifth game. It literally would have come out differently if they'd used a subtly different (commonly used) scoring ruleset. It's not at all clear who's stronger at their peak.
Keep in mind that professionals are counting throughout the game, and are playing to win, not to win by a lot. So a 0.5 point victory may simply mean the victor was confident in their position and chose not to take unnecessary risks.
This is an impressive achievement. However there are many subtleties involved when humans play against computers. I think only time can tell how big a breakthrough this really is.
It is telling that AlphaGo only won 3:2 for the informal games. As a computer doesn't know the difference between formal and informal this seems to indicate that Alpha isn't truly above Fan Hui in strength. Also the formal games were played with fast game rules, which may be particularly advantageous to computers. Unlike chess go accumulates pieces on board throughout the game. Towards the end there are many stones on board and it is easy for human to err while the search space (possible moves) actually gets smaller for the computer and there is no doubt that computer has stronger book keeping capabilities. So to fairly evaluate human vs computer we may need new time rules different from human vs human games.
The paper does not disclose whether the trained program displays true understanding of game rules. Humans don't just use pattern recognition, they also use logic to evaluate game state. While this could be addressed by the search part of the algorithm the paper doesn't appear to give any indication on whether this was studied. For example, the board position strictly speaking does not determine the game state due to ko rules (so the premise of the paper that there is a valuation function v(s), where s is the board position, that determines game outcome is incorrect). It would be particularly interesting to see how the algorithm fares when there are multiple kos going on at the same time. Also it would be interesting to see how well the algorithm understands long range phenomenons such as ladder and liveness. With a million dollar challenge in the plan it is understandable the Google team may not want to disclose weaknesses of the algorithm but in the long run we will get to know how robust it really is.
From my experience playing against conv nets I would say if you treat your computer opponent as a human it would be like playing against the hive mind of a group of experts with infallible memory and it is not to your advantage. So one would be better off trying "cheat" moves that human experts do not use on each other and see how well the computer generalizes. Without search and with neural nets alone it is clear that computers do not generalize that well. So it would be interesting to see how well search and neural nets work together and if someone could find the algorithm's weak spots.
>a computer doesn't know the difference between formal and informal this seems to indicate that Alpha isn't truly above Fan Hui in strength. Also the formal games were played with fast game rules,
I'm not sure if I'm misunderstanding you, or you're misunderstanding the situation, but the informal games had faster time controls than the formal ones: the formal games had one hour of main time, while the informal games just had byo-yomi (3 periods of 30s: if you take more than 30 seconds for a move, you "use up" a period).
"so the premise of the paper that there is a valuation function v(s), where s is the board position, that determines game outcome is incorrect"
No, it isn't, any more than the claim that there's a position evaluation for chess is incorrect because of castling and capture en passant. A "position" isn't just where the pieces are on the board, but includes all relevant state information.
I believe that Lee Sedol would sweep all the games if he were to play Fan Hui under the same conditions, which makes the wait until the official match tantalizing. It's fair to say that AlphaGo has mastered Go, but there is a very large difference between a professional who has moved to the west and professional players who are competing at the highest level in regular matches. It's fair to represent Fan Hui as a master of the game, but misleading to represent him as equivalent to currently competing professionals. It is great that we'll get to see a match up against a player who is unquestionably one of the best of all time.
As a non-expert, may I ask (as the term does not appear in the paper): How valuable is the Shannon number in order to evaluate "complexity" in your context?
Since both numbers are out of the realm of brute-forcing, the bigger achievement is because of the more fluid and strategic nature of Go compared to chess. Chess is more rigid than Go, and playing Go employs more 'human' intelligence than chess.
Quoting from the OP paper:
"During the match against Fan Hui, AlphaGo evaluated thousands of times fewer positions than Deep Blue did in its chess match against Kasparov; compensating by selecting those positions more intelligently, using the policy network, and evaluating them more precisely, using the value network—an approach that is perhaps closer to how humans play. Furthermore, while Deep Blue relied on a handcrafted evaluation function, the neural networks of AlphaGo are trained directly from gameplay purely through general-purpose supervised and reinforcement learning methods."
"Go is exemplary in many ways of the difficulties faced by artificial intelligence: a challenging decision-making task, an intractable search space, and an optimal solution so complex it appears infeasible to directly approximate using a policy or value function. The previous major breakthrough in computer Go, the introduction of MCTS, led to corresponding advances in many other domains; for example, general game-playing, classical planning, partially observed planning, scheduling, and constraint satisfaction. By combining tree search with policy and value networks, AlphaGo has finally reached a professional level in Go, providing hope that human-level performance can now be achieved in other seemingly intractable artificial intelligence domains."
I will admit to not following AI at all for about 20 years, so perhaps this is old hat now, but having separate policy networks and value networks is quite ingenious. I wonder how successful this would be at natural language generation. It reminds me of Krashen's theories of language acquisition where there is a "monitor" that gives you fuzzy matches on whether your sentences are correct or not. One of these days I'll have to read their paper.
For language generation, AFAIK there is no good model that follows this architecture. For image generation, Generative Adversarial Networks are strong contenders. See for instance:
was it easy to convince the players to have a match with AlphaGo? or was there some reluctance especially now when losing is becoming more of a possibility at even strength?
I don't know about Go professionals, but this might be the last time a human can win against computers. (Or the first time computers will win all the time.) It's a strange honour in either case.
When can we hope to see some form that will allow the public to play against this even if it is pay to play for each game or a weaker PC version? I hope the system does not end up being put away like Deep Blue was.
Also, what kind of hardware results in this level of play and how is hardware correlated with strength here?
I'd love to put it on a public go server and will try to convince people :)
However, this will have to wait until after the match in March, that's our number 1 priority at the moment.
There are graphs in the paper showing how it scales with more hardware.
Deep Blue was only innovative in that it was specialized hardware for this type of search. The algorithms it used were well-established, and as there was no way to play it as a piece of hardware without great expense, there wasn't really a reason to keep it around.
Chess engines you can run today, for free, on your own laptop, are far and away better than Deep Blue (and any human), and I believe still don't reach Deep Blue's raw speed.
I'm curious: is there a Chess league for software? And if yes, how far are they already better (in ELOs) than humans if run on commodity server hardware?
I can't find the claimed ELO for Jonny (current champ) but Junior (previous champ) is listed at 3200+, Magnus' top rating, the highest ELO rating ever, is 2882 for reference
There is a lot of politics in chess programming but the bottom line is that Komodo is currently the strongest program followed by Stockfish (which is distributed under GPL).
In the paper you estimate that Distributed Alpha Go has ELO around 3200 (if I read the plot correctly.) According to goratings.org, Fan Hui is rated 2900 and Lee Sedol is rated 3515. Doesn't that mean you still have work to do before beating Lee Sedol?
The ancient Chinese game of Go is one of the last games where the best human players can still beat the best artificial intelligence players. Last year, the Facebook AI Research team started creating an AI that can learn to play Go.
Scientists have been trying to teach computers to win at Go for 20 years. We're getting close, and in the past six months we've built an AI that can make moves in as fast as 0.1 seconds and still be as good as previous systems that took years to build.
Speaking of which, what would a robot war look like? I imagine a large portion of the effort would be to hack/otherwise persuade the enemy robots to switch sides.
Here's what the US Army Research Lab thinks warfare will look like in 2050. Reading just the Contents is a pretty good TL;DR but basically augmented humans, micro targeting, lots of misinformation, and so much information that decisions have to be automated to the point that humans can only operate "on the loop" instead of "in the loop".
Politics. There are obvious strategies (bomb everything, if war still not over, build bigger bombs and GOTO 10), but that sort of demotivates humans and they stop the war.
It'll be interesting (as in the old fake Chinese curse interesting) to know what happens when less democratic power structures wage war. They are still constrained by economics (trade creates value, if they bomb the shit out of someone they might find themselves cut off from trade, see embargoes and sanctions on Russia) and the possibility of an internal struggle (civil war) is always there.
It will always come back to threatening human lives. You can always smash machines against each other, but ultimately it's pointless until the humans themselves are threatened with their lives.
I suspect it wouldn't be difficult to program a computer to beat humans at League, but no one has put real effort into it because of anti-cheat and the fact that it would be looked at more as "hacking" than an intellectual challenge.
With over 10 000 hours of Dota [1] under my belt I am fairly certain that even with perfect mechanical skills [2] a strong human player will still beat the A.I.
Even at the very top pro level, matches are constantly being won & lost purely based on the initial hero drafting phase. Calculating the optimal draft is way more difficult than Go. It's probably not an exaggeration to say that the search space scale difference from dota draft to Go is about the same as from Go to tic-tac-toe. Because it's not only about the 100+ different heroes grouped into combinations, but also every possible game that can happen then with those combinations.
Then once the game starts, the A.I. may be able to respond to actions extremely quickly, with perfect precision. But what should the responses be? This is not a simple thing to answer and humans keep taking completely different approaches as our understanding of the game keeps evolving.
Also, the very best dota bots can currently only beat absolute beginners who don't understand the game at all yet. It takes a few hundred games of practice for a human to go beyond the best A.I. currently available.
--
[1] A game similar to League of Legends
[2] Directly reading from memory and then directly calling gameplay functions, basically a perfect hack, would get you what is known as "mechanical skills". For example being able to last hit.
I understand what you're saying, but I think you fail to realize that a good A.I. is almost the same thing as a better human brain.
You can just train the A.I. by replaying thousands of professional matches, and then let it train by playing billions of matches extremely quickly. It can even play both sides at the same time, and try out millions of different strategies against each other. It doesn't even matter if there's a limitation on how fast it can click and press keys, since every single action will be perfectly optimal.
Not only that, but you don't even need to write a single line of code to tell it about the rules of Dota. Before you start the training, it doesn't even have to know what each key does, or what happens when you move the mouse. Neural networks are capable of learning all of this from scratch, basically by trial and error.
This is not your typical dota bot. Bots are not A.I.s, so this is a whole different ballpark.
I understand what you're describing and I belive that this will be possible in the distant future. I just haven't seen any evidence of this being even close to possible with todays technology.
I think the primary problem is the search space size. I've seen this type of learning work on simple 8 bit games, and it seems we may finally be at the stage to handle Go. However Dota has many orders of magnitude more different possible moves at any given situation. The total search space grows incredibly fast after every move.
Thus, I do think neural nets can eventually learn how to play well, it's just that there's simply not enough memory or processing power to achieve any success right now.
My point was that the actual game is what you need to compare - the choice of heroes is superficial compared to the complexity of the game itself, in either case. "Calculating the optimal draft is way more difficult than Go." is simply massively wrong.
The game field in a computer game will be quantized - even if it uses floating-point arithmetic that's still quantization. Conversely a go board can be scaled up or down without losing the feel of the game. If the grids were the same resolution then a given point in time you have many more choices in go because you can play literally anywhere.
"Calculating the optimal draft is way more difficult than Go." is correct because you can't calculate the optimal draft without calculating all the possible games that can happen. Every hero is unique, you can't really preserve anything from the game calculations of another hero.
There's a question of what the AI is doing - makes more sense with an FPS:
1. The AI is a program running on the computer, so the input is the state of the game. This is basically just an aimbot, and would be trivial to do - just make it wander around randomly and headshot everything instantly.
2. The AI is looking at the screen like a human player, and has to parse the screen data (we could even let them have direct pixel input, not a camera). This would be much harder.
For a turn-based game like Go or Chess, the distinction is vague because the CV required to parse a board is fairly trivial and orthogonal to the problem of strategy.
I understand that training this thing requires massive amount of computation, so has to be done on a massive cluster to do in reasonable time. Once they have it trained, though, what are the computational requirements like? Would it be feasible to run it on an ordinary PC?
Speaking of Go programs and computation requirements, one of the best performance hacks ever was done by Dave Fotland, the author of the program "Many Faces of Go", which was one of the top computer go programs from the '80s through at least around 2010.
He donated code from MFoG to SPEC, which incorporated it into the SPECInt benchmark. So, the better a given processor was at running Fotland's go code, the better it scored on SPECInt. Since SPEC was one of the most widely reported benchmarks, Intel and AMD and the other processor makers put a lot of effort into making their processors run these benchmarks as fast as they code.
Net result, Fotland had the processor makers working hard to make their processors run his code faster!
Hiya. Reporter here. On the press briefing call Hassabis said that the single node version won 494 out of 495 games against an array of closed- and open-source Go programs. The distributed version was used in the match against the human and will be used in March. Distributed alphago used in the October match was about 170GPUs and 1200CPUs, he said.
Pachi isn't in the same league as AlphaGo. Pachi can't win at all against AlphaGo without handicap. With a 4 stone handicap Pachi only wins 1% of the time against AlphaGo.
Fan Hui is 2p, so very skilled, but the ranking system goes up to 9p. To give a sense of how large a gap that is, there is and has only ever been one Westerner to achieve that rank, Michael Redmond. The article states they plan to face off against Lee Sedol 9p, and if they beat him in a no-handicap game, that will be as impressive as Deep Blue against Kasparov.
You'll want to watch this year's Computer Go UEC Cup[0] in March, with Zen and CrazyStone being the typical victors. (CrazyStone in particular has sustained a 6d rating on KGS, a popular Go server, but lately has been at 5d. In the past it has beaten a 9p with a 4 stone handicap, which is impressive, but that handicap is huge and it hasn't yet won with 3 stones.) Of interest this year is that Facebook is competing, and AFAIK they seem to take a similar approach as Google by training the AI using deep learning techniques and then strengthening it further with MCTS. In their public disclosures they claim to beat Pachi pretty often, which puts their bot around 4d-6d, it'll be interesting to see how it fairs against Zen and CrazyStone in the Cup and if it wins against a 9p.
This is not correct. In the professional ranks 1p is often no weaker than 9p because often the young players start out at 1p and are very strong because selection pressure is much greater. The 9p rank in Japan also came partly out of just playing a lot for a lot of years so that accomplishment was not as great as it seems. In any case, active professional players for the most part are not dramatically stronger than one another. The range for the most part is about 2 stones, maybe 3.
Even though Fan Hui is not an active professional in the traditional sense this is an absolutely huge accomplishment and leap in playing ability by computers. BTW, Michael Redmond is not particularly strong by professional standards.
(Edit: The correlation between strength and rank now as noted below is due to promotions more often coming from acheivements: If you win at X you get promoted to 7p immediately, if you win at Y you get promoted to 9p. You cannot win a big tournament and keep your low rank. Here is an example of someone who went from 3p to 9p in one match: https://en.wikipedia.org/wiki/Fan_Tingyu. But professionals cannot give each other 6 stone handicaps when one is 9p and the other is 3p)
This is not quite correct either. 1p is of course generally much stronger than an arbitrary amateur dan (even some (many? most?) 9d amateurs) and you can advance up the pro ranks quickly by winning certain games, but you can see the histograms yourself that while there's a clump of 9ps everywhere but China there's still a distribution. http://senseis.xmp.net/?ProfessionalRankHistograms Another problem is you don't lose your 9p rank once you earn it. It would be nice if there was an international Elo system tracking all the 9ps of various countries to rank them properly... Maybe someone's tried to calculate rankings independently? Still, I think it's pretty uncontroversial that someone who's been a lower-rank pro for longer than a few years is going to be significantly weaker than their higher-rank peers.
There are two major attempts at international ratings today. Dr. Bae Taeil does ratings for Korea, and Remi Coulom produces ratings independently, based on the database at go4go.net (http://goratings.org is the site).
Taeil's method seems unusual, though it may well be justified . I think he's using a relatively complete database of games, but I don't know for sure. Coulom has a very well regarded mathematical model, but we know that there are some gaps in the database, which a) may skew international comparisons, and b) may result in inaccurate ratings for players with few games in the database (but those players are usually not top players in the world).
See my comment below: there are very few 1p players near the top, unsurprisingly.
I feel like this is mostly misleading. Look at a list of top players (goratings.org). The top fifty is mostly players 5p and up. There are a few 3p or 4p Chinese players running around, and apparently even 1 1p (Li Qincheng) but by and large, there is a relation.
Yes. He's a case where you'd make exactly the right assumption if just told his rank.
Though I should warn: as a low level pro who moved to Europe, this database has very little data on him. His European results indicate that he's stronger than Pavol Lisy, Alexander Dinerstein, and Mateusz Surma, who all rate higher than him here.
I don't think it's overblown at all. They make it clear he is the European champion and never imply he is the best player in Go, just obviously very good. Looks like he has won the last 3 European championships in a row (https://en.wikipedia.org/wiki/European_Go_Championship).
Overall I think it was a well balanced article highlighting an impressive achievement, not sure why you felt the need to diminish it as not significant?
The version of AlphaGo used in the paper beats CrazyStone about 99% of the time. Even if we give CrazyStone 4 stones handicap, AlphaGo still beats it 80% of the time.
Very nice. How come you guys didn't enter the UEC Cup? (Edit: rereading this it kind of sounds snarky, not meant that way. Really impressive if you can kick CrazyStone's teeth in...)
From the paper, no that's with "single machine AlphaGo" which is 48 CPUs and 8 GPUs. Distributed AlphaGo beats Single Machine AlphaGo 77% of the time. (Distributed AlphaGo being 1202 CPUs and 176 GPUs.)
It's worth clarifying your use of the "p" rating scale alongside the "dan" rating scale. Essentially, there's an "amateur dan" scale which is what's measured on public servers like KGS. Under this scale essentially all professional Go players rank at the maximum rank, 9 dan.
In parallel there's the professional ranking system which also uses the title "dan" but is bestowed by the professional Go associations of every country. These rankings are symbolic instead of quantified, although generally higher professional dan ranks cannot be anything but corresponding to higher skill.
So, a "European 2d" is a professional rank which may or may not have a good translation into a quantitative scale like ELO or "KGS dan" (I actually don't know). Generally, my understanding is that European professional rankings lag behind Asian ones as well by some amount.
You're generally accurate, except that Fan Hui is a Chinese 2p from the Chinese Pro Association who has played in Europe for several years. He's still generally a bit stronger than the homegrown European professionals.
> In the past it has beaten a 9p with a 4 stone handicap, which is impressive, but that handicap is huge and it hasn't yet won with 3 stones
4 stones is a huge handicap? 4 stones is a huge handicap if he was playing against a 6p maybe. But with 4 stones handicap anything near 2-4p would still have a hard time.
Related question: Has anyone ever experimented with taking a really strong engine for one of these games and then simulating the sort of mistakes that humans make (overlooking certain positions, looking ahead fewer steps, etc) to try and simulate different skill levels of opponents?
I don't play Go, but I dabbled a little bit in Chess, and it seems like the engines with "difficulty settings" don't really make the sort of mistakes that humans make. They can, of course, just over-prune the look-ahead tree, but that's not how the human brain works.
I recall in ~1990 I read a paper which described exactly this (attempt to make the same mistakes as humans). The argument for doing so in the paper was, IIRC that computers beating humans at chess was just a matter of time, but no longer interesting for AI research.
It seems quite likely that the book that paper was in was Computers, Chess, and Cognitionhttps://www.springer.com/us/book/9781461390824
(revised contributions from the WCCC 1989 Workshop New Directions in Game-Tree Search, May 29-30, 1989)...and was probably the one by John McCarthy (yes, that McCarthy)
Edited to add: Found it! The paper I recall reading was 'Artifical Stupidity' by William Hartson, in Advances in Computer Chess 4 (1986) https://chessprogramming.wikispaces.com/Advances+in+Computer... ... perhaps nothing concrete came of this though - Hartson was a player and author, not an AI researcher.
Yeah, IIRC most engines just blunder every once in a while depending on the difficulty setting. rather than playing consistently less strong moves, or something like that.
it's on my nebulous "someday" todo list to add this sort of thing to quackle [one of the two current leading-edge scrabble programs, elise being the other one]. ideas include simming opponent play with a reduced vocabulary, attempting plausible phonies, and adapting the search tree to account for how well opponent's previous moves matched the simulated best move for them. on the other side of the coin, that would also allow quackle to have realistic skill levels (right now the only skill adjustment is essentially "stop thinking after a certain time and just play the best move found till then")
AlphaGo played Fan Hui using 1202 CPUs and 176 GPUs.
It's strenght was assessed to be 3140 ELO on the scale used by www.goratings.org/ (BTW, this is different from the European ELO system)
That would put it at #279 in the world. Fan Hui is number 633 at 2916. AlphaGo's next opponent will be Lee Sedol, #5 at 3515.
The single computer version of AlphaGo is estimated to be 2890 ELO, which would be #679 in the world. It's is closer to what we might be playing against on our laptops and phones but still 48 CPUs and 8 GPUs.
Doing some arithmetic on that the increase in ELO score with number of processors seems to roughly match 60 points per doubling of computer speed suggested in Wikipedia's chess article[1]. As Sedol is 375 points ahead that would suggest they'd need about 80 times as many processors to equal him with current software. Maybe they'll improve the software. Or crank up a lot of machines.
This is a wonderful achievement! I wrote and sold a Go playing program for the Apple II in the late 1970s, and I have had the privilege of playing the woman's world champion and the champion of South Korea -- so I have some experience to base my praise on.
I never thought that I would see a professional level AI Go player in my lifetime. I am taking the singularity more seriously.
I'd used minimax before but not heard of Monte Carlo tree search. It sounds interesting.
From [1]:
> Although it has been proven that the evaluation of moves in MCTS converges to the minimax evaluation, the basic version of MCTS can converge to it after enormous time. Besides this disadvantage (partially cancelled by the improvements described below), MCTS has some advantages compared to alpha–beta pruning and similar algorithms.
Unlike them, MCTS works without an explicit evaluation function. It is enough to implement game mechanics, i.e. the generating of allowed moves in a given position and the game-end conditions. Thanks to this, MCTS can be applied in games without a developed theory or even in general game playing.
> The game tree in MCTS grows asymmetrically: the method concentrates on searching its more promising parts. Thanks to this, it achieves better results than classical algorithms in games with a high branching factor.
> Moreover, MCTS can be interrupted at any time, yielding the move it considers the most promising.
Aside from the tree search perspective, MCTS can be thought of as an extension of upper confidence bound approaches for multi armed bandit problems. This stuff is all about exploration vs exploitation tradeoffs - which is something quite different to Minimax / alpha-beta search - those prune branches when they logically prove that evaluating the branches cannot improve the decision of which move to make.
A multi armed bandit problem offers you one decision to pull one of n levers, so it's like a wide shallow game tree where the root has n leaves.
I wrote a MCTS backgammon move recommendation engine/AI for a class in college. It was a lot of fun.
It was easy to ramp up to many many cores.
I have no idea how to play the game well, never learned, but because the win/lose rules are so clear and using this method, no need to build in game/move theory. Still got great results compared to "advanced" players' games.
I wrote a MCTS Ultimate tic-tac-toe engine[0] over the last little while, and likewise don't know how to play the game well. Once thing I have been mulling over in my mind but haven't really explored is training an NN on the game tree produced by long searches and somehow extracting strategies a la DeepDream. I don't have much experience with NNs though so no idea if it even possible or what the extracts would look like.
If anyone would find this interesting work to collaborate on feel free to contact me. The MCTS part is done.
When reading such news I always get the feeling that the game is stacked in favor of the computer. Not in the sense of rules, but in the sense of media coverage and interpreting the results. It seems like a lot of people in AI and IT are pretty desperate for validation of their notions about human (un)intelligence.
Watch 'Game Over: Kasparov and the Machine' for a good description of what I mean. And yes, I bet many people here hate that movie. That's exactly what I'm talking about.
There are a lot of relatively simple chess programs that can beat most amateur players. Personally, I cannot beat the Go program I have on my cellphone. (I am pretty horrible at the game.) None of that seems newsworthy. And yet after 5 wins against a pretty strong (but not world's best) Go player, and it's suddenly a "breakthrough" that changes everything.
If some new chess player have beat Deep Blue 4 to 2, would we revert the perceived status of AI to what it was before that match with Kasparov?
I'm not saying this advance in AI is unimpressive. I just don't see a fair evaluation of what it means. There is too much one-directional hype.
> If some new chess player have beat Deep Blue 4 to 2, would we revert the perceived status of AI to what it was before that match with Kasparov?
Not saying I disagree with much of your comment, but I think the answer to that question is actually, "yes". As a middling chess player, I think it would have been impossible for "a new chess player" to have beaten Deep Blue 4 to 2, and these days, it requires near-best-in-the-world prowess to pick up any games at all off the best chess computers, which I think proves that it wasn't hype-y or unfair to claim that the machines have beaten us at this.
Hype is hype, whether it turns out to be justified or not. Not all people can beat even a trivial Go program.
You may be missing my larger point. It is not about whether humans can beat the best chess or Go computers. It is about a very strong bias in favor of algorithms in general. The dynamics around human vs computer chess matches simply demonstrated this bias in an easily percievable form. Who cared about Kasparov's later victories and draws against other state of the art programs? They received very little coverage.
Chess is just a game. The important thing to consider is what will happen when algorithms start to competre with people in more complex and ambiguous areas.
I understand your larger point, but that's just how news works: you report on the new stuff, not the old stuff. Humans beating computers at chess was old news when Deep Blue happened, so it wasn't worth reporting when that happened. Nobody talks about computers beating humans at chess anymore, because that's old news now. It would be (really big) news again if a human started beating the best chess computers again.
It is old news for humans to beat computers at Go, but the reverse is still new news.
I agree with you that people who infer an impending AI takeover of the world from computer success in specific games are just being silly.
Well, I think that the hype is one-directional because the default assumption is that machines will lose to humans in tasks that involve reasoning. We know that there's no computer that can learn to put away the dishes after watching a person do it, so there's nothing to talk about and nothing to hype.
It's a milestone in the progress in AI. It's been nearly 20 years since Deep Blue beat Kasparov and an obvious next goal was Go. They're not quite there yet but it looks like next year.
There was a statement in the movie that went something like, "After Deep Blue was announced the winner, IBM's stock jumped 15%" I took a look at the historical data Yahoo(?) had for that date and there was no spike at all for IBM.
They frame it as some huge setup by IBM, but to me, it just seemed like Kasparov got psyched out and thrown off his game.
Not just in your opinion. The search space is exponentially larger in go because you can place a piece on any unoccupied square.
"The search space for Go's game tree is both wider and deeper than that of chess. It has been estimated to be as big as 10^170 compared to 10^50 for chess, making the normal brute-force game tree search algorithms much less effective. "
To provide a little more perspective. The estimated number of atoms in the observable universe is about 4*10^80. So we have no hope of ever being capable of "solving" go by mapping out every possible game state AKA "brute forcing" the game.
Personally I think its not nearly as impressive as state-space comparisons would have you believe. The initial difficulty in cracking Go was because we were using standard game AI techniques that works on games with a much smaller state space. Once we developed methods specifically for Go, we started making progress in leaps and bounds. But this doesn't represent fundamental progress in the AI of games, but rather a recognition of using the wrong tool for the job.
The difference between Chess and Go is critical here. I don't know if there's any formal analysis of this sort, but the difference seems to be that in Go multiple paths to the same board state are very similar in evaluation. And so sets of moves can be evaluated as a batch or stochastically. I think it was an algorithm that exploited this property that saw one of the first significant jumps in computer Go strength. Contrast this with chess where different paths to a given board state vary so widely in evaluation that you must evaluate them all. I suspect it is this property that allows Google's technique to be effective in evaluation positions. But I don't see it extending to games that aren't similar to Go in this regard, like chess.
> in Go multiple paths to the same board state are very similar in evaluation
Not at all, different paths to a board state also have widely varying evaluations in Go. Joseki is a great example of this, especially since these sequences have been thoroughly analyzed. In most joseki playing any move out of order will leave weakness and a skilled opponent will not just continue to play the pattern out of sequence.
You're right of course. the branching factor of Go really only shows how much more difficult it is to blindly brute force than Chess. It's worth saying that neither game is close to being solved completely (as in mapping every possible game state). Creative techniques must be used in both games to cull out large chunks of the search space.
I definitely see how the more simplistic movement rules (and lack of variety of piece types) in Go would make it easier to cull off large swathes of the search space. Despite this my gut tells me that the branching factor is still the best benchmark we have (and easiest to convey) for comparing the computational difficulty of solving a board game with an adversarial search algorithm.
"Some of the factors responsible for the game tree's enormity are shown below. (A comparison to chess is shown in brackets)
the size of the board : 19x19 (8x8)
the average number of moves per game is around 300 (80)
the average branching factor at each turn is around 235 (35)
players can pass"
It's been 10 years since I did the math for this in my AI class but a simplified example would be to compare the start of each game. a go player has 19^2 or 361 possible moves. then on the second turn there are 360 possible moves (since one space has been taken up by the first move) so after two moves there are 361*360 or 129,960 possible go board configurations.
a chess player is more restricted by the smaller size of the board and the way pieces are permitted to move. so there are a total of 16 possible pawn moves on the first turn plus 4 possible knight moves leaving 20 total moves available. On the second turn the second player is similarly resticted to 20 possible moves causing the total number of chess board configurations after two moves to be 400. Once more room opens up on the chess board the branching factor does increase a bit but it never comes close to the hundreds of potential moves at any step during a go game.
this "branching factor" is the key to the size of a search space. in chess it gets messy to calculate due to the variety of piece types and their allowable movements but it's easy to see that at any given board configuration there are fewer possible moves. to calculate the search space you multiply the number of potential moves at each step until every board configuration has been accounted for.
I think being difficult "for computers" is a misleading concept. Go is more difficult for our current algorithms than chess, at least in part because we've focused on chess for much longer.
Sooner or later we may discover some new algorithm (this deep-neural-net-based approach may be it) which will reduce Go to be as tractable as chess, or even more so.
It was more difficult for computer programmers to program an ai which reach a human top level, and for this level you have to throw much more hardware in it.
Until it isn't. Once humans are handily beaten in both games, it's kinda hard to tell which game is harder (I guess you could compare the amount of computations needed to beat the top human player, but that does not let you distinguish if it's harder for computers or conversely easier for humans.)
This did not happen overnight. The big computer companies have been building AI labs for years, decades in case of MicroSoft. They've been revisiting Go for the past few years too. One of Marvin's early students advises Googles R&D. These companies have raided every major university. If Marvin was still alert in his final years, he would have known some about this.
Don't think many have said it here, so just fyi, to me this does not seem like a big 'breakthrough' in AI so much as a good demonstration of smartly combining existing current ideas. Quoting from a developer of a monte-carlo tree based GO bot (http://www.sciencemag.org/news/2016/01/huge-leap-forward-com...):
"Coulom agrees, but he notes that there isn't one key new invention that makes the whole program work. "It's more like a great engineering achievement," he says. "The way they put all the pieces together is really innovative.""
Not to disparage the result, but just saying it seems inline with the progress in recent years of using deep neural nets for reinforcement learning with games.
>Don't think many have said it here, so just fyi, to me this does not seem like a big 'breakthrough' in AI so much as a good demonstration of smartly combining existing current ideas.
Well yes. But good engineering is still impressive.
People occasionally ask me about signs that the remaining timeline might be short. It's very easy for nonprofessionals to take too much alarm too easily. Deep Blue beating Kasparov at chess was not such a sign. Robotic cars are not such a sign.
This is.
"Here we introduce a new approach to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves... Without any lookahead search, the neural networks play Go at the level of state-of-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0."
This matches something I've previously named in private conversation as a warning sign - sharply above-trend performance at Go from a neural algorithm. What this indicates is not that deep learning in particular is going to be the Game Over algorithm. Rather, the background variables are looking more like "Human neural intelligence is not that complicated and current algorithms are touching on keystone, foundational aspects of it." What's alarming is not this particular breakthrough, but what it implies about the general background settings of the computational universe.
Go is a game that is very computationally difficult for traditional chess-style techniques. Human masters learn to play Go very intuitively, because the human cortical algorithm turns out to generalize well. If deep learning can do something similar, plus (a previous real sign) have a single network architecture learn to play loads of different old computer games, that may indicate we're starting to get into the range of "neural algorithms that generalize well, the way that the human cortical algorithm generalizes well".
A number of commenters are talking about how the human professional beat was well below the world champion. This is entirely missing the point. Beating the best human is an entirely arbitrary threshold, which is why Deep Blue vs. Kasparov wasn't a great sign per se. There's probably nothing computationally distinguished about the very best human versus a very good human - the world champion isn't using a basically different algorithm. What matters is the discontinuous jump, how it was done, and the absolute level of human-style competence achieved.
This result also supports that "Everything always stays on a smooth exponential trend, you don't get discontinuous competence boosts from new algorithmic insights" is false even for the non-recursive case, but that was already obvious from my perspective. Evidence that's more easily interpreted by a wider set of eyes is always helpful, I guess.
I hope that everyone in 2010 who tried to eyeball the AI alignment problem, and concluded with their own eyeballs that we had until 2050 to start really worrying about it, enjoyed their use of whatever resources they decided not to devote to the problem at that time.
""neural algorithms that generalize well, the way that the human cortical algorithm generalizes well"."
I think we have already been seeing this for years with image recognition, speech recognition, and other pattern recognition problems. As with those problems, playing Go is one of those things you can easily get heaps of data for and formulate it as a nice supervised learning task. The task is still spotting patterns on raw data with learned features.
However, the current deep learning methods don't (seem to) generalize well to all that our brains do - most of all learning to do many different things with online small input of data. I have not seen any research into large scale heterogenous unsupervised or semi-supervised learning with small batches of input - these big neural nets are still used within larger engineered systems to accomplish single specific tasks that require tons of data and computing power. Plus, the approach here still uses Monte Carlo Search in a way that is fairly specific to game playing - not general reasoning.
Clearly this is another demonstration Deep Learning can be used to accomplish some very hard AI tasks. But I don't think this result merits thinking the current approaches will scale to 'real' AI (though perhaps a simple variation or extension will).
It seems to me that images and sounds are 'alike' in a way that doesn't (on its obvious face) expand to include Atari game strategies and evaluating Go positions. In which case generalizing across the latter gap is more impressive than a single algorithm working well for both images and sounds.
The difference isn't easy to describe, but one such difference would be that a single extra stone can change a Go position value much more than a single pixel changes an image classification.
I think his point is that it's very easy to create a lossless input representation of the Go board, and the ultimate loss function is obvious. We're then left with a large sequential prediction task. Previous learning algorithms were stumped by the non-linearities, but this is exactly the situation where deep learning shines.
The problem changes dramatically when the AI is supposed to take arbitrary input from the world. Then the AI needs to determine what input to collect, and the path length connecting its decisions to its reward grows enormously.
I still agree with your take though: there's an important milestone here.
> The difference isn't easy to describe, but one such difference would be that a single extra stone can change a Go position value much more than a single pixel changes an image classification.
A CNN can still distinguish extremely subtle differences of various animal breeds, exceeding human performance in such tasks. Why was that advance not a warning sign? The rotational-translational invariance prior of the convolutional neural network probably helps because, by default, local changes of the patterns can massively change the output value without the need to train that subtle change for all translations. Also, AlphaGo does a tree search all the way to the games end, which can probably easily detect such dramatic changes of single extra stones. Reality is likely much too unconstrained to to able to efficiently simulate such things.
> "Human neural intelligence is not that complicated and current algorithms are touching on keystone, foundational aspects of it."
Of the algorithms used in this work, which would be touching on foundational aspects of human intelligence, in your view?
Thinking about the variant of MCTS used in this work, for example, it's not clear to me that tree search, no matter how clever, touches much on human cognition, at least significantly more than deep blue did.
On the other hand, the idea of bootstrapping a network with huge numbers of expert interactions before 'graduating' it to more complex training and architectural enhancement might turn out be an important part of the Game Over algorithm, as you call it. Even if it doesn't resembles much how humans beings learn.
The MCTS part of the algorithm per se doesn't seem super-important AFAIK from my brief read of the paper, except insofar as it shows that the 'neural' part of the algorithm was meshing well with a broadly consequentialist idiom, which is important. Another way of looking it at is that the neural algorithm was able to well 'steer' the MCTS part, and it's the neural steering which is important.
It's quite a low-key affair (no GPU clusters and so on) but what I think is remarkable is that it described a fairly complex cognitive (neural) architecture that was able to mimic certain kinds of child-level cognition with TINY amounts of training data. Instead, human supervisors guided the evolution of its cognitive strategy in a white-box kind of fashion to encourage it to answer questions as children did.
In many respects (like training data volume, layer depth, and so on) it couldn't be further from the current deep learning trends, and yet it seemed much more along the lines of what I imagine an actual AGI would be doing, especially one we hope to control.
Great share, this is quite impressive. The system doesn't require much training input to go from blank slate to conversation-capable (1587 input sentences -> 521 output sentences; see Appendix S1 for examples). This high learning efficiency might imply that it resembles language-processing architecture in humans. At the low level it's neurons, but organised and connected in a specific, planned way.
A key point is that the central executive (the core, which controls the flow of data between slave systems) is a trainable neural network itself, which learns to generate the "mental actions" that control the flow of data between slave systems (like short-term and long-term memory), rather than rely on fixed rules to control the flow. This allows the system to generalise.
(Some of the training itself doesn't seem to be that similar to training a human child, though. It's instead tailored to the system's architecture.)
I'm excited to see how much richer this and similar systems will become, as researchers improve the neural architecture and processing efficiency. Will we see truly human-like language and reasoning systems sooner than expected?
As one who has made arguments that sound similar to, but are importantly different from, the ones you criticize, I will clarify:
1. I don't claim everything always stays on a smooth exponential trend, but that things do so on a large scale as small variations average out. It doesn't surprise me that someone managed to get above-trend performance at Go. However, I predict this will not lead to above-trend GDP.
2. I don't predict we will get human-level AI in 2050. I predict we will probably never reach a tech level that would make human-level AI a possibility. The more people start fretting about the 'AI alignment problem' (which we would in any case no more be able to solve today than Leonardo da Vinci could design a fail-safe nuclear reactor), the lower the probability that we ever reach that tech level. Conversely, though a small thing, this news of continued incremental progress makes me a tiny fraction of a percent more optimistic.
Update: For the record, I've now bet my $667 against a friend's $1000 that AlphaGo does not beat Sedol in March. Mentioned to help distinguish between the concepts of long-term problems and short-term problems.
Maybe because "the AI alignment problem" is a classic academic inside baseball term for what would probably resonate with a larger audience better as "the Skynet apocalypse".
Skynet fights with lumbering robots and doesn't seem all that smart. If you want a picture to symbolize what we're worried about, don't imagine a picture of a Terminator robot with glowing red eyes; imagine a picture of the Milky Way with a 30,000-lightyear-diameter sphere gapped out of it, centered on Earth's former position.
Don't worry, it'll take at least 15k years to blast a hole that big, unless the AI is really smart.
And life seems to be quite resilient and eerily altruistic at times.
We still haven't wiped this planet clear despite having enough nukes to do so. Some people put their military careers and the security of their countries at risk refusing to launch nukes despite snafus higher in the command chain.
And then you have all those westerners whose day jobs became so meaningless, detached from their environment and sometimes outright hostile to other humans that even if they don't commit suicide, they willingly stop breeding and openly talk about replacing themselves with more down-to-earth folks who have their priorities right: food, children, food for children, and maybe then some little AI R&D, although who cares about that if you can have more children.
On the positive side we might end up with something as fundamentally pleasant as the Culture (who are supposed to be over 10,000 years more advanced than us):
I think something like the Culture represents the best case for humanities long term future as this is almost certainly going to include AIs of far greater powers than us bags of meat.
If anyone wants to play a game of go, regardless if you are a seasoned amateur or a complete beginner, please send me a message on http://online-go.com (username mongorians)!
If you have no prior experience I would be happy to teach you the basics -- it is truly an incredible game.
Computers have calculated faster than humans since forever and that doesn't seem to bother the human psyche. If you read the paper you will find very specialized and custom-tuned algorithms carefully devised by humans that take advantage of computers' calculation speed, so it's really the achievement of a different group of human (computer specialists instead of go players), why should that bother people?
Go/weiqi is actually a computation heavy game even for human players. Top players routinely evaluate many moves and count positions. This can be stressful and error prone during time control periods when they have very little time to make moves. Top players routinely lose games that they are leading under time pressure. This is to the advantage of computers as they are reliable and actually need to search less during endgames. Unlike chess go accumulates pieces on the board as game goes on. It would be more interesting to have humans play computers with more generous time rules.
Humans still devised and implemented the algorithms those computers are using. I'd count that as a win for the meat side.
Then there are things like http://www.sscaitournament.com/ going on. The bots there are still quite bad compared to competent human players. There is still hope :)
Consider that there are people who are smarter than you. There are people that are better looking than you. There are people who are better at sports than you. For a lot of people in the world, there are people who are better at all of those things than they are. Does it bother them?
Obviously it does for some people, but in the end you are who you are. Whether it is someone else, or even something else, being better is not necessarily as important as you might imagine.
I remember seeing a label on a box of very high quality glassware. It said something like, "The very slight imperfections in this glassware are proof of its hand made nature". We are free to value anything we want -- even something that is objectively inferior.
It doesn't need to be a competition. The better computers get, the more we can use them for to do for us. If we can let computers do everything then that leaves us free to do anything. Besides, maybe they'll get really good at comforting our hurt psyche, like the machine intelligences in banks' culture novels.
What any individual, or even large group of human minds, can achieve is a very tiny fraction of what can be achieved given enough time and scale. And if that isn't nihilistic enough, it probably means absolutely nothing given even more time.
So, no, I think that fear is a silly byproduct of our own biases with or without the possibility of superior intelligence.
I'm sure when the AI overlords really take over, they'll give you a little box for you to excel in :)
This comment seems dismissive. I will never be good at chess or go, let alone the best at either. That in itself doesn't bother me. I'm fine with that. But to have your entire species demoted to second place? That has to have a negative psychological effect, and not just on the individuals with raging egos.
i don't think it has to be a raging ego. Everyone's like this to a degree: we're trained to think this way from birth. Why can't one enjoy their virtues without comparing them to others? i study mathematics for fun; i was half as good as the kids who ended up going to grad school in my class; only a few of them will end up professors. The fact that professors of mathematics completely outclass me does not impact my enjoyment of my own mathematical ability at all. In fact, i am thrilled when i get to talk to an expert in the fields i'm interested in.
Being second place in a cooperative system is not bad. People don't mind that. Being second place in a competitive system is bad. People are scared that machine intelligence will be in competition with them; i think it's better to just reframe the relationship.
If the bot is trained through human play and trains against itself, would it not only be as good as the best human? This doesn't seem like computerized intelligence, this seems like advanced human intelligence. If you get the thing to play hundreds of games against 9P dans it will probably eventually beat them yes. But the theoretical limit to skill in that game is still probably a big margin above 9P. How will we make it hit that mark?
Wow! As a casual (and very bad) go player, this is astounding!
Winning 5 games in a row at even strength against a 2-dan professional really is huge. This is not something that I was under the impression would be technically achievable for many years.
Does anyone know how strong FanHui is? I know Google ranks him as professional 2Dan but the rank may not mean the same thing as it is in China/Korea. I know he's the European Champion, but I've heard there is a large gap between the Asian professionals and their counterparts elsewhere. The ranking in general is also very skewed because it depends on the number of tournaments people win and which specific tournaments. Its a lot like Tennis where some tournaments have a rating of 1000, while others only have 500.
So the thing is, there really are only Asian pros. (Having said that, here is a list[1] of Western pros). It's only in the last few years that the American Go Association could start certifying[2] Go players as pro. And to the best of my knowledge the European Go Association still does not have this power. Go, amazing game though it is, was relatively unknown outside CJK (China, Japan, Korea) until post-WW2. But globalisation has meant that the rest of the world is finally hearing about this beautiful game. So it migrates to the US via Japan (it's called iGo^ in Japan, which is why it is called Go in the West) and to Europe via Russia and the US. The US is a bit in advance of Europe but not by a huge amount. Arguably Eastern Europe is stronger than Western, as in Chess.
So basically Fan Hui[3] is 2p Chinese rank which makes him pretty solidly 2p. He's the three time European champion which should give you an idea of the strength of players in Europe. The top European players like Alexander Dinerchtein who is roughly the same strength as Fan Hui has an official 3p rank, Korean I believe. So Fan Hui is some ways off the top 9p players in CJK but he was convincingly beaten by AlphaGo so I'd be hesitant to try to infer its rank from this one performance …
EGF and AGA have only just in the last couple of years started giving professional rankings, so most professionals in Europe and the US still have their professional rankings from China, Japan, or Korea, rather than European and American associations.
He is 2p in China, so his rank is actually chinese. In Europe (EGF) he plays as 7d. An oddball rank would be 2-3p in China, 3-5 or so in Japan and maybe 2 in Korea. But just guessing, I'm to far to properly compare, the only gauge is when he plays other "non-EU pros" in the EGC.
Forecasters will get scored, so you can see who's right and who's not. Hopefully we'll get some good technical debate on there too (disclosure: I'm affiliated with GJ Open site)
For people interested in the use of neural nets to evaluate board position, it might be worthwhile reading up on TD-gammon which this work builds Off of
If this trounces the leading human this spring, I'll be really excited if the AI experts start working on physical games. The RoboCup (robot soccer) hasn't had nearly the the improvements that more logical games have (e.g. chess and Go). I hope that combining game playing with physical movement with vision, and communication is the next frontier.
Very impressive work. When the neural nets for go paper came out a year ago I played around and implemented it (but never got good results), and was very inspired by using neural networks for Go. As an amateur fan of the game, I was particularly intrigued by the neural network's more human-like understanding of the game (if you look at earlier layers, you see patterns that make sense to a Go player).
That said, Fan Hui, who Google just beat, is a 2p player, which while incredibly strong is still far away from Lee Sedol or Gu Li. But given the pace of NN progress in recent years, and the incredible resources Google has been devoting to it, I wouldn't give them more than a decade or two.
So the team beat Fan Hui, a rank 2 dan professional player. To put that in perspective, the highest ranked professionals are 9 dan corresponding to many hundreds higher score in ELO. Still a long way to go to topple the world champion!
I'm not sure if that's always the best way of looking at it since I think most professional dan titles are accomplishment based instead of a measurement of comparative skill.
That said, Europe is not the top of the world in Go playing, so while it's notable that Fan Hui represents professional play above beating 9ds on KGS, it may still be a long slog before it can approach Chinese, Korean, or Japanese professionals.
I don't see how it should affect the perception of his skill. Presumably AlphaGo could beat any player ranked at his skill level or lower, in which case AlphaGo beating him says no more about his skill level than his ranking does.
If Google really wants the March matchup with Lee Sedol to be fair, it should make available a number of AlphaGo's training games for Sedol to evaluate. Otherwise, we have a similar situation to Deep Blue vs Kasparov, where Deep Blue had access to hundreds of Kasparov's games but Kasparov did not have access to Deep Blue's past games to evaluate. This is a major disadvantage for the human player.
This is a good place as any to ask this. Do hackernews readers think that computers will eventually have a "mind" and "consciousness" ? Or will it just be simulated?
Will what be simulated, is the question. When we speak about consciousness, we are referring to something internal which we intuitively know about ourselves, but cannot directly observe in others. When we have a good artificial consciousness, the external behavior will be convincing. Then we can ask: is the visible behavior just some simulation (like a mindless script being followed) or is there an internal reality which is the same as our consciousness.
Answers to the question will probably come from the convergence of two fronts: understanding better what the brain does, and correlating that with an understanding of what is going on in the machine. That is, maybe it will be shown that the kinds of states and state changes in the two are very similar (or can be mapped to each other in some way). In other words, we externalize the structures and states as much as possible and identify an equivalence. Then we can proclaim that what is going on in the machine isn't just some elaborate script; the states are actually human like: the external behavior is underpinned by apparently the same stuff that makes people tick.
We will also simply be convinced on an emotional level, due to the machines leading complex lives, demonstrating traits such as shame, regret and self-loathing, as well as joy that looks genuine. There will be depressed AI's that require therapy (possibly from humans), and ones that choose to terminate themselves. People who are not computer scientists or philosophers will be convinced that the machines are conscious, by the ways in which their lives intertwine with those of the machines and the relationships they form; only a dwindling group of skeptics will remain, even long after there is no space on the field where the goal-posts can be moved any farther.
It's at least plausible that we will one day expose external behavior that we're unable to distinguish from humans'. So you're right, it all comes down to what's going on in the machine.
What's not clear to me is that, in order for us to ascribe AIs the moral value of consciousness, there has to be an isomorphism between the digital computer's computations that generate their external human-like behavior and our brain's biological computations that generate our external human-like behavior.
For one, even if there is such a mapping, there may be difficulty in expressing it in ways our minds can understand. (Side note: would we have consult with the seemingly-sentient, newly created AIs to act as proof assistants to decide whether we humans should ascribe moral value to them?) So inability to for humans to confirm the existence of such a mapping isn't really that indicative of anything.
For two, more fundamentally, suppose that parts of the AI consciousness algorithm do map to parts of the human consciousness algorithm, but other parts don't. How do we figure out which are "key" to the One True Consciousness? AFAICT even in theory the only way to figure that out would be for an individual human to sign up for exotic (and impossible) surgeries to radically modify how their brain works to emulate the parts of digital computations that don't map to biological ones.
Honestly, it seems almost bigoted to me to say there's only one way for consciousness to exist, and that's our own. Biological consciousness supremacy.
If this is indeed the way it does play out, and what you have said all sounds very plausible, it's going to raise all sorts of difficult questions for future generations.
Who will decide how many of these AI's can be created? Will AI's ultimately be able to create their own offspring? We already have limited resources, this will only increase the demand :-) I see trouble ahead... as well huge opportunities for advancing our understanding of the Universe.
Many of these questions have been explored in AI fiction for many decades. Even right down to the titles, like the book "Do Androids Dream of Electric Sheep" which became "Blade Runner" for the movie version. Or "I, Robot" by Isacc Asimov.
The chinese room thought experiment is simply annoying because it doesn't even get the questions right. It basically asks if the hardware becomes conscious/understanding when running software - instead of asking the real question - if the software can gain the understanding/consciousness. The man in the room is just a hardware component replacing a computer processor. But the only place to look for intelligence in this experiment should be in the rule-system, not in the processor executing those rules. As those rule systems decide if the communication makes sense or not. To make a real conversion the rule-system needs flexibility to handle dynamic input and thereby creating a dynamic flow between the inside and outside of the room, but that flow is about symbol manipulation, not about the symbol manipulator. I think no one in AI ever even argued that the hardware part of a computer would gain understanding.
The only good part about this thought experiment is that it makes the hard question of consciousness - how can it arise from physical phenomena - somewhat more obvious. But it's still the same question as for normal human minds and neither an argument for software systems becoming conscious nor against it.
Of course, another question is: will the AI's consider us humans to have the equivalent of their consciousness? Or will they think "These humans don't not really possess consciousness, they just exhibit a crude biological simulation of it".
(BTW: This is not just meant as a joke, nor just as instigating fear of the singularity - turning a question on its head can sometimes lead to interesting insights).
So you don't believe there is any distinction between being "alive" and running software? Providing of course the software is complex enough to display human traits such as thought, emotions, creativity and so forth?
Just to be clear, I hold no solid viewpoint on this, I'm just interested as to what other folk here think.
"alive", life has deferent meaning (and defined requirements e.g self-reproduction) than "consciousness", "sentience" or "mind".
Unless you belief in a mystical soul component (I do not), there is nothing else "there" but machine. A very complex, biological machine. But there is no reason to believe that that a sufficiently powerful software/hardware can not replicate the functionality of that machine exactly. Then what possible difference could there be between the two duplicate machines?
But, we will have systems that exhibit a many aspects of "thought" and sentience long before we are able to code up homo sapien sapien.
I'm not sure it's just about computational power. Currently we mainly deal with digital computers that process information in a very discrete way. This is a nice article with regards to the human brain being analogue or digital-
I think making that distinction would be similar to saying that a computer is not running software because it was built inside minecraft.
Although, it's possible that creating something that does what humans do requires the same materials humans are made of (but seems unlikely, or at least I think other materials will allow a close approximation).
(a) it doesn't matter because these are just names for things that we can't even define succinctly for humans and (b) they're the same thing (simulated or not simulated, what's the difference??)
I've got a very clear model for what consciousness is.
Let's start by looking at how brains work for something that is much easier to reason about - vision. Light hits the eye, and it activates a set of neurons that are wired up to the light receptors. There's further out neurons that are set up to catch certain higher-level patterns, like edge detection or objects headed directly toward the eye. Deeper and deeper these connections go, until you get something that you subjectively experience as "seeing". It's at this point that optical illusions work - flat images on paper that are designed to make the higher level visual pattern recognizers fire. At a very high level, how parts of the brain work is by activating based off the sensory data represented by the activation of other parts of the brain.
Just like detecting edges in your visual field was important in the ancestral environment, so was answering questions like "what did you do on the hunt?" with a good story. Just like there's a portion of the brain the constructs edges from lower-level visual sensation, there's a portion of the brain that constructs narratives from lots of different kinds of sensations. Furthermore, this narrative module has access to remembered narratives (since consistent stories are better ones) as another form of sense data.
That, in short, is what consciousness is - the brain making sensible stories out of the sense data it encounters, and making those stories available to the brain (including the story-making module, which is why you can make stories about the qualitative experience of story-making).
Although this might be a sensible way of looking at how the brain processes information, it doesn't address the hard problem of why there is a subjective experience.
Err, it does if you mean what I think you mean by "subjective experience". There's subjective experience because the brain constructs a narrative about the sensory data it is getting (and that narrative is also part of the information it is processing). This is what subjective experience is. Why it exists is because people needed to tell stories in the ancestral environment.
It doesn't explain why that paricular representation is privileged as being experienced, as opposed to all of the other representations in the brain.
Simply declaring 'that's what subjective experience is' is not an explanation.
Another way of putting this is that you haven't explained why it's necessary that an autobiographical message passing system needs to experience itself, and what the threshold is for qualifying as such a system.
It seems obvious to me - it's privileged because it's what's used to tell stories about your sensory data. It's very easy to identify with that part of yourself, but that's not the only way to go about things. One of the more common other ways is usually called "flow", where the sports-playing / problem-solving part of the mind is basically the entire thing that you're experiencing.
In other words, that representation is privileged because the environment often demands story-telling from you. In situations where it doesn't, you get very different experiences (like flow states, or long solo wilderness hikes).
So you seem to have just contradicted yourself. You said before that the storytelling experience is consciousness. Now you are saying there are other kinds of subjective experience that are also consciousness. Why are we conscious of them?
Eventually I'm pretty sure this will be an important question. Hundreds, or thousands of years into the future. It will matter one day. Will these artificial life forms have the same rights as humans? If there is no difference between simulated or human conciousness, then the answer is yes, they should have the same rights.
If you're going to apply a word to a place where it previously hasn't been meaningful to apply the word, it's worth clarifying what meaning that word now has.
For instance, is Android a Linux distro? Depends why you're using the phrase "Linux distro". For some applications of that phrase, yes; for some applications no. Saying that it is, or that it isn't, doesn't tell you anything more, and doesn't help you answer those questions; you still have to answer them, so you might as well answer them directly. For instance, if you're packaging some software for a bunch of Linux distros, you probably don't want to count Android (or at least, handle it very differently from Debian or Fedora). If you're tracking fixes to kernel CVEs, you probably do want to count Android.
Or, more mathematically, is 0^0 = 0 or 1? Depends how you ended up with 0^0 in the first place: if through 0^y as y approaches 0, then 0, but if through x^0 as x approaches 0, then 1. Saying "It's 1" or "It's undefined" is satisfying to the part of the human brain that likes clean answers, but it doesn't inform us.
Similarly, let's not ask "what does 'consciousness' really mean", but "where do we use the term 'consciousness'." Should AIs be able to drive a car unsupervised? Should a sufficiently advanced AI have civil rights, such as the right to life? Should an AI be entrusted with political office? Or with the vote, and if so, what representation is fair? Can I give an AI ownership of a company? For the religious among us, do AIs have the ability to have human-like morality? Those are all questions worth answering, but saying "Yes, this computer has a conscious mind" or "No, it doesn't" is not really going to help those questions get answered.
The Chinese room experiment is sort of not useful in that, while "the ability to learn Chinese" is a thing we expect of consciousnesses, it's not really generalizable. Nothing else in our society really depends on people's ability to learn a language; those few things that do, generally don't require the learner to be a person (e.g., you want something translated and you give it to a company).
Every time there's a "breakthrough" in AI, this question crops up. Consciousness is poorly understood and there's no good reason to think it's a product of intelligence.
In this case, all that's happened is that a game played by people can now be played well by a computer program which is designed specifically to recognize patterns. (A go position is just a 19x19 array of trits.) Computer programs reached human-level performance decades ago at chequers/draughts, and later at chess and backgammon. Those programs did not lead to a general AI.
Yes, it will be possible unless you specifically construct your definition of consciousness to make it impossible. However, having a mind is unnecessary (and inefficient) for the most intelligent possible machines. Imagine extremely advanced future branches of statistics, which have yet-to-be-invented but still entirely statistical methods for discovering state spaces and causal rules, all with formal proofs of optimality. Imagine that, even if executing these methods is impossible in physical hardware, future computer science is capable of proving optimal sampling methods for a given problem and level of available computing resources. A machine that executes these algorithms is smarter than us, but does not have a mind.
Will we construct machines with minds despite that? Since "to see if we can" is usually sufficient motivation for humans, I expect yes.
Biologically-inspired computing techniques will probably lead to computers with minds, and I can see these computers eventually becoming smarter than humans. Will we make such machines before we invent enough of future-statistics to make a mindless machine that's smarter than us? I don't know, that's an interesting question.
It will of course be simulated. Without getting too metaphysical the question of whether there is anything beyond computation is at this point largely spiritual. There are also reasonable arguments that all of reality as we know it is one large simulation. http://www.simulation-argument.com/simulation.html
Do hackernews readers think that computers will eventually have a "mind" and "consciousness" ? Or will it just be simulated?
Personally? Yeah, I do think computers will eventually achieve "consciousness". I don't think of consciousness as being anything magical or terribly special to the point that a computer couldn't have it. I generally think it's just a question of a sort of recursion where the brain is "hearing" it's own internal processing as it works. When we ask questions, run simulations (day-dream) and we "see" and "hear" those things, I suspect it's just some sort of feedback loop between internal modules of the brain.
I'd go so far as to suggest that some computer programs may already be conscious, and we just don't have any good way to measure/test that fact.
Indeed, it's perhaps one of the most fundamental questions in life, at least in my view. I only wish I could somehow peer into the distant future to see how things pan out.
It's likely that eventually the boundaries between humans and AI will blur. I guess at some point there will be cognitive augmentation, but imagine the debate that will generate. Look how we view physical enhancement by drugs today in sport.
I'm wondering if strong players will learn to beat AlphaGo by playing unconventional openings, so as to confuse the neural net, and get the upper hand before the MCTS can take over in the middle game.
This could be good for the game of Go in general as top players are forced to find new ways to play.
Just speculating as I am not a particularly strong player myself.
But, but, but, ..., we should ask where did the AI software come from? Three guesses, the first two don't count, and the third is, may I have the envelope please,
[drum roll],
and the winner is, humans! So, it's not that AI beat a human. Instead, a human with a machine beat a human without one!
I worked on a Go player using neural networks in college 15 years ago. Very interesting project and fun. Mine wasn't very good cause we didn't have a lot of training. But it would make legal moves and it was exciting to see when it made a "good" move (i.e. a capture or block).
What makes humans so much more intelligent that they can learn to be as good as this game when playing only up to 1000 games per year and be just as good as a something that can train on a million games per day?
I want computers that can learn on small data instead of big data, like humans do.
I'd be really interested to see DeepMind tackle a strategy game with a more fluid state space like Starcraft. It may be much more difficult to plot out decisions when they're not bucketed into discrete turns, and you can potentially move everything at once.
APM mostly means that bots can have perfect micro-management of units (within reason; a bunch of goliaths or dragoons in a chokepoint simply cannot really be managed ;)). This translates to slightly higher income in the beginning [1] (and thus certain strategies can be a bit faster. For ground armies in open fields this can mean a large advantage, although humans are quite good at this as well.
APM doesn't really help in creatively coming up with a good strategy to counter a certain build. Just today I watched a game between two Protoss bots, one of which went carriers, the other zealot/reaver. That's ... not really an optimal decision.
Tactical manoeuvres like flanking or attacking at different places at the same time are also mostly absent for now (although APM can help there, of course).
It should be possible to cap the APM of an AI if there is a desire to make it play more like a human, but I would find an uncapped version interesting to see what tactics work well when control is not an issue.
Now the question is, how applicable are the approaches developed for AlphaGo to other tree search problems? The beating a professional Go player with less evaluations than it took Deep Blue to beat Kasparov is certainly very exciting.
DeepMind does about everything in Torch, which is Lua. Although I'm not sure about Torch's distributed support, so the use of 50 GPUs and Distbelief may mean another language?
I can only wish the source would be released for all the researchers working on AI to improve also their algorithms. I understand that Google wants to keep his edge but it also slow down the whole field.
Accusing Google of slowing down the field is laughable. Google has opened up a ton of AI stuff, in every category imaginable, from Inception (pre-trained model), Street View House Numbers (giant dataset), gemmlowp (infrastructure for fast matrix multiplies), deep dream (implementation of a specific deep learning technique), all the way to TensorFlow (complete production-ready neural net framework with examples and a free graduate-level university course to go with it). And of course they're publishing open access papers for everything in addition to the code releases. Furthermore, they're also employing core contributors who continue to maintain important libraries like Ceres Solver, Eigen, and Keras, just to name a few.
Google has even open sourced DeepMind's previous most impressive achievment, the Atari player. Based on that track record it's probably only a matter of time before the Go player is released. I'd expect it not long after the upcoming match, if they win.
my summary (may be wrong): they create a convolutional neural network with 13 layers to select moves (given a game position, it outputs a probability distribution over all legal moves, trying to assign higher probabilities to better moves). They train the network on databases of expert matches, save a copy of the trained network as 'SL' then train it further by playing it against randomly selected previous iterations of itself. Then they use the history of the move-selecting network playing against itself to generate a new training set consisting of 30 million game positions and the outcome of that game, with each of the 30 million positions coming from a separate game. They use this training set to train a new convolutional neural network (with 13 layers again, i think) to appraise the value of a board position (given a board position, it outputs a single scalar that attempt to predict the game outcome of that board position).
They also train ANOTHER move-predicting classifier called the 'fast rollout' policy; the reason for another one is that the fast rollout policy is supposed to be very fast to run, unlike the neural nets. The fast rollout policy is a linear softmax of small pattern features (move matches one or more response features, Move saves stone(s) from capture, Move is 8-connected to previous move, Move matches nakade patterns at captured stone, Move matches 12-point diamond pattern near previous move, move matches 3x3 pattern around candidate move). When a feature is "move matches some pattern", i don't understand if they mean that "match any pattern" is the feature, or if each possible pattern is its own feature; i suspect the latter, even though that's a zillion features to compute. The feature weights of the fast rollout classifier are trained on a database of expert games.
Now they will use three of those classifiers, the 'SL' neural network (the saved network that tried to learn which move an expert would have made, before further training against itself), and the board-position-value-predicting network, plus the 'rollout' policy.
The next part, the Monte Carlo Tree Search combined with the neural networks, is kinda complicated and i don't fully understand it, so the following is likely to be wrong. The idea of Monte Carlo Tree Search is to estimate the value of a board position by simulating all or part of game in which both players in the simulation are running as their policy a classifier without lookahead (eg within the rollout simulation, neither player does any lookahead at each step); this simulation is (eventually) done many times and the results are averaged together. Each time the Monte Carlo simulation is done, the policy is updated.
In order to take one turn in the real game, the program does zillions of iterations; in each iteration, it simulates a game-within-a-game:
It simulates a game where the players use the current policy, which is represented as a tree of game states whose root is the current actual game state, whose edges are potential moves, and whose nodes or edges are labeled with the current policy's estimated values for game states (plus a factor encouraging exploration of unexplored or underexplored board states).
When the simulation has visited the parent of a 'leaf node' (a game state which has not yet been analyzed but which is a child of a node which is not a leaf node) more than some threshold, the leaf node is added to a queue for an asynchronous process to 'expand the leaf node' (analyze it) (the visit-count-before-expansion threshold is adaptively adjusted to keep the queue short). This process estimates the value of the leaf node via a linear combination of (a) the board-position-value-predicting network's output and (b) the outcome of running a simulation of the rest of the game (a game within a game within a game) with both players using the 'fast rollout' policy. Then, the SL neural network is used to give initial estimates of the value of each move from that board position (because you only have to run SL once to get an estimate for all possible moves from that board position, whereas it would take a long time to recurse into each of the many possible successor board positions and run the board-position-value-predicting network for each of these).
Because the expansion of a leaf node (including running SL) is asynchronous, in the mean time the node is 'expanded' and a 'tree policy' is used to give a quick estimate of the value of each possible move from the leaf node board state. The tree policy is like the quick rollout policy but with a few more features (move allows stones to be captured, manhattan distance to two previous moves, Move matches 12-point diamond pattern centered around candidate move).
At the end of each iteration, the action values of all (non-leaf) nodes visited are updated, and a 'visit count' for each of these nodes is updated.
At the end of all of these iterations, the program actually plays the move that had the maximium visit count in the monte carlo tree search ("this is less sensitive to outliers than maximizing action-value").
some more details:
During monte carlo tree search, they also use a heuristic called 'last good reply' which is sorta similar to caching.
the move-predicting networks are for the most part just fed the board state as input, but they also get a computed feature "the outcome of a ladder search"
because Go is symmetric w/r/t rotations of the board, the move-predicting networks are wrapped by a procedure that either randomly selects a rotation, or runs them for all rotations and averages the results (depending on whether or not the network is being used for monte carlo tree search or not)
The last human surveys the burnt husk of his dying planet, suppressing pangs of awe as the gleaming computronium ships - no more puzzles left to solve - turn and flicker out of space-time in search of new challenges elsewhere.
"Clever, but nothing to do with general intelligence," he mumbles under his breath as the flames consume him.
Brute force statistical inference over millions of examples is an advance, but it is an advance that falls within the same paradigm of machine learning that we've had for decades. IMO, it will really take a paradigm shift to move into general intelligence.
Obviously accomplishments like this are interesting for their own sake. But the really interesting question for me is, is there anything here, that is new, that can be used outside of tree-based games of complete information? (A much weaker question, is there anything new here that can be used in other games?)
Let's take the ability to match the best humans at various tasks. If we put doing long multiplication better than the best human at "1", and passing the Turing Test at "10", where is a Go playing engine? I'd say less than "2", but still higher than chess.
Much higher (but still short of general intelligence, obviously) would be beating the best humans at language translation, proving math theorems, or answering factual questions phrased in natural language. Then, if and when computers can write novels which you'd prefer to read over the best human author, now we're talking.
So I think, we don't need to worry about our humanity here yet, and we're not remotely close. Over less than a century we've gone from outperforming humans in arithmetic to outperforming us in things that are a little less like arithmetic but still discrete games modeled by trees and precise searches.
> Much higher (but still short of general intelligence, obviously) would be beating the best humans at language translation, proving math theorems, or answering factual questions phrased in natural language. Then, if and when computers can write novels which you'd prefer to read over the best human author, now we're talking.
There's a lot of similarity. Take math theorem proving, which is an active area of machine learning; the algorithms tend at the moment to look something like create a tree of possible theorems, then the algorithm tries to traverse down the tree towards the target theorem by evaluating each branch by the probability it leads towards the target. Because of the exponential explosion, it's very important to select good theorems to expand at each level of the tree, and the ML algorithms can learn characteristics of good intermediate theorems by learning on the existing corpuses of machine-checkable proofs, which gives them much higher success rates in reaching target theorems. So right now the best theorem-provers can already reprove a lot of existing proofs. But they're using stuff like SVMs last I saw, and were much less sophisticated than a deep q-learner.
Okay, I misread that piece, what I read was that the software is ranked as Amateur dan.
My point stands: If your title says "beats champion" and you don't qualify champion, that's major link bait. The european champion is only 2p, versus the best players being all 9p ranked.
Video from Nature: https://www.youtube.com/watch?v=g-dKXOlsf98&feature=youtu.be
Video from us at DeepMind: https://www.youtube.com/watch?v=SUbqykXVx0A
edit: For those saying it's still a long way to beat the strongest player - we are playing Lee Sedol, probably the strongest Go player, in March: http://deepmind.com/alpha-go.html. That site also has a link to the paper, scroll down to "Read about AlphaGo here".
If you want to view the sgfs in a browser, they are in my blog: http://www.furidamu.org/blog/2016/01/26/mastering-the-game-o...