Video from Nature: https://www.youtube.com/watch?v=g-dKXOlsf98&feature=youtu.be
Video from us at DeepMind: https://www.youtube.com/watch?v=SUbqykXVx0A
edit: For those saying it's still a long way to beat the strongest player - we are playing Lee Sedol, probably the strongest Go player, in March: http://deepmind.com/alpha-go.html.
That site also has a link to the paper, scroll down to "Read about AlphaGo here".
If you want to view the sgfs in a browser, they are in my blog: http://www.furidamu.org/blog/2016/01/26/mastering-the-game-o...
There are two interesting areas in computer game playing that I have not seem much research on. I'm curious if your group or anyone you know have looked into either of these.
1. How to play well at a level below full strength. In chess, for instance, it is no fun for most humans to play Stockfish or Komodo (the two strongest chess programs), because those programs will completely stomp them. It is most fun for a human to play someone around their own level. Most chess programs intended for playing (as opposed to just for analysis) let you set what level you want to play, but the results feel unnatural.
What I mean by unnatural is that when a human who is around, say, a 1800 USCF rating asks a program to play at such a level, what typically happens is that the program plays most moves as if it were a super-GM, with a few terrible moves tossed in. A real 1800 will be more steady. He won't make any super-GM moves, but also won't make a lot of terrible moves.
2. How to explain to humans why a move is good. When I have Stockfish analyze a chess game, it can tell me that one move is better than another, but I often cannot figure out why. For instance, suppose it says that I should trade a knight for an enemy bishop. A human GM who tells me that is the right move would be able to tell me that it is because that leaves me with the two bishops, and tell me that because of the pawn structure we have in this specific game they will be very strong, and knights not so much. The GM could put it all in terms of various strategic considerations and goals, and from him I could learn things that let me figure out in the future such moves on my own.
All Stockfish tells me is that it looked freaking far into the future and it makes any other move the opponent can force it into lines that won't come out as well. That gives me no insight into why that is better. With a lot of experimenting, trying out alternative lines and ideas against Stockfish, you can sometimes tease out what the strategic considerations and features of the position are that make it so that is the right move.
It doesn't seem too difficult to have an AI let you know which other strong moves it rejected, and to dig into its forecasts for how those moves play out compared to the chosen move to tell you why it makes particular scenarios more likely. But that would just be postrationalization too...
Think of it like describing a tiger. I have no idea how to describe complicated edge detection algorithms and exact shapes. But I can say something like "an animal with orange stripes", and that would be useful information to another human.
Likewise a grandmaster can explain that pawns are valuable in situations like this, or that point to a certain position and say don't do that, etc. To a computer that information is useless, but to a human that's extremely useful. We already have some pattern recognition and intelligence, we just need some pointers. Not an exact description of the algorithm.
I think that the only reason this is true is because, humans, shared the same low (or intermediate in this case) level features of their models of the world and a common language to share them.
Artificial neural networks understanding have evolved along other path, and they probable have another different organization between the different levels, but that doesn't make the fundamental mechanism different.
The reasons for why the player may think the move is objectively good can vary, but they are almost always linked to rational considerations. E.g. that the move increases their piece count, their control of center squares, their attacking opportunities, or is tactically necessary due to the position.
My point being that when grandmaster players play chess, they are just as interested in finding objectively right moves as a computer is. Unless it's speed chess it's rarely a "I suspect this might be good" sort of thing.
(That said, many grandmasters do avoid lines of play they consider too dangerous. World Champion Magnus Carlsen's "nettlesomeness" - his willingness to force games into difficult positions - has been one explanation for why he beats other Grandmasters.)
I would argue you see far more pattern recognition at play in chess than you do of heuristics. Heuristics is more common at lower levels of play.
When Grandmaster's rely on pattern recognition, they are using their vast repertoire of remembered positions as a way to identify opportunities of play. It's not that they think the move looks right, it's that they played a lot of tactical puzzles, and because of this pattern recognition, they are now capable of identifying decisive attacks that can then be objectively calculated within the brain to be seen as leading to checkmate or a piece advantage.
They don't make the move because of the pattern or heuristic. They make the move because the pattern allowed them to see the objective advantage in making that move.
As for your point about a move being objectively good: Unless you completely solve the game of chess, there will never be always one move in every situation that's objectively the best. In many games (and you will see this in computer analysis), 2 or 3 moves will hold high promise, while others will hold less. From an objective standpoint all these three moves could be objectively better than all others, but it could be hard to justify that one is necessarily better than another.
The reason for this is partly because between two objectively 'equal' moves, there may be a rational reason for me to justify one over the other based on personal considerations (e.g. because I am familiar with the opening, because I played and analyzed many games similar to this line, because I can play this end game well, etc.) Decisions based on those considerations are not what I would call heuristics, because they are based on objective reasons even if heuristics may have contributed to their formation within the mind.
This is quite wrong. They use a score that material is only one (although a major) factor of.
"because barring an amazing tactical combination (which can usually be computationally eliminated past 5 moves) or a crushing positional advantage, a loss of pieces will mean the victory of the person with more pieces."
Again, this simply isn't true. For one thing, talk of "piece counts" and even "increasing piece counts", rather than material, is very odd coming from a serious chessplayer. Aside from that, time, space, piece mobility and coordination, king safety, pawn structure, including passed pawns, how far pawns are advanced, and numerous other factors play a role. All of these can provide counterplay against a material advantage ... it need not be "crushing", merely adequate. And tactical combinations need not be "amazing", merely adequate. And whether these factors are adequate requires more than 5 moves of lookahead because chess playing programs are only able to do static analysis and have no "grasp" of positions. All of which adds up to the need for move tree scores to be made up of far more than "piece counts".
I perhaps resorted to hyperbole in my original description for the sack of emphasis. You are correct that at higher levels of play, positional considerations matter far more than material considerations. The advantage does not need to be amazing, but adequate. However, as material begins to accumulate the advantage one must have in position in order to justify the loss will increasingly require a position that moves into the realm of "amazing" and "crushing".
You are right that objectively calculating the positional strength of a position is very difficult to do without immense brute forcing, and likely needs more than 5 moves ahead of insight. When I said that I was really referring quite strictly to tactical combinations where the vast majority of tactical mistakes can be caught quickly.
If we could solve chess, this most likely would be true, just as it's true for tic-tac-toe, which anyone can solve in mere minutes once they realize that symmetry allows for only 3 distinct opening moves (corner, middle and edge) and games should always end in a draw unless someone makes a silly mistake.
Granted, there are lots of paths to draw that one might take, but the objectively strongest move is to take a corner, then the opposite corner, which requires the opponent to either try to force a draw or lose, whereas it's not hard to use weak moves to hand either player a victory, even though the game of tic-tac-toe can always be forced into a draw with skilled play.
This is the essence of chess openings. A choreographed set of moves that is "good for both parts"
And of course there's a fundamental imbalance between white and black, so you can't have both sides making the same move
But from a point on, there's no fundamental answer on what a good move is, barring analyzing the tree to a great depth.
If I had to speculate at a high level why the Sicilian opening is so popular for black in professional play, it would be ultimately because the Sicilian allows black to obfuscate white's board symmetry, which creates opportunity for counterplay against white's fundamental advantage of having the initiative.
I will say though that as someone who devoted some serious time into trying to become a master, that opening theory completely changes as you get to the master level and beyond.
In tournaments I would play a solid but relatively obscure opening as black that worked very well as a safe opening to guard against highly tactical book play, but when I really analyzed the entire line going out past 12-15 moves with a grandmaster, I learned that with careful play there actually was a way to gain a slight edge for white with it -- enough to make the opening uninteresting to most grandmasters. It would play well against masters, but not against a top GM who would know how to play out the line correctly.
And the claim that there would be no variation in moves between players if moves were objectively good is absurd nonsense. Just because not everyone plays the best move, that doesn't mean it's not the best move. Of course different players apply different heuristics -- some players are better than others. But in the vast majority of positions, all grandmasters will, given enough time for analysis, agree on the best move or a small number of equally good best moves. When there are multiple best moves, different grandmasters will choose different ones depending on their style, familiarity, opponents, and objectives (tournament players play differently when all they need is a draw than when they need to win).
Your previous comments, about "postrationalization", are also nonsense. Certainly GMs play intuitively in blitz games, but when taking their time they can always say why a move is better -- and they do just that in postgame analyses, many of which can be seen online. The explanations are given in terms of the major strategic factors of time, space, and material, or other factors such as pawn structure and piece coordination, or in terms of tactical maneuvers that achieve advantages in those factors ... or that result in checkmate (which can be viewed as infinite material gain, and many chess playing programs model it as such).
But chessplaying programs aren't goal driven. They evaluate such factors when they statically analyze a position, but they evaluate millions of positions and compare the evaluations and bubble these evaluations up the game tree, resulting in a single score. That score does not and cannot indicate why the final choice is better than others. Thus
"It doesn't seem too difficult to have an AI let you know which other strong moves it rejected, and to dig into its forecasts for how those moves play out compared to the chosen move to tell you why it makes particular scenarios more likely."
is just facile nonsense grounded in ignorance ... of course it can let you know which other strong moves were rejected, but it cannot even begin to tell you why.
"But that would just be postrationalization too... "
You keep using that word, in completely wrong fashion. The computer's analysis is entirely done before it makes the move, so there's nothing "post" about it. And it makes moves for reasons, not "rationalizations". Perhaps some day there will be AIs that have some need to justify their behavior, but the notion does not apply to transparent, mechanistic decision making algorithms.
With me knowing nothing about this particular pocket f the world, how does one live as a "Go professional"? Who pays for this? I don't imagine this is very attractive for sponsors, or do I underestimate how popular this is in Asia?
As it's a game more popular amongst older demographics, there tend to be a lot of wealthy patrons and supporters (individuals and companies) who sponsor tournaments and teams. One of the highest-paying competitions is the Ing Cup with a prize of $400,000. Japan has nearly 10 major year-long tournaments every year, totaling over $2mil in prizes, many are sponsored by major newspapers. China has domestic year-long leagues, where city teams each have their own sponsors. All the games I mentioned here pay a match fee whether players win or lose.
So yes, it is a popular game in Asia, however less so for the younger demographic and is unfortunately in decline. Most people just don't have the attention span, interest or time these days. :(
The big tournaments all have their own huge sponsors, often media conglomerates.
Actually Janggi, the Korean variant of Chinese chess (yes, there are some rule differences, but it's recognizably the same game - like Chess without castling and en passant) is very popular, though according to Wikipedia currently less so than Go.
People have learned to communicate heuristics which is very useful for beginning players. A grand master may not be able to communicate nuances of the game to low level players, but they do benefit from working in groups which suggests they can share reasons for a given move not just propose moves and have other independently evaluate them.
So, while people can't map out the networks of neurons that actually made a given choice, we can effectively communicate reasons for a given move.
However I think this area has been under-researched but is obviously important. It would enable very strong learning environments and man/computer hybrid systems. I think there's very direct relevance in safety/security critical systems and there's some literature in operators not understanding what is going on in complex systems and how that can be fatal (think power plants and the like)
If so, could it be just as easily programmed to answer those questions as it evaluates moves? That is, it seems the information is there to form those answers, but it's not the question the AI has been asked.
Historically there has been a back-and-forth with chess engines....
Early engines tried to really "smart", but consequently couldn't really analyze very deeply, as they spent a lot of time on each position. Newer engines mostly churn through really deep analysis, going many layers deep, but are making comparatively simplistic analyses.
For instance, it wasn't asked to evaluate the pawn structure and provide that analysis as an output, but it certainly could be programmed to do so.
This quite misses the point. These programs do that as a matter of course, for individual positions. But choosing a move is the result of evaluating many millions of positions and comparing the scores through tree pruning. The program cannot tell you that it chose one move over another because the chosen move is generally better for the pawn structure than the move not chosen, because it doesn't have that information and cannot obtain it.
Or, maybe you missed mine.
No. A chess playing program's move score is a value obtained from treewise comparisons of the static evaluations of millions and millions of positions.
Nobody says that the red koopa in Super Mario Bros. has bad AI.
Similarly long time ago, read about how AOE (Age of Empires) used a lot of computers to play against each other, then do statistic which units actually got used. The idea was to rebalance the units such that all are almost equally used (well in terms of computer-ai play).
I think these two articles were both in the same book, so I'll have to dig.
HOMM3 is also my favourite game, along with Disciples. I'm a big turn-based strategy fan :)
> You just run multi-PV analysis and play moves to keep the intrinsic performance rating at the desired level.
I thought that "just" was funny – it reminds me of this comic.
Stockfish does have a "skill level" setting, and it's not terrible at faking club-level play (if you have the Play Magnus mobile app, it's just Stockfish at different skill levels). However, as of 2013 the implementation is much more primitive than what I'm suggesting here.
To be clear, even though it's incredibly obvious, I've never seen this idea anywhere else. It first occurred to me after reading the original Guid & Bratko paper on intrinsic ratings in 2006. Happy to continue this offlist if you want to work on it. My e-mail is in my profile.
Having said that, I only added the link to that comic because I didn't want to just write a comment saying "thanks for the links", and I'm only replying again, because I'm hoping it continues the pattern where you keep replying to my replies with super interesting links!
The interesting problem is to make the computer play like a 2000-rated person, not just as well as a 2000-rated person. I'm a big fan of Regan's work, but I don't think IPR on its own is sufficient to make the computer play idiomatic suboptimal moves.
It made playing against a program a lot nicer than other chess programs I had tried.
The astronauts in 2001 play chess with the computer, and it sometimes loses on purpose to make the game more interesting for them.
I.e. there are various chess bots that can be assigned personality dimensions like aggressiveness, novelty, etc.
Or, more likely, with no understanding at all.
Is that not exactly how a human decides which move to play?
When playing, they have a strategy, which they could explain to other go players. They don't just recognize patterns or do brute-force look-ahead. The same is true for good chess players.
So if the best move has an evaluation of +0.05 and the second best has -0.02 , the difference is probably a very subtle positional improvement (and the first move may not in fact be better; chess programs aren't perfect). If the best is +3.12 and the second is -0.02, and you can't see why, there's a concrete material tactic you're missing (or, less likely, a major, obvious positional devastation).
But, it can't tell you what you're missing, just the magnitude of the difference.
Concept was the same as you describe. No rocket science. I purposely read nothing beforehand, because I wanted to devise an approach and this seemed the most obvious.
Of course, nowadays that is not the only technique in use.
In any case, I'm not sure why you think it impossible to add any additional analysis to the program as it repeatedly scores millions of positions.
To summarize, I believe what they do is roughly this: First, they take a large collection of Go moves from expert players and learn a mapping from position to moves (a policy) using a convolutional neural network that simply takes the 19 x 19 board as input. Then they refine a copy of this mapping using reinforcement learning by letting the program play against other instances of the same program: For that they additionally train a mapping from the position to a probability of how how likely it will result in winning the game (the value of that state). With these two networks they navigate through state-space: First they produce a couple of learned expert moves given the current state of the board with the first neural network. Then they check the values of these moves and branch out over the best ones (among other heuristics). When some termination criterion is met, they pick the first move of the best branch and then it's the other player's turn.
How is this calculated?
When some termination criterion is met
Were these criterion learned automatically, or coded/tweaked manually?
2. They just run a certain number of simulations, i.e. they compute n different branches all the way to the end of the game with various heuristics.
Existing research throws a bunch of professional games at a DCNN and trains it to predict the next move.
It generally does quite well but fails hilariously when you give it a situation which never comes up in pro games. Go involves lots of implicit threats which are rarely carried out. These networks learn to make the threats but, lacking training data, are incapable of following up.
The first step of creating AlphaGo worked the same way (and actually was worse at predicting the next move than current state of the art), but Deep Mind then took that base network and retrained it. Instead of playing the move a pro would play it now plays the move most likely to result in a win.
For pros, this is the same move. But for AlphaGo, in this completely different MCTS environment, they are quite different. Deep Mind then played the engine against older versions of itself and used reinforcement learning to make the network as accurate as possible.
They effectively used the human data to bootstrap a better player. The paper used a lot of other cool techniques and optimizations, but I think this one might be the coolest.
In this case though they play and optimize against themselves
By learning from other teachers, and by applying original thought. Also, due to innately superior intelligence. If your IQ is 140, and that of the teacher is 105, you will eventually outstrip the teacher.
As an AGAAmateur 4 dan I read 10 moves pretty regularly, that's including variations.
And if the sequence includes joseki (known optimal sequences of 15-20+ moves), then pros will read even deeper...
No it doesn't. You seem quite happy to just make stuff up that you know nothing about, like "2-3 moves into the future".
I think a key missing component to crowd success on real expert knowledge (as opposed to trivia) is captured by the concept of prediction markets. (https://en.wikipedia.org/wiki/Prediction_market) The experts who are correct will make more money than the incorrect ones and eventually drive them out of the market for some particular area.
With humans on the other hand, there will always be some discussion. And some human experts may be better at persuading other human experts or the combining entity.
I think it would be an interesting thing to try after they beat the number 1 player. Gather the top 10 (human) Go players and let them play as a team against AlphaGo.
The next two games, it seems like Fan Hui did not perform as well as the first (as opposed to the computer being clearly better than him). Where the games played in a row?
Regardless, I'm looking forward to the games with Lee Sedol. I studied in his school in Korea, and personally know how hard it is to get to that level.
My assessment is that the bot from those games will NOT beat Lee Sedol. So train it hard for march :)
You can see that in the first game the bot played really solidly without risk-taking and didn't want to lose even a few stones. You could say that it played very conservatively but solidly. It won by a tiny margin (2.5) so Fan Hui probably concluded that the bot would win every game by a similar small margin if he didn't change the style of play. I'm sure that in the first game Fan Hui was sounding the bot out for strengths and weaknesses, seeing if it knew all the tesujis and josekis and whatnot.
So from then on you see Fan Hui trying to mix it up and play more aggressively and what is very interesting is that he got outplayed in every game, even to the point of losing a big group and resigning.
So - if you play conservatively and tentatively and solidly it'll beat you by a sliver, if you try to out-think it it'll nail you. At least at the 2dan pro level.
I'd be hesitant for calling a Lee Sedol victory ahead of time. We know that in chess Kasparov beat IBM's bot initially but then IBM tweaked and within a couple of years the bot was too strong. Even though go is much harder than chess I predict that if Google lose this time and if they don't lose by much they'll win the time after that.
So, yes, you can cheat in a perfect information game.
If transivity applies then AlphaGo is likely stronger than the average of those former Japanese champions, including Norimoto Yoda, who is currently ranked at 187th (about 300 Elo rating below Lee Sedol and 300 above Fan Hui) .
There's a saying in Go circles that there is a substantial gap in playing style or intuition between pros and even top-level amateurs. Whether that is true or not, AlphaGo has definitely crossed the threshold to pro-level play in Go.
By March 2016, Google DeepMind would have improved AlphaGo somewhat at least through self-playing and perhaps more processing power.
The game with Lee Sedol will be an interesting one to watch!
You'll find lots of Korean pros who got awarded their 1p for going abroad and teaching. 8p and 9p in Japan, in my recollection, are reserved for those winning one of the major tournaments and defending a title there.
Is AlphaGo being made available to the public? I'm a mediocre player, but I'd like to try a few games against it. Current synthetic players don't quite play the same as humans do, and it's a bit jarring. I wonder if the Google AI is more human-like in its style.
Anyway, here's the news from the p.o.v. of AGA:
AlphaGo: using machine learning to master the ancient game of Go
Interesting that DeepMind was using Google Cloud for compute. I imagine that the MCTS expansion can become massive. Any chance DeepMind may publish some of the internals about how many instances were used, how computation was distributed, any packages or frameworks used, etc.
And congrats on achieving this impressive AI milestone!
The computation requirements for training and running are given in the paper in both the body and the appendix details.
The difference is 100 Elo, which is substantial.
Edit: Ke Jie has 700 Elo on Fan Hui. About the same as the gap between Magnus Carlsen and the strongest player at my local chess club.
It is telling that AlphaGo only won 3:2 for the informal games. As a computer doesn't know the difference between formal and informal this seems to indicate that Alpha isn't truly above Fan Hui in strength. Also the formal games were played with fast game rules, which may be particularly advantageous to computers. Unlike chess go accumulates pieces on board throughout the game. Towards the end there are many stones on board and it is easy for human to err while the search space (possible moves) actually gets smaller for the computer and there is no doubt that computer has stronger book keeping capabilities. So to fairly evaluate human vs computer we may need new time rules different from human vs human games.
The paper does not disclose whether the trained program displays true understanding of game rules. Humans don't just use pattern recognition, they also use logic to evaluate game state. While this could be addressed by the search part of the algorithm the paper doesn't appear to give any indication on whether this was studied. For example, the board position strictly speaking does not determine the game state due to ko rules (so the premise of the paper that there is a valuation function v(s), where s is the board position, that determines game outcome is incorrect). It would be particularly interesting to see how the algorithm fares when there are multiple kos going on at the same time. Also it would be interesting to see how well the algorithm understands long range phenomenons such as ladder and liveness. With a million dollar challenge in the plan it is understandable the Google team may not want to disclose weaknesses of the algorithm but in the long run we will get to know how robust it really is.
From my experience playing against conv nets I would say if you treat your computer opponent as a human it would be like playing against the hive mind of a group of experts with infallible memory and it is not to your advantage. So one would be better off trying "cheat" moves that human experts do not use on each other and see how well the computer generalizes. Without search and with neural nets alone it is clear that computers do not generalize that well. So it would be interesting to see how well search and neural nets work together and if someone could find the algorithm's weak spots.
I'm not sure if I'm misunderstanding you, or you're misunderstanding the situation, but the informal games had faster time controls than the formal ones: the formal games had one hour of main time, while the informal games just had byo-yomi (3 periods of 30s: if you take more than 30 seconds for a move, you "use up" a period).
No, it isn't, any more than the claim that there's a position evaluation for chess is incorrect because of castling and capture en passant. A "position" isn't just where the pieces are on the board, but includes all relevant state information.
As a non-expert, may I ask (as the term does not appear in the paper): How valuable is the Shannon number in order to evaluate "complexity" in your context?
Quoting from the OP paper:
"During the match against Fan Hui, AlphaGo evaluated thousands of times fewer positions than Deep Blue did in its chess match against Kasparov; compensating by selecting those positions more intelligently, using the policy network, and evaluating them more precisely, using the value network—an approach that is perhaps closer to how humans play. Furthermore, while Deep Blue relied on a handcrafted evaluation function, the neural networks of AlphaGo are trained directly from gameplay purely through general-purpose supervised and reinforcement learning methods."
"Go is exemplary in many ways of the difficulties faced by artificial intelligence: a challenging decision-making task, an intractable search space, and an optimal solution so complex it appears infeasible to directly approximate using a policy or value function. The previous major breakthrough in computer Go, the introduction of MCTS, led to corresponding advances in many other domains; for example, general game-playing, classical planning, partially observed planning, scheduling, and constraint satisfaction. By combining tree search with policy and value networks, AlphaGo has finally reached a professional level in Go, providing hope that human-level performance can now be achieved in other seemingly intractable artificial intelligence domains."
Also, what kind of hardware results in this level of play and how is hardware correlated with strength here?
There are graphs in the paper showing how it scales with more hardware.
Chess engines you can run today, for free, on your own laptop, are far and away better than Deep Blue (and any human), and I believe still don't reach Deep Blue's raw speed.
I can't find the claimed ELO for Jonny (current champ) but Junior (previous champ) is listed at 3200+, Magnus' top rating, the highest ELO rating ever, is 2882 for reference
There is a lot of politics in chess programming but the bottom line is that Komodo is currently the strongest program followed by Stockfish (which is distributed under GPL).
So what? The Stockfish app on your phone is stronger than Deep Blue was.
Also, will AlphaGo have access to Lee's previous games? Will Lee have access to AlphaGo's games for preparation?
is there any way to access the nature paper without a paywall(for academic, non-commerical use) ?
A few hours ago, Zuckerberg said:
The ancient Chinese game of Go is one of the last games where the best human players can still beat the best artificial intelligence players. Last year, the Facebook AI Research team started creating an AI that can learn to play Go.
Scientists have been trying to teach computers to win at Go for 20 years. We're getting close, and in the past six months we've built an AI that can make moves in as fast as 0.1 seconds and still be as good as previous systems that took years to build.
It'll be interesting (as in the old fake Chinese curse interesting) to know what happens when less democratic power structures wage war. They are still constrained by economics (trade creates value, if they bomb the shit out of someone they might find themselves cut off from trade, see embargoes and sanctions on Russia) and the possibility of an internal struggle (civil war) is always there.
I submit - League of Legends, any team video game.
Even at the very top pro level, matches are constantly being won & lost purely based on the initial hero drafting phase. Calculating the optimal draft is way more difficult than Go. It's probably not an exaggeration to say that the search space scale difference from dota draft to Go is about the same as from Go to tic-tac-toe. Because it's not only about the 100+ different heroes grouped into combinations, but also every possible game that can happen then with those combinations.
Then once the game starts, the A.I. may be able to respond to actions extremely quickly, with perfect precision. But what should the responses be? This is not a simple thing to answer and humans keep taking completely different approaches as our understanding of the game keeps evolving.
Also, the very best dota bots can currently only beat absolute beginners who don't understand the game at all yet. It takes a few hundred games of practice for a human to go beyond the best A.I. currently available.
 A game similar to League of Legends
 Directly reading from memory and then directly calling gameplay functions, basically a perfect hack, would get you what is known as "mechanical skills". For example being able to last hit.
You can just train the A.I. by replaying thousands of professional matches, and then let it train by playing billions of matches extremely quickly. It can even play both sides at the same time, and try out millions of different strategies against each other. It doesn't even matter if there's a limitation on how fast it can click and press keys, since every single action will be perfectly optimal.
Not only that, but you don't even need to write a single line of code to tell it about the rules of Dota. Before you start the training, it doesn't even have to know what each key does, or what happens when you move the mouse. Neural networks are capable of learning all of this from scratch, basically by trial and error.
This is not your typical dota bot. Bots are not A.I.s, so this is a whole different ballpark.
I think the primary problem is the search space size. I've seen this type of learning work on simple 8 bit games, and it seems we may finally be at the stage to handle Go. However Dota has many orders of magnitude more different possible moves at any given situation. The total search space grows incredibly fast after every move.
Thus, I do think neural nets can eventually learn how to play well, it's just that there's simply not enough memory or processing power to achieve any success right now.
The game field in a computer game will be quantized - even if it uses floating-point arithmetic that's still quantization. Conversely a go board can be scaled up or down without losing the feel of the game. If the grids were the same resolution then a given point in time you have many more choices in go because you can play literally anywhere.
1. The AI is a program running on the computer, so the input is the state of the game. This is basically just an aimbot, and would be trivial to do - just make it wander around randomly and headshot everything instantly.
2. The AI is looking at the screen like a human player, and has to parse the screen data (we could even let them have direct pixel input, not a camera). This would be much harder.
For a turn-based game like Go or Chess, the distinction is vague because the CV required to parse a board is fairly trivial and orthogonal to the problem of strategy.
Have you ever played Diplomacy?
Speaking of Go programs and computation requirements, one of the best performance hacks ever was done by Dave Fotland, the author of the program "Many Faces of Go", which was one of the top computer go programs from the '80s through at least around 2010.
He donated code from MFoG to SPEC, which incorporated it into the SPECInt benchmark. So, the better a given processor was at running Fotland's go code, the better it scored on SPECInt. Since SPEC was one of the most widely reported benchmarks, Intel and AMD and the other processor makers put a lot of effort into making their processors run these benchmarks as fast as they code.
Net result, Fotland had the processor makers working hard to make their processors run his code faster!
Facebook Paper - PDF
Pachi Original Paper - PDF
Fan Hui is 2p, so very skilled, but the ranking system goes up to 9p. To give a sense of how large a gap that is, there is and has only ever been one Westerner to achieve that rank, Michael Redmond. The article states they plan to face off against Lee Sedol 9p, and if they beat him in a no-handicap game, that will be as impressive as Deep Blue against Kasparov.
You'll want to watch this year's Computer Go UEC Cup in March, with Zen and CrazyStone being the typical victors. (CrazyStone in particular has sustained a 6d rating on KGS, a popular Go server, but lately has been at 5d. In the past it has beaten a 9p with a 4 stone handicap, which is impressive, but that handicap is huge and it hasn't yet won with 3 stones.) Of interest this year is that Facebook is competing, and AFAIK they seem to take a similar approach as Google by training the AI using deep learning techniques and then strengthening it further with MCTS. In their public disclosures they claim to beat Pachi pretty often, which puts their bot around 4d-6d, it'll be interesting to see how it fairs against Zen and CrazyStone in the Cup and if it wins against a 9p.
Even though Fan Hui is not an active professional in the traditional sense this is an absolutely huge accomplishment and leap in playing ability by computers. BTW, Michael Redmond is not particularly strong by professional standards.
(Edit: The correlation between strength and rank now as noted below is due to promotions more often coming from acheivements: If you win at X you get promoted to 7p immediately, if you win at Y you get promoted to 9p. You cannot win a big tournament and keep your low rank. Here is an example of someone who went from 3p to 9p in one match: https://en.wikipedia.org/wiki/Fan_Tingyu. But professionals cannot give each other 6 stone handicaps when one is 9p and the other is 3p)
Taeil's method seems unusual, though it may well be justified . I think he's using a relatively complete database of games, but I don't know for sure. Coulom has a very well regarded mathematical model, but we know that there are some gaps in the database, which a) may skew international comparisons, and b) may result in inaccurate ratings for players with few games in the database (but those players are usually not top players in the world).
See my comment below: there are very few 1p players near the top, unsurprisingly.
Though I should warn: as a low level pro who moved to Europe, this database has very little data on him. His European results indicate that he's stronger than Pavol Lisy, Alexander Dinerstein, and Mateusz Surma, who all rate higher than him here.
Overall I think it was a well balanced article highlighting an impressive achievement, not sure why you felt the need to diminish it as not significant?
In parallel there's the professional ranking system which also uses the title "dan" but is bestowed by the professional Go associations of every country. These rankings are symbolic instead of quantified, although generally higher professional dan ranks cannot be anything but corresponding to higher skill.
So, a "European 2d" is a professional rank which may or may not have a good translation into a quantitative scale like ELO or "KGS dan" (I actually don't know). Generally, my understanding is that European professional rankings lag behind Asian ones as well by some amount.
4 stones is a huge handicap? 4 stones is a huge handicap if he was playing against a 6p maybe. But with 4 stones handicap anything near 2-4p would still have a hard time.
I don't play Go, but I dabbled a little bit in Chess, and it seems like the engines with "difficulty settings" don't really make the sort of mistakes that humans make. They can, of course, just over-prune the look-ahead tree, but that's not how the human brain works.
It seems quite likely that the book that paper was in was Computers, Chess, and Cognition https://www.springer.com/us/book/9781461390824
(revised contributions from the WCCC 1989 Workshop New Directions in Game-Tree Search, May 29-30, 1989)...and was probably the one by John McCarthy (yes, that McCarthy)
Edited to add: Found it! The paper I recall reading was 'Artifical Stupidity' by William Hartson, in Advances in Computer Chess 4 (1986) https://chessprogramming.wikispaces.com/Advances+in+Computer... ... perhaps nothing concrete came of this though - Hartson was a player and author, not an AI researcher.
Most of them are more "play like God then hang their queen".
The engines now are so strong I can't beat them with Queen and Rook odds so even that doesn't really help.
AlphaGo played Fan Hui using 1202 CPUs and 176 GPUs.
It's strenght was assessed to be 3140 ELO on the scale used by www.goratings.org/ (BTW, this is different from the European ELO system)
That would put it at #279 in the world. Fan Hui is number 633 at 2916. AlphaGo's next opponent will be Lee Sedol, #5 at 3515.
The single computer version of AlphaGo is estimated to be 2890 ELO, which would be #679 in the world. It's is closer to what we might be playing against on our laptops and phones but still 48 CPUs and 8 GPUs.
I never thought that I would see a professional level AI Go player in my lifetime. I am taking the singularity more seriously.
> Although it has been proven that the evaluation of moves in MCTS converges to the minimax evaluation, the basic version of MCTS can converge to it after enormous time. Besides this disadvantage (partially cancelled by the improvements described below), MCTS has some advantages compared to alpha–beta pruning and similar algorithms.
Unlike them, MCTS works without an explicit evaluation function. It is enough to implement game mechanics, i.e. the generating of allowed moves in a given position and the game-end conditions. Thanks to this, MCTS can be applied in games without a developed theory or even in general game playing.
> The game tree in MCTS grows asymmetrically: the method concentrates on searching its more promising parts. Thanks to this, it achieves better results than classical algorithms in games with a high branching factor.
> Moreover, MCTS can be interrupted at any time, yielding the move it considers the most promising.
A multi armed bandit problem offers you one decision to pull one of n levers, so it's like a wide shallow game tree where the root has n leaves.
It was easy to ramp up to many many cores.
I have no idea how to play the game well, never learned, but because the win/lose rules are so clear and using this method, no need to build in game/move theory. Still got great results compared to "advanced" players' games.
If anyone would find this interesting work to collaborate on feel free to contact me. The MCTS part is done.
Watch 'Game Over: Kasparov and the Machine' for a good description of what I mean. And yes, I bet many people here hate that movie. That's exactly what I'm talking about.
There are a lot of relatively simple chess programs that can beat most amateur players. Personally, I cannot beat the Go program I have on my cellphone. (I am pretty horrible at the game.) None of that seems newsworthy. And yet after 5 wins against a pretty strong (but not world's best) Go player, and it's suddenly a "breakthrough" that changes everything.
If some new chess player have beat Deep Blue 4 to 2, would we revert the perceived status of AI to what it was before that match with Kasparov?
I'm not saying this advance in AI is unimpressive. I just don't see a fair evaluation of what it means. There is too much one-directional hype.
Not saying I disagree with much of your comment, but I think the answer to that question is actually, "yes". As a middling chess player, I think it would have been impossible for "a new chess player" to have beaten Deep Blue 4 to 2, and these days, it requires near-best-in-the-world prowess to pick up any games at all off the best chess computers, which I think proves that it wasn't hype-y or unfair to claim that the machines have beaten us at this.
You may be missing my larger point. It is not about whether humans can beat the best chess or Go computers. It is about a very strong bias in favor of algorithms in general. The dynamics around human vs computer chess matches simply demonstrated this bias in an easily percievable form. Who cared about Kasparov's later victories and draws against other state of the art programs? They received very little coverage.
Chess is just a game. The important thing to consider is what will happen when algorithms start to competre with people in more complex and ambiguous areas.
It is old news for humans to beat computers at Go, but the reverse is still new news.
I agree with you that people who infer an impending AI takeover of the world from computer success in specific games are just being silly.
They frame it as some huge setup by IBM, but to me, it just seemed like Kasparov got psyched out and thrown off his game.
We don't know how big it is, because they beat Fan 5-0, so that only places a lower bound on how good the bot is.
A few neural nets, a lot of MCTS, and a little touch of ingenuity.
"The search space for Go's game tree is both wider and deeper than that of chess. It has been estimated to be as big as 10^170 compared to 10^50 for chess, making the normal brute-force game tree search algorithms much less effective. "
To provide a little more perspective. The estimated number of atoms in the observable universe is about 4*10^80. So we have no hope of ever being capable of "solving" go by mapping out every possible game state AKA "brute forcing" the game.
The difference between Chess and Go is critical here. I don't know if there's any formal analysis of this sort, but the difference seems to be that in Go multiple paths to the same board state are very similar in evaluation. And so sets of moves can be evaluated as a batch or stochastically. I think it was an algorithm that exploited this property that saw one of the first significant jumps in computer Go strength. Contrast this with chess where different paths to a given board state vary so widely in evaluation that you must evaluate them all. I suspect it is this property that allows Google's technique to be effective in evaluation positions. But I don't see it extending to games that aren't similar to Go in this regard, like chess.
Not at all, different paths to a board state also have widely varying evaluations in Go. Joseki is a great example of this, especially since these sequences have been thoroughly analyzed. In most joseki playing any move out of order will leave weakness and a skilled opponent will not just continue to play the pattern out of sequence.
I definitely see how the more simplistic movement rules (and lack of variety of piece types) in Go would make it easier to cull off large swathes of the search space. Despite this my gut tells me that the branching factor is still the best benchmark we have (and easiest to convey) for comparing the computational difficulty of solving a board game with an adversarial search algorithm.
"Some of the factors responsible for the game tree's enormity are shown below. (A comparison to chess is shown in brackets)
the size of the board : 19x19 (8x8)
the average number of moves per game is around 300 (80)
the average branching factor at each turn is around 235 (35)
players can pass"
It's been 10 years since I did the math for this in my AI class but a simplified example would be to compare the start of each game. a go player has 19^2 or 361 possible moves. then on the second turn there are 360 possible moves (since one space has been taken up by the first move) so after two moves there are 361*360 or 129,960 possible go board configurations.
a chess player is more restricted by the smaller size of the board and the way pieces are permitted to move. so there are a total of 16 possible pawn moves on the first turn plus 4 possible knight moves leaving 20 total moves available. On the second turn the second player is similarly resticted to 20 possible moves causing the total number of chess board configurations after two moves to be 400. Once more room opens up on the chess board the branching factor does increase a bit but it never comes close to the hundreds of potential moves at any step during a go game.
this "branching factor" is the key to the size of a search space. in chess it gets messy to calculate due to the variety of piece types and their allowable movements but it's easy to see that at any given board configuration there are fewer possible moves. to calculate the search space you multiply the number of potential moves at each step until every board configuration has been accounted for.
Sooner or later we may discover some new algorithm (this deep-neural-net-based approach may be it) which will reduce Go to be as tractable as chess, or even more so.