Hacker News new | past | comments | ask | show | jobs | submit login

I don't know much about Go but I thought a comment about AlphaGo was interesting that it played in a way to marginally beat the player, which was different that most masters played, which is to clearly beat the opponent by as wide a margin as possible.

Is this accurate? Does MasterP also use this style? Are there humans that can play this way?

(I'm asking you because you seem to know what you are talking about here.)




Thats not true. Go is a game where the safety of the lead is more important than the amount at all times, and its taken into account strategically all the time.

What alphago showed however, is that once in the lead it made mistakes: small mistakes that didnt jeopardize the game, but lost the lead. If all paths lead to rome, it doesnt matter which is shorter. Humans however, always think of the best after safety.

What was terribly cruel to see as a Go player was the computer playing poorly early on: showing that it already knew it was going to win.


Modern chess engines take a lot of effort to hide this problem. For example, an engine might know from a table that it can win a king+rook vs king+pawn endgame, and then throws away its second rook to reach such a position.

Humans using the engine don't like this however, and so the authors build in heuristics to make it play more "humanly". Things like using scores for positions, rather than "chance of winning" is also largely for the sake of the users.


Humans play for a large lead because they don't have enough memory/power to accurately estimate the value of their positions, so they play for a buffer -- AlphaGo has higher confidence in its valuation, so it can play it closer -- ~85% confidence of winning by 5 stones (with room for error) vs 99% chance of winning by 2 stones


I believe that accuracy/"self-confidence" is part of it. However, I think it's also the case that AlphaGo has a monte carlo tree search in addition to the neural net, so it sometimes plays more conservatively than it needs to because it overweights obscure possibilities ("defending here is not necessary, but by doing so, I prevent some number of playouts where I play a dumb move and lose, and I can still win even if I defend").

Humans do the same thing, playing conservatively in a situation where they're far enough ahead. The difference is that a human sometimes looks at a move and says "This move works, and gains points. There is no risk." For a bot using MCTS, everything is a probability.


Another factor is that many micro-endgame sequences simply have a "correct answer" that loses the least points. Any human who's played Go for more than a few months knows these sequences, and if they choose to answer a certain move, will prefer the answer which is "always correct" to another move which would also win the game. This naturally leads to the winning player preserving their margin even when they could throw it away, while the machine has no such bias, and will just as happily throw away the margin as preserve it.


I think this is somewhat incorrect- The creators of AlphaGo made it clear that their system does not take the opponent into account at all, it just answers the question "What is the strongest move right now?" and plays that move, without taking the opponent into account. In other words, it does not have any mental model of the opponent.

However, you are correct insofar that it doesn't care about winning by large margins, it prefers winning by smaller margins if it can achieve this with a higher win probability.


A perhaps more accurate way to say it is that AlphaGo's models its opponents as a copy of itself.


So do human players, with rare exceptional ocassions


Playing trap moves is not that rare and it shows that you expect the opponent not to know a complicated variation.


Strong players dont play "trap moves"


Who said anything about strong players?


> AlphaGo


human players are not alphago


The context of this thread is AlphaGo. Why would we bother discussing what bad human players do when the topic is clearly about play at the highest levels?


The highest level players play special variations against each other hoping that the other player doesn't know it. They CLEARLY play against what they think the other player knows, not what they know (since they have studied this variation on purpose to prepare)


As a semi-pro player I can assure you thats not how you play Go.


Interesting. From chess it seems that, for important matches, players will deeply study each other's games and try to get the other player into positions that they may be less used to playing and less comfortable with.

Is this not done in go?


Its more important to smoothen out your own weaknesses than looking for your opponents. A weakness in your style of play is going to make you lose way more many games than your ability to find some weaknesses in some other player.

To become a pro you have to go through insane levels of competitions, you need to be strong, not find a weakness in the 100's of players you will face to just have a shot to become the lowest level of professional.


That's just appeal to authority. Lee Sedol's move against AlphaGo in game 4 could be considered a trap because it actually didn't work, but it was complicated enough to trick AlphaGo.


Appeal to authority is only deductively invalid, not inductively invalid.


But it's not even a good authority, since strong amateurs (semi-pros) have many opponents (amateur tournaments are played in the same day or weekend and have many matches), while top players prepare specifically for one opponent in tournament finals (each match played on a different day).


You never study your up coming opponents previous games?


As a pro you study everyones games as a mean to get all the latest information possible, but there isn't really much you can do as a top pro to play against your opponents likings: the era were that could bear fruits ended decades ago.


For example, US Congress has a lot of matches in a day so you don't even know who you're going to face. Only top players study the games of their opponents because they have time to prepare in finals of big tournaments.


All competitive-game-playing AIs ask, "What will my opponent play in response to this move?" It's possible for an AI to evaluate a move based solely on the resulting board position, but it wouldn't be very good. Pretty much all AIs play many turns out to see if the move is any good. In the case of AlphaGo and Monte Carlo tree search, they actually play to the end of the game many times. To do this, they must of course play moves for each player.


Ah but I think the key here is that it doesn't say "how will this player respond" but "how would a player respond".

No matter how I've played against it to get to where we are, it'll play the same from that point on. It won't identify me as a risky player from my pattern, nor will it try and classify me as "unpredictable" in some way. It'll play each move as though it has sat down at an already in-progress game between two random opponents.

> It's possible for an AI to evaluate a move based solely on the resulting board position, but it wouldn't be very good. Pretty much all AIs play many turns out to see if the move is any good.

I would strongly argue that these are identical situations. Playing out scenarios in your head but taking into account no history is the same as "evaluating a move based solely on the board position".


All players play to marginally beat their opponent. Playing too aggressively when you're in the lead is "losing a won game" because it gives opportunities for your opponent to create turnarounds.

In the end of a well-played game the losing player will find that they perhaps get more points than they expect, but each time they win a tradeoff with a slight margin the position on the board becomes much more solid and fixed than it ought to be. A 10 point lead with a large variance consolidates toward an unfaltering 1/2 point lead.

Indeed, commentary on Master (P)s games seemed to suggest this was exactly how endgames went.

Early in the game, confidence in winning is usually correlated exactly with point margin... but at the same time, score in the early parts of the game is very hard to estimate. It's repeatedly noted that AlphaGo plays a very "influence oriented" game which means that it eschews confidently having many points for having lots of "power" on the board which will later translate to points.

So, all together I'd say that AlphaGo plays very well here and doesn't have too much of an obvious computational bias.

The one true oddity is once AlphaGo's internal win probability reaches 100% it starts playing idiotic moves. The reason is simple: it's just searching for moves which have the highest expected win rate and at this point nothing it can do is bad. It won't lose points or anything, but instead just plays moves that are obviously pointless.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: