Not very technical, not about Go tactics either, but it's just a very well-done movie about the people involved.
(I wouldn't worry about the criticism from the know-nothings below; I doubt a single one has ever had the slightest involvement in making a film so they're just ignorant loudmouths)
If a person likes soap operas, they could enjoy the movie.
Calling a movie bad doesn't diminish the original event. It just criticizes the movie itself.
> Only on HN will you see a recounting of a massive achievement of humankind dismissed offhandedly like this.
I'm not dismissing an achievement of humankind. I'm dismissing the PR piece they put out about it.
Do you struggle to see the distinction?
Sure, I’d like it if they discussed the algorithm and the code but you need to entertain a regular audience.
I started showing people ChatGPT when it first came out, they shrugged, they didn't get it. Most people still don't get how important generative AI is and will be. Eventually, they'll have that moment too.
I have seen ChatGPT. Until they fix the error rate, I see it as a novelty. A toy you can’t count on nor offload responsibility to.
And technically an opinion is not a hard fact and us humans have plenty of opinions.
See https://arxiv.org/abs/2211.00241 and https://goattack.far.ai/
The best Go programs have a flaw that allows a good, but not championship-level, human to defeat them, by creating a group that encircles another group, which apparently confuses the AI's method of counting "liberties", which determine whether the group lives or not.
Some appear to dismiss this as just a "trick", but it seems to me to point to a more fundamental deficiency in the architecture or training method.
They achieved a <10% win rate against other engines https://goattack.far.ai/transfer#contents, so the strategy is not that generic. Edit: actually it was 66% against another bot, https://goattack.far.ai/human-evaluation#human_vs_lz4096 but they had to bring the visits down to 4096, which I assume means that at a "normal" visit count the bot would still win.
Still, that paper is extremely interesting, consistently triggering suicidal behaviour in a super-human bot.
A human can, technically, beat a superhuman Go AI. But the process is clearly playing the opponent rather than the game, the moves are obvious weak moves. Humans aren't winning by playing good moves, the challenge being posed to the AIs isn't intimidating at all and they will defend against it sooner or later.
The beauty in a computer saying "fuck you, I'm going to win, this isn't a poetry slam, all I need to do is beat you by a single point" and demolishing a the best opponent humanity had to offer.
—Lee Sudol (9d)
1. The human has to play perfectly.
2. The human has to play perfectly and quickly.
The premise being that an AI with less time to calculate its moves could result in an advantage to the human.
Similar to what is being called hallucination in LLM area
The human move seemed like an incredibly improbable move by Alphago, and ended up giving the human an upper hand.
Lines in Go are counted from the edge of the board. Here's a visual of the 3rd line for example: https://senseis.xmp.net/?ThirdLine
The 1st line is uninteresting. The point of Go is to surround territory. You cannot surround any territory on the 1st line. Players try to avoid playing on the first line until the end game.
The 2nd line is called the "line of defeat". It really only "catches" 1 point of territory (the point on the first line). If players take turns playing on the 2nd and 3rd lines next to each other, with the 2nd line player taking 1 point of territory, and the 3rd line player taking no territory but outward influence, it is considered a great victory for the 3rd line player because center influence is generally counted as worth 2 points per stone of influence. This is a loose count, because it's not actually any real points, but generally accepted as reasonable. Here's a visual: https://senseis.xmp.net/?TheSecondLineIsTheRouteToDefeat
Side note: "Influence" is the term used to describe how stones facing toward the center affect the flow of the game. They don't give direct points, but a skilled player can use their influence throughout the game to control the direction of the game and thus gain points in the future.
The 3rd line is the "line of territory". Each stone here gets about 2 points of territory. Players are usually happy to be able to make moves along the 3rd line, especially if they can do so while doing something else, or while maintaining control of play.
The 4th line is the "line of influence". Similarly to the 3rd line, players are often happy to be able to play moves along the 4th line because stones on the 4th line will be advantageous throughout the game. While plays on the 3rd line often don't give influence (because their influence is easily countered by 4th line plays by the opponent), players are happy with the territory they provide. Similarly plays on the 4th line don't give territory (the territory can somewhat easily be scooped up by plays along the 3rd and 2nd line), but players are usually happy with the influence the 4th line provides.
Thus the 3rd and 4th lines are the most common lines for play. 3rd line for when a player wants territory, 4th line for when a player wants influence.
The 5th line is very much approaching the center of the board. https://senseis.xmp.net/?FifthLine . While it gives similarish influence to the 4th line, it is even easier to scoop out territory from under it. Usually players avoid playing on the 5th line unless there's a specific reason such as strengthening a position or pressuring an opponent. It's not an unplayable move to play on the 5th line in general, and some players experimented with playing more on the 5th line, but it's not considered as valuable as the 3rd and 4th lines.
A shoulder hit is a tactical move where a player pushes their opponent from behind. Usually it turns into a move where both players end up trading moves along 2 different lines. https://senseis.xmp.net/?ShoulderHit
As such, shoulder hits have historically been very common on the 4th line. This happens when a player has a stone on the 3rd line and their opponent plays an attack move on the 4th line diagonal to it. Often both players will take turns from there strengthening their position along the 3rd and 4th lines. The 3rd line player takes territory and the 4th line player takes influence. This is often considered a fair trade.
But AlphaGo played a shoulder hit on the 5th line. This looks like a rookie mistake because that forces the opponent to take territory on the 4th line. If both players take turns building from there, the 5th line player gets "2 points" of influence while the 4th line player gets "3 points" of territory.. for every stone played on these lines! This is the kind of move that is commonly told to beginners "do not shoulder hit on the 5th line". It is a mistake. Most people just learn not to consider it.
I hope this helps :)
That didn't age well
I think that's the huge flaw in all of these ml systems. They don't build fundamental understanding. We're brute forcing it in a way, but perhaps we're losing something in the long tail.
I think the ability to learn by self play (essentially in a closed room without external training data) is where the line between "fundamental understanding" and "regurgitating information" from these AIs lie.
Not saying that computers will think or not, just saying we have new challenges before the Turing test.
Same with drawing images and understanding language: this is not solved yet.
This is like showing an answer on an exam but failing to explain how you got it. I doubt you can get away with this.
> But humans failed to find an algorithmic solution for Go.
sure we have algorithmic solutions for go, they're just not very good.
> All they could do is to throw a lot of data and get a bunch of coefficients without discovering underlying rules.
that's not completely true either, the special thing about ~alphago~ alphazero* was that it learned by playing itself instead of learning from a pre-recorded catalog of human games (which is the reason for its - for humans - peculiar playstyle).
now i'm not sure how you're arguing a neural network trained to play go doesn't understand the "underlying rules" of the game. to the contrary, it doesn't understand ANYTHING BUT the underlying rules.
explaining why you did something isn't always easy for a human either. most times they couldn't say anything more concrete than "well it's obviously the best move according to my experience" without just making stuff up.
*edit: mixed up alphago and alphazero
In other words, the algorithm is very long for a relatively reduced programming language.
We don't understand the explanation, though it is correct. Not sure if the problem here is with the capabilities of the examinee or with the examiner.
The explanation is perfectly sensical, just too complex for humans to understand as the model scales up.
The thing you're looking for - a reductive explanation of the weights of a ANN that's easy to fit in your head, does not exist. If it were simple enough to satisfy your demands, it wouldn't work at all.
Meanwhile things like stock markets attempt to with things like partial future prediction, which means all possible outcomes are not calculable in finite time, hence they use things like ML/AI.
Sonny: Uh... yes?
Oh how the tables have turned.
Sonny: "...oh god"
That's the neat thing about exponential curves, you always feel like you're at the fun part of them.
The singularity is defined as a moment in time when the A.I. improves itself to such a degree that humanity can no longer keep up.
Throwing more compute power vs a single opponent is not the same thing.
How would this computer fare against the top 10 best players collaborating(or even top 90-100)? I would bet it would lose big time.
It is unlikely, also, that a committee of players would be significantly better than a single master, due to lack of coherence -- but that's an interesting idea! I wonder if a committee of the top 100 go players playing a game by vote could beat someone in the top 10 more than 20-0 or something; i doubt it -- it might even go the other way (that the single player would win the series)
I don't think this counts as the real "start of the singularity" because Alphazero was not able to (or capable of) altering its own algorithm, but rather just adjusting its weights.
Something more akin to being in the long march toward general AI.
As a personal note the whole issue of large LLM's capacity for intelligence, beauty, humanity, morality, logic, etc etc was softened in my mind and heart by witnessing with rapt attention this epochal shift in computing.
I had held Go up as a paragon of human brilliance and beauty -- to see that standard fall was a complex process of grief and discovery for me, which I feel has better prepared me for understanding and appreciating the emergence of LLMs
Winning, I think, is secondary to this. It's a useful measure of how one has progressed in that transformation, but I think the lessons and principles from Go that I can apply in guiding my my day-to-day life are more valuable.
How would this computer fare against the top 10 best players collaborating(or even top 90-100)? I would bet it would lose big time.
The top 10 chess grandmasters in the world working together could not beat the best (or even "mediocre") chess engine. Not even close. They're practically playing different games.
Against the AlphaGo of 2015? They might win, but probably not (I think you're overestimating how much collaboration would help). Against today's AlphaGo/KataGo/FineArt/etc there's literally zero chance, even with a two stone handicap. Same goes for 100 GMs playing collaborative chess against Stockfish.
(that said, I agree calling this the singularity is overkill)
I think even with that in account, RL has only reached a tiny fraction of its potential. We have focused so much on supervised and unsupervised learning for so many years, and then been wow'ed by LLMs we have only see RL start to impact industries in self-driving/flying vehicles, and forget about all the other potential.
The thing about RL that people don't seem to understand is that it is mathematically proven to find the optimal control policy.
In the context of go, that means the only way it can be beaten is through variance (or "luck", if you prefer). As there is no dice or random element in go, the top players in the World basically have to be optimal in every move to get a draw. And then again, and again, and again.
And that's the best they can do if the RL algorithm has stopped learning - it's found an optimal strategy, and it can't be beaten, only matched.
Think about all the optimisation and control problems out there that could benefit from this. And yet still we seem to think it's like supervised/unsupervised learning and only "accurate" to ~90+%, and so it doesn't get the attention it deserves.
Or perhaps I'm a dreamer and an optimist and you're right.
I would happily take the other side of that bet though. At even money, I have all the EV, I'm confident of it.
Katago can give handicap stones to pros and win. It's as much better than pros as pros are better than unserious amateurs.
It's not even a matter of compute power, katago is very good with 0 playouts.
I believe in Go that is no longer the case, however. A human provides no additional value to a Go engine.
That's why you shouldn't bet money on things you don't know about...