The video linked here focuses on the ability of AlphaZero in making long-term optimal piece sacrifices against Stockfish, which is largely incapable of understanding these types of moves due to its hard coded bias towards losing pieces defined in its value function, causing AlphaZero to be able to consistently outmaneuver and ultimately handily defeat StockFish in almost every single game it plays against it.
* "ultimately handily defeat StockFish in almost every single game it plays against it" the vast majority of the games were draws actually
* Regarding "stockfish is too materialistic" ... if you watch last TCEC's premier division and superfinal, you'll find games where stockfish offered to sacrifice a lot of material, lc0 accepted the sacrifice and went on to lose far later in the game.
* It's unknown (except for people at deepmind, probably) if a0 is still stronger than stockfish. stockfish gained a lot of elo since the published match (and would beat the stockfish version a0 played against in more games than a0 did)
It's been having an interesting time in various competitions, and much like the match you point out, it makes some decidedly more "human" moves, little stabs to try and get out of a draw situation that Stockfish won't due to its hard-coded biases. Here's an interesting game from TCEC 14 https://www.youtube.com/watch?v=rhI4DKGSjtk Ultimately it came second to Stockfish, but that was the highest placing it has had so far, and they keep refining and improving it all the time.
Stockfish is currently leading, but all games vs. LCZero so far have been draws, and the winner is decided by the "Superfinal": a series of 100 games between the two top engines of the round-robin stage. Stockfish won TCEC 14, but with both engines improved since then it's still uncertain who will win.
But open one is better than closed one as we can have other experiment such Facebook go, darkgo and minigo etc. Not sure there is any similar chess open net engine.
The one I remember most was that alphazero was running on a supercomputer, while it's opponents didn't get that luxury. This is said to be why alphazero didn't make it far when put on similar hardware as other ais.
The other criticism I remember was from stockfish's dev saying that the settings they have for stockfish where suboptimal, most notably they had a set 1 minute think time per turn. This is apparently not how stockfish works, because like a human player it will also calculate on which turns it will spend its time thinking.
So while alphazero may very well be better, there's not really a way to confirm this with these matches sadly. The matches whet still helpful to the chess community, though. Definitely still worth checking out
I think what you might be mistakenly thinking of is, with some articles erroneously reporting when the games/paper were first publicly released, the fact that training a model of AlphaZero requires a considerable amount of self-play that takes up a lot of compute cycles.
It’s important to make the distinction between training and inference time here though. For the analogous comparison to how Stockfish works, it has been trained for more than a decade now with humans iteratively refining and manually creating features and strategies. The primary achievement of what Alphazero represents, is that it was able to train itself completely independent of humans and come up entirely with its own features/strategies.
At inference time too though, we see a large increase in benefit for AlphaZero against Stockfish. As even though both agents had exactly the same hardware and time allotted, AlphaZero is far more efficient at exploring its search space due to the value/pruning function it learned by itself through training. Stockfish also has its own value and pruning functions, but because of the fundamental nature of how expert systems work, it’s very difficult to create generalized features that address every possible variable. That’s the other benefit of training a model to learn things itself through self-play as opposed to manually engineering the features ourselves by hand. This is why AlphaZero performed far better than StockFish here given the same time, and one of the reasons why it was deemed to be more “human-like”. Because as humans we don’t brute force possible moves, but rather immediately discount a large majority of plays and focus in on a very small subset that we know to be the optimal path to go towards, all through a sort of ingrained “intuition” built up over the countless games we’ve played.
I hope I was able to help clarify things with this short summary, and if you’re interested, I definitely recommend reading the paper itself for more in depth detail/information! You can find it freely available here at: https://deepmind.com/documents/260/alphazero_preprint.pdf
It should be noted that newer versions of Stockfish are generally slightly stronger than neural net based approaches (Leela). Although perhaps this is a temporary state of things.
Many of AlphaZero's matches were also notable because it managed to achieve very clean strategic positions in a way that seem to be longer term and require more multitasking than a human game (AZs bishops, for example). I bet I could score (a little) more than 50% just shown a mid game board position, no moves.
So it is not 100% by any means. And AlphaZero did seem to be slightly more 'human' than Stockfish, and the games more entertaining as a result. Though still recognisably superhuman.
But I would say, the excitement in human chess is not just a set of moves, but the meta. What the players have played before, how fast they play, their use of time, where their mistakes happen, how risky do they need to play to make up points. This is often more apparent in either short time controls, long game series such as the world championship, or tournaments.
Like watching a good pitching duel in October baseball. A random number generator and pitching machine could conceivably statistically play better, but would miss the point.
My enjoyment of chess is not simply a scalar function of chess moves.
Games played by computers (with the exception of NN-based programs) have a generally-discernible feel to them that I believe most grandmasters can spot a mile away.
Stockfish frequently sacs pawns and minor pieces for space advantage
Modern games have multiple character builds where the only way to play is to cheese (all characters that must kite have a cheese mechanic. I’m not gonna chase someone until I die. Fuck that. I’ll hide and try to pick em off)
Watching one AI cheese another doesn’t make me happy. It makes me sad for what I thought we’d have now.
Halo 2 I think, they pre-computed the strategic value of every spot on the map and the AI would use the terrain. First and last time I’ve heard someone do something like that.
In particular, there was one that played a bunch of chess engines against each other, and came up with a better metric than Elo, for when players aren't going to change skill.
...but I was more impressed by the Atari 2600 chess game which had castling and en passant. Having 256 bytes seems like a good excuse to leave out such nonsense.
also maybe some previous hn discussion may have some analysis https://news.ycombinator.com/item?id=9151552