The list that's returned still contains mainly tactical surprises where Stockfish inaccurately evaluated the position at the end of depth 5. I think what I'm trying to say is there are some moves in a position that aren't tactically surprising (a piece sacrifice, a crazy attacking move, etc.) but positionally surprising (a long maneuver to get a piece to a certain square that I didn't think of). These positionally surprising moves aren't captured by this methodology because they don't involve large fluctuations in valuation when the depth changes.
As to your second point, an issue with how computer chess affects the modern scene is how playing the "best" move in any given position isn't representative of how humans play. Humans carry out plans and evaluate positions to the best of their ability, but the heuristics and procedure they use aren't the same as a computer's. For example, Karjakin didn't prepare for his match against Carlsen last month by playing a bunch of games against Stockfish. Rather he probably analyzed Carlsen's past games and opening choices to come up with a strategy.
I do think you can come up with a way to prepare against individually known opponents by identifying weaknesses programmatically. You can model a human's approach to playing chess as a distribution of parameters (material, king safety, pawn structure, etc.) that take in the current position and return the best move. You also have Stockfish's evaluation which returns the "best" move. With this, it's possible that you could use build a neural network that learns to play very similarly to a certain player by using their past games as a training set and comparing the chosen move to Stockfish's move. The network could learn to mimic the heuristics that the human individual uses to make decisions and playing against this new AI would be great practice for preparing against specific opponents.
I'm not sure I follow your point about tactical vs positional surprise. Surely the ultimate goal of the positional surprise is the same as the tactical surprise - you get an advantage at the end of an expected series of moves. Otherwise what's the point of getting into a surprising position that's not better than the conventional one?
My question is, is there any difference here that can't be solved by, say, upping the ply-number?
On humanlike chess-AI: have an adversarial network that works to classify human vs machine players, and optimize for humanness * strength-of-play in the AI?
The difference is that the positional sacrifice is less tangible. A space advantage, a tempo advantage, more mobile pieces, improved cohesion/coordination of pieces (Kasparov was legendary for taking this last kind of advantage and turning it into a lethal attack). It's a dynamic advantage rather than a static/permanent advantage, which also means there's a risk of that advantage dissipating as the game drags on.
These advantages aren't the kind where you can sit back and let the game play out confident of winning. It's a deliberate unbalancing of the equilibrium of the position, and one where this temporary dynamic advantage needs to be used to create a longer-lasting and static advantage.
Would it be fair to say you are trying to optimize for future positions where you aren't sure you will win, but the positions resemble certain archetypal positions/ share certain features that are advantageous (i.e. has a high probability of transforming into conventionally advantageous situations)?
I'm sure the chess AIs are full of this sort of knowledge internally, though, in the form of computation optimization algorithms. Perhaps the issue is to translate it to a human-usable format.
Indeed, chess engines do have heuristics to include positional advantage in their evaluation of a board, so they "know" in some way that a doubled pawn is disadvantageous or that development of pieces or attacking central squares is beneficial, much as humans know these things.
I've never heard experts discuss this, but I bet it's true that human beings still succeed in appreciating many of these benefits at a higher level of abstraction than machines do. An argument for this is that computers needed an extremely large advantage in explicit search depth to be able to beat human grandmasters. So the humans had other kinds of advantages going for them and most likely still do. One of those advantages that seems plausible is more sophisticated evaluation of why a position is strong or weak, without explicit game tree searches.
I looked at the Stockfish code very briefly during TCEC and it looks like a number of the evaluation heuristics that are not based on material (captures) are manually coded based on human reasoning about chess positions. But if I understood correctly, they are also running machine learning with huge numbers of simulated games in order to empirically reweight these heuristics, so if a particular heuristic turns out to help win games, it can be assessed as more valid/higher priority.
You could imagine that there are some things that human players know tacitly or explicitly that Stockfish or other engines still have no representation of at all, and they might contribute quite a bit to the humans' strength.
Perhaps the positional sacrifice can be identified by similar means. The most superficial measurement of a position is the material left on the board. So when you compare the superficial measurement to a deeper positional measurement and they are divergent, then we have something positional.
I think one of Kasparov's games against Karpov in the New York portion of one of their World Championship matches involved Kasparov sacrificing a queen for positional compensation on the black side of a King's Indian. It would be interesting to see what this project thinks of that game.
As to your second point, an issue with how computer chess affects the modern scene is how playing the "best" move in any given position isn't representative of how humans play. Humans carry out plans and evaluate positions to the best of their ability, but the heuristics and procedure they use aren't the same as a computer's. For example, Karjakin didn't prepare for his match against Carlsen last month by playing a bunch of games against Stockfish. Rather he probably analyzed Carlsen's past games and opening choices to come up with a strategy.
I do think you can come up with a way to prepare against individually known opponents by identifying weaknesses programmatically. You can model a human's approach to playing chess as a distribution of parameters (material, king safety, pawn structure, etc.) that take in the current position and return the best move. You also have Stockfish's evaluation which returns the "best" move. With this, it's possible that you could use build a neural network that learns to play very similarly to a certain player by using their past games as a training set and comparing the chosen move to Stockfish's move. The network could learn to mimic the heuristics that the human individual uses to make decisions and playing against this new AI would be great practice for preparing against specific opponents.