You can run Stockfish single threaded in a deterministic manner by specifying nodes searched instead of time, so in principle it is possible to set some kind of bounty for beating Stockfish X at Y nodes per move from the start position, but I haven't seen anyone willing to actually do so.
Surely it is apparent to you that the first few moves are not independently chosen by the engine, but rather intentionally chosen by the TCEC bookmakers to create a position on the edge between a draw and a decisive result.
Yes, engines would almost certainly never play 2. f4. That's a different question than whether chess is solved, for which the question of interest would be "given optimal play after 1. e4 e5 2. f4 is the result a win for one side or a draw?"
It's also almost certainly the case, in that I don't know why you would do it, that Stockfish given the black pieces and extensive pondering would be meaningfully better than Stockfish with a time capped move order. Most games are going to be draws so practically it would take awhile to determine this.
I'm of the view that the actual answer for chess is "It's a draw with optimal play."
However, it is true that Elo gain on "balanced books" has stalled somewhat since Stockfish 16 in 2023, which is also reflected on the CCRL rating lists.
IMO AlphaZero was partially a result of the fact that using more compute also works. Stockfish 10 running on 4x as many CPUs would beat Stockfish 8 by a larger margin than AlphaZero did. To this day, nobody has determined what a "fair" GPU to CPU comparison is.
Engines like Stockfish might have over 100 "search parameters" that need to be tuned, to my best knowledge SPSA is preferred because the computational cost typically does not depend on the number of parameters.
Or, if attempting to use SPSA to say, perform a final post-training tune to the last layers of a neural network, this could be thousands of parameters or more.
The concern about the dimensionality of the search space is real, especially once things cross over into the 100s -- BO would certainly not be useful post-training the way the blog post talks about using SPSA.
That being said, it still seems possible to be that using a different black box optimization technique for a fairly constrained set of related magic numbers (say, fewer than 50) might lead to some real performance improvements in these systems, could be worth reaching out to the lc0 or stockfish development communities.
Leela policy is around 2600 elo, or around the level of a strong grandmaster.
Note that Go is different from chess since there are no draws, so skill difference is greatly magnified.
Elo is always a relative scale (expected score is based on elo difference) so multiplication should not really make sense anyways.
reply