The Top Chess Engine Championship (TCEC) is a computer chess tournament. The goal is to provide the viewers with a live broadcast of long time control, quality chess - played strictly between computer chess engines created by different programmers. One Season is divided into several Stages and lasts about 3-4 months. The winner of the Season will be the TCEC Grand Champion.
They are current playing to lowest league--the qualification league. There will be several leagues culminating in a superfinal. 100 matches between the two best engines.
It will be exiting to see how the conventional tree search, neural network and NNUE engines compare this year.
The SC2 bots don't compete in real time, many of them can't, most of those that can are clearly hampered when playing against a human (necessarily real time) versus a machine (not real time), but you just watch the matches as if they'd happened in real time after the game is finished.
Why not just have a fixed pace unrelated to how long the moves "really" took to calculate, with a (no longer real time) clock shown to indicate how the machines used their allotted time?
> As in, real time to the computation? Why though.
Why not? What have SC2 bots (or their abilities) to do with chess bots?
> Why not just have a fixed pace unrelated to how long the moves "really" took to calculate, with a (no longer real time) clock shown to indicate how the machines used their allotted time?
You can't show moves that haven't been played yet, so either your pace is much too slow, or you have to wait until the game is finished before displaying it. And it still wouldn't make sense to display the moves in a fixed pace when the pace (and time pressure) is in fact part of the game (see also the recent human world championship where many blunders were often played just before the 40th move because of time pressure).
> You can't show moves that haven't been played yet, so either your pace is much too slow, or you have to wait until the game is finished before displaying it.
This makes sense for human players, but the Chess Engines aren't human, so their matches can proceed in parallel. Whereupon a human audience isn't actually watching them in real time anyway. AIUI The engines are not learning during tournament play, so unlike a human (who may discover an opponent's weaknesses during play over the course of a tournament) they're only getting updates at specific points between play.
"That's not how humans do it" is a pretty weak excuse even if not for the fact that TCEC has a completely inhuman design. Humans don't start from positions chosen more or less arbitrarily to reduce the number of draws, whereas TCEC does.
> Whereupon a human audience isn't actually watching them in real time anyway.
Uhm, yes they are.
TCEC can't afford to run more games in parallel. And if they did, they'd use all that processing power to run one game instead. The point is to produce the highest quality chess possible.
Why would you delay showing the moves that happen when you can just... not do it? You can always go through the games at your own pace afterwards anyway.
> This makes sense for human players, but the Chess Engines aren't human, so their matches can proceed in parallel.
And if they wanted to they could show multiple matches in parallel. After all that's what happens in the case of a tournament such as the recent Tata Steel tournament. I assume that the issue is that they want to maximize resources for the engines and for that you need to run at most a few engines (and therefore matches) in parallel.
If I read things correctly it's all running on a single server with "only" 96GiB of RAM per engine (https://wiki.chessdom.org/TCEC_Season_Further_information#TC...), running more matches in parallel would be detrimental to the quality of the chess played.
I am not sure how the SC2 bots work, but chess engines always play in 'real time'. They already play at a decent level with 100 ms thinking time, and longer time controls mean higher quality moves.
Most chess players play through chess games after the fact at their own pace, clicking one button to move ahead or another to move backwards. We've had this ability for hundreds of years: at one time using printed pages and wooden chessboards instead of computers, but the principle is the same.
But if you're really interested in the TCEC as a competitive sporting event, you might prefer to watch the drama in real time. How many people record the Superbowl and play it back at 100% speed the next day?
Is "100% speed" here the speed the actual Superbowl happens at, or the speed the game would proceed at if, in fact the goal was to play football not sell advertisements?
It's a sixty minute football game. So to me 100% is sixty minutes. Whereas when broadcast it's a four hour TV show.
Sports/eSports are more fun to watch live, because it's going on right now and you're finding out the results the same time as everyone else in the world is.
As for the SC2 bots, it sounds like they aren't very good. Chess bots, on the other hand, are, and these ones are playing a standard classical time control that is often used by human players. Time management is a super important part of this (indeed there are often separate neutral networks that are used to determine how much time should be spent thinking about a given move). The game inherently takes place live over a given window of time, and so long as that's true, why not stream it live too?
Well, whether the SC2 bots are "good" depends on how you squint at the problem.
The framework being used was developed for Google DeepMind's AlphaStar, which is a learning AI approach although obviously very different from their approach to chess.
But today the framework is used by rules-based bots largely competing against each other. This means unlike AlphaStar, which set out specifically with a human-like approach to beat excellent human players, the amateur bots are entirely focused on winning versus other bots by any means necessary. The most successful tend to have sprawling multi-theatre conflict as their end game, maximum army size, and a half dozen or more different small skirmishes happening at once, hard for the human observer to be sure who is better until suddenly there's a decisive outcome. That wouldn't be compatible with AlphaStar's mission at all, obviously human players can't fight these battles with success.
Their most obvious defect is they don't resign. A competent human player resigns hopeless positions in SC2 knowing quickly that they have lost, but most bots will stay in the game until destroyed which would be very rude for a human. They can be indecisive, attacking then pulling back, then attacking again in seconds, and they are much more easily thrown off by unexpected situations than a human - but overall they're a match for a good human player unless that player has prepared specifically to exploit a known weakness of a particular bot. (e.g. there are bots that do not understand why an enemy Nydus Worm in your base can't be allowed to complete... since that basically never happens)
Seeing chess bots play as fast as possible isn't all that interesting - play gets better the longer they have to think and the more resources they have to use. They could potentially do what you're suggesting, but unless they buy a bunch of new/better hardware the play would be worse because the engines would get less real time to calculate their positions, which defeats the point. And even if they did get better hardware, they could keep doing what they're currently doing and get better games due to the increased hardware.
Basically, you're asking why they're not playing blitz instead of classical - it's just fundamentally a different format.
Are there niche versions of this? There's like communities that try to write code that can run on old hardware and in certain space requirements, etc.
From a coding perspective - I think it could get really interesting to see people try to make super small engines. I've seen a couple and am blown away that a pretty small program w/o an ML model can destroy me in Chess.
It'd be cool to see how good super simple programs could get.
> I think it could get really interesting to see people try to make super small engines.
I wonder what's the minimal hardware which can outplay any human with time control?
We know supercomputers can outplay any human, nowadays any smartphone can. But what about 286? 8080?
I like that. I've actually generated competitive [1] genetic programs for Corewar [2]. It was only 5 instructions long, but proves that in theory, given enough time, a computer program could be generated that plays Chess better.
In a constrained environment like microcontrollers, the smallness of the space of all possible programs makes it faster for finding a good one. There is one catch, though: It may fail in unexpected ways.
I'd also like to see something of a chess Turing test, and while very hard to quantify a competition for the most human-like low- or mid-level chess engine. On Chess.com there's a plethora of bots of different ratings, but outside a few gimmicks (e.g. "Nelson" always gets his queen out early) they're all the same, only differing in how frequently they blunder in-between perfect moves.
Other than the board, that page is completely opaque to me. There's a whole bunch of what are apparently technical stats, but I can't even infer enough to know whether big numbers are good or bad.
This would be a whole lot more engaging with more explanation, at least hover tips over all the fields and so forth.
Yes you are right it is not a very good UX. For me the most interesting part is observing the evaluations of the position from the perspective of 2 engines. See the charts that track it move by move. It is often the case that one engine sees something that other doesn't. For example, say, that it is completely busted. I imagine internal dialogue: I'm fine, I'm fine, am I in trouble... oh $^#^ I'm losing now. It was especially true in the early days of neural chess engines which saw ideas which were well beyond the event horizon of traditional engines. Most people who are watching these tournaments on twitch are chess engine developers themselves that's why interfaces like these are fine for them.
The top line "eval" number is who is winning - positive means white is winning and negative means black is winning. "TC" is the time control - right now they are playing at 30 minutes + 5 seconds per move (for each side). Most of the other numbers are diagnostics about the chess engine's calculations.
Generally, their audience is "people who play chess" - and those people can be assumed to know this. If you don't play chess, watching a broadcast on Twitch or YouTube will be more enjoyable.
This website is niche enough that I think they are better of optimizing for what in-the-know users want than dumbing it down for people passing by.
I recognize that is not really what you are asking for though, and I agree that they could do better with tool-tips and the likes.
Leela Chess Zero is the open source successor of Alpha Zero, and has been the second strongest engine (except a few times where it bested stockfish) for the last few years.
And the reason for that is because they haven't been actively developing AlphaZero anymore. It's not worth it to them to spend their limited resources doing so when others in the Chess community are perfectly willing to. DeepMind has meanwhile instead been spending their efforts on solving protein folding, which I think we can all agree is a better use of their limited resources.
The limited resources I'm referring to is people. It's a relatively small organization, and focusing people on Chess comes at an opportunity cost of not focusing on something else.
Just because your situation is even more limited doesn't mean that they don't also have limitations. That's just gatekeeping.
I appreciate that nobody has unlimited resoureces but, for example, the latest DeepMind paper posted on HN (on AlphaCode) had a couple dozen authors. By comparison I do all my work myself, with the help of my advisor and so do his other PhD students. And all for a PhD student's stipend, when DeepMind researchers are paid at least at post-doc levels, presumably. There's no comparison in the resources that I and DeepMind can throw at a problem.
I mean, please, give me DeepMind's limitations. I'd be so happy!
Yeah Leela even one once in 2019. Since stockfish moved to a hybrid approach (conventional evaluation + neural network) it has been unstoppable though.
How does Chess engine timing work? Do chess engines strategically use up their time depending on the complexity or the criticality of the position?
Do both chess engines have equal machine power, cores, IOPS, CPU model, RAM, etc?
They're showing a game with an eval bar. How does it evaluate the position, don't you need a chess engine to evaluate and provide a score in the first place? Perhaps we're seeing two chess engines play against each other with a third one evaluating for the viewers? Is it the average evaluation between the two engines?
> Do chess engines strategically use up their time depending on the complexity or the criticality of the position?
Yes, this has been something engines have started to optimize. Some engines that don't do this well can be beaten by "flagging" (i.e. making pointless moves rapidly) once they run low on time.
> Do both chess engines have equal machine power, cores, IOPS, CPU model, RAM, etc?
Last I remember, the goal was to give engines roughly equal computational capacity. But it would be tailored to the engine, e.g. AlphaZero getting something with more GPU(s) and StockFish getting (at the time) more CPU(s) - at least for the final matches.
> They're showing a game with an eval bar
For me, it shows the current evaluation of both engines (numerically), as well as a graph with up to four engine evaluations (the two contestants + up to two "commentators").
To some extent it depends on the settings of the tournament because engines can work in the background and analyse position while it is not their turn. It is called ponder.
Time management is based on some heuristics built into the engine. Some positions that are being evaluated are more dynamic and engines have rules to evaluate positions which are settled down.
As for the hardware if 2 CPU engines are playing they have exactly the same resources. The problem arises when one engine is GPU based and the other is CPU. In this situations balancing the compute power is hard but even then they normally have the same time allocated.
For time management, the rough approach is to allocate a fixed % of the remaining time and increase or reduce it, mostly depending on the stability of the evaluation with increasing search depth.
Games are played on the same computer, with only one engine analyzing at a time.
The graph shows each engine's evaluation of the position for each move in the game. The red line is from a 3rd engine "observing" the game.
They are current playing to lowest league--the qualification league. There will be several leagues culminating in a superfinal. 100 matches between the two best engines.
It will be exiting to see how the conventional tree search, neural network and NNUE engines compare this year.