The creator admits it early on -- it's measuring rarity based on the specific notation everyone uses, which greatly influences the classification of rarity.
Fundamentally all chess moves are a piece moving from one source to another destination including:
- castling as a king move with a distance greater than 1
- pawn moves to the 8th or 1st rank with the additional datum of a new piece
- en passant is the same as a regular pawn capture, it just requires the victim pawn to have moved two squares previously.
Algebraic notation also has an arbitrary and reasonable amount of extraneous detail despite dropping the source location if it's unambiguous.
For example, the captures (x), check(+) and checkmate(#) symbols are all unnecessary given the previous state of the board is always known. With en passant it's also unnecessary to have a special symbol indicating an en passant capture, and indeed there isn't one.
I was initially hoping to get some insight on e.g. which pairs of squares had the fewest moves for a given piece etc.
That being said, I thoroughly enjoyed the video. It was beautifully illustrated and explained everything clearly.
"I was initially hoping to get some insight on e.g. which pairs of squares had the fewest moves for a given piece etc."
This may not be quite what you were asking for, but it's close, and has the advantage that I can link it right now. Tom7's Elo World chess video has where pieces start and end up, and their survival rates, as a chart: https://youtu.be/DpXy041BIlA?si=Zdh6Rh6mekatp2-q&t=815
He did find a move that occurred a single time including the specific game that included it. He also showed many moves that occurred zero times from every single game played on lichess.org.
So, depending on your definition either could reasonably qualify. Which you pick as the rarest is simply an arbitrary definition.
You could consider different notions, but run into the issue of defining what is unambiguous. IE You could say e2 to e4 is unambiguous for a given game state but that would imply game state must be included in the definition for of a move. Defining what the minimum game state is for an unambiguous move would be a video of its own.
What occurred once is not what chess programmers would call a “move”, but rather what chess players would call “move”.
His definition of a move is one ply of algebraic notation. From a chess programming perspective algebraic notation is just a data format and doesn’t have any greater significance.
In programming terms a move is a data structure that allows you to derive one position from another according to the rules of chess.
In Stockfish a move is a 14 bit number, the first 5 bits are the destination square, the next 5 bits are the origin square, the next 2 bits are the promotion piece, and the last 2 bits are the move type (normal, promotion, en passant or castle)
I was wondering on how Stockfish encodes the destination and origin square in 5 bits each. I think they don't, at least the code on github uses 6 bits each, which actually gives you 64 possible values, so works out fine:
Comparing games with that notation alone would make moving your queen e2-e3 in game A the same as moving a pawn e2-e3 in game 2.
I don’t think anyone would actually agree those are the same move.
What Stockfish considers a move includes the full board state, it simply doesn’t need to pass that information around internally. Thus removing that ambiguity but means there’s a great number of moves that have only occurred once.
Also, it’s 6bit + 6bit + 2bit + 2bit = 16 bits which isn’t an arbitrary number. There’s no need to actually encode that something is a promotion because it can be inferred from the board state, but there’s zero cost to pass an extra bit around so it’s included anyway.
Notating checks is not even redundant; it can disambiguate which piece is to move without additional information (e.g. Rac1 and Rhc1; only one of them might give a discovered check, so Rc1+ could then be an unambiguous notation where the check is not redundant). The PGN spec is clear that SAN disambiguates legal moves and not pieces (if moving one of those rooks would put yourself in check, you should not disambiguate when you move the other one), but I don't know whether it considers the check part of the move for those purposes.
I see what you mean obviously, but neither of those moves could possibly give a discovered check, right? If the rook starts in the corner of the board, nothing can hide behind it or attack from behind it.
> Neither the appearance nor the absence of either a check or checkmating indicator is used for disambiguation purposes. This means that if two (or more) pieces of the same type can move to the same square the differences in checking status of the moves does not allieviate the need for the standard rank and file disabiguation described above. (Note that a difference in checking status for the above may occur only in the case of a discovered check.)
Presumably there are a mind-bogglingly-huge number of unique board state transitions. It's virtually impossible for the same game of chess to be played twice, except for silly scholar's-mate type games. Almost every single game in a chess database will have many unique board state transitions.
I bet it's way less than you think - orders of magnitude.
What could happen versus what does happen are entirely different.
Doing some algebraic permutations computation here would be like claiming a 50,000 letter English document has 27^50,000 possibilities.
I mean no, there's words, and they only go in certain orders and there's all these rules.
Here's another approach: humans are pretty lousy when remembering large amounts of anything so let's say there's in practice, only been 100,000 unique games played over and over. Without the help of a computer or careful tabulation, I'm pretty sure no human would realize it because no human can remember 100,000 unique games.
Anyways, it's worth digging into the data to see what the variation really is. I bet the 90th percentile is embarrassingly small with a long tail that's far shorter then most think
edit: so I actually took 7.7 million games from https://www.ficsgames.org/download.html and did some basic processing on them. These are people ostensibly with ELOs over 2000 which is pretty decent just to see if I'll eat crow on this one.
Going in, I was expecting a uniqueness level to be something like 50-70%. Actual percentage of unique games over 7.7 million? 98.7%.
Alright fine.
Although I could try to do 1 billion games, I expected the distribution to be readily visible around 7 million.
Now as an artifact of the data, I made the games as compact as possible, potentially leading to ambiguity maybe. So a game might look like so
Given this we can just run uniq with incrementing numbers and find out how things increase. I'm doing this on a pretty old laptop (3rd gen intel) so excuse me for cutting things a bit short
number of characters / unique entries / percentage duplicates
It would be interesting to see later, when I crunch larger numbers on a more capable machine, if these distributions generally hold. Of course it won't, it's not possible. But I'm wondering if it's greater than what the Shannon limit would predict.
An ancillary analysis would be to compare it to the possible legal permutations at a character count although this would of course require a board and rule model.
I would expect those percentages to decrease as the length increases and perhaps such a function can give more predictive heft to the actual "language" of chess in practice
It's also worth noting that unique string != board state.
Proof: Both black and white could move the left rook pawn as their first move and right rook pawn as the second move.
Now reset the board and do right rook first and left rook second. Same board state, different game string.
In practice unique board states is a strict subset of number of moves but given how far off I was on my first assessment ... I wonder if we're talking about another < 2% hit.
All of this is dependent on an actual engine that can process the notation. There's apparently lots of options for pgn.
I'd also like to develop a heatmap based on statistical analysis. I'd imagine this would not only be way less than equally distributed but there'd be no way to slice the data to make it appear equally distributed
It would be fun to have a chess variant where en passant applied to every move:
- you play bishop a3×e7 taking my queen
- I reply with bishop a7×c5, taking your bishop en passant, getting my queen back (your bishop got taken before it reached my queen)
- you reply with knight a4×b6, taking my bishop while it’s on route to intercept your bishop that took my queen. You get back your bishop, it does end up on e7, and I do lose my queen again
- I reply, taking your knight while it moves through a5. Your bishop dies again, I get back my queen.
- etc.
For knight’s moves, I think you’d have to either make a hard rule as to what square they move over, or let the player say how they moved on every move. Also, two pieces could be taken in one move (a piece on the target square and the knight that just hopped over it)
Standard chess already has some of this in the rules for castling. There, you aren’t allowed to move your king through a position where it would be attacked by an enemy piece. That’s like saying it can be taken en passant.
In what sense is the pawn not all the way there? It occupies the square, prevents any other piece from occupying the square, can deliver check or checkmate from the square, and can be captured on the square.
The OP refers to the fact that “en passant” is french for “in passing”, so the move sort-of refers to the idea that the pawn takes the other pawn while it is passing through the third or seventh row, as if the capture starts while the previous move still is in progress.
Also, the pawn can’t deliver checkmate, can it, if it can be taken en passant? It probably is possible to construct a position where taking en passant would bring the king into check in another way, but in those cases, the en passant move isn’t possible.
> Also, the pawn can’t deliver checkmate, can it, if it can be taken en passant? It probably is possible to construct a position where taking en passant would bring the king into check in another way, but in those cases, the en passant move isn’t possible.
I believe I've managed to construct a situation where this is the case. The key is that the pawn that would be able to take en passant is being pinned (e.g. by a queen or rook) with the king directly behind it, such that the pawn cannot perform any captures. Then, you just need to make sure all of the squares adjacent to the king are threatened, and finally actually put the king into check via a pawn advancing two squares.
Technically, the c4 pawn cannot be taken en passant (i.e. this is an illegal move) because it would expose the black king to a different check. But I think this is in the spirit of your question.
It's because pawns used to be able to move only one square. En passant was created when they were allowed to move two squares, sort of pretending that it only moved one square and is why you can only do it immediately after the first pawn move, kind of where the pawn "should" be.
In the sense that a pawn that's in the perfect position can strike while it is "passing", but if that doesn't happen then it finishes the move and it's too late, the opportunity is gone forever.
Clearly there's some Heisenberg uncertainty principle where the pawn occupies both the third (or sixth) and fourth (or fifth) rank, in a kind of superposition that only an opponent pawn situated in a certain position would be able to observe.
I think the logic is based on pragmatism. A different piece has a chance of capturing the pawn later, but a pawn would never be able to since it can't go backwards.
This video is really about the rarest Standard Algebraic Notation (SAN) corner cases, which is more than a little different from rarest moves. But the author basically acknowledges this, and 'rarest moves' is so hard to really define anyway. So kudos it's otherwise a great video.
I'm about equally impressed with the statistical analysis and the video construction and presentation. I can imagine tackling the former, but not the latter. I did notice a few mistakes in the video presentation though (eg a Bd4# that he presented as Be4#). I imagine at some point he just thought "I've polished this thing enough" and stopped.
I bet it becomes unique far far less often then most people think.
Computing the number of permutations is thoroughly unconvincing.
For instance, there's 20 possible first moves and of those only probably 2 are played 95% of the time. You can certainly compute what the rates open is and the rarest response that's actually played or the rarest response to the most common open
Interesting, but I initially expected this to be about the unusual opening employed to victory by Magnus Carlson against Kacper Piorun on May 7, 2024[1] (1. a4 e5 2. Ra3).
It's even more interesting because an unknown IRL Chess.com player named Viih_Sou (since revealed to be Brandon Jacobson[2][3]) used this opening to defeat Daniel Naroditsky on May 2, 2024[4] only to be subsequently banned for violating the Fair Play Policy[5].
Why would that be the rarest chess move? It's a pretty common way to try to start a chess game, so much so that kids have to be taught not to do it because it's flat-out terrible.
I just want to say that the format of this video is beautiful and easy to follow along. A topic that is easily boring and dry is presented in a way to keep the viewer interested
The problem is that there must be hundreds of ways to describe "a move" from the complete board state before and after to the distance that the piece moved (moving moving a bishop and queen 2 diagonally is the same move).
So while I agree that the notation is pretty arbitrary and puts lots of emphasis on how implicitly a move can be recorded I don't think it is fundamentally worse than any other definition. Yes, personally I would have picked something more directly tied to the game than if the notation requires more disambiguation, but I don't think it really makes the video any less interesting or the result and less useful (probably no use either way).
Agreed. Just the notion of double disambiguation is meaningless to a chess player who doesn’t use the notation. This is fundamentally not about chess. But about one particular way that certain people use to write down chess moves.
Another count against statistically analysing SAN strings to death is some of the rules of that format. There's a lot of weight given to disambiguation in this video. In SAN you must not (if you're following the rules, some software including ChessBase doesn't follow the rules) disambiguate if one of the pieces is pinned to the king. So 1.d4 e6 2.c4 Bb4+ 3.Nbd2 Nf6 4.Nf3. Not 4.Ngf3.
This muddies the waters I think, these doubly disambiguated moves he lovingly isolated would be recorded differently if a marginally involved piece was pinned.
More significantly, there's no SAN notation for stalemate, or en-passant (or many other things of course) of great chess significance, perhaps more worthy of analysis.
The video was still a great technical achievement, and very entertaining for a chess nerd like myself.
You’re welcome to scan the game archives to determine the actual game state and find the case where someone had three Bishops on the same color, positioned in three corners of a square, moved one to create a discovered checkmate, but the move was notated without disambiguation because the two other bishops were pinned.
The problem with using a dataset consisting of all games on lichess.org is that most/all instances of these moves are most certainly from people who are trying them out in a noncompetitive game just to see what happens. In fact he himself likely polluted the data further just to make this video, maybe even enough to change the answer.
There needs to be a minimum bar for the data to be meaningful, e.g. by restricting to players above a certain rating threshold, or considering tournament games only.
I don't think including "noncompetitive" games is an issue. For a game with so many possible states, it only makes sense to ask about what moves have been played at all, and not the context that these moves were played in.
Plus, restricting the dataset introduces more biases and ambiguities. What exact ELO should be "good enough" for consideration? Why not a point higher or lower? Should they have accounted for time control too, because people in speed chess play worse and can get into weird situations they otherwise wouldn't have been in?
He stated that he tried using master tournament games but the dataset was way too small.
But yes indeed, the single example game he showed was indeed a result of the winner playing very silly moves and the loser allowing it rather than resigning.
In bullet games at Lichess it is not that uncommon to play on lost positions to try to either flag the opponent or to offload as many own pieces as possible to seek a stalemate with the frenzy. Conversely, the winning side then tries to delay the win by promoting a bunch of unneeded pieces and sort of demonstrating who is really in charge. It’s even fun.
This is delightful. I think that the hardest part (that, honestly, he absolutely nailed) is defining "rare" and "move" -- not only did he come up with reasonably satisfactory definitions, but he also was able to walk us through his discovery process.
The rest is a fun programming problem (which he largely glossed over), and it's clear he put a massive amount of care into the video. Thank you!
Kinda misunderstands algebraic notation: You are by no means required to use the minimal unambiguous notation. In fact, the canonical form is something like "Ng1-f3". "Nf3" is a context-dependent abbreviation of that.
Dig up an old chess book if you don't believe me. You'll find books that only use the long form.
Notation has changed over the years. Even twenty years ago you could pick up (admittedly dated) books using descriptive notation (e.g. P-K4). It's pretty clear what is the accepted standard for notation today though.
Algebraic notation hasn't changed notably in the last (at least) 50 years. The abbreviated form that most people use today was also the form that most people used 50 years ago. (A change to indicate draw offers is the only one I can remember.)
The point is that there are multiple ways of writing the same move, and none of them are more correct than the other. Basing your definition of a move on the precise text that the computer happened to spit out when the games were retrieved, that's just being lazy.
OP misses a couple important things when he's discussing notation: when a pawn gets promoted, you have to specify what it gets promoted to, and also if the move results in stalemate you note that. Stalemates are much rarer than checkmates, and underpromotions are very rare also, so I imagine the stalemates analogous to the checkmates he considers must be rarer still, if they ever happen at all.
If you define a "move" as a piece moving from one square to another, which I think is the correct definition, then the rarest move is probably an illegal move. Like black king jumping from b4 to h8 or something. Or a piece moving to an already occupied square of the same color. In over the board chess illegal moves happen all the time in bullet chess, but not online of course.
Chess historian Tim Krabbé ran an analysis with a different notation, using an extended algebrical system (start and end square + kind of captured piece, if any). Here his results: https://timkr.home.xs4all.nl/chess2/diary_6.htm (entry 105).
His site is a trove of information for people liking chess trivia.
Creator of the video uses the standard chess notation, which encodes if a move
- is a capture
- is a check
- is a checkmate
- needs a disambiguation
- and other things
It is arguable how much the rest of the board is part of a move (see the current top comment), but let's say moves are different if their standard chess notation is different.
Then it turns out that bishop double disambiguation moves are very rare. They require 2 same color bishop underpromotion for the start. And they doing captures and/or checkmates are even rarer. A lot of them never ever happened on lichess.
That's about the content. Now about the format.
The video features hands expressive hands all along, which makes it stimulating to watch. If you are into presentation, go and check it out.
Fun video. An interesting follow up would to do this would be to ‘disambiguate’ every move, such that a move was a single piece’s movement and reflected less of the state of the game.
It's a 17 minute video.. that's quite the round up. Just saying because for me the difference from 15 minutes to 30 minutes tends to go from: "yeah I'll check this out" to "boy this is an investment".
Interesting, I'd assumed it was going to be a pawn promotion to something esoteric like a knight that didn't create a check, but that's a couple layers deeper.
On a scale of billions of games, your situation would be be fairly common. The author of the video got to the conclusion by stacking multiple insanely rare occurrences - the player would need to underpromote to a bishop (the rarest promotion) and then capture and checkmate while these bishops are placed in a specific configuration. The author get into weird territory fast when fishing for the strangest game out of billions.
Did he include under-promotion captures resulting in zugzwang? If you’re not including zugzwang then you shouldn’t include checks and mates. The problem becomes a lot simpler.
parent is referring to stalemate. Underpromotion to a bishop to achieve stalemate was my guess for rarest, since you can always stalemate with a queen just as well as a bishop, so theres no reason to underpromote if you goal is a draw on the turn in which the promotion occurs.
This video is excellent. It’s one of those videos that will become a classic of YouTube in the future. The kind that’s recommended to millions of people
Fundamentally all chess moves are a piece moving from one source to another destination including:
- castling as a king move with a distance greater than 1
- pawn moves to the 8th or 1st rank with the additional datum of a new piece
- en passant is the same as a regular pawn capture, it just requires the victim pawn to have moved two squares previously.
Algebraic notation also has an arbitrary and reasonable amount of extraneous detail despite dropping the source location if it's unambiguous.
For example, the captures (x), check(+) and checkmate(#) symbols are all unnecessary given the previous state of the board is always known. With en passant it's also unnecessary to have a special symbol indicating an en passant capture, and indeed there isn't one.
I was initially hoping to get some insight on e.g. which pairs of squares had the fewest moves for a given piece etc.
That being said, I thoroughly enjoyed the video. It was beautifully illustrated and explained everything clearly.