Hacker News new | past | comments | ask | show | jobs | submit login
Titled Tuesday Cheating (dorianquelle.github.io)
203 points by surprisetalk 6 months ago | hide | past | favorite | 125 comments



Hikaru talking about the cheating allegation against him: https://www.youtube.com/watch?v=Zm2R--_fBj0

I'll add that he live streamed the tournament he was accused of cheating in. If you've ever watched how fast he moves and quite often pre-moves & talks through his strategy as he goes, it'd be very surprising to see that he cheated (and I don't even know how he'd go about doing so - he's faster than the engines!).


I don't think he was cheating, if just because he was playing a series of players far, far lower than his standard level, and he proves that level in tournaments that have far superior anti-cheating requirements.

However, could he be cheating, if he wanted to, in ways undetectable by the livestreaming? Absolutely! At the GM level, just having the evaluation visible somewhere is more than sufficient to gain 100 points of ELO: You know when your opponent blunders, and if you are winning. That can be setup trivially somewhere out of view. He doesn't need to be told one move, ever.

See, for instance, Carlsen-Caruana WCC, game 6. Ended in a draw. But there was a mate in 30 once, that neither player saw. The moment Fabiano was told of the mate in 30, after the game, he started blurting out the right line... because if there's mate in 30, there was only one way it could possibly go, but it was too hard to calculate. Mate in 30 says 'yes, the really crazy line that you saw works out!' The same can happen in blitz.

As for Hikaru being faster than the engine... it's a crazy claim that he wouldn't believe himself. At any and all time controls, the engine wins, all the time. The calculations from the previous move already included every move your opponent does... It's just that the lame browser engines aren't set up to remember the line from move to move, so they recompute. See what happens with any engine in the cloud, using aggressive parallelism, the kind a super GM uses to study.

There's minimal cheat prevention in titled Tuesday. Even when streaming, cheating is trivially easy to do, by anyone. But this is a place where Kramnik is overplaying his hand. There's probably many players making money in titled tuesday by cheating, as their performance is so very far from over-the-board performance, it's highly suspicious. But the top dozen super-GMs, for which Titled Tuesday money is unimportant? The statistical differences vs their regular play would have to be far bigger than what Kramnik shows for them.


> It's just that the lame browser engines aren't set up to remember the line from move to move, so they recompute.

I didn't realise this. Appreciate the correction.


I agree with your analysis.

> That can be setup trivially somewhere out of view.

And that's the reason headphones are banned in tournaments.

It's a problem for streamers though, since they need headphones to avoid audio feedback.


What audio feedback?

Do chess streamers need audio from the computer?


The audio clues that sound when the oponent moves a piece are really important, more so if you are distracted reading chat as streamers do.


They're likely capturing their screen with sound. If they allow the audio output of the game to be captured by the microphone, then the game sounds will play twice. Chess doesn't have very many sounds, but it does have some, e.g. when a piece moves.


They often listen to music.


They are streaming.


> At the GM level, just having the evaluation visible somewhere is more than sufficient to gain 100 points of ELO.

I suspect it could be even more, but I wonder if anyone has tested this at different levels? It would be cool to have a graph that shows eval-bar at 1600 ELO equals 70 points, eval-bar at 2800 ELO equals 200 points, etc.


Do I think he cheats? No.

Do I think it's possible to cheat at his level, it would be an art to cheat in rapid and not get in trouble at the rate he plays. But sure, I'll bite, why not.

You can see that Hikaru has the ability to calculate and anticipate standard play far above his opponents, he calls out their moves before they make them. He clearly calculates both, he calculates the human line for him and his opponents.

You can also see the tilt in chess vs say LoL in almost real time. You can begin to spot fatigue in an hour-long video, multi-hour stretch, you can see Hikaru get tried and change the way he plays, slow down, go for positions that are easier to play because he knows himself very well.

Hikaru taking 2900s to town on chess.com simply looks like he is playing above the field at that pace. In a real tournament, with 90+ minutes, 2500-2600s give both Magnus and Hikaru trouble because there is so much more time to calculate. I think the reason why Hikaru is so strong, pattern recognition at this speed, instinct, and time spent playing, not strength necessarily in classical chess. (e.g. he has never been world champ)


> Hikaru talking about the cheating allegation against him: https://www.youtube.com/watch?v=Zm2R--_fBj0

He seems very amused by the accusation in that video.

Regarding his video to me this sounds like if you were to ask Djokovic to play 46 tennis matches versus players ranked between 5 and 400 in the world, could he win 45 out of 46 matches and draw one game (not possible in tennis but hey)?

I mean: Djokovic literally won 41 matches in a row, including several matches against opponent from the top 5.

From the opponents' list, it looks like there was no Magnus Carlsen and not any top 5 players in the world?

Is it that surprising that a world champion crushes competition like that?

I don't know but I'm not that suprised.

P.S: I'm making a tennis analogy because Hikaru likes and plays tennis


> this sounds like if you were to ask Djokovic to play 46 tennis matches versus players ranked between 5 and 400 in the world, could he win 45 out of 46 matches and draw one game (not possible in tennis but hey)?

Not quite, it's more like watching Djokovic's match record until he gets a really good streak and then calling that particular run out as unlikely. It being specifically 46 matches is just because that's the particular length of data that Kramnik cherry picked to do his analysis on.


Unlike a chess game, a tennis match can't be lost with a single "blunder," unless it's a serious injury ... so I'm not sure it's a good analogy, mathematically speaking.

We should ask Marc Esserman.


I'm going to need a statistical analysis of Hikaru's Tennis matches before I can assume anything. Surely he's cheating at tennis if he's cheating at Chess....


The chances Djokovic isn't cheating are very low though.


Kramnik has always intentionally trolled. This is how he compensates his frustration when he loses.

The weird thing is, Kramnik actually cheated himself, when he became world champion. He accepted a match for the title with Kasparov, though he lost the match for the challenger against Shirov.

This accusation against Hikaru is not funny anymore. Noobs in the Lichess chat start calling Hikaru a cheater. This is not ok. Being a cheater is a stigma in the chess community.

Kramnik should apologize, or get declared as a persona non grata for a while in the chess community.


Live streaming doesn't automatically mean you can't cheat. There is a software that reads the chesscom/lichess board in real-time, calculates a bunch of lines using a chess engine and overlays the evaluation of the next best moves over the chesscom board. You can then choose the move that you feel most comfortable with. You never need to look away from the board or switch focus from the browser window. This software doesn't need to be always running. You can start it with a hotkey on demand. The state of the technology is such that you can easily cheat in bullet games while streaming at the same time.


He is not faster than engines.


Maybe not usually, but here's an example when he was faster than stockfish [0]

[0] https://www.youtube.com/watch?v=vWghHUcjPZU


chess.com browser based engine is quite bad and default depth is low (and this is from 2 years ago).


What is happening there? Hikaru doesn't explain.


It said white had the better position and then later in the same position it said black had the better position.


He is if you normalize to 70W of power utilization, which is what more than what the brain uses :)


he is sometimes faster than engines. Engines are stronger than humans at slow time controls, but there are multiple GMs that can beat Stockfish at ultrabullet due to low depth.


It is true that chess engines are not usually designed to play in 15-30s games, however those ultra bullet examples games against the browser based versions which are much weaker and slower than the a dedicated engine process.


He was wearing headphones during the stream, right? At his level, it's enough to get a 'BEEP' sound 1-2 times during the game to increase your performance by 200 ELO.


That's at least one "at the end of the day" a minute.


Before anyone gets into how much cheating there is, notice the conclusion

> I am convinced that there is no evidence that cheating increased after the chess boom in 2020, compared to the offline baseline. I am also convinced that there is no evidence of pervasive and consistent computer assisted cheating in titled tuesday tournaments.

However, re "Since then, a former World Champion published an hour long video, and accused Hikaru Nakamura of cheating." see https://www.chess.com/blog/VladimirKramnik/informational

Lichess discussion: https://lichess.org/forum/general-chess-discussion/kramnik-i...


Kramnik has gone off the deep end.

The "statistics" he is using make zero sense and whenever someone tried to counter him with actual mathematics he would just delete the comments.


I respect Kramnik a lot both as a player and as a commenter on current top chess. I thought his research on chess with the castling rules removed was fascinating, for instance.

I honestly watched his entire Youtube rant, and by the time it got to his actual data, I was so mentally drained from trying to understand the long, poorly structured 45 minute rant that preceded it, I was unable to even parse his figures and tables.

I think his focus on games that can vs can't lead to higher earnings is interesting. But his inability to communicate his ideas clearly and in good faith doesn't paint a picture of someone who understands what he's doing.


Your second paragraph made me LOL - thanks! Similar experience here, listening to the C-Squared interview.

> I think his focus on games that can vs can't lead to higher earnings is interesting.

The root problem, or least one of them, is that he is cherry-picking his results. He may not even understand that he is doing it. (Caruana also pointed this out in the C-Squared interview.)

Rather than defining some search universe with reasonable parameters, and applying some standard filter across all of that space, Kramnik first notices some "interesting" result and expands out from there. This is a recipe for self-delusion and/or fraud.


Yes and furthermore anyone who attempts to offer a rebuttal to his argument has their post summarily deleted. A number of working statisticians have shown his arguments to be baseless yet he continues to stick his fingers in his ears and cry “Nya Nya Nya!”

I have no idea why he’s doing this!


Kind of embarrassing that he doesn't have the self-awareness especially after the false accusations that he cheated during his WC match and that whole scandal years ago.


His statistics are simple and make a lot of sense: when playing decisive games for money online some players have high performance increase, while others have high performance decrease. Anyone who attacks him chooses to ignore this statistics, including the author of this blogpost.


Shameless plug: f you're looking for a similar way to compare your online rating with the FIDE rating, take a look at my project: https://www.chessmonitor.com/

Like the author, I also use regression, but extended to non-title players. More information: https://www.chessmonitor.com/blog/2023-elo-calculation


Having a quick look at ChessMonitor, it looks like much of the analysis tools mirror what can be found on Lichess (unsure about Chess.com). Is the value providing the aggregate analysis between Lichess and Chess.com? Or is there functionality past what can be found on the respective sites?


Chess.com Insights are similar to what my project is doing. However, Lichess has nothing similar. They do have something called "Chess Insights" but it's not really comparable IMHO.

That said, there are also some unique features like "Openings" which allows you to quickly identify your best/worst openings.


Where have I seen this UI before? :)


Doesn’t make much sense for Hikaru to cheat. He competes over the board where it’s much harder to cheat, and he makes most of his money through streaming, not through competition results.


The size of his streaming audience is largely predicated on his perceived skill though. So, he may hypothetically feel pressure to cheat in order to impress his audience.


I'm not sure that translates very linearly. Most of the chess YouTube "stars" are not even GMs. Most of the viewers are hardly good players, so what you can learn from an IM is astronomical. Charisma and good at teaching are much better skills than a 1000 ELO difference imo. To your point, beating Magnus might drag a lot of people to his stream.


> Most of the chess YouTube "stars" are not even GMs.

But they have tutorials, put tons of effort into entertaining content, etc.

Hikaru doesn't.

People watch Hiraku because they want to watch a top 5 player.


They watch Hikaru because he is fun while he plays and wins.

Offline, he is #4 is slow chess and #1 in Blitz.

He's not facing strong challenges on Titled Tuesday, and (like all top players) he is so good that there simply aren't enough players and data to make the empirical Elo scale work for assessing his win probabilities.


That’s right, the most popular content creator on Youtube is an IM and the most popular game analysis channel is by an amateur


That’s nothing new. Many of the books supposedly by GMs are ghostwritten by non-titled players.


But his appeal as a streamer is due to his strength.

The stronger he is, the more his name is listed as the winner, the more viewers he has.


Wondering if Hikaru is using an engine is a bit like wondering if John Carmack gets his insights from ChatGPT.


Regarding the list of noteworthy upsets, it's worth noting that, while I didnt go through all of the games, the top three results were all due to the stronger player making easy to spot blunders and the lower rated (but still strong) player converting.


What's going on in the picture? Has the illustrator never seen a chessboard?


The illustrator is DALL-E.


He should be fired


Or asked to generate the rules for this variant where you use mix-and-match pieces from several different sets on a 7x7 board.


Why do so many chess experts, practitioners and commenters, with varying degrees of familiarity with the chess world, misspell Elo?


Because it is so often capitalized, lowercasing it looks like a mistake, it looks exactly like an acronym, Elo was not nearly as famous for anything else, 'ELO' exists as a real acronym for many other things (eg. Electric Light Orchestra), and you can even easily define pseudo-definitions for it as a backronym - 'expected logistic odds', just off the top of my head.

I host a copy of his book and I still struggle every time to not type 'ELO' instead.


Yeah, I think we've been flooded by three-letter acronyms for so long that we see them everywhere. It doesn't happen with four letters: you wouldn't write VOLT or WATT.


Specifically a 2 syllable 3 letter word that doesn't fit a common English pattern (as it's not an English name).


Most likely reason is because it doesn't matter, "Elo" stopped being a person long ago, similar to many named streets, cities, states, etc. throughout the world

ELO is just as accurate as Elo when referring to the rating system.


How many of those cities and states became acronyms?


Probably quite a few, I refer to some cities as their acronyms only such as NY, DC, ATL, SF, etc. - admittedly they still have their old name, but ELO and Elo still have their name as well


And why do they use Elo to refer to rating on chess.com and lichess? [They use Glicko/Glicko2].


The author has a short paragraph at the end about Glicko.


Yes, he knows his stuff.


They aren't misspelling, they are capitalizing on the name.


He's talking about Vladimir Kramnik and this video BTW: https://youtu.be/_jk_r8FeREM

Saved you the work of the digging I had to do as someone that's not in the loop


This article is smoke and mirrors. A lot of shiny graphs, but he chooses to completely ignore the main statistical result of Kramnik: when playing Titled Tuesday games which decide your cash prize, some players have unusually high performance increase, while others have performance decrease, including young world blitz champion Alireza Firouzja who grew up playing online. This statistics is reproducible from open data, the most difficult part is to separate the games that decide players money prizes, for that you'll have to look up the tournament table after each round manually.


I'm not sure I fully understand why MAE is that big there (~0.35). Is it a difference between theoretical (based on rating difference of players) expected score and actual score averaged over many games? If score is in range 0-1, then MAE=0.35 seems a huge to me. I get it, that there is some deviation from theoretical prediction, but if I understood correctly it happens for larger ratings difference, but I guess more games should be played between players of similar strength.


I am pretty sure everyone is cheating against me online; every time I get a player it is obviously a 2600+ player speed running online crushing my dream to reach a rating above 500.


On a related note, isn't this one place where so-called Trusted execution and Web integrity would legitimately be of use? With a completely trusted path of execution coupled with browser integrity, a cheater would theoretically be forced to use an entirely separate device or source for moves, like a second screen, handheld, beads, whatever... which would need to be input moves in realtime and thus making it quite a bit harder and needed outside assistance.


"entirely separate device"

Like ... an iPhone with a camera that views the monitor, recognized the board and suggests the best move?


It's the old philosophical impossibility of DRM to ever succeed. As long as the consumer is able to see the media, or the player able to play the game, it can be recorded given enough time and resources, same for cheating.


Interesting.

He concludes that cheating in Titled Tuesday must be virtually non-existent.

Many elite GMs (even those more level-headed than Kramnik) would strongly disagree.


Would it be possible to get some probability of cheating per player based on how often they play the best moves according to engines in online and live matches?


> based on how often they play the best moves according to engines in online and live matches

One thing that Magnus mentioned about a year ago (I think in relation to his oblique insinuation that Hans Niemann cheated) is that you don't have to cheat throughout the game.

He posited that if he checked an engine only once per game (at some critical juncture), he would be unbeatable (by any other non-cheating super GM).

If you cheat like that, it would be considerably more difficult to detect.

Caruana recently suggested that ~50% of players had cheated in cash online tournaments. He wasn't particularly clear about what he meant by "50%" but I suspect he thinks that many lesser GMs have done what Magnus suggested: they occasionally "dip their toe" into the cheating realm by checking at some critical point against a more formidable opponent.

They don't even need to do this every game. And, human nature being what it is, I bet they can convince themselves that it's not really cheating (e.g., "I was just double checking my favorite move....")


> He posited that if he checked an engine only once per game (at some critical juncture), he would be unbeatable (by any other non-cheating super GM).

Not even to that extent.

They just need to be told which single move mid-game to think harder on, and they'll often find it.

So all they need is a metaphoric tap on the shoulder.


> They just need to be told which single move mid-game to think harder on, and they'll often find it.

That's right! I think Carlsen even said something to that effect.

"This is a critical move." is all he might need to know....


You need something more sophisticated, otherwise it's easy to confuse high skill and/or well memorised lines with cheating. And a good cheater will be smart enough to either use engine assistance only in difficult spots and/or frequently pick the second/third best moves from the engine.


I think the parent's idea is that over-the-board is harder to cheat; therefore you could look at the difference between OTB and online play.


Correct. My idea is actually that with some grace number of moves that players can play by the book (i.e. openings). After that, run the analysis I mentioned. I agree that playing OTB is more stressful and that is difficult to model in such an approach. I was hoping to at least get some signal to filter out the mostly obvious cheaters and so focus on the remainder that should be less obvious.


But otb is more stressful.


Where exactly is "Zürich, Switherland"?


Most probably in Germany, Austria or Switzerland where the QWERTZ keyboard is widely used and the Z and Y are swapped (leading to an easy one key slip between the H and the Z).


Nice find though I personally always slip left or right, not up or down.


It struck me as an interesting puzzle; was this a curious locale spelling variation, a typo, a speech to text artifact, etc?

The above is a best guess and the first time I'd come across that keyboard layout.


[flagged]


What's wrong with it?


The somewhat obvious typo of "analysis" repeating in the article?


Hikaru has done multiple videos on why it’s total nonsense, pointing to statisticians from multiple universities that detail the problems in his thinking.

In short, he tries to pretend Elo doesn’t exist and there’s no way a top 10 GM could repeatedly defeat FMs and IMs rated 400 points lower.


Ok but that's not what was posted. The OP looks reasonable to me.


Fair enough, I think I've been confusing the content of this post with the discourse that led to it.


Hmm it's a little odd - he says there isn't persistent cheating, but that's in a world where chess.com bans people all the time for cheating.

So what he's saying is the chess.com anti cheat techniques are effective (for now), at least with respect to this specific tournament.


> he says there isn't persistent cheating

...in titled Tuesday, i.e. when there is a real-life name and chess career on the line.


No, chess cheating is rampant. But it's highly unlikely the professionals are doing it.


Controversial opinion, but what is the point of becoming a pro chess player in a world of computers? Any novice can beat you when assisted by an engine. The same can't be said of other professions. A novice can't design a better bridge than a civil engineer just by using cots software.


What's the point of being a professional runner, when motorbikes exist?

People can get enjoyment from doing (and watching) people doing things, even if machines can do it better.


Arguably nothing.

Amateur running seems to make more sense.


Status-seeking is the reason folks focus on professional running, or professional chess, or any of these very hardcore, very difficult endeavors that have no practical purpose.

Someone once commented that men make extraordinary efforts to impress women, yet in general have very little understanding of what women actually like. This explains jacked-up trucks, professional chess, professional running, etc

EDIT: I'm talking about professional-level running or chess, not the hobbies of running or chess


Surely nobody becomes a professional chess player because they think it will impress women.


I think you dismiss the thrill of being really good at something.

We humans get a lot of satisfaction of improving and competing in all sorts of activities.

The world is so big that for any activity you'll find a good deal of people obsessed with that activity, and making it their life's work to perfect it.


I feel like for most competitive activities like this, who you really want to impress is other people who do that same activity. (I play competitive scrabble, for which there is approximately zero recognition in the "outside world", but having your fellow scrabble players recognise your ability counts for a lot.)


And yet so many men refuse to believe that women wear fancy dress and makeup for each other, not for men.


Is there no room for enjoyment of the activity? Does every endeavour have to be couched in the language of self-consciousness?


So too, the appreciation of excellence


> Someone once commented that men make extraordinary efforts to impress women,

While that is undoubtedly true, I suspect no one becomes a professional chess player for the ladies.

They do it because they like, they're good at it, and they (if good enough) can make a living with it.


>Someone once commented that men make extraordinary efforts to impress women

You are seriously suggesting that people go pro in X/Y/Z mostly because they want to impress girls? Oo


It is pretty well known that all civilization was just an effort to impress the opposite sex (and sometimes the same sex), as explained in this educational film about the dangers of sex robots [1].

[1] https://www.youtube.com/watch?v=IrrADTN-dvg


Because it is fun.


> A novice can't design a better bridge than a civil engineer just by using cots software

...although a novice can beat a civil engineer at bridge design games. :-)

See Real Civil Engineer's YouTube channel [1], where a real civil engineer plays various engineering and puzzle games, including a lot of Poly Bridge 2 and 3. In Poly Bridge after finishing a level he generally looks at the leaderboard to see how the people that beat him did. There's usually a bunch that used cheesy methods, but there are also often several where they actually just did a better design. Especially on levels where hydraulics are used to move parts of the bridge.

[1] https://www.youtube.com/@RealCivilEngineerGaming


> Any novice can beat you when assisted by an engine.

A novice wouldn't be assisted by the engine but rather be an engine-chessboard interface with no real input as far as the game goes. I think that's a really weird take. You wouldn't claim there's no point in 5000m race when any person with a car easily beats you.


I can't carry the car everywhere in my pocket though.

Running ability is useful in real life outside of races.

Very high chess skill is never useful in any environment outside of the artificially restricted game environment.


The analogy holds. Deep thinking is useful in real life. "Very high" running skill is not useful in any environment outside of artificially restricted controlled races.


Chess moves won’t let you get out of a building on fire quivkly.


Nor will the difference between being decently fit and a world class runner.


In 50-60 years from now you'll almost certainly be dead, if you're unfortunate it might be much earlier. And along the way everybody and everything you've ever known will also die. Literally nothing anybody does has any grand point or meaning.

All life comes down to is fulfillment. And chess, for many, is one of the most absurdly fulfilling and satisfying things that there is. And what makes the game so special is that as you become better at it, it becomes even more fulfilling and satisfying. And there's no real limit to how much you can improve. And the cost of it is practically $0. And so all a pro is, is somebody who takes their fulfillment of the game to an arguably natural extreme.


Like all professional competitions: because there's an interested audience with money. And humans still have the money, so we demand to see human competitors we empathize with.


What's the point of becoming a champion runner in a world of automobiles?

Any novice can beat you when in a car.

---

EDIT: With all due respect, your comments sounds exactly that silly.


It would be more like "what's the point of becoming a champion runner if a theoretical drug existed that made anyone instantly into a champion runner and was very difficult to detect?"


Cheating, except when done by an extremely strong player to begin with, tends to be utterly trivial to detect. For one reason among many more, a normal cheater will not really understand how difficult or easy some moves are and will tend to play a super-human move, requiring an exceptional depth of calculation, at the same pace which he plays a trivial recapture.

Cheating in chess is maybe something closer to steroids in athletics. If you take an average Joe and gear him up with all the steroids in the world, he's not suddenly going to just hulk out and become a world class athlete, contrary to what many think. But if you take a world class athlete and gear him up, he is going to be able to use that gain a substantial and unfair advantage over his clean peers.


Yes, except that cheating is not that rampant.

Maybe for Title Tuesday IDK, but not for most competitive chess.


Love of the game. Computers can make chess more interesting, not less! I love working with them.


When AGI finally comes around are you going to just stop doing anything? After all, what's the point when a machine can do it better?


This is my fear. That everything becomes pointless.

I think the solution is human augmentation to stay relevant.

Or we just have silly things like weight classes for each achievement.


games are fun.


A month ago also Fabiano Caruana said the same exact thing but without making names.

[0] https://www.youtube.com/shorts/aD1mga-Oj5k


He's not talking about Hikaru.


> without making names




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: