Hacker News new | past | comments | ask | show | jobs | submit login
A bit on scaling chess.com's database (unstructed.tech)
246 points by ikonic89 11 months ago | hide | past | favorite | 90 comments

Having not played in years and like everyone else I guess I went to chess.com after watching the Queen's Gambit, which they mention was an even bigger peak than lockdown, and I was impressed by the site: you can play immediately without registering and the user interface is nice and intuitive.

It's very refreshing.

https://lichess.org/ is also very good. You can play without registering, all features are available to everyone for free, and the site is open source. It's a good clean website without ads or tracking.

(I used to be bitter about chess.com, thinking they're just cashing in on their domain name and charging for features because they can. And that may be, I don't know. But I have seen them organize some good chess events, so they might not be all bad. I think they pay streamers to use their site though, a practice I'm not too happy about.)

I was bitter towards chess.com after they sent me my password through the forgot password process

That was a decade ago though. Maybe they've improved

They certainly have. Now not only are their user accounts secure, they are on the bleeding edge of detecting and kicking off cheaters.

How do they make money?

Premium features.

You can pay to get an extra move or a take-back (only joking).

Actually thinking about it, there probably is a cheap chess clonelike app on mobile market that does that.

There is! "Chess Online"

Oh no. You shattered my remaining slither of hope in mobile app market.

If you think that's bad, check out this GDC Talk: https://www.youtube.com/watch?v=E8Lhqri8tZk

They procedurally generated 1,500+ different iterations on top of the same Unity-based slots shovelware using noun/adjective pairs. The general success they enjoyed is as sad as it is hilarious. For the record, the most enticing descriptor categories were: "3D", "Luxury", "Casino/Gambling", and "Sexy".

Stop it! This is soul crushing... and funny.

By charging their users money.

They have a premium membership that gives a few extra features like game analysis.

I’m quite happy that they are able to make money off of a game like chess. I don’t see a problem with “cashing in on their domain name.” They provide value to people, and people are willing to pay.

You generally have to pay to be a member of a chess club for example. For a fraction of the price, you can play with anyone in the world. Seems valuable to me.

Back in 2014 the lichess app would send your username/password in clear text :) will have to see if that's still the case...

Paying people to play chess can't be a bad thing. I prefer chess.com to lichess specifically because I support chess players by using chess.com. Streamers, tournaments, article writers...

What's so bad about influencer marketing? It is the norm these days. Inevitably competition will do it if you don't.

As an online chess player I find it a bit shady.

Lichess.org has a better UX and featureset for free than a premium chess.com subscription, but many of my friends watch steamers --> sign up to chess.com and get a worse expereince.

Very subjective opinion about the UX.

I much prefer chess.com's UX. They allow multiple pre-moves vs. lichess that only allows 1 premove, has much better cheat detection and overall much better handling of players who engage in bad behavior (abandoning lost games) by segregating repeat offenders into a pool of their own so they don't affect well behaved players, and lichess has a very quirky and unnecessary 300 move limit, which one can argue is more than enough for regular chess, but for speed chess is just a nuisance.

I've played speed chess for years, and I have never hit 300 moves in a game. The only way this will ever happen is if both sides are intentionally trying to avoid 3 move repetition.

And in terms of premoving, chess.com spends 0.1 seconds per premove whereas on lichess it's instantaneous, so I've found games tend to last longer on lichess move wise.

Sounds like valid reasons. I think we have different preferences - I like the lichess aesthetic, unlimited free puzzles, engine evaluations and opening explorer and lack of freemium popups.

Hope you enjoy your games, and your preferences seem reasonable too!

Also lichess has good number of variants, for example its very easy to get a game of CrazyHouse in lichess, whereas can you play CrazyHouse in Chess.com.

How does lichess handle abandoned games? About half the time I don’t even get to claim the early win and i can see the other player is already playing another game

If a player leaves your game, lichess gives like a 20 second timer and then lets you claim victory. I believe there are also "offer draw" options, but I'm not sure.

Oh is it not a thing on chess.com? Ime lichess abandoned game detector is not super reliable but i guess better than nothing

Chess.com has 60 seconds countdown after leaving a match (at least in 10 min mode, haven't played others that much). Also, in daily chess there is a time limit for every move. Failing to move on time results in automatic resignation

You can claim victory or a draw in that situation.

Marketing is antithetical to a free market, because it makes consumers prefer brands with the better marketing to brands with better value at a given price point.

This is why advertising is so heavily regulated - though not even close to regulated enough.

Marketing via 'influencers' is even worse than other forms of marketing because it is one of the least regulated spaces, especially when graphed against its relative effectiveness.

I bit the bullet and got a subscription. The game analysis feature is really good (I guess you can just run that client-side in theory? I don't know), and the lesson content feels worth it for me. And stuff like 4-player chess is fun.

The main homepage is a huge mess though...

> The game analysis feature is really good (I guess you can just run that client-side in theory? I don't know)

Indeed, this is how Lichess (optionally) does it (for free).

They also offer server-side rendering as well (provided by volunteers through fishnet): https://github.com/niklasf/fishnet

I think you mean .org and yeah that seems alright and the open-sourcing is interesting (resisting trolling on node ;).

It's all about SEO, chess.com is the first result on Google and DDG for "chess".

Thanks for the correction: https://lichess.org/

You can find a list of other chess websites at


Chess24 has Jan Gustafsson and Peter Svidler as commentators. Amazingly good English banter from a German and Russian. In general, the individual personalities of Nations comes through really well in chess commentary, since commentators are from all over the place, are quite sharp, and everyone is talking about the same thing in usually the same language (English). I didn’t even know Germans and Russians had a sense of humor...

ICC is an old (in Internet terms) established website.

I wish more sites would let you begin using the service before trying to hoover up user accounts. I understand why it's not always possible, but it sure is nice.

Their live commentary is surprisingly entertaining as well.

And they have a bunch of chess variants to mix things up.

All around impressive site.

It's absolutely astounding that you can go on Twitch and see Grandmasters playing speed chess at pretty much any time of day.

I agree with other posters who've said that while Queen's Gambit may have sparked interest, it's streamers -- professional chess players or not -- who are keeping interest high.

I'm a big fan of chess, and it's awesome to see so many of my friends suddenly become interested in the game!

And it's Phish's "Dinner and a Rematch" that annihilated their infrastructure on 12/31/20 and is no doubt bringing about these projects.

Would've been nice to see something about this in the article

It was a catastrophe. Not surprised they gloss over it. Plus Phish is more polarizing than Trump so it's dangerous to associate with your brand. Songs are longer than chess matches.

> One of things that has really helped our DB scale is the fact that we’ve moved a lot of sorting, merging, filtering from the DB into the code itself.

This goes against everything I have learned until now. Maybe an experienced DBA can explain when moving stuff out of the database is helpful? I mean sorting, merging and filtering is the core discipline of a RDBMS and many, many man hours went into it. How and when can a custom version be better?

If you have 1m clients making DB requests and are hitting performance constraints you could do any of:

  1. Add more ram, CPU, etc to the DB host
  2. Create DB read replicas for higher read volume
  3. Shard the DB (reads and writes)
  4. Offload logic to stateless clients, who you can easily scale horizontally
So option 4 is reasonable if you don't want to or can't do the other options.

I would need to know a little more about those 1M requests before making any of your points valid. If 1M are solely reads, you can do real-time transient offload onto Redis without calling it cache and breaking your bank.

True. I would rephrase option 2 as "cache and/or replicate your query results somewhere".

It's very application-specific. If I'm building a message board/forum software where I'm paginating tens of thousands to millions of rows, I'm probably going to do my sorting in the DB. If I'm working with a smaller data set like a users music playlist or favorite movies, I may opt to bring that data in unsorted from the DB and do everything else in app code. Even with small result sets, databases often over-allocate memory for sorting (sort buffers are usually pre-allocated) and even though it's often tunable, a lot of concurrent sorts on small data sets can waste memory.

I can break it down quickly, here: - Scaling web servers is much easier. Queries are (mostly, on a single host) executed one after another, so if one is slow, you suddenly have a queued of queries waiting to be executed. The goal is to execute them as quickly as possible. - 2 separate queries, both using perfect indexes will be much faster (insanely faster on this scale) than 1 query with a join. So, we just join them in code - Sorting is often a problem, since in MySQL only one index can be used per query. - No foreign keys increases insert/update query speeds, and decreases server load. - Etc, etc.

And thanks for brining this up, I've added a disclaimer that these are not to be taken for granted, and they work for us, on our scale.

I did plan a whole new article about this. With all the benchmarks we've gathered over the years.

It is easier to scale up application servers than DBs. You want the DB to spend its CPU cycles on what its real value which is ACID.

A single DB machine can easily become a performance bottleneck.

It doesn't matter if it sorts 1000x faster than client code, when there aren't cycles left to perform this (in theory) very performant sorting.

This might not be your actual problem, but I've seen it hit a completely oblivious dev team before :)

They didn't say it's done more efficiently, they said it helped their DB scale.

(It's probably cheaper/easier to scale their application servers as well.)

I'd certainly be tempted to move anything I could to the client...the end user's web browser. I imagine a fair amount of the load is from activity that never generates any revenue.

> No foreign keys, many things were done in the code itself (filtering, sorting, etc, to make sure the DB only ever uses the most efficient indexes)

This normally is terrible advice. You have to be sure that you really need it. Too often I see developers do this needlessly.

Yeah, I can agree with that. You really need to know what you're doing. We have benchmarks for most of such "micro-optimizations", though now we can know how things are going to behave before we even make them, just from the experience.

Definitely not something I'd suggest on an average traffic website.

They mention the huge bump in traffic from Netflix's "The Queen's Gambit". However, it is worth seeing the Google Trend for searches for both the show and chess since they wrote the blog post. Initially there was correlation between interest in the show and the game, however, as interest in the show wained, interest in the game remains steadily high.

You are 100% correct. We expected to see a drop in activity after the interest in show reduced, but we're still hitting records day after day. Crazy.

My theory is online streaming is more of a factor than the show

yep. From what I've seen: it's because a bunch of chess Twitch and Youtube people have been building up communities + meme-scapes around the game. QG added a bunch of new people who found the online community and got into it -- especially because with the pandemic people have been wandering the internet searching (unintentionally) for new hobbies to get invested in.

Oh, wow. I didn't think about how it is now also a spectator sport like esports.

The bongcloud attack changed everything

sounds like a joke but it's really true.

edit: and all of r/anarchychess

Chess.com has deals with most of the prominent streamers, too

> it was in a fairly decent state for a database that started its life over 12 years ago. Not many unused/redundant indexes, the ones that were there were mostly good.

Ah, to work on a service whose requirements have not changed in 1500 years :P.

J/k (kinda), enjoyed the read thanks!

If anyone is interested in online chess, https://lichess.org/ is a really interesting project. It is non-for-profit, avoiding freemium upselling and instead relying on user donations

Here is a great talk by the founder


And here is the open-sourced code.


As a lichess player myself I have to say I find it a bit tone deaf and annoying that it's peddled in every context chess.com is mentioned.

Under that perspective I'm perhaps guilty of mentioning lichess unnecessarily, but I do think the site is an interesting phenomenon (i.e. providing cutting-edge technology, at no cost to users, without advertising or tracking).

Raising awareness of alternatives is good, but yep I think it's also fair to be critical of the way that awareness is generated.

I think it’s entirely warranted, given that lichess does not have the budget to sponsor big name chess streamers. The only thing that keeps lichess going is the user base.

I have a chess.com account and wouldn’t have heard about lichess if not for this post.

I think it's an interesting phenomenon and worth spreading some info about, both because of the technical aspects and the philosophy of the creator I prefer.

I do the same in person with friends, and if they ignore me, or try it and prefer chess.com then not too much harm done:).

It’s because lichess is pretty significantly better than chess.com while being free. Creates that extra motivation to make that post.

Very enjoyable talk, thanks for sharing. Funny (amusing) to hear that lichess' competitors would pay professionals not to pay on lichess. Perhaps unsurprising in the era of influencers?

Yes Thibault Duplessis is a very funny guy and has made something great without capturing any personal value!

> Funny (amusing) to hear that lichess' competitors would pay professionals not to pay on lichess

That is true, nevertheless, many of the same Grandmasters still play using anonymous accounts. For obvious reasons, we should not point out which accounts these are.

This was a great solution and execution under pressure. You might consider TiDB in the future when you have some breathing room to consider a new operational model. It is MySQL compatible, open source, and can scale writes horizontally as traffic increases by just adding additional nodes. Disclaimer: I work at PingCAP, the creator of TiDB.

I was in the Phish vs. phans chess.com match on NYE. I was wondering how this is handled. Fascinating! Thanks for sharing.

I was also there, and when it wasn't working at first I assumed the DB was totally overloaded. Glad you got it working in the end!

The team had to selectively throttle access to that one game and change things around to get through the overload.

I'm trying to pun "paged" with "Phish" and it doesn't work.

But ops was definitely on deck, on the holiday.

It's worse than that. They bailed on voting and had a human watching the chat for audience moves.

Glad you liked it :)

As a minor note, the auto-increment thing can often be handled by having one of the machines only generate even primary keys and the other only generate odd.

Do keep an eye out for id exhaustion if you try that though, since you just halved your headroom when running 100% on one or the other.

There are "new generation" databases, some of them MySQL 5.6 compatible (e.g. Vitess) with the ability to massively scale in terms of read/write QPS (queries per second).

In database related podcasts you will hear that QPS is considered to be a solved problem nowadays.

We did test Vitess couple of years ago, just decided not to go with it. There were a lot of reasons for that, maybe a topic for another post.

Vitess is just one example, there are others. And it depends, some teams might be able to do some work to get along with MySQL, like chess.com. As in most cases, there is no silver bullet here.

ikonic89: how large is the database? The charts showed disk utilisation % but would be interested to know raw sizes.

Sorry for the late reply :) Looks like we have between 20T and 25T of data, combined across all MySQL databases. And somewhere in the region between 60B and 80B records. Edit: this is just for data that's used all the time. Almost nothing is dead, just sitting around.

Their business has grown a lot due to The Queen's Gambit. It's pretty cool when an outsized force grows your business like this!

Nice engineering blog post. Thanks for sharing.

The first mistake is using MySQL.

Saying that in response to a 12 year old database they've built an empire off makes you look silly, not them.

not really

Fair. I'm not sure why I replied honestly, and so hostility.. It would have cost me nothing to just scroll on by. Sorry mate.

I think it is unfortunate that single node/non scalable databases are still being used, which then leads people to have to do gymnastics to mitigate their limitations (no write scaling, etc). If they used something like Spanner, they would probably not have needed to do any of this and could instead focus on their core business.

I'd be interested in what the estimated monthly bill for them would be using Cloud Spanner.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact