> Today, we serve 2.3 million QPS at peak. 2M of those queries are reads and 300K are writes.
Damn, even for a service as popular as Slack, that’s significantly more than I expected. Slack has ~12-13 mil DAUs, right? I assume at peak time of day, maybe 2-3 million actively using the product at the same time? If that’s a fair assumption, that’s roughly 1 MySQL query per second, per active customer - seems like a fair bit? I wonder if they do polling (instead of websockets)?
At my work we have roughly 1 order of magnitude fewer DAUs, but roughly 2 orders of magnitude fewer QPS. And we also have chat areas of the product.
It is a pretty high QPS. I work on a chat service that is quite popular (even more popular than Slack) and looked at our QPS to our persistent data stores. It is less than an order of magnitude compared to slack's when you normalize for DAU. Really wonder what they're doing over there.
Whatever the service, it isn't the platform that Slack is. I would guess that a huge percentage of their db traffic is from apps and custom integrations.
As of September 2019, Slack reported 12 million DAUs [0]. Of course, that is pre-pandemic, and the only figures that Slack has provided regarding demand in 2020 are these tweets [1] from Slack CEO Stewart Butterfield back in March. In those tweets, it mentions that Slack was serving 12.5M simultaneously connected users.
May want to edit your post - I believe you meant 300K QPS, not 300 QPS :) But yeah, that’s very impressive to handle that much load without sharding! Though it is a cluster of 12 instances, each with 96 CPUs and 614 GB RAM - those are some seriously beefy machines.
Slack are getting 300K WRITES per second, though. Which does mean you have really no choice but to shard - read replicas are nice for scaling reads, but do nothing for writes, and that’s A LOT of writes.
Damn, even for a service as popular as Slack, that’s significantly more than I expected. Slack has ~12-13 mil DAUs, right? I assume at peak time of day, maybe 2-3 million actively using the product at the same time? If that’s a fair assumption, that’s roughly 1 MySQL query per second, per active customer - seems like a fair bit? I wonder if they do polling (instead of websockets)?
At my work we have roughly 1 order of magnitude fewer DAUs, but roughly 2 orders of magnitude fewer QPS. And we also have chat areas of the product.