3 million queries/second across 16k nodes seems pretty heavy on redundancy?

sgarland · 2024-10-14T14:10:09.000000Z

I was going to say, that's absolutely nothing. They state 2.1K clusters and 16K nodes; if you divide those, assuming even distribution, you get 7.6 instances/cluster. Round down because they probably rounded up for the article, so 1 primary and 6 replicas per cluster. That's still only ~1400 QPS / cluster, which isn't much at all.

I'd be interested to hear if my assumptions were wrong, or if their schema and/or queries make this more intense than it seems.

pgwhalen · 2024-10-14T14:57:13.000000Z

> assuming even distribution

I don't work for Uber, but this is almost certainly the assumption that is wrong. I doubt there is just a single workload duplicated 2.1K times. Additionally, different regions likely have different load.

withinboredom · 2024-10-14T12:43:01.000000Z

That's 200 qps per node, assuming perfect load balancing.

620gelato · 2024-10-14T14:15:32.000000Z

2100 clusters, 16k nodes, and data is replicated across every node "within a cluster" with nodes placed in different data centers/regions.

That doesn't sound unreasonable, on average. But I suspect the distribution is likely pretty uneven.