Hacker News new | past | comments | ask | show | jobs | submit login

3 million queries/second across 16k nodes seems pretty heavy on redundancy?





I was going to say, that's absolutely nothing. They state 2.1K clusters and 16K nodes; if you divide those, assuming even distribution, you get 7.6 instances/cluster. Round down because they probably rounded up for the article, so 1 primary and 6 replicas per cluster. That's still only ~1400 QPS / cluster, which isn't much at all.

I'd be interested to hear if my assumptions were wrong, or if their schema and/or queries make this more intense than it seems.


> assuming even distribution

I don't work for Uber, but this is almost certainly the assumption that is wrong. I doubt there is just a single workload duplicated 2.1K times. Additionally, different regions likely have different load.


That's 200 qps per node, assuming perfect load balancing.

2100 clusters, 16k nodes, and data is replicated across every node "within a cluster" with nodes placed in different data centers/regions.

That doesn't sound unreasonable, on average. But I suspect the distribution is likely pretty uneven.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: