A deep dive on load balancing algorithms

jedberg · on Nov 17, 2021

I'm disappointed that this doesn't include HRW (Highest Random Weight or Rendezvous Caching). It has the advantages of consistent hashing without the disadvantage, and doesn't require a central coordinator like consistent hashing does.

skyde · on Nov 17, 2021

does it handle overloaded service similar to "Bounded Consistent Hashing"?

jedberg · on Nov 18, 2021

Yeah, it would. If the server returns a 503 to the caller when it's overloaded, the caller would move on to the next host, and if the server were taken out of service, then the call would move on to the next host.

The main difference is that only the calls for that host would move, and they would be distributed across multiple other hosts, instead of shifting the keyspace around, depending on what you are using for your HRW calculation to pick the server list.

recursive · on Nov 17, 2021

I'm not familiar with any of these, but this point stuck out to me.

Two random choice says "Max load of any server is O(log log number_of_servers)."

If work accomplished is proportional to load, then the total work done by the entire system is O(number_of_servers * log log number_of_servers). It seems very suspicious, magical even, that the total work is more than linear with the number of servers. Free energy discovered?

scottlamb · on Nov 17, 2021

> If work accomplished is proportional to load, then the total work done by the entire system is O(number_of_servers * log log number_of_servers). It seems very suspicious, magical even, that the total work is more than linear with the number of servers. Free energy discovered?

No free energy for at least a couple reasons:

* This is O(...) meaning (roughly) "bounded above by", not Theta(...) meaning "bounded above and below by".

* This is max load, not average load. Even if it were theta, it wouldn't follow from the max load on a server being Theta(log log number_of_servers) that the total load is Theta(number_of_servers * log log number_of_servers).

rp1 · on Nov 18, 2021

I imagine the reason is that the max load on a single server is O(log log number of servers), but it’s not possible for all the servers to be at max load.

For instance, imagine load balancing via random assignment. The theoretical max load of a server is receiving every request, but if one server receives more requests, then the other servers receive less.