But if you have an efficient system where each server can handle 10K+ concurrent requests, then the variance introduced by the random algorithm becomes insignificant. Also, it doesn't matter that some requests take much longer than others; if you're dealing with a lot of users per server, the slow requests will be distributed evenly too and average out.
I implemented the balls-and-bins algorithm and I fond that when dealing with few buckets and a large number of balls, the variance in load between buckets became smaller. I even added randomness to the 'weight' of each request (to account for slow vs fast requests) and the distribution was still even - The more requests each server can handle, the more even the distribution is with the random algorithm.
Also note the speaker is a CTO of a CDN (fast.ly), I am guessing he has experience with large concurrent requests as well :)
It may be less with HTTPS, depends on how efficient the CPU is at cryptographic algorithms and how many cores.
fastly is a CDN. I assume that they're talking about the performances of their CDN servers. Not the web servers of the clients that will process the request, and is indeed a lot slower operation.
Also, it'd have to be a server with good async support like Node.js, Go, Haskell, Scala, Tornado, nginx...
Considering that the speaker is CTO at a CDN company (which has to deal with a lot of different kinds of back-ends that are outside of their control), it makes sense that they would need to use and algorithm which can handle all possible scenarios - They can't force their customers to use bigger servers and more efficient systems.
Isn't that all a load balancer is supposed to do? I certainly don't want my load balancers performing computations or logic. I want it to pass that work off to another server.
To be clear, you can do a lot better with a better pipe, smart caching, compression, etc. But people often have horribly unrealistic estimates about how much traffic their servers can handle because they don't take bandwidth into account, and load balancers are no exception.
Of course, when you break it down by individual web request, most responses are still below 800KB, but you shouldn't load plan for the average case. And clearly even the average case is well above 12KB, especially for a CDN (which is responsible for serving the image, video, and large script content). I'm also pretty confident the page I linked already includes compression (which decreases size, but can increase time quite a bit; many people expect software load balancers to be using the absolute fastest compression available, but that's often not the case in my experience).