Crypto hardware acceleration is commodity. I believe that Broadcom sold the chips for <$5 over 5 years ago. To say they don't scale horizontally doesn't make sense either. If they're integrated with PCIe then you can get a lot of crypto processing in a single chasis. (Disclosure: Used to make crypto accelerators and load balancers)
There surely are reasons not to integrate at the load balancer, but they're not because the load balancer will melt down.
You'll usually run out of entropy before cpu usage becomes relevant with SSL processing, I've seen old versions of apache hang with little or no entropy to process SSL connections. I recommend some sort of RNG or a poor mans software version such as http://www.issihosts.com/haveged/
This comment really opened my eyes -- thank you! I am still a bit confused though. I think I was running out of entropy, but it seems nginx should use /dev/urandom, which supposedly doesn't block -- just becomes less random when entropy runs out. So is nginx set up to block when entropy is depleted and that is why this happens?
Very cool post - not only is it informative, but you've taken the "oh-so-rare" extra step of actually coding up a solution to what you're talking about - rather than taking the easier approach and just telling others what they're doing wrong but presenting no practical alternatives. Kudos!
One downside of this approach (without some funky iptables/networking-fu) is that you loose the source IP from the original request. Adding headers like X-Forwarded-For only works after the request has been decrypted, so all the traffic will appear to source from the load balancer, which can present its own issues.
Hardware or software? There's some hacks with TPROXY/HAproxy I've seen that would do the transparent proxy but the setup seems like more trouble than it's worth.
IPVS is built-in to the Linux kernel, and HA projects like keepalived have ipvsadm integration. Tproxy works fine, and has been in the kernel since 2.6.30. In most load-balancing cases, losing the remote IP address isn't that big a deal (you have to deal with NAT too), and a full proxy like haproxy has it's benefits.
Why does session affinity not solve the problem of session caching? The author says it's "a whole other world of pain and suffering" but doesn't explain why.
I think that was a bit hyperbolic, but affinity does have drawbacks. Balancing the requests is much easier without affinity, and you don't run the risk of the load ever getting severely skewed. Maintenance is also easier, because you don't have to "drain" the load from a server, you just drop it out of service, and the requests can go on to something else.
The latter point isn't that big a deal if the only reason for affinity is SSL session caching, because you could yank the server even if it has active sessions, and the clients would simply re-establish with the next backend.
I often load-balance ssl using session affinity, and would also like to know if this author has encountered other issues, or just hasn't looked at the capabilities of haproxy in a while.
I guess that it's not easy to implement session affinity on the SSL level. You cannot access information like cookies or other HTTP headers (since they are inside the SSL payload that you're trying to handle). So you have to use the little information you have: source IP address and port, and session data.
The idea would be to make sure that a given client is always sent to the same SSL handler.
We could imagine having two layers of load balancers:
- first layer would use source IP address and/or session data to determine to which server of the second layer the connection should be forwarded;
- second layer would receive the connection and to the proper SSL handling.
I believe that this would work, but it seems that it would require a custom "half-implementation" of SSL on the first layer of load balancers. I don't know if there is any provision for that in OpenSSL or GNUTLS. Also, since there are already hooks to do session caching in most SSL-enabled servers, using those hooks to plug in a memcached backend seems to be less "disruptive" (read "easier to understand, implement and debug").
You can balance ssl traffic at a TCP level using the client's IP to set affinity. The only real catch is if you have a large base of clients behind a single NAT, but in most cases traffic will balance out pretty well.
Sometimes though, you don't want affinity at all. If you don't care what backend server takes the request, you can balance the load more efficiently, and more easily rotate servers in and out of service.
Matt this is awesome. Nginx is becoming the standard for front end load balancing for many high traffic sites and this helps. We only have one nginx front end load balancer (even though we do over 4000 http req/s) but we'll be migrating to a cluster soon so I'll give this a whirl.
If you have any visitors you are doing them a huge disservice if you do not have SSL session caching.
You would use external SSL caching like this if you have more than one SSL termination point (typically a webserver like nginx/Apache) behind a load-balancer.
There surely are reasons not to integrate at the load balancer, but they're not because the load balancer will melt down.