When we moved to a new hosting facility ~5 years ago, I went with just a single public IP as well. It's worked fairly smoothly. We use a loadbalancer to decide the real destination and forward the traffic on to the actual servers handling the requests. This has worked flawlessly and I'd definitely go that route again in the future.
I always wonder just how much can a single load balancer do? Even if it is just forwarding request/replies, at something like hundreds of thousands of queries per second how does it work?
First of all, you’ll be amazed at what hardware load balancers can do.
Secondly, if you want you can still have multiple load balancers listen to a single IP; it’s a bit more tricky to pull that off with HTTP load balancing rather than just network load balancing, but not impossible.
Unimog is key to making the deployment described in the paper work.
Another way of thinking about this is that we have a single virtual load balancer (backed by all the physical machines in the DC) sitting in front of services that transform/terminate/filter/etc customer traffic.
Thanks! I am more curious about the physical aspect of it. How do you get multiple switches to share an IP? What determines with switch forwards the packets? Like, simply, what wires connect to what?
Your switches run a routing protocol (almost always BPG) and your load balancers all announce the same IP to them. This results in a routing table with many next hops for the LB IP, and traffic is distributed across them using ECMP, typically with some resilient hashing scheme so that a given flow will always go to the same physical LB, even if there are topology changes due to LB or switch failures.
At the edges, your switches or border routers announce the IP to all your peers and transit ISPs.
Yes, this. The beauty of the Unimog setup we run in production is we are resilient to ECMP rehash events — each server acting as a load balancer has access to state (backed by Consul) that tells it which flow belongs to which application server.