Massive disclaimer: I work on NLB.
That's do-able and all, but I kind of didn't hate the old paradigm of having an extra layer there.
If you are going to look at it, attempt time - ~04:50 UTC, remote address from 184.108.40.206/20 network
tcp4 0 0 192.168.100.101.49615 220.127.116.11.80 ESTABLISHED
tcp4 0 0 192.168.100.101.49614 18.104.22.168.80 ESTABLISHED
tcp4 0 0 192.168.100.101.49613 22.214.171.124.80 ESTABLISHED
tcp4 0 0 192.168.100.101.49612 126.96.36.199.80 ESTABLISHED
tcp4 0 0 192.168.100.101.49611 188.8.131.52.80 ESTABLISHED
tcp4 0 0 192.168.100.101.49610 184.108.40.206.80 ESTABLISHED
Each of those will be routed to the same target, so it's up to your browser to decide which to use for what.
Check the 'Type' property mentioned in the CloudFormation templates reference in public documentation http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuid...
But that pricing model:
Bandwidth – 1 GB per LCU.
New Connections – 800 per LCU.
Active Connections – 100,000 per LCU.
An LCU is a new metric for determining how you pay for a Network Load Balancer. An LCU defines the maximum resource consumed in any one of the dimensions (new connections/flows, active connections/flows, and bandwidth) the Network Load Balancer processes your traffic.
A large number of services are not even featured on the calculator as an option (e.g. Lambda)
# ab -n 400 http://nlb-34dc3b430638dc3e.elb.us-west-2.amazonaws.com/
Time per request: 108.779 [ms] (mean, across all concurrent requests)
# ab -n 400 <public server via ELB>
# ab -n 400 <public server via ALB>
# (for reference) ab -n 400 https://www.google.com/
# (for reference) ab -n 400 https://sandbox-api.uber.com/health/
Time per request: 88.400 [ms] (mean, across all concurrent requests)
Time per request: 415.859 [ms] (mean, across all concurrent requests)
$ ab -n 400 https://www.google.com/
Time per request: 168.438 [ms] (mean, across all concurrent requests)
Unfortunately, it lacks a very significant existing feature of ELB: SSL/TLS termination. It's very convient to manage the certs in AWS without having to deploy them to dedicated EC2 instances.
If you expect an instantaneous load of more than about 5Gbit/sec, in those situations we work directly with customers via AWS Support. We really try to understand the load, make sure that the right mechanisms are in place. At that scale, our internal DDOS mitigation systems also come into play. (It's not a constraint of NLB).
The load test in the blog post was done with an NLB, and was done with no pre-provisioning or pre-warming and allowed us to get to 3M RPS and 30Gbit/sec, which is when we exhausted the capacity of our test backends.
ALBs start out with less capacity, and are constrained more by requests than bandwidth. I don't have a precise number because it depends on how many rules you have configured and which TLS ciphers are negotiated by your clients, but the numbers are high enough that customers routinely use ALBs to handle real-world spiky workloads, including supporting Super Bowl ads and flash sales.
Each ALB can scale into the tens of gigabits/sec before needing to shard. ALB also has a neat trick up its sleeve: if you add backends, we scale up, even if there's no traffic. We assume the backends are there to handle expected load. So in that case it has "more" capacity than the backends behind it. That goes a long way to avoiding some of the scaling issues that impacted ELB early in its history.
If you have a workload that you're worried about, feel free to reach out to me and we'll be happy to work with you. colm _AT_ amazon.com.
This is painful, though. I don't know about Sep 2017, but in 2016 upscaling your ELB involved answering a block of 21 or so questions in a list only given to you after you engage with support, which had some pretty esoteric items on it. It was decidedly un-AWS-ey.
First things first, with NLB our experience is that pre-warming is never necessary. Each NLB starts out with a huge volume of capacity, beyond the needs of even the largest systems we support, and each NLB can theoretically scale to terabits of traffic.
Our first big improvement for ALB was that the basic scalability and performance of ALB is at a point that in all but a very small number of cases (think of some of the busiest services in the world), customers don't need to do anything. This is the pay-off from a lot of hard work focused on low-level performance.
Our second big improvement, for both ALB and Classic ELBs, was a mix of more generous capacity buffers, and pro-active and responsive scaling systems. Together, these mean that we can race ahead of our customers load requirements.
Another item that's helped is that for the truly big scaling cases, which is DDOS preparedness, we now have the AWS Shield service to manage that process in consultation with the customer. That's useful if your needs are more nuanced and custom than the DDOS protection that is included with ELB by default. This gets into things such as how your application is configured to handle the load.
With all of these improvements, ALB does not require pre-warming for the vast majority of real-world workloads. However, after years of pre-warming as a thing, we have customers who have incorporated it into their operational workflows, or who rely on it for extra peace of mind. We do want to continue to support that for our customers.
We try really really hard to preserve connections and the system is designed to keep connections healthy even for months and years.
"Yes, while you should see performance increases on ALB in general, for major traffic surges we still recommend a pre-warm"
> Beginning at 1.5 million requests per second, they quickly turned the dial all the way up, reaching over 3 million requests per second and 30 Gbps of aggregate bandwidth before maxing out their test resources.
In this case, it seems like the one-size-fits-all ELB has been replaced by ALB for those using containers, who want L7 LB, and don't need insanity-scale. NLB for those who want massive scale, a dumb pipe and/or need consistent IPs. They could have tried to build these features into ELB but they didn't, they deliberately created new nomenclature to get rid of baggage.
Also see: SimpleDB -> DynamoDB. EC2 Classic -> VPCs.
The blog post says there's one static ip per zone. I suppose www.mydomain should have multiple A records each pointing to an elastic ip in a zone. What happens when one zone entirely fails? Does it need a DNS change at this point? Or does the NLB have a different IP with which it can do BGP failover?
- One record per zone (which maps to the EIP for that zone)
- A top-level record that includes all active zones (these are all zones you have registered targets in, IIRC)
The latter record is health checked, so if an AZ goes down, it'll stop advertising it automatically (there will be latency of course, so you'll have some clients connecting to a dead IP, but if we're talking unplanned AZ failure, that's sort of expected).
That said, this does mean you probably shouldn't advertise the IPs directly if you can avoid it, yes.
(disclaimer: we evaluated NLB during their beta, so some of this information might be slightly outdated / inaccurate)
I thought one of the advantages of multiple zones is that zonal failover can happen with "zero" downtime (this seems to be the case with Amazon RDS).
We do also withdraw an IP from DNS if it fails; when we measure it, we see that over 99% of clients and resolvers do honor TTLs and the change is effected very quickly. We've been using this same process for www.amazon.com for a long time.
Contrast to an alternative like BGP anycast, where it can take minutes for an update to propagate as BGP peers share it with each other in sequence.
"Because the underlying IP address of a DB instance can change after a failover, caching the DNS data for an extended time can lead to connection failures if your application tries to connect to an IP address that no longer is in service."
I was aiming to use static IPs (for client firewall rules), and simplify networking configuration, so what I ended up with is an auto-scaling group of HAProxy systems that run a script every couple of minutes to assign themselves an elastic IP from a provided list. Route 53 is configured with health checks to only return the IP(s) that are working.
The HAProxy instances also continuously read their target auto-scaling groups to update backend config, and do ssl terminating, also running the Let's Encrypt client. Most services are routed by host name, but a couple older ones are path-based and there are some 301 redirects.
I think NLB could replace the elastic IP and route53 part of this setup, but I'd still need to do SSL, routing, and backends. It's too bad, because my setup is one that could be used nearly anywhere that has more than one public-facing service, but there's not much built-in to help - I had to write quite a few scripts to get everything I needed.
Still waiting on that fix, AWS.
You can see this happening by looking at the "Active Connection Count" graphs from your ALB, and adding or removing an instance from an ASG.
At 30+GBPS and over 20kRPS, removing one instance can cause absolute chaos.
From your description it may be that you have long lived connections that build up over time, at a rate that targets can easily handle, but that the re-connect spikes associated with a target failure/withdrawal are too intense. This is a challenge I've seen with web sockets: imagine building up 100,000 mostly-idle web sockets slowly over time, even a modest pair of backends can handle this. But then a backend fails, and 50,000 connections come storming in at once!
Another scenario is adding an "idle" target to a busy workload, but it not being able to handle the increased rate of new connections it will get. Software that relies on caching (including things like internal object caches) often can handle a slow ramp-up, but not a sudden rush.
We're currently experimenting with algorithms that allow customers to more slowly ramp-up the incoming rate of connections in these kinds of scenarios.
Anyway, those are guesses, so I may be wrong about your case, but hopefully the information is still useful to others reading.
Also did I read it wrong or this is actually cheaper than ALB?
Awesome! Can't wait to dig in more!
edit: Okay, it's 800 new connections per second, per the ELB pricing page, under "LCU details". The cost for 80k connections in an hour is effectively constrained by the bandwidth, eg if there's very low bandwidth it's $0.006/hour or $4.32/month.
When you bring in a new server to a busy workload, it gets all of the new connections. That can put a lot of pressure on that host - and if the application relies on caching, which most do in some form, it can really make performance terrible and even trigger cascading failures.
Another problem is that if a single host is broken or misconfigured, throwing 500 errors, it is often also the fastest and lowest CPU box, because that isn't very expensive. It can suck in and blackhole all of the traffic.
Based on how these issues work out at scale, we've moved beyond simple load based load balancing (I know that sounds counter-intuitive) and into algorithms that try to achieve a better balance for a wider range of scenarios.
Why would one reinvent the wheel when Linux already lets you do that.
In that case it would be C since it's implemented in the Linux kernel.
This appears as Layer 4 load balancing, IPVS is more of Layer 3.
So the person who was voted down by mentioning HAProxy might not be too far off. It could be implemented through HAProxy + TPROXY enabled in the kernel. Then just make sure that default gateway configured on targets routes back to the load balancer or it is the load balancer.
No, IPVS is L4 load balancing. L3 load balancing would be a routing protocol plus ECMP.
This is heavily used by Facebook for their loadblancer on their racks.