You can use direct server return to manipulate the Ethernet frames so that packets don't travel back through the load balancer on the way to the parent switch.
That generally requires config on the serving hosts, which wasn't mentioned in the setup. I think I saw a reference to adding hosts with a different port number than the service port as well. For people in EC2-VPC (not classic), all their traffic is going through an Amazon NAT anyway, perhaps this new service is setting up translations there. (Note all the references to VPC, and never a mention of EC2-classic)
Do you have a support POC? I'd reach out to them as they should be able to provide you with a roadmap update. If you're not sure who that is, you can reach out to me at kozlowck at amazon.com.
Is there any intent to add TLS termination? That’s a dealbreaker for us switching from the classic load balancer.
Otherwise this looks really awesome, thanks!
I don't think they can add TLS termination because of the way it's implemented. NLB runs on Layer 4 - the transport layer where TCP/UDP run on. TLS technically runs on top of the transport layer.
That’s kind of the answer I was expecting, just hoping it wasn’t the case. From the marketing material they really want you to move, but not having a solution to offload tls makes it impossible for us. And it worries me to see the CLB getting effectively deprecated with it an alternative
I'm hopeful AWS will follow this up with ACM supporting SSL certs on instances, so you can run a LetsEncrypt equivalent on each instance, providing TLS end to end encryption
that threw me for a bit of a loop as well. This means that responsibility for doing ACL whitelisting at the edge is now moved from the actual edge, to the security groups on the actual servers responding to request, right?
That's do-able and all, but I kind of didn't hate the old paradigm of having an extra layer there.
One way to think of NLB is that it's an Elastic IP address that happens to go to multiple instances or containers, instead of just one. Everything else stays the same.
Yeh, it's easy to use it like that for now. I hope they update it later on though. Seems like an missing feature in their otherwise nice firewall rules setup.
Demo page states "Your browser may keep a connection open for a few seconds and re-use it for a reloaded request. If it does, you'll get the same target", but when I attempted to abuse the power of F5, I was alternated between ice cream and bumblebee.
If you are going to look at it, attempt time - ~04:50 UTC, remote address from 88.119.128.0/20 network
Browsers typically use a few connections to load a page so that it can load faster. Each of those threads has a different source port, and thus may route to a different target.
In Colm's demo, it depends which thread your browser uses when requesting the part including the CSS which decorates the object.
In my Chrome on Mac I see 6 TCP connections to the demo NLB
tcp4 0 0 192.168.100.101.49615 54.69.111.179.80 ESTABLISHED
tcp4 0 0 192.168.100.101.49614 54.69.111.179.80 ESTABLISHED
tcp4 0 0 192.168.100.101.49613 54.69.111.179.80 ESTABLISHED
tcp4 0 0 192.168.100.101.49612 54.69.111.179.80 ESTABLISHED
tcp4 0 0 192.168.100.101.49611 54.69.111.179.80 ESTABLISHED
tcp4 0 0 192.168.100.101.49610 54.69.111.179.80 ESTABLISHED
Each of those will be routed to the same target, so it's up to your browser to decide which to use for what.
CloudFormation, CodeDeploy, and ECS all support NLB today :) I used the console to create the simple demo though, so I don't have a template to recreate it. Sorry!
It looks like the deep linking to the LCU page doesn't work (you have to click the tab for Network Load Balancer), so here's what an LCU is from that page:
---
An LCU is a new metric for determining how you pay for a Network Load Balancer. An LCU defines the maximum resource consumed in any one of the dimensions (new connections/flows, active connections/flows, and bandwidth) the Network Load Balancer processes your traffic.
Seems to remarkably decrease latency (380ms -> 109ms). Running some tests:
# ab -n 400 http://nlb-34dc3b430638dc3e.elb.us-west-2.amazonaws.com/
Time per request: 108.779 [ms] (mean, across all concurrent requests)
# ab -n 400 <public server via ELB>
381.933
# ab -n 400 <public server via ALB>
380.632
# (for reference) ab -n 400 https://www.google.com/
190.536
# (for reference) ab -n 400 https://sandbox-api.uber.com/health/
107.680
If you're wiling to terminate SSL, this looks like it could be a solid improvement.
Static IP, source IP, and zonality are game changing.
Unfortunately, it lacks a very significant existing feature of ELB: SSL/TLS termination. It's very convient to manage the certs in AWS without having to deploy them to dedicated EC2 instances.
I expect the health-checking happens independently of the routing. The routing will just act upon a list of routes which is modified independently by the health check.
This was discussed a couple of times recently, but answers seemed contradictory. ELB requires pre-warming if you expect sudden high load. But do ALB and NLB?
Each NLB starts out with several gigabits of capacity per availability zone, and it scales horizontally from there (theoretically to Terabits). That's more capacity than many of the busiest web-sites and web-services in the world need.
If you expect an instantaneous load of more than about 5Gbit/sec, in those situations we work directly with customers via AWS Support. We really try to understand the load, make sure that the right mechanisms are in place. At that scale, our internal DDOS mitigation systems also come into play. (It's not a constraint of NLB).
The load test in the blog post was done with an NLB, and was done with no pre-provisioning or pre-warming and allowed us to get to 3M RPS and 30Gbit/sec, which is when we exhausted the capacity of our test backends.
ALBs start out with less capacity, and are constrained more by requests than bandwidth. I don't have a precise number because it depends on how many rules you have configured and which TLS ciphers are negotiated by your clients, but the numbers are high enough that customers routinely use ALBs to handle real-world spiky workloads, including supporting Super Bowl ads and flash sales.
Each ALB can scale into the tens of gigabits/sec before needing to shard. ALB also has a neat trick up its sleeve: if you add backends, we scale up, even if there's no traffic. We assume the backends are there to handle expected load. So in that case it has "more" capacity than the backends behind it. That goes a long way to avoiding some of the scaling issues that impacted ELB early in its history.
If you have a workload that you're worried about, feel free to reach out to me and we'll be happy to work with you. colm _AT_ amazon.com.
> [upgrading ELB], in those situations we work directly with customers via AWS Support.
This is painful, though. I don't know about Sep 2017, but in 2016 upscaling your ELB involved answering a block of 21 or so questions in a list only given to you after you engage with support, which had some pretty esoteric items on it. It was decidedly un-AWS-ey.
It was definitely un-AWS-ey, and there's an almost visceral pain response on our faces when we don't have a self-service API for something. This is improving all of the time and I think is already much better, with some big specific improvements ... I'll do my best to share what I can here.
First things first, with NLB our experience is that pre-warming is never necessary. Each NLB starts out with a huge volume of capacity, beyond the needs of even the largest systems we support, and each NLB can theoretically scale to terabits of traffic.
Our first big improvement for ALB was that the basic scalability and performance of ALB is at a point that in all but a very small number of cases (think of some of the busiest services in the world), customers don't need to do anything. This is the pay-off from a lot of hard work focused on low-level performance.
Our second big improvement, for both ALB and Classic ELBs, was a mix of more generous capacity buffers, and pro-active and responsive scaling systems. Together, these mean that we can race ahead of our customers load requirements.
Another item that's helped is that for the truly big scaling cases, which is DDOS preparedness, we now have the AWS Shield service to manage that process in consultation with the customer. That's useful if your needs are more nuanced and custom than the DDOS protection that is included with ELB by default. This gets into things such as how your application is configured to handle the load.
With all of these improvements, ALB does not require pre-warming for the vast majority of real-world workloads. However, after years of pre-warming as a thing, we have customers who have incorporated it into their operational workflows, or who rely on it for extra peace of mind. We do want to continue to support that for our customers.
Yes, all packets from a TCP connection will keep going to the backend that was chosen when the connection first came in. This is preserved even if other backends are added and removed to the set of eligible targets. It's also preserved even if the backend itself becomes unhealthy, but the connection seems not to be impacted. For example, if the backend application stops listening for new connections - existing ones will continue to work.
We try really really hard to preserve connections and the system is designed to keep connections healthy even for months and years.
Not sure about ALB, but from the linked blog post for NLB:
> Beginning at 1.5 million requests per second, they quickly turned the dial all the way up, reaching over 3 million requests per second and 30 Gbps of aggregate bandwidth before maxing out their test resources.
Seems like this limitation has been for long time (since 2009). Curious to know how everyone has been using ELB. To me, ELB seems to be an unfinished product. It must be painful to first predict heavy load on your application and then notify AWS well in advance.
It's almost like AWS released what they thought was the best product at the time, regretted some of their decisions and then launched other product(s) to replace it....
In this case, it seems like the one-size-fits-all ELB has been replaced by ALB for those using containers, who want L7 LB, and don't need insanity-scale. NLB for those who want massive scale, a dumb pipe and/or need consistent IPs. They could have tried to build these features into ELB but they didn't, they deliberately created new nomenclature to get rid of baggage.
Also see: SimpleDB -> DynamoDB. EC2 Classic -> VPCs.
The blog post says there's one static ip per zone. I suppose www.mydomain should have multiple A records each pointing to an elastic ip in a zone. What happens when one zone entirely fails? Does it need a DNS change at this point? Or does the NLB have a different IP with which it can do BGP failover?
AWS provides you with a number of DNS records for each NLB:
- One record per zone (which maps to the EIP for that zone)
- A top-level record that includes all active zones (these are all zones you have registered targets in, IIRC)
The latter record is health checked, so if an AZ goes down, it'll stop advertising it automatically (there will be latency of course, so you'll have some clients connecting to a dead IP, but if we're talking unplanned AZ failure, that's sort of expected).
That said, this does mean you probably shouldn't advertise the IPs directly if you can avoid it, yes.
(disclaimer: we evaluated NLB during their beta, so some of this information might be slightly outdated / inaccurate)
The default answer includes multiple A records, so if clients can't reach one of the IPs, they try another. There's no need for anything to propagate for that to kick in, it's just ordinary client retry behavior.
We do also withdraw an IP from DNS if it fails; when we measure it, we see that over 99% of clients and resolvers do honor TTLs and the change is effected very quickly. We've been using this same process for www.amazon.com for a long time.
Contrast to an alternative like BGP anycast, where it can take minutes for an update to propagate as BGP peers share it with each other in sequence.
RDS failover still uses DNS and you still need to be aware of client TTLs:
"Because the underlying IP address of a DB instance can change after a failover, caching the DNS data for an extended time can lead to connection failures if your application tries to connect to an IP address that no longer is in service."
I assume they intend for you to use Route53 on top of this. You could use a combination of geolocation routing and failovers to set it up so that by default people are routed to their nearest region, but if that region is currently offline send them somewhere else instead.
I have just finished setting up a new front-end for a few services (we are just about to start migrating production systems to it).
I was aiming to use static IPs (for client firewall rules), and simplify networking configuration, so what I ended up with is an auto-scaling group of HAProxy systems that run a script every couple of minutes to assign themselves an elastic IP from a provided list. Route 53 is configured with health checks to only return the IP(s) that are working.
The HAProxy instances also continuously read their target auto-scaling groups to update backend config, and do ssl terminating, also running the Let's Encrypt client. Most services are routed by host name, but a couple older ones are path-based and there are some 301 redirects.
I think NLB could replace the elastic IP and route53 part of this setup, but I'd still need to do SSL, routing, and backends. It's too bad, because my setup is one that could be used nearly anywhere that has more than one public-facing service, but there's not much built-in to help - I had to write quite a few scripts to get everything I needed.
No chance that I'll jump into another new load balancer product from Amazon any time soon. ALB has significant deficiencies that AWS don't warn you about, and you only find then at tens of thousands of RPS.
Sure. Whenever a "config change" (Note: this includes adding or removing targets to a target group, EG Autoscaling) happens on an ALB, the ALB drops all active connections, and re-establishes them at once, at high load, this obviously causes significant load spikes on any underlying service.
You can see this happening by looking at the "Active Connection Count" graphs from your ALB, and adding or removing an instance from an ASG.
At 30+GBPS and over 20kRPS, removing one instance can cause absolute chaos.
Wow that sounds awful - but thankfully this isn't typical. I'm going to go digging for a case-id/issue and see what's going on myself (please e-mail the case if you have one). Re-configurations are routine and graceful.
From your description it may be that you have long lived connections that build up over time, at a rate that targets can easily handle, but that the re-connect spikes associated with a target failure/withdrawal are too intense. This is a challenge I've seen with web sockets: imagine building up 100,000 mostly-idle web sockets slowly over time, even a modest pair of backends can handle this. But then a backend fails, and 50,000 connections come storming in at once!
Another scenario is adding an "idle" target to a busy workload, but it not being able to handle the increased rate of new connections it will get. Software that relies on caching (including things like internal object caches) often can handle a slow ramp-up, but not a sudden rush.
We're currently experimenting with algorithms that allow customers to more slowly ramp-up the incoming rate of connections in these kinds of scenarios.
Anyway, those are guesses, so I may be wrong about your case, but hopefully the information is still useful to others reading.
This is perfect. Need something like this to load balance A records for dynamic domains off apex. We can most likely use the static IP address perfectly for this.
Also did I read it wrong or this is actually cheaper than ALB?
I don't understand the pricing model. 800 new connections per hour, for $0.006? Isn't that extremely expensive? 80,000 connections for $0.60 in an hour is $432 per month for not a whole lot of traffic.
edit: Okay, it's 800 new connections per second, per the ELB pricing page, under "LCU details". The cost for 80k connections in an hour is effectively constrained by the bandwidth, eg if there's very low bandwidth it's $0.006/hour or $4.32/month.
I was just wondering if this is something purely developed inside amazon or is it backed by an ADC like NetScaler or F5. does anyone know any detail ? I'm assuming that classic load balancer is some third-party or old framework and this is something amazon developed internally.
I would assume that it's something developed internally at Amazon. Networking inside of AWS isn't standard fare and I doubt something like NetScaler or F5 products would be able to be used. Generally speaking, they aren't using TCP/IP behind the curtain, to move packets between nodes. AWS has even created their own routing hardware/software because no other company could do what they need at the scale that they need. See this video for more information: https://www.youtube.com/watch?v=St3SE4LWhKo
AWS & Amazon uses a LOT of routers from one of these vendors. Although, they would probably try and avoid baking it into a public-facing product like this.
in case anyone is confused about this sub-thread:
"""
Application Load Balancers provide native support for HTTP/2 with HTTPS listeners. You can send up to 128 requests in parallel using one HTTP/2 connection. The load balancer converts these to individual HTTP/1.1 requests and distributes them across the healthy targets in the target group using the round robin routing algorithm. Because HTTP/2 uses front-end connections more efficiently, you might notice fewer connections between clients and the load balancer. Note that you can't use the server-push feature of HTTP/2.
"""
( http://docs.aws.amazon.com/elasticloadbalancing/latest/appli... )
Yes it is. For the client, multiplexing multiple requests across a single connection and only needing to negotiate TLS once can have huge benefits, depending on the page/site. You are after all building your app/site for your end users, right?
Feature request : Please allow weighted load balancing i.e. ability to distribute traffic in a user specified ratio (weights) to different sized instances.
For now here's a work-around that I use: create multiple listeners/ports on the larger instances and add them as targets. Containers is a great approach here too; load up the bigger instances with more containers and register each container as the targets.
This is what ELB Classic did for a long time - but we're experimenting with new algorithms. The problem with this approach is that it's not cache friendly.
When you bring in a new server to a busy workload, it gets all of the new connections. That can put a lot of pressure on that host - and if the application relies on caching, which most do in some form, it can really make performance terrible and even trigger cascading failures.
Another problem is that if a single host is broken or misconfigured, throwing 500 errors, it is often also the fastest and lowest CPU box, because that isn't very expensive. It can suck in and blackhole all of the traffic.
Based on how these issues work out at scale, we've moved beyond simple load based load balancing (I know that sounds counter-intuitive) and into algorithms that try to achieve a better balance for a wider range of scenarios.
I should have read the blog instead of skimming before responding.
This appears as Layer 4 load balancing, IPVS is more of Layer 3.
So the person who was voted down by mentioning HAProxy might not be too far off. It could be implemented through HAProxy + TPROXY enabled in the kernel[1]. Then just make sure that default gateway configured on targets routes back to the load balancer or it is the load balancer.
Ah my bad, even the wiki link I posted said it is layer 4. I guess both solutions could be utilized then, perhaps it was IPVS then since it would be more performant than HAProxy.
It preserves source ip, so that suggests something like a kernel module, or like Intel's DPDK. Either (practically) rules out anything but C, doesn't it?
Better off depends on what your workload goals are.
If you want path or host name based routing, Application load balancer may be a better fit as it natively supports WebSockets.
If your goal is long lived sessions (weeks and months, not minutes and hours), Network load balancer is probably a better fit.
ALBs are great for WS and can term SSL for WSS. Just use "HTTP" and "HTTPS" protos on the target groups and it'll work. It's a bit confusing but it works.
Massive disclaimer: I work on NLB.