
New AWS UDP Load Balancing for Network Load Balancer - bryanh
https://aws.amazon.com/blogs/aws/new-udp-load-balancing-for-network-load-balancer/
======
wikibob
This is a Big Deal because it enables support for QUIC, which is now being
standardized as HTTP/3.

To work around the TCP head of line blocking problem (among others) QUIC aises
UDP.

QUIC does some incredible patching over legacy decisions in the TCP and IP
stack to make things faster, more reliable especially on mobile networks, and
more secure.

Here’s a great summary from Fastly on what QUIC means for the Internet:
[https://www.fastly.com/blog/why-fastly-loves-quic-
http3](https://www.fastly.com/blog/why-fastly-loves-quic-http3)

------
mark242
This is big for making services which rely on DNS much easier to roll out in a
container environment (ECS, EKS, etc). Traditionally we've had to create
custom AMI images, use CloudFormation to keep them running with EIPs, and then
have those EIPs be part of runtime configuration for our services.

~~~
not_kurt_godel
One "downside" of AWS is we've rolled a lot of custom solutions like this, at
significant time/expense, only to have them be made obsolete by eventual
native feature support. So we get left with a mixture of legacy systems using
the custom solution and newer ones using native support and it makes things
more complicated. It's actually a good problem to have in many ways, and
basically unavoidable in many circumstances, but an interesting dynamic
nonetheless. Reminds me of interstellar wait calculation[0] - do we defer
dependent features until there's native support, or forge ahead knowing
there's a likelihood of being 'overtaken'?

[0]
[https://en.m.wikipedia.org/wiki/Interstellar_travel#Wait_cal...](https://en.m.wikipedia.org/wiki/Interstellar_travel#Wait_calculation)

~~~
joemag
Another way to look at it: customers like you, who build custom work arounds
to some problem, influence our decision that a particular problem is important
enough to be solved.

~~~
not_kurt_godel
Yup, and that's overwhelmingly a good thing! The one thing I will say is that
AWS does tend to lean on this attitude a bit too much, IMO, with a tendency to
ignore common sense about what people will inevitably need, thus causing the
kind of thrash I described when it could have been avoided. It is erring on
the right side of delivering vs waiting generally, but the balance could stand
to be fine tuned.

------
Jedd
Related - has anyone done much with UDP load balancing on prem?

We're starting to hit performance and HA walls with ingesting Netflows from
edge routers - you can only nominate one target, and using Elasticsearch /
Logstash there are some hard limits.

Would AWS be appropriating nginx under the hood here?

~~~
nullwasamistake
Lots of people use IPVS but the more efficient modes don't work on AWS.
Generally why most that need a LOT of traffic use a cloud provider for regular
servers and their own servers in CoLo for heavy stuff.

With how Amazon likes to use OSS in their services I'm pretty sure their UDP
load balancer are in fact just using IPVS

~~~
Jedd
Interesting, thanks. Hadn't considered this option before, and will do some
more exploring, though I note on the IPVS page they say:

"For scheduling UDP datagrams, IPVS load balancer records UDP datagram
scheduling with configurable timeout, and the default UDP timeout is 300
seconds. Before UDP connection timeouts, all UDP datagrams from the same
socket (protocol, ip address and port) will be directed to the same server."

I'm hopeful / confident that affinity can be fully de-tuned here, as we're
looking at around 5-10k UDP Netflows per second from a given router that need
to be distributed to a set of receivers.

~~~
nullwasamistake
I may be wrong, but I think you can tell IPVS to schedule using tuple hash
only using Direct Return mode, which means no stored state for connection
tracking.

Edit: doesn't appear to be true, but it uses it's own "lightweight" connection
tracking table so you can unload conntrack modules from kernel.

Realistically IPVS can probably route 40 gigabit of traffic per instance.
Combine that with DNS round robin and maybe even multi-homing at the front and
you could handle basically anything

------
stock_toaster
Nice! I wonder if this is a preparatory step for future quick/http3 support?

------
johannnishant
That's great! Any idea what Load balancing algorithm this would use?

We have a need for some stickiness in the load balancer (for example: UDP
Packets from a source must be routed to the same instance, at least for a
short while)

~~~
mcpherrinm
It's documented as:

> For UDP traffic, the load balancer selects a target using a flow hash
> algorithm based on the protocol, source IP address, source port, destination
> IP address, and destination port. A UDP flow has the same source and
> destination, so it is consistently routed to a single target throughout its
> lifetime. Different UDP flows have different sources, so they can be routed
> to different targets.

From the NLB docs at
[https://docs.aws.amazon.com/elasticloadbalancing/latest/netw...](https://docs.aws.amazon.com/elasticloadbalancing/latest/network/introduction.html)

------
zedpm
This is great news, and something I’ve been requesting for years. I manage an
IoT backend based on CoAP, which is typically UDP-based. I’ve looked at Nginx
support for UDP, but a managed load balancer is much more appealing.

~~~
celim307
Same story here, getting NGINX to help even with highest support tiers was a
PIA too

------
api
Apparently if the target is the instance ID this can preserve public source IP
and port. That can be a big deal for e.g. bootstrap nodes for P2P networks.

------
jedisct1
Can be nice for games, QUIC and DNSCrypt.

------
dnautics
Curious: How does one generally load balance udp? Drop packets? Slow them
down?

~~~
bboreham
It means taking a set of packets sent to one address and spreading them across
multiple servers to share out the load.

~~~
dnautics
oh, geez. Thank you. Somehow I was thinking about throttling, not load
balancing.

------
matsur
A plug for our (Cloudflare's) product — we support managed load balancing for
UDP as well.

\- [https://blog.cloudflare.com/spectrum-for-udp-ddos-
protection...](https://blog.cloudflare.com/spectrum-for-udp-ddos-protection-
and-firewalling-for-unreliable-protocols/)

\- [https://blog.cloudflare.com/introducing-spectrum-with-
load-b...](https://blog.cloudflare.com/introducing-spectrum-with-load-
balancing/)

~~~
jbarham
Looks cool but if the product is only available for "Enterprise" customers and
the pricing is "Request Quote" that means it's expensive. At least the AWS
pricing is published.

------
TrueDuality
Sweet. Now add support for multiple ports on a single service[1] and this load
balancer might actually become useful.

[1]: [https://github.com/aws/containers-
roadmap/issues/104](https://github.com/aws/containers-roadmap/issues/104)

~~~
encoderer
ALB !== NLB

~~~
cameroncooper
With NLB targetting EC2 you can only specify one port per target group. To
achieve multiple ports going to a single instance (or autoscaling group) you
need to have one listener and target group per port.

