
AWS’s Elastic Load Balancer is a Strangler - jeffrey-sherman
https://shermanonsoftware.com/2020/03/04/amazons-elastic-load-balancer-is-a-strangler/
======
reilly3000
For what its worth, ALB is a fantastic product. As a full L7 load balancer it
can stand in the place of nginx, provide OIDC auth, replace API gateway
especially for high volume Lambdas, and has lots of tunable logic for running
diverse auto-scaled workloads. AWS is good about not breaking APIs and
contracts, so the "strangler" strategy is really about accommodating existing
customers. I'm not going to say Classic Load Balancers are a dog of a product
or anything, its just that they pre-date VPC, and the web stack has moved on
since then.

Having ALB be such a robust, turn-key product does tempt one into giving it a
lot of responsibilities which definitely lends itself towards lock-in. I guess
its the same with marriage: there are always tradeoffs.

I do wish it was better integrated with ECS + AppMesh + API Gateway, so it
could be like a managed GetAmbassador.io. Envoy proxies are a hell of a good
idea, and I think are going to be something like the React of the backend.
While one can dream about a grand unified request router for the cloud, ALB
continues to innovate with things like canary deployments and awesome routing
rules, all well continuing to 'just work' at web scale. Is there anything else
you can throw up to 50K RPS at and not have to think about it that much?

~~~
philliphaydon
One thing is that ALB to Lambda restrictions suck.

Lambdas have a 6mb response size. Api gateway has a 10mb response size. If you
go ALB to Lambda. And your response is over 1mb. Your response will fail.
Because the ALB team decided to introduce a random 1mb limit, not document it
as a limitation, and added it to their trouble shooting notes instead.

So if you need to return a small PDF or something from a lambda via ALB. Good
luck.

~~~
jedberg
FYI for anyone reading, the expected architecture is to return the URL to an
S3 object that is your larger than 1mb response.

The reason they do this is to keep the alb running efficiently by not having
to architect for random large responses.

~~~
philliphaydon
The problem I have is, is the limit is less than the limit imposed by Lambda
itself. Which is less than the limit imposed by API Gateway.

API Gateway: 10mb

Lambda: 6mb

ALB: 1mb

Yet if your target is an ec2 instance, there's no limit on the load balancer.
So IMO the limit for a lambda target should be the limit imposed by Lambda
itself. 6mb.

~~~
sudhirj
While I'm not disputing the limit numbers or whatever hardship they might
cause, it's worth noting that this is a basic limitation of the original
Lambda model and maybe FaaS in general. The capability comes from running a
giant pseudo-infinite mesh of isolated execution environments that load your
code and execute on demand, while having to buffer both the request and
response to make sure clients are protected from the details. This buffering
means that size of the buffer will always be limited - the team managing might
make the buffers bigger based on experience, but it's not a solved problem.

ALB to containers or servers is a different beast - here the entire request
and response need not be buffered at all (there might still be a very small
buffer, mostly negligible), so streaming responses, websockets etc become
possible.

We use lambda to resize images, so we do push against these limits a bit, but
it's a fair tradeoff for the advantages - no worries about CPU throttling from
too many requests, no waiting for servers to start for spiky loads etc.

~~~
inopinatus
Lambda was not designed for request/response. It’s an event driven service.
Wrapping API gateway around it is an architectural blunder, and leads to folks
like the GP wondering why their use case is a shitty fit.

~~~
mlthoughts2018
“function as a service” absolutely does need to support request/response as a
primary use case.

------
checker
And so is nginx, haproxy, API gateway, and any other layer 7 load
balancer/proxy.

~~~
joana035
And people were doing that even before aws existed

Edit: you can do that with iptables

~~~
checker
I didn't know that it's possible with just iptables. This sounds pretty
useful. Thanks!

------
ufmace
I haven't had to use the Strangler technique, but I think I would disagree
with this post. It is possible to use ALBs in this way, but I think it goes
against the point a little. I thought the point was that it should be a little
bit annoying to maintain all of the pass-through routes in the replacement
codebase, to ensure that there is motivation to actually do the replacement,
instead of just leaving it in place indefinitely.

Plus, leaving the pass-through in the codebase lets you do a much wider
variety of things. You could run both the original and your replacement, and
check the responses against each other. You could do that, or do replacements,
at however tiny of a part you can imagine. You could choose which cases to
handle where according to any algorithm you can dream up.

------
vorpalhex
I've always called this technique "proxying", since most of the software that
does this is referred to as "a proxy".

------
philsnow
Small nit, doesn't this assume http elbs ? I didn't think you could slice
traffic in any useful way if it's raw TCP.

~~~
travbrack
Right. Without reading http headers, you can only differentiate traffic by IP
and/or port.

~~~
zbentley
In general this is true. But routers/loadbalancers exist for many TCP
protocols (even stateful ones) as well. gRPC is probably the most popular such
protocol at present.

~~~
travbrack
I think you mean application protocols. There's only one TCP protocol. In the
example of gRPC you'd have gRPC > http/2 > TCP > IP > ethernet

Yes load balancers can make decisions on application layer protocols like
gRPC, but not in mode TCP which is what the parent comment was asking about.
That's because that mode only looks at the TCP header and IP headers which
don't contain any information about the application protocols.

An example of a time you might use this mode is if you're terminating TLS on
the web servers, so you can't read the encrypted http headers when they hit
the load balancer.

------
_nalply
Sounds really good. I didn't know that some people calls this a Strangler.

I have only one small caveat if your work is full of politics: The pressure to
provide a replacement of the legacy system disappears and a strangler might
transmute into the saving net of the legacy system.

Now you have two problems instead of one: the strangler AND the legacy system.

~~~
YawningAngel
I think this is a good thing. Moving from having no choice but to replace an
entire system to being able to work on it in incrementally makes an otherwise
impossible problem tractable. If you end up only replacing the problematic
parts of a legacy system and retaining the rest I think that's fine.

~~~
closeparen
The legacy has to be _really bad_ for an abandoned mid-flight migration to be
any better. Usually the intermediate state is at least a little worse (but
worth it because you eventually get through). Strangler pattern only helps
bound the damage.

------
luord
This is true for pretty much any load balancer or even any reverse proxy, I
think; but nevertheless, it's true and a good way to think about stranglers.

------
Thaxll
Yes it's called A/B testing, weighted load balancing ect...

------
crankylinuxuser
My biggest gripe is that the ALBs do not handle SSL decryption and re-
encryption in a safe way.

It's not my opinion that it's not safe, but FedRAMP. I'll just leave that at
that.

