I don't think this is particularly good advice, he makes some valid points, but doesn't really consider the costs of the alternatives that he is suggesting.
Some counterpoints:
1. Micro-lambdas
How many routes does your application have 3, 30?
Trying to maintain 30 different codebases and deployments would be a nightmare.
Package size.
Package size is unlikely to be much different by adding a little extra code, especially if the code for each route uses the same set of libraries(which will make up the bulk of the package size).
Least privilege.
If it makes sense to limit a routes privileges, you can simply deploy the same lambda code separately, with a different role.
Upgrading.
On the contrary, upgrading a lambda that uses a single codebase, is much simpler than having to deploy multiple systems and deal with the issues of interoperability and compatibility between versions.
Reusing code.
Also much much simpler with a monolithic codebase, the code is right there, ready to use.
Testing.
Testing a monolithic app is very straight-forward, you don't need to setup a complex integration test environment, all the functionality can be tested locally.
4. Lambdas calling lambas
I tend to agree with this one, at least for synchronous calls, but you can also call another lambda asynchronously, actually this is something you tend to find yourself doing if you go down the path of breaking down a monolithic lambda.
5. Synchronous waiting.
Yes, from a cost perspective, you should minimize the amount of time your lambda is spending waiting on external IO.
However, from an application complexity perspective, you should make things as simple as possible, building overly complex event based lambda flows, can make it really difficult to debug and maintain your application.
I use serverless to orchestrate my lambdas. They help out with the tradeoff between micro-lambdas and monolith. I get a single code base where I can reuse code, write tests and deploy all at once and serverless packages it up into individual lambdas. You can also manage other resources there in a single serverless file including iam permissioning, databases, or any other aws service. So your entire app is in that config file.
Sometimes it bites you in the ass though and you get cryptic errors when trying to update the app in cloudfront, but overall I think its worth it.
This is my approach and it worked really well for a side(ish)-project. I liked having each endpoint be it's own file but also the code could share common 1st and 3rd party libs. I was also able to easily add some top-level checks/handling by adding a wrapper function that could handle certain types of errors and inject stuff like DataDog (though I will not be using them going forward, way too expensive, even with 0 traffic they were causing almost $75 worth of Cloudwatch requests a month, on top of their costs).
You can solve alot of these with multi-module projects, at least in Kotlin/Java.
You have a single codebase, with modules for each route, and modules for shared infra/domain code.
Maven builds a JAR for each Route with only the resources that route needs, you deploy a single lambda for each JAR - but - you deploy them all together as part of the same build pipeline.
Then you have, non-monolithic lambdas, that can easily be performance tuned and privileged individually, but a single codebase that represents a whole 'microservice' application in one place, with shared code, that can be upgraded and kept consistant.
> Package size is unlikely to be much different by adding a little extra code,
I was surprised to learn that Google Cloud Functions deploy a 1.5gb docker image. After that I was like fuck it, I don’t need to think about saving a few megabytes payload here and there.
Presumably though that involves image layers and an image-layer-aware filesystem so that they don’t actually have to have more than one copy of the common image per execution machine, and just have to send around the customer’s overlay diff? Maybe they don’t bother and just pass the whole thing around, but this is a solved problem in the wider world for distributing a large number of large OCI images that are substantially similar.
>doesn't really consider the costs of the alternatives
I loathe blog post "anti-patterns" for this reason. Our job as engineers is largely about managing tradeoffs. And trade-offs don't come pre-packaged in a blog of dos and donts, they're specific to the constraints at hand.
Just about all of these "anti-patterns" are actually freaking awesome when what they offer (and do not offer!) lines up with what you need.
1. monoliths are awesome when that's all you need. Simple to write, maintain, and deploy.
2. Orchestration lambdas are A-OK when it's all you need and your workload fits comfortably in Lambda's constraints. I swear, the 'cloud' is rotting people's brains. There's a complexity threshold where Step Functions start to pay off. It's fairly high imo. The complexities in testing and debugging it introduces are huge compared to just dropping a break point in your control script.
3. This one is sane
4. Calling this an anti-pattern is again wrong imo. Goodness of fit depends on context. Lambdas calling other lambdas can squeeze a helluva lot of burst CPU performance for not a lot of money. There are a lot of workloads for which this makes sense.
5. This one is stepping over dollars to pick up pennies. I agree with your take entirely. Whatever minuscule amount is saved on idle CPU time with be dwarfed by the additional dev time spent writing, deploying, instrumenting, and debugging. Classic Solutions Architect wankery.
Your counter-points to (1) don't seem any different to the monolith vs. microservice debate. Much of your critiques are not so much problems as they are trade-offs. And they've been talked about ad nauseam.
The whole stack of waiterio.com runs on AWS Lambda.
We do use several Monolith lambdas and it works well.
I believe the reason why Lambda has seen so little adoption is because cloud architects suggest to break existing services into a Lambda function for each endpoint.
This causes an explosion of complexity and code that runs in the cloud in one way but it can't be reproduced locally on developer machine easily.
My advice is the opposite:
DO USE MONOLOTHIC LAMBDAS.
CDK(and infra as code in general) is amazing, especially for use cases like this where one would be responsible for deploying and creating resources around 10s of lambdas for one 'service'.
I'm eager to chat with you about CDK since it's kind of new and I don't know anyone else using it.
Feel free to write me at giorgio DOT zamparelli AT gmail DOT com to chat about CDK, Lambda, serverless, Infrastructure as Code.
You might want to edit that comment and speak towards cold start times, dependency management, your deployment strategy (e.g. are you using a docker image lambda or zip, etc) and what you're paying for provisioned concurrency to keep those big lambdas warm and available. That's all nuance you're missing from promoting monolith lambdas, and those kinds of things will get the non-architect-minded in hot water.
We use node.js and the cold start times are negligible.
Java, C# and docker based lambdas have problems with cold start times but we do not suffer from them.
We also use JAMstack and our html pages are statically generated in advance with Cloudfront CDN in front.
We use Lambda for the backend REST JSON APIs.
Hard disagree with the reasons given for "2. Orchestrator Lambda." While step functions are very useful and have a variety of uses, they're also VASTLY more expensive to run (the cost continues to go down but last I checked, 2.5x more expensive). Orchestrator lambdas are extremely useful, and given care in code, architecture, separation of concerns they can be extremely useful and very light. The author's point on this seems more like "I haven't seen this done elegantly" or "I tried this and my code became unmanageable," and doesn't provide solid evidence for an argument against them. The argument made could apply to any unit of infrastructure that contains code.
I believe "4. Lambda functions calling Lambda functions" speaks to what I believe are the author's shortcomings for #2, with the caveat that "Cost" is definitely a consideration, as waiting for a waterfall of lambda calls will incur more cost. Call-it-and-forget-it is the more cost-efficient method for lambda-to-lambda. I trend more towards pub-sub for this scenario if the call-stack is more than 2 lambdas (e.g. lambda calling one lambda, end of stack). The author doesn't mention SNS curiously, which would suggest some inexperience in this theater. SQS can be used if payload process explicitly calls for the capabilities of SQS, but bare SNS is much more suited to light pub sub duty.
Overall, this seems like a rather reductionist article written for the SEO keywords rather than content. Anyone happening across this should deep dive "the why" behind each of the claims before taking it at face value. And I wish the article stated that at the very top.
This appears to be more or less a straight up copy-paste from AWS documentation (https://docs.aws.amazon.com/lambda/latest/operatorguide/anti...). I'm glad to see the HN discussion because I was literally looking at those docs yesterday. Less glad about personal blogs "lifting and shifting" content and passing it off as their own.
The “duration of invocation” payment model (discussed in #4) is ultimately what turned me off of lambda for anything that involves an external API. You could have a lambda function running smoothly, and then one day get a surprisingly high bill because a downstream service that used to respond in 100ms encountered issues and started taking 3000ms to respond so you’re suddenly paying ~30x per invocation through no fault of your own.
Are you actually likely to have a "high bill" on Lambda but also be "surprised" that all your requests are 30x slower?
Presumably anything that can generate a "high bill" also has some reasonable level of on-call, alerting, SLAs etc ... such that if your Lambda invocations were all taking 30x longer people would notice.
I really did get a “surprisingly high bill” because of it. It wasn't high in the absolute sense, but it was something like 10x my usual monthly cost in a single day, and would have been higher if I didn't get an alert.
The reason I went for serverless over a server in the first place was so that I wouldn't have to wake up in the middle of the night because of server issues, so alerting is not a satisfying solution. Especially where the problem is more an artifact of how the service is billed than how much it consumes actual server resources.
You're making a tradeoff. You no longer have to alarm for things like your server failing due to disk failure or other stuff that lambda abstracts away from you, but you do have to alarm on things that will increase cost. Constant cost(if you aren't dynamically scaling) and more headache vs variable cost and less headache
It would be a trade-off I'd consider if alternatives didn't exist, but they do. Someone down-thread mentioned Cloudflare Workers, which have a sensible pricing model. These days I use Google Cloud Run, which gives me everything I wanted from Lambda but has model where you pay by CPU time rather than invocation time, and allows a CPU to handle multiple requests.
Any external API that could reasonably take a long time to return (because of the work its doing) should have an asynchronous API. I.e. you submit the request to start the work and can later do another request to check if its complete and get any results. Rather than waiting on a live request.
Good practice for using external APIs is to NOT use any default http client settings and always provide your own timeouts to what you consider reasonable for connections and responses, as well as using a context system with deadlines so you can time out any requests that are taking more than a reasonable time to complete. Making your described surprise long expensive requests into nothing more than short errors (which hopefully you'll pick up on after a while, as long as you've got your alerting system setup right).
> Any external API that could reasonably take a long time to return should have an asynchronous API.
I agree, but that doesn’t solve the problem here. The remote API was asynchronous, I was just waiting for it to ack my instruction. Because of issues out of my control (network congestion, maybe?) the time to get a 200 OK from the server shot up.
> always provide your own timeouts to what you consider reasonable for connections and responses
Agreed here as well, and I was providing my own timeout. The problem is that (cost-wise) it’s fine for 1% of requests to hit a 5-second timeout, but gets expensive when 100% of requests do. And lowering the timeout means that during normal times, requests that have latency in the tail of the distribution but ultimately go through would fail, which is undesirable.
I don't think that's what they meant by asynchronous api. More likely something along the lines of "schedule a job - 1 fast request" "call me back at my endpoint when the job is finished" (or variations, the gist is you are not stuck waiting for the server response)
How can you schedule a job within 100ms if it takes 3000ms to establish a connection with the server running the other api due to network congestion? Maybe you can send the request in UDP, but you'll get tons of situations where the secondary api never ran.
Right, but even in schedule a job - 1 fast request, there's still some waiting for the server to ack that it got the message. That's the request that became slow. I don't see how there's any getting around that, async API or not.
One selling point is “free” horizontal scaling possibility. Contrived example: you’ve a parallelizable workload but the work arrives sporadically. 600 pieces of 10 minutes work arrive. In theory, instead of waiting 10 hours to process the batch on a single instance - lambda will fire up 600 instances completing the batch in 10 minutes. And 1 cloud CPU for ten hours costs the same as 600 for 10 minutes.
In practice, you spend more time writing lambda specific code and changing an obvious workflow to avoid DB access becoming a bottleneck or debugging why published events did not trigger the correct function, etc. that you wonder whether throwing cpu/memory/disk resources at a single instance tuned for the workload with dedicated local SSD storage might have been a better option especially as tasks around logging, persistent storage, debugging, profiling, error handling and getting stack traces are so much easier.
Yeah, I don't really see the appeal, either. I thought part of the point of something like this was that you no longer had to manage or think about the environment you're deploying to in the same level of detail.
But clearly you do, except now the resources, permissions, versioning, and costs you have to think about are all specific to various parts of AWS, which are locked in, probably more expensive, and probably less familiar than their counterparts on operating systems or in containers.
Seems like a lot of work for a kind of scalability that leaves you with little insight into how it works, idk
The expense depends on your use-case. I think that some SaaS products would be cheaper to host on Lambda. We have short bursts of high load, which led to quite a bit of waste with provisioning of traditional application servers and database.
After moving from traditional servers to Lambda, we had lower hosting cost. After switching, it is easier to deploy new back-end features. It is easier to provision a sandbox for development.
In our case, the cost savings were significant. More important, we have operational improvements: safer delivery of new features, comprehensive metrics, and comprehensive alerting.
Of course, there are many ways of achieving these goals. Kubernetes has great ways of doing the same with less vendor lock-in, but requiring more knowledge and care with the system components.
Vendor lock in is a non issue. Prices have been dropping across the board for basically all cloud services across all providers as long as cloud has been a thing. While the hosting itself might be 5x more expensive, you can generally run a large scale app on serverless infra with 20% of the SRE/platform headcount, and for nearly every engineering company the savings in salary costs and focus on product instead of platform will pay for themself very quickly and become a huge asset in the long run.
From a skills perspective, the vendor lock-in still bothers me on a personal level.
What's interesting about learning to push the right buttons on one company's black box, especially when I know it's likely powered by or equivalent to familiar F/OSS?
It’s not about what is interesting; it’s about what is productive. Kubernetes is interesting but I would not launch a project in it at a scale smaller than say 20M USD/year hosting costs because you’ll probably be able to make a higher quality project in a faster time if you use hosted/3rd party solutions
Security, ability to outsource pretty much everything ops-related besides cloud resource provisioning and deployments to the cloud provider and developer focus on business logic.
I don't know a better dev environment than a (possibly scaled down) personal replica of production environment in the cloud. With proper tooling (e.g. Serverless Stack or SAM) you can achieve very fast code updates so the old argument of slow feedback due to having to deploy changes to the cloud on each iteration is getting less and less true as well.
With more traditional models already keeping your OS, possible container images, web server and any other middleware secure and up-to-date is pretty expensive if you want to do it properly.
Going all-in on serverless might not make much sense for a large software product company but when building bespoke business software it allows small teams to do wonderful things very cost-effectively.
I get the impression there is a lot of impedance mismatch between the ability to host functions in the cloud, and the tooling and ways of thinking required to make this work.
Like, suddenly you can't just debug your code; debugging is magic and requires special tools. And you can't just use a recursive solution: recursion might literally cost you per call.
Not that the shift is bad, but just that our tools and thinking haven't quite caught up to it.
I get the same mismatch in a tiny way, when using Google colab to teach Python. It's very easy to get started, it solves lots of "local install" and versioning issues; but the keyboard shortcuts aren't quite there yet, and debugging might be scary.
I don't have the same experience. I run a web service hosted entirely on lambda with some static pages in S3.
Your lambda application should be runnable locally. Doing otherwise is an implementation choice. Some errors related to underlying OS considerations are harder to catch unless you spin up a Lambda container...but, that's just it. Spin it up locally and do your debugging. Not magical at all.
Recursion in an Lambda application is MORE powerful than recursion locally. With lambda, you have as much horizontal scale as you can pay for. Spin off 10,000 threads if you want. You're upset you have to pay for it? You always pay for it with hardware and electrical bills and time if not directly to AWS. My application uses a recursive lambda strategy as its core design.
What's magical is that all of your logs exist cleanly in Cloudwatch and retention policies are easy.
Being able to run the containers locally helps, but I have hit a myriad of issues which occur in the lambda runtime, but not when running the service locally.
With EC2 or ECS backed by EC2 there is much less impedance mismatch between local dev environments and prod environments which results in less surprises.
This might be a bit of #1 and bit of something else, is involving compiled languages. I see teams who normally develop in Java, will then attempt to port that entire codebase into a Lambda function and hit massive cold start times. Their solution, instead of splitting up the codebase (we need the reusability... but it's too much work to turn these parts into libraries...) is to start playing with provisioned concurrency. I think there's a mindshift required with Lambdas, which isn't immediately obvious. Anyone else encounter this and choose the easy way out - recommending they use Node/Python instead?
Provisioned concurrency isn't worth it in my experience. You can just use, say, a CloudWatch rule to ping your desired number of instances every 5-odd minutes and keep them warm. 288 pings per day for 100 instances at 100ms per instance/ping is 2,880 seconds of execution time, or 5,760 GB-seconds at 2GB RAM (1 full vCPU and a bit). At $0.0000166667 per GB-second, that's $35 a year. You could cut that in half with smarter pings than e.g. [1], but it's not worth the engineering cycles at this price.
I understand your point, but I disagree with the characterisation of 'compiled versus interpreted' (which I've seen elsewhere in blog posts too).
Java, C#, etc. are also interpreted languages; they just-so-happen to have a separate compile-to-bytecode step (unlike e.g. Python, which compiles to bytecode when a file is first imported); and their interpreters just-so-happen to have very slow startup times.
"Compiled languages" can have very fast startup times, if we avoid languages like Java (which sacrifice a lot of startup time in favour of steady-state throughput; which is obviously inappropriate for a Lambda). If we do compile down to a native (x86_64 Linux) binary, it requires a custom runtime (e.g. if using Haskell, Rust, etc.).
> and their interpreters just-so-happen to have very slow startup times.
I was surprised to find out that Common Lisp (SBCL) is so incredibly fast to start up that it can run faster than Rust even starting from source. And Lisp is the archetypical dynamic langauge! Made me question my entire career working with JVM-based languages.
Many Lisp systems like SBCL basically restore the main memory (heap) with all data, caches and native code. Then they call a start function. No need to load/link/compile/initialize code at start.
> if we avoid languages like Java (which sacrifice a lot of startup time in favour of steady-state throughput
Could you expand on that? I understand that both the JVM and Node are JIT compilers, but I don't understand why the JVM usually takes longer to start and what are the trade-offs involved here compared to Node.
This is not the type of answer you want, but it is the answer: Java/JS startup times are different largely for product reasons.
V8 is highly optimized for fast startups because that's necessary in loading web pages.
JVM startup time is generally not an issue anywhere but Lambdas, also, JVM apps run for longer and so they want to take advantage of performance data to optimize over time.
If Oracle was desperate to make JVM start fast as a matter of life-or-death for an important revenue stream, then JVM would start quickly.
What the trade-offs would be I don't know, but keep in mind there might not have to be any, at least materially. Products are made to do certain things and not others, if something was never a requirement, nobody ever cared to think about it hugely material terms, well it's unlikely to happen.
FYI one simple tradeoff is that JVM might not compile code until it's run a bit and optimized whereas V8 might immediately compile some things meaning the later you get 'pretty fast right away' but miss out on the runtime characteristics that make it optimally fast, so in the former you pay a little bit of 'learning time' up front and then get better optimization.
Oracle has a Java AOT compiler as part of the GraalVM. They call the feature native images [1]. It's not quite the same thing as configuring functionality in the JVM to see what speeds things up, but I think it'd give a good idea of what would need to be dropped for faster start-up time. The biggest one is dynamic class loading.
AOT is actually often slower than the JVM because it doesn't have runtime profiling. It is possible to use AOT with runtime profiling (i.e. run your app in some real world manner, capture the data and then AOT with the runtime data).
But the point here is that JVM (and V8 actually) can do more than simple compiling by understanding the nature in which the program runs and therefore be quite fast often making up for the fact they are VMs.
The JVM doesn't, it starts in tens of milliseconds. But many popular Java frameworks take seconds to start (i believe mostly because they make heavy use of reflection and classpath scanning). You can write fast-starting Java apps, but you have to stray away from the beaten path to do it.
The main differences are probably the compilers involved, which tend to be layered based on how often a routine gets called (starting interpreted, then using a fast-but-dumb JIT compiler, then a slow-but-smart JIT compiler).
Another big diffence is that Java does lots of special handling for classes, e.g. hooking up 'reflection' machinery, allowing dynamic 'class loaders', etc. which happens during an application's startup.
In contrast, Javascript 'classes' are basically just functions (constructors); and Javascript functions are values just like anything else; hence there's less of an up-front cost to JS code. Python works in pretty much the same way from a language level (but its CPython interpreter works very differently to V8).
We do use monolithic lambdas written in Node.js and do not have cold start times problems.
The problem with cold starts is more related to Java than to monolithic lambdas.
> Anyone else encounter this and choose the easy way out - recommending they use Node/Python instead?
I feel your comment fails to take into account the main reason teams stick with Java in AWS Lambdas: code and dependency reuse, and consequently turnaround time.
Arguing that nodejs and Golang are the ideal runtime in problems involving peeling tasks out of a monolith and into AWS Lambdas is a complete waste of time. Your goal is to offload processing tasks out of your service asap without wasting time developing and testing code written from scratch. Thus, you just create a project for your lambda, pick up your Java code that works and is well tested, offload the battle-hardened code and it's delendencies to the lambda package, add a handler to the lambda, add a lambda client to the monolith, sprinkle integ tests, and you're done: you have a production service. You're running a long-lived background task which is invoked only from time to time, and you don't really care if it takes 1 or 2 minutes to run.
Additionally, do you really want to force your team to manage two distinct tech stacks just because you want to offload a process out of a service? What will that cost you?
The article has some good points, but just one thing about monolith lambdas.
I'm sure this is not an unpopular opinion anymore (not sure if it ever was) but:
1. if you build something new, starting with a monolith lambda is OK. Your customers care about a secure, user friendly, performant product, if a monolith lambda gets it, great!
2. your code can be in a monorepo but you can still have have multiple lambdas (running the same code but each with it's own least privilege IAM policy, it's own SLA / reserved concurrency / memory settings)
3. As soon as you have something that is indeed used by more than one service / team, that it can have it's own persistence / authentication / authorization, e.g. something if there was an API/SaaS out there you would "delegate" it to it, this is when you extract a microservice.
If you start with designing your microservice architecture for your side project / MVP or even seed stage startup, I applaud you, as you are succeeding in places many failed.
You grow into complexity, not start with it on day one.
I'm sure many agree with this sentiment but I keep seeing people getting an allergic reaction seeing any hint of mono* in any project.
Anti-pattern #1 is funny considering several people have said this is exactly how several AWS services are implemented - one Lambda function. Your deployments are much simpler. We built an application using an approach like he described and once you get to 30+ functions, it becomes a hassle.
> use AWS Step Functions to orchestrate these workflows using a versionable, JSON-defined state machine.
I think a better approach would be to use a proper workflow engine [0]. Using a standard modelling language like BPMN and the ability to monitor, audit, and optimize make this a better option than an Amazon-specific approach.
OK, I can't let this slide. I have to call this out. 99% of this is a blatant word-for-word copy of existing AWS documentation. It's not even re-written in his own words, just straight up copied. Every bullet point is there. Every picture is a copy. Looking at Basim Hennawi's other posts on his blog, he seems to have a repeating pattern in his other posts of ripping directly from other site documentation and passing it off as his own.
While point #1 is of course valid, it is most useful for new code, or code with significant resources behind it to optimise it for lambda.
For people with an established codebase, not wanting to go all in on lambda/serverless, or wanting to use existing frameworks, the trade-off might be worth it.
A variant of #1 that I've encountered has a single "service" with a multiple entry-point functions, being deployed as several lambda's - one per entry point - so each lambda contains a copy of the whole codebase.
Neither of the suggestions for avoiding nested Lambdas seem very satisfactory - the situation described there is one which would make me consider if Lambdas were actually a good fit for my problem.
The #4 is interesting, and it makes me wonder, if the last action of a function is to call another function, will it wait for the termination? Would this be fair to call this a tail-call, and say that the lambda runtime may or may not implement TCO for its functions?
It is possible to use the InvocationType "Event" [1] when invoking a Lambda. Then the API returns immediately, but you will never know if there was an error during the execution of the Lambda.
The author would problaly argue to use Step Functions in this case, but using "Event" can also work fine.
Lambda + SNS + CloudWatch (with alarms) is a successful pattern for invoking other lambas very async and still being notified of problems. Swap in SQS if you're in need of message retry.
On a slightly off-topic comment, I really wish big companies would stop cooping technical terms for their product names. I clicked thru happily expecting an article about lambda as in anonymous functions or lambda calculus in general. I know, language evolves and yada yada, but when one actor purposefully forces a new ambiguous use of a word, that is not normal language evolution, that is language vandalism. If you want to use an existing word for your product name, at least pick something from other field, so there is no ambiguity in a specific context (e.g. in a computer forum the word "mouse" unambiguously means the input device, and the word "apache" certainly means the web server or parent org).
That they decided to call a DAG tool "AWS step functions" always puzzled me. I get it now -- like "step by step" -- but didn't a single person do a web search for "step function" beforehand to find it already has a totally separate meaning in mathematics and scientific applications?
You've put one of my biggest frustrations into words very well. Someone trying to learn about serverless architecture is immediately going to run into the serverless framework which will often have nothing to do with what they're trying to accomplish.
Or trying to learn about the CLI (Command Line Interface or Common Language Infrastructure)
A group is trying to propose a standardized language for querying graph databases, ala SQL. They decided to call it Graph Query Language. This has nothing to do with the GQL spec, which itself has nothing to do with graph databases.
The biggest obstacles in this field really feel self erected sometimes.
"Graph" as used in "Graph Query Language" and "Graph Database" is synonymous.
The bigger problem comes from naive practitioners using a word and insisting to novices it has one exclusive meaning when it plainly doesn't.
"Lambda" is not a protected mathematical term. It's a Greek letter (and 500 other things besides)! The fact that 'lambda calculus' is a recognizable concept is itself a consequence of the language evolution you're complaining about.
Some counterpoints:
1. Micro-lambdas
How many routes does your application have 3, 30? Trying to maintain 30 different codebases and deployments would be a nightmare.
Package size.
Package size is unlikely to be much different by adding a little extra code, especially if the code for each route uses the same set of libraries(which will make up the bulk of the package size).
Least privilege.
If it makes sense to limit a routes privileges, you can simply deploy the same lambda code separately, with a different role.
Upgrading.
On the contrary, upgrading a lambda that uses a single codebase, is much simpler than having to deploy multiple systems and deal with the issues of interoperability and compatibility between versions.
Reusing code.
Also much much simpler with a monolithic codebase, the code is right there, ready to use.
Testing.
Testing a monolithic app is very straight-forward, you don't need to setup a complex integration test environment, all the functionality can be tested locally.
4. Lambdas calling lambas
I tend to agree with this one, at least for synchronous calls, but you can also call another lambda asynchronously, actually this is something you tend to find yourself doing if you go down the path of breaking down a monolithic lambda.
5. Synchronous waiting.
Yes, from a cost perspective, you should minimize the amount of time your lambda is spending waiting on external IO. However, from an application complexity perspective, you should make things as simple as possible, building overly complex event based lambda flows, can make it really difficult to debug and maintain your application.