Hacker News new | past | comments | ask | show | jobs | submit login
Behind the scenes, AWS Lambda (bschaatsbergen.com)
429 points by garblegarble 14 days ago | hide | past | favorite | 84 comments

If you're interested in Firecracker, I wrote a summary of the original paper here: https://www.micahlerner.com/2021/06/17/firecracker-lightweig...

Any idea how much it has diverged from crosvm?

Quite a lot. Initially, a lot of the changes were removing things from crosvm, but adding features like snapshots, and factoring things out into RustVMM, has made them diverge a lot more.

There's some data in the paper about how similar they were then, too.

Great article @mlerner

This is a great article - I really appreciate when people take the time to assemble details from a bunch of different sources (Firecracker paper, re:Invent talks) and turn them into a useful overview like this.

Clearly Bruno got a lot of the details right, Jeff Barr tweeted a link to this a few weeks ago: https://twitter.com/jeffbarr/status/1404512248152825857

A couple of days ago, I tried to search on how AWS operates RDS behind the scenes, since it is a managed stateful service I was wondering whether it runs in a traditional way VM-based or in a fully containerized environment? .. Unfortunately, a simple search will lead you to the consumer/customer resources out there only.

This is a good paper that talks about Aurora and provides some insight into how RDS operates: https://www.allthingsdistributed.com/files/p1041-verbitski.p...

It’s nice that AWS builds their own higher level abstractions on the same primitives outside developers use. Feels like they eat their own dogfood much more than Google where they bypass GCP and instead utilize underlying Borg primitives for many services.

Which gcp services run directly on borg? My understanding is at least bigtable, cloud sql and other dbs are within “hidden” VMs. I think loadbalancers and storage are exceptions but same is true for aws (except the classic elb probably)

Bigtable, Firestore and Spanner run directly on Borg.

Cloud SQL V2 runs in hidden VMs.

Are those not internal services that pre-date GCP that are exposed externally _through_ GCP?

An AWS engineer talks about this here: https://twitter.com/rakyll/status/1415170934609121285?s=20

Other thing AWS has going in its favor is the rest of Amazon uses AWS versus at Google, none of their flagship services run on GCP (though I think YouTube is moving, which is good).

The rest of amazon doesn’t run on aws in its entirety afaik. Also not sure it’s necessarily a good thing either.

This is a new thing really - it used to be that you'd use a different system that's in many ways better integrated with how the rest of development works but far worse in terms of UX and capacity planning, etc. Now many of the tools are basically frankenstein transformation from the old way into the Amazon-specific way AWS is used via the multi-account pattern.

Nice. I wonder if the stateful merits provided and marketed by containers orchestrates (e.g. K8S) is something they will consider in the future? ..

To build a new service at Amazon, the general path of least resistance these days is to use Lambda. If not Lambda, then ECS. If not ECS (or if it requires bare metal) then EC2.

Yeah, except for the lambda time limit...

Given most things are frontended by API Gateway anyway this isn’t too big of a concern since there’s already an effective 30 second timeout there. For ETL like stuff you can use Stepfunctions too.

It's still sad, because then you need a lambda for just these one or two things that might run long (or maybe even forever) and now you have to mix lambda and e.g. ECS.

That's fair, if you do need a sort of long running connection for most Websocket tasks (there is the websocket/ApiGW hook but that's less useful), WebRTC, raw sockets etc there is no choice than to use ECS or something similar.

Based on how they bill it, it looks like it's running on VMs

Agreed. AWS RDS instance types are just EC2 instance types prefixed with "db." and you're choosing either single-AZ or multi-AZ deployments so presumably AWS is just spinning up 1 to 3 EC2 instances with some preconfigured software on them.

From what I know there is a secret sauce beyond a mere AMI and a control plane, based on some EBS volumes magic. I may be mixing things up with Aurora though.

this is very, very safe to assume. There is probably a hundred engineer's worth of "secret sauce" for an entire managed DB product line.

There were some comments in the early days that the Multi-AZ magic for classic RDS was just drbd on top of EBS.

Aurora is a completely different approach where the RDBMS code is modified to directly interface with EBS instead of going through a traditional OS filesystem layer.

Amazon published a paper describing how Aurora works:


One other thing I learned here is that lambda@edge is not actually run on the edge at all. It is forwarded to the nearest datacenter to execute. Not enough capacity in edges to spin up entire VMs for everything, even with Firecracker.

They do have Cloudfront Functions now for real Edge compute: https://aws.amazon.com/blogs/aws/introducing-cloudfront-func...

Which are barely "compute", but very nice for redirect-rewrite-add-headers type stuff. Like mruby in h2o, lua/js in various nginx modules and so on.

Their JS engine seems quite odd, they selectively have very modern features but not let/const o_0

Great write up. Besides the technical parts, AWS Lambda probably created a ton of new businesses/ startups that otherwise would have been hard or at least expensive to get going.

This is great! Awesome writeup w thekind of details that are sometimes opaque and hard to find documentation for. I recently deployed a NextJS app using Serverless framework (and serverless-nextjs), so Lambda@Edge... looking fwd to playing more with compute at CDN edgein general (eg fly.io). Amazing how easy it is, esp. as someone who came into webdev in 1998.

Considering your long experience, didn't you feel like we lost a lot post-PHP? I also stepped out of the PHP world into JS, and never understood why there isn't any apache2-modnodejs... And to me, the serverless JS movement seems to be just that, but with a lot of unnecessary baggage.

We surely picked up some baggage. Much of it vendor specific. But also we jettison some? You stick a lambda into API gateway and you're on the internet. No servers. No linux setup. No apache conf.

I'd encourage you to dive in for 20 hours in pure curiosity mode and see what you find.

It's great for a lot of use cases ... But unfortunately there is several important points that prevents to use this combo as a silver bullet.

API gateway is limited to 29 sec of execution, if you need anything longer you will need an EC2 instance (or ECS or fargate) to act as a webserver and call the lambda (up to 15 min), cloudfront is also not an option for this comon use case because it's limited to 180 sec.

Lambda is a proprietary solution that only works for people on AWS. Linux is open, and I just need to put it on a box. How do I install the AWS Lambda stack on a standard Linux box?

You can't.

Are you actually asking this question?

Or you pretending to ask a question because you think the fact that AWS Lambda run on AWS is some huge gotcha that I never imagined and no one would ever tolerate?

I explicitly note vendor-specific baggage. AWS revenue is over 45 billion annually and half of customers use lambda.

My point is that the industry really hasn't moved on from the old LAMP stack if it's been replaced by a single company. When it truly comes down to it, the day-to-day tools are not ours if they aren't open.

And if deploying a lambda function on my own hardware is vastly more complex, then the tools haven't really changed, they just got outsourced.

There are a bunch of semi-standards like Serverless Framework and Knative, but nothing concrete.

> And if deploying a lambda function on my own hardware is vastly more complex, then the tools haven't really changed, they just got outsourced.

Well... Yeah?

I came into web development in 97. Back then almost everything was on shared servers, so pretty much the same thing.

I agree the tools got outsourced.

I also agree there are big chunks of LAMP under the covers of running an AWS Lambda. So, in that sense, we haven't "moved on" from the old LAMP stack.

I also agree the tools are "not ours" if they aren't open. They do useful things. It's a tradeoff.

Perhaps a different road to the same problem but on K8s it’s really simple to install a FaaS (e.g., Kubeless and others) provider and get the same benefits. Different metrics sure, but the approach to getting deployable runtimes is the same (if using Serverless framework for example).

Of course then you need a cluster which may be back at AWS as EKS (or not) but at least it’s more open from that perspective.

Lambda gets you back to the one request per process model that made php so easy to reason about and performance flat. With normally deployed JavaScript and single process concurrency, callbacks could all complete at same time and all block waiting to get cpu time to complete the request.

I've been doing software for maybe 25 years. We gained a lot of control and ownership, thanks to FLOSS, 15-20 years ago and then lost a lot.

Really cool post!

From the architecture, it's not really clear to me why Lambdas have the 15 min limitation. It seems to me AWS could use the same infrastructure to make a product that competes with Google Cloud Run. Maybe it's a businesses thing?

I can't think of any reason outside of product positioning.

A lot of the novelty of Lambda is its identity as a function: small units of execution run on-demand. A Lambda that can run perpetually is made redundant by EC2, and the opinionated time limit informs a lot of design.

It may be product positioning, but Lambda really stems from AWS desire to do something about the dismal utilisation ratio of their most expensive bill item: Servers [0].

I speculate, 1min or 15mins workloads are optimum to schedule and run uncorrelated workloads. Any more, and it may diminish returns?

[0] https://youtu.be/dInADzgCI-s?t=524 (James Hamilton, 2013)

I loved using spot instances for managing scaling for a startup i worked at, saved alot of money instead of using these services they provide.

I find myself favour Serverless more while it continues to mature, and generally have fewer complaints.

Btw, you'd like AWS Batch: It is a hassle-free, zero-code way to run batch / uncritical workloads on Spots. https://aws.amazon.com/ec2/spot/use-case/batch/

> A Lambda that can run perpetually is made redundant by EC2

Is only conceptually true outside of "EC2 Classic", because (to the best of my knowledge) every other EC2 launches into a VPC, even if it's the default one for the account per region, and even then into the default security group (and one must specify the IDs). That may sound like "yeah, yeah" but is a level of moving parts that Lambda doesn't require a consumer to dive into unless they want to control its networking settings

I would think removing the time limit on Lambda would be like printing money since I bet per second for Lambda is greater than EC2

Lambda does provide a level of convenience via abstraction that EC2 doesn't: just provide inline code, an S3 hosted zip file or, recently, an ECR image and it's off and running.

I doubt this is a difference marker for most medium to large sized customers though. Making a wrapper for invoking uploaded code is trivial and if done on EC2 doesn't come with the baggage of Lambda (cold starts, costlier expense, more challenging logging and debugging, lack of operational visibility, etc)

This service exists, it's called AWS Fargate [0].

[0]: https://read.iopipe.com/how-far-out-is-aws-fargate-a2409d2f9...

Fargate isn't a competitor to Cloud Run (I wish it was) because it doesn't scale to zero in between requests and scale back up again when new traffic arrives.

Check out AWS App Runner. It may do exactly what you’re looking for.

"Easily pause and resume your App Runner applications using the console, CLI, or API. You’re only billed when the service is running."

That doesn't sound like the automated scale-to-zero I get from Cloud Run.

It does scale to zero CPU when your application isn’t serving requests. See the pricing model at https://aws.amazon.com/apprunner/pricing/ for more details. It does not scale to zero memory, however, because customers have told us that cold-start latency has been their biggest pain point with Lambda functions. App Runner containers can respond to requests in milliseconds as a result.

I have the opposite use case.

Most of the stuff my company is running is made up of data pipelines and machine learning pipelines. So we have a lot of infrequent jobs that don't really care about latency.

That sounds like a job for Fargate or EC2 instances that are managed by EC2 Capacity Providers on ECS.

This isn't true.

Fargate scales in minutes, not seconds. And it never scales to zero.


Makes sense!

I wish Fargate was easier to use and had a scale to 0 feature.

If App Runner ends up supporting private deployments then we can have a true Cloud Run competitor.

> I wish Fargate was easier to use and had a scale to 0 feature.

Fargate can be scaled to zero. Also, have you tried the CLI? [0]

[0]: https://github.com/aws/copilot-cli

When I say "scale to zero" I mean like Cloud Run or AWS Lambda: I define it as the service automatically scaling to zero (and hence costing nothing to run) in between requests, but automatically starting up again when a new request comes in - so the request still gets served, it just suffers from a few seconds of cold-start time.

I'm pretty sure Fargate doesn't offer this. It sounds like you're talking about the ability to manually (or automatically through scripting) turn off your Fargate containers, then manually turn them back on again - but not in a way that an incoming request still gets served even though the container wasn't running when the request first arrived.

Curious, have you used Cloudflare Workers Unbound to see how it compares with Cloud Run in terms of pricing and cold starts?

I haven't yet - my projects (all based around https://datasette.io/) need full Python support, and it looks like Cloudflare Workers still only with with JavaScript or stuff-that-compiles-to-JavaScript. I don't think I can get Datasette working via the Python-to-JavaScript route, it has too many C dependencies (like SQLite).

Lots of interesting work is being done in this area (Currently doing research around serverless at the moment). Cold start up times still remain a pretty large issue (125ms start up for VM is still quite large) but some interesting papers trying to attack this, through strategies like snapshotting!


Also predicting function calls to properly schedule and reduce cold start latency


These 125ms are only the startup time of the MVM and don't include additional latency introduced by optimizing the code package and the involvement of the placement service.

You can also avoid the cold start penalties entirely, if you're willing to pay extra for provisioned concurrency [1].

[1]: https://docs.aws.amazon.com/lambda/latest/dg/configuration-c...

This seems to be a solution that comes at the cost of the consumer which is fine if they want to pay for it, and seems to be an option provided for more latency sensitive applications.

Obviously, one could eliminate the cold start issue in general by just constantly paying for a running EC2 instance.

But cold start is still an issue for the provider as cold start is a cost to them, even reducing cold start of internal runtimes would be a massive benefit (For example pre-warmed JITs). Better cold start times means better bin packing for their services, and overall less cost to everyone.

Good to know you enjoyed the read!

nice writeup for how the magic really works. lambdas rock!

Is this write up correct? How do they know that? I don't see any references on info source except a talk at re:invent.

Both Marc Brooker (lead developer on the AWS Lambda team) giving the talks at Re:Invent as I mentioned in the footnotes, and the official documentation that's out there will provide you with a lot of information.

There's a decent references section at the bottom, and having watched the talks and briefly scanning the Firecracker paper referenced, they do back up the writer.

That's a footnote section and to me the listing is only partially related to the text, i.e. the write up contains a lots more details on multiple components.

Thanks though for backing up this write up. That's +1 for confidence.

When I was at re:invent 2019 I joined some chalk talks which weren't recorded (or not published). Some of the hosts told lot of details of their internal infrastructure.

i keep seeing talk about fast firecracker boot times... but as far as i can tell firecracker is something that talks to the KVM apis and does monitoring...

wouldn't fast boot times be a result of kvm and the structure of the VM being booted, or does this boot time metric include scheduling (on one or more firecracker hosts) and delivery of the image to the runner host for the VM?

Great article. Is there an according one about Google Cloud Functions?

Fantastic paper. So I've been playing with the java and python runtimes and it's absolutely stunning how much better python is on execution and start up time.

Also how does an event actually get to the lambda handler? Because they can come from all kind of sources.

Judging from five minutes of browsing the go runtime sources, they poll some lambda API:


The response is apparently posted to another API endpoint.

Googling for the environment variable actually specifying the API endpoint, I found https://docs.aws.amazon.com/lambda/latest/dg/runtimes-custom... which seems to spell out some of the details.

Of course that just moves the question to "where does the runtime API get the event from?" but at that point they could be doing all kinds of things, I guess.

Now I'm curious to see what happens if you ask for more events before returning a response for the first event you got. Can you actually process events in parallel?

I believe they fire up an http server, based on how their local executor behaves, and then do "servlet-y" (or WSGi-y) dispatch into the entry point method

Pretty sure its just gRPC calls, and firecracker passes events along to its firecracker-containerd services.

My guess is java is slow as it works really well when the JIT can optimize your code. Longer running functions will very likely outperform python.

I'd love to see a similar writeup of Cloudflare Workers.


It’s not that odd if you consider it is the 11th letter of the Greek alphabet.

Can we agree to leave COVID out of this discussion?

The Y-combinator is a function in Lambda calculus.

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact