There's some data in the paper about how similar they were then, too.
Clearly Bruno got a lot of the details right, Jeff Barr tweeted a link to this a few weeks ago: https://twitter.com/jeffbarr/status/1404512248152825857
It’s nice that AWS builds their own higher level abstractions on the same primitives outside developers use. Feels like they eat their own dogfood much more than Google where they bypass GCP and instead utilize underlying Borg primitives for many services.
Cloud SQL V2 runs in hidden VMs.
Aurora is a completely different approach where the RDBMS code is modified to directly interface with EBS instead of going through a traditional OS filesystem layer.
Their JS engine seems quite odd, they selectively have very modern features but not let/const o_0
I'd encourage you to dive in for 20 hours in pure curiosity mode and see what you find.
API gateway is limited to 29 sec of execution, if you need anything longer you will need an EC2 instance (or ECS or fargate) to act as a webserver and call the lambda (up to 15 min), cloudfront is also not an option for this comon use case because it's limited to 180 sec.
Are you actually asking this question?
Or you pretending to ask a question because you think the fact that AWS Lambda run on AWS is some huge gotcha that I never imagined and no one would ever tolerate?
I explicitly note vendor-specific baggage. AWS revenue is over 45 billion annually and half of customers use lambda.
And if deploying a lambda function on my own hardware is vastly more complex, then the tools haven't really changed, they just got outsourced.
There are a bunch of semi-standards like Serverless Framework and Knative, but nothing concrete.
I came into web development in 97. Back then almost everything was on shared servers, so pretty much the same thing.
I also agree there are big chunks of LAMP under the covers of running an AWS Lambda. So, in that sense, we haven't "moved on" from the old LAMP stack.
I also agree the tools are "not ours" if they aren't open. They do useful things. It's a tradeoff.
Of course then you need a cluster which may be back at AWS as EKS (or not) but at least it’s more open from that perspective.
From the architecture, it's not really clear to me why Lambdas have the 15 min limitation. It seems to me AWS could use the same infrastructure to make a product that competes with Google Cloud Run. Maybe it's a businesses thing?
A lot of the novelty of Lambda is its identity as a function: small units of execution run on-demand. A Lambda that can run perpetually is made redundant by EC2, and the opinionated time limit informs a lot of design.
I speculate, 1min or 15mins workloads are optimum to schedule and run uncorrelated workloads. Any more, and it may diminish returns?
 https://youtu.be/dInADzgCI-s?t=524 (James Hamilton, 2013)
Btw, you'd like AWS Batch: It is a hassle-free, zero-code way to run batch / uncritical workloads on Spots. https://aws.amazon.com/ec2/spot/use-case/batch/
Is only conceptually true outside of "EC2 Classic", because (to the best of my knowledge) every other EC2 launches into a VPC, even if it's the default one for the account per region, and even then into the default security group (and one must specify the IDs). That may sound like "yeah, yeah" but is a level of moving parts that Lambda doesn't require a consumer to dive into unless they want to control its networking settings
I would think removing the time limit on Lambda would be like printing money since I bet per second for Lambda is greater than EC2
I doubt this is a difference marker for most medium to large sized customers though. Making a wrapper for invoking uploaded code is trivial and if done on EC2 doesn't come with the baggage of Lambda (cold starts, costlier expense, more challenging logging and debugging, lack of operational visibility, etc)
That doesn't sound like the automated scale-to-zero I get from Cloud Run.
Most of the stuff my company is running is made up of data pipelines and machine learning pipelines. So we have a lot of infrequent jobs that don't really care about latency.
Fargate scales in minutes, not seconds. And it never scales to zero.
I wish Fargate was easier to use and had a scale to 0 feature.
If App Runner ends up supporting private deployments then we can have a true Cloud Run competitor.
Fargate can be scaled to zero. Also, have you tried the CLI? 
I'm pretty sure Fargate doesn't offer this. It sounds like you're talking about the ability to manually (or automatically through scripting) turn off your Fargate containers, then manually turn them back on again - but not in a way that an incoming request still gets served even though the container wasn't running when the request first arrived.
Also predicting function calls to properly schedule and reduce cold start latency
You can also avoid the cold start penalties entirely, if you're willing to pay extra for provisioned concurrency .
Obviously, one could eliminate the cold start issue in general by just constantly paying for a running EC2 instance.
But cold start is still an issue for the provider as cold start is a cost to them, even reducing cold start of internal runtimes would be a massive benefit (For example pre-warmed JITs). Better cold start times means better bin packing for their services, and overall less cost to everyone.
Thanks though for backing up this write up. That's +1 for confidence.
wouldn't fast boot times be a result of kvm and the structure of the VM being booted, or does this boot time metric include scheduling (on one or more firecracker hosts) and delivery of the image to the runner host for the VM?
Also how does an event actually get to the lambda handler? Because they can come from all kind of sources.
The response is apparently posted to another API endpoint.
Googling for the environment variable actually specifying the API endpoint, I found https://docs.aws.amazon.com/lambda/latest/dg/runtimes-custom... which seems to spell out some of the details.
Of course that just moves the question to "where does the runtime API get the event from?" but at that point they could be doing all kinds of things, I guess.
Now I'm curious to see what happens if you ask for more events before returning a response for the first event you got. Can you actually process events in parallel?
My guess is java is slow as it works really well when the JIT can optimize your code. Longer running functions will very likely outperform python.