Things will be going great and then there's the oddly weird 2 second delay. I guess it is bringing up a new server or container to run the lambda in.
Whereas with your own (or well, Amazon's) machines you can scale up before hitting the limits and not need long pauses.
Maybe one day they'll fix that.
I'm curious what's going to happen when everyone else does this too. It goes without saying this isn't the intended use for the price they've set and it's also apparent that >75% of customers will, likely, choose to make this performance optimization before going to production or after complaints of bad latency.
Also it is a bit scary that even with keeping a single server warm, you still pay the cold startup penalty on subsequent scale-ups. Afaik, no cloud provider has claimed to have 'solved' this (yet more secrecy in how the platform is managed)
"you still pay the cold startup penalty on subsequent scale-ups"
That's probably why they don't care.
You just keep one warm AND you pay for it.
If you don't have parallel requests this is a good thing for you, bot everyone else doesn't have much gain from only having one hot instance.
On the other hand you can probably get around this with UI tricks when facing and end user. Native apps are installed anyway and web apps will be delivered via S3/CloudFront etc.
After all this is just keeping it "loaded in memory".
I just watched a tutorial where someone described methods to make the cold starts faster and one major point was, get the code size down, because on a cold start the target machine needs to download the code and run it.
My team needed to do some data transformations and setting up s3 put event triggered lambda jobs that pulled the date and transformed it for our data warehouse was so easy to implement it was ridiculous.