- Chris, Serverless@AWS
Cloudflare Workers has the right pricing model. They only charge for CPU time and not wall time. They also do not charge for bandwidth.
> Lots of sub 100ms workloads...
AWS Lambda (or Lambda at Edge), as it stands, is 10x more expensive for sub 50ms workloads (Workers does allow upto 100ms for the 99.9th percentile) that can fit 128MB RAM.
I don't know the infra costs of operating lambda, but my guess is that it's far from CPU-dominated.
I would not be surprised if the Cloudflare pricing model is making a tradeoff to make CPU-bound workloads pay for more of the infra than the rest. It's a valid trade-off to make as a business offering, and it might be feasible given the mixture of workloads. Whether it's the right way is debatable. Whether this model can be tanked by an army of actors taking advantage of CPU-insensitive pricing remains to be seen, or is an acceptable risk that you can take (which you can observe and protect against).
Workers aren't the same as lambdas, they are a super slim JS environments. At 50ms max runtime most browsers won't even start, let alone fetch and process a page.
Then there are more specialized browser-testing providers like LambdaTest.com and BrowserStack.com
Care to expand on that? What exactly do you mean by "things are way more different"?
It's a quick Google. 128MB max memory, 6 concurrent out going connections max, 1MB code size limit. The use case here is a subset of what AWS Lambda can handle. The supported languages also differ (only things that have a JS / wasm conversion for Cloudflare Workers).
I haven't looked deeply, so please correct me if I'm wrong, but I understand there's also restrictions on the built-in APIs available  and npm packages supported for NodeJS.
I would assume some of the above contributes to the price difference.
1 - https://developers.cloudflare.com/workers/runtime-apis/web-s...
 For Workers, 50ms is all CPU time and that is definitely not the case with Lambda which may even charge you for the time it takes to setup the runtime to run the code and time spent doing Network IO and bandwith and RAM and vCPUs and what not.
It's like going to a restaurant that uses bottled water instead of tap water, and they dont provide an answer as to what the benefits of bottled water are
On Lambda you get a full environment inside an OS container. On CloudFlare you get a WASM process.
The Lambda model is more compatible which can be a real benefit.
Besides the fact that Cloudflare's part of the Bandwidth Alliance with GCP and other infrastructure providers from which AWS is conspicuously absent, Cloudflare's also slowly but surely building a portfolio of cloud services.
Cloudflare's alliance with other infrastructure providers mean Cloudflare's platform isn't really limited to "API" workloads. This is discounting the fact that Cloudflare recently announced Workers Unlimited for workloads that need to run longer (upto 30mins) though then they do charge for bandwidth.
Otherwise it's just "AWS said, Cloudflare said"
Not sure about this, most use cases of Lambda use other resources and do not exist in a vacuum. Comparison should be made using complete systems not only parts.
Is there fine print on this? Can I put 100TB / mo through their caching servers at the lowest $20 price tier?
Any word on whether we'll also see this change on L@E billing?
-Harlan, the ex-intern
If this also covers lambda@edge, this will save us quite some money.
Someone from AWS once commented to me that "if you're ever having to manage a pool rather than letting us manage it, that's a gap in our services".
Thanks! It's been a fun/interesting 4 years in this space :)
Edit: You're spending like 80ms on cold-start of your lambda function, plus network overhead. If you can spare that, you can likely spare the half a millisecond for the 999,900 cycles you're complaining about.
I've never used Lambda, but any time I have a function that I need to run in response to some event or perodically (that's what Lambda is, right?), it's set up in a background worker specifically because it's long and slow, as anything fast can be done synchronously without the overhead.
For longer tasks, spinning up an EC2 or Beanstalk instance is probably the way to go.
As for what to use it for, we used it in our application (deployed to Netlify which uses Lambda under the hood) where lambdas operated like a 'proxy' to various 3rd party API suppliers (Commercetools, Adyen, some age verification service), and those too would use a lambda function to ping back at us (e.g. when payment was confirmed). Worked pretty well, although in retrospect I would've preferred a 'normal', monolithic server to do the same thing.
For some people.
Those cost savings are made up somewhere else.
Ultimately, Amazon is not a loss leader.
It matters close to 0.
You might only drop videos in once a week, but when you do you want to run some code against them. There are plenty of distributed workflow reasons to run long running Lambdas infrequently rather than spinning up and down an EC2 instance.
Not trying to nitpick anything; just curious what was meant by "underpowered". Seems like there's still a breadth of compute-intensive use cases that are more appropriate for lambda--e.g., cost is more sensitive than latency and I have too low a volume of requests for a dedicated EC2 instance to make economic sense. This has been where I've spent most of my career, but no doubt there are many use cases where this doesn't hold.
(ie, if people now use 1.5x as many Lambdas, they can lower costs by 1/3, and everyone wins).
People think about pricing as a zero sum game. In reality, there are very few things in the world that are zero sum games.
This is really, really exciting!
We use very little serverless at the moment, because the three clouds we need to deploy to have infuriating differences between their execution and deployment environments. Eg. How they manage dependancies, the runtimes, how you describe and deploy each function.
Compared at least to K8 where the containers you build run just fine wherever you put them.
Docker containers can be up to 10gb, traditional lambdas are still limited to 250mb.
All else being equal, faster services are cheaper to run. Faster services can service more requests per compute/memory resource, which means you don't have to buy as many servers/containers/whatever. This is particularly important if you're being billed by the ms, which is the context we're talking about here.
Amazon is not going to change software development per se, but at least at some of their customers' sites calculations will be done how many hours can be allocated for a n% reduction in runtime. So, if you live in an amazon-universe, this is a real "game changer". Bystanders may chuckle ;)
I'm interested to hear what people think about https://www.infracost.io/docs/usage_based_resources - longer term we could extend that to fetch average_request_duration from cloudwatch or datadog.
Very tired of this: `Duration: 58.62 ms Billed Duration: 100 ms`
Very happy about this: `Duration: 48.74 ms Billed Duration: 49 ms`
1. Python & JS
3. C# & Java
I couldn't find any data on Rust.
The understanding at the time was that Python & JS runtimes are built-in, so the interpreter is "already running" Go is the fastest of compiled languages, but just can't beat the built-in runtimes. C# and Java were poorest as they're spinning up a larger runtime that's more optimized for long-running throughput.
Of course, benchmarks like this only go so far. Use as a starting point for your own evaluation; not as an end-all-be-all.
Laughs in data science.
> a 5 second boot time for a longer-lived service is acceptable.
Not every application can tolerate the occasional 5-second-long request. Just because Python can cold boot "hello world" 3 seconds faster than Go doesn't mean that's going to hold in the real world.
Using data science tooling in a lambda seems iffy, especially ones that are not production ready. And good luck getting such libraries in go.
Python cold booting an interpreter 3 seconds faster than Go is a big deal, especially if your target execution time is <50ms and you've got a large volume of invocations, and are not being silly and importing ridiculously heavy dependencies into a lambda for no reason other than to make a strange point about Python being unsuitable for something nobody should be doing.
Lambdas cold-start during requests. So the unlucky request that triggers a cold start eats that cold start.
> Using data science tooling in a lambda seems iffy, especially ones that are not production ready.
Nonsense, there are a lot of lambdas that just load, transform, and shovel data between services using pandas or whathaveyou. Anyway, don't get hung up on data science; it was just an example, but there are packages across the ecosystem that behave poorly at startup (usually it's not any individual package taking 1-2s but rather a whole bunch of them scattered across your dependency tree that take 100+ms).
> And good luck getting such libraries in go.
Go doesn't have all of the specialty libraries that Python has, but it has enough for the use case I described above.
> Python cold booting an interpreter 3 seconds faster than Go is a big deal, especially if your target execution time is <50ms and you've got a large volume of invocations
According to https://mikhail.io/serverless/coldstarts/aws/languages/, Go takes ~450ms on average to cold start which is still up a bit from Python's ~250ms. To your point, if you're just slinging boto calls (and a lot of lambdas do just this!) and you care a lot about latency, then Python is the right tool for the job.
> not being silly and importing ridiculously heavy dependencies into a lambda for no reason other than to make a strange point about Python being unsuitable for something nobody should be doing.
My rule of thumb (based on some profiling) is that for every 30mb of Python dependency, the equivalent Go binary grows by 1mb, moreover, it all gets loaded at once (as opposed to resolving each unique import to a location on disk, then parsing, compiling, and finally loading it). Lastly, Go programs are more likely to be "lazy"--that is, they only run the things they need in the main() part of the program whereas Python packages are much more likely to do file or network I/O to initialize clients that may or may not be used by the program.
The way I'm using lambda, I compile the lambda build image beforehand which contains the python packages already installed, and the only "time" restraint is that of the lambda spinning up itself.
If you ran e.g. "pip install -r requirements.txt" inside the lambda, then yes it would take time to install the packages.
Meanwhile in Go, dependencies are baked into the executable so there is no resolving of dependencies, and the analog to “module level code” (i.e., package init() functions) are discouraged and thus much less common and where they occur they don’t do as much work compared to the average Python package.
Edit: Bizarre. Seems like Go on lambda does actually have slower cold start than JS or Python. I wonder if its just that the binary is likely larger than the equivalent JS source code? https://levelup.gitconnected.com/aws-lambda-cold-start-langu...
I hadn't expected it either, but it loads node faster. Perhaps via some VM trick?
A more realistic benchmark would be parsing a 1kb protobuf blob and printing some random key from it.
(this would require importing a non-stdlib parser)
Without knowing how it's implemented, my guess is that they're conserving python/v8 processes, so that they're not cold-starting the interpreter on each lambda execution.
You can't  do the same thing for a Go binary, so they have to invoke a binary, which might involve running some scans against it first.
This leads to some pretty counterintuitive conclusions! If you want minimal latency (in Lambda!!), you really should be using JS/Python, I guess.
: OK. Maybe you could. Go has a runtime after all, although it's compiled into the binary! I have never heard of anybody doing this, but I'd love to read something about it. :)
Dependencies for the dynamic languages matter A LOT! Take a look at what it'll cost you for requiring the AWS SDK in Node.js, for your cold starts https://theburningmonk.com/2019/03/just-how-expensive-is-the...
Personal benchmarks puts Rust as the most optimal language that I've tried to run on AWS Lambda so far.
Most of the time in Lambda is usually spent waiting for IO, which is slow in any language. If you’re using Lambda for heavy computation, that’s not a great choice.
On the other hand I heard legends about under 10ms Rust Lambdas.
I'm not questioning that 5 is a tenth of 50. I'm questioning the Rust speed :D
It's nice to see this drop, though I'm sure Amazon does it due to competition as well.
Fat profit margins attract competition, this is what happened when Oracle/Unix combo were chewed up by Microsoft Windows/SQL from the bottom, and then Linux/MySQL started chewing up Microsoft from their bottom. It's the dog-eats-dog world.
- Jeff Bezos
You get customers by being nice to them. Being nice to customers means competitive pricing, high quality support, good documentation, easy integration, etc. It's all driving towards the same goal.
And then you provide a competitor or startup another opportunity.
It's still great to see.
AWS doesn’t really ever have a reason to raise costs, it doesn’t have to lowball costs to attract customers in the first place.
It’s an interesting model because apps either optimize for or happen to fall into “loopholes” where some customers end up getting more value than others or may turn into a financial liability at scale.
For example, think about authentication... charging per auth will mean that some use cases will be nearly free, as some external users may only sign in once per quarter. But charging a flat rate has the opposite effect. You have to design the service and tweak the metrics and rates to make it work.
Maybe because they know Cognito is horrible? ;)
OTOH, I have never seen DynamoDB prices decreasing.
I'm an ignorant in AWS Lambda but how do you know if their ms measurement is accurate? Is there any way to verify this?
- Chris - Serverless@AWS
Most workloads ALSO hold memory (which is a key constraint) over the entire wall clock time, and the delays and impacts/costs of hibernating out the memory and then bringing it back so you can just be charged for CPU time may not make sense.
Maybe even accounting for strong and weak nuclear forces (not sarcasm...we are engineering at the quantum level now, soon it will be apart of a business metric. Instead of 'equipment' being servers, it might be the domain-space-time used).
This won’t save me a ton of money; but at least I won’t have to guess on if bumping up to the next CPU+MEM tier will get me from 101ms to 99ms
A feature I'd really like next is secrets as environment variables like ECS.
Retrieving SecretsManager secrets and SSM Secure Parameters in application code is messy and provides significant friction for developers on my team.
I'm asking for Lambda to have externally supplied secrets
Being billed by the millisecond does not mean that you should give a pricing per millisecond.
I prefer Digital Ocean or Heroku's approach of billing by the second, but giving the price per month. How on hell is `$0.0000000021 per millisec` better than `$5/month, billed by the millisecond`? If I know that my workload will be about 20% of a dedicated CPU, I know that I'll end up paying about $1 per month.
If I know my function takes around ~35ms ballpark, and I will probably invoke it 5,000 times per day, then I can calculate my monthly: 0.0000000021 $/ms * 35ms * 5,000 * 30 = 0.011 $/month.
AWS usually shows a neat example of usecase and what the billing would be on their pricing pages.
No function runs forever.
Also if you want to reboot/repurpose the server, 15 minutes is max wait time.
And finally, you don't want people running long jobs here when you have a solution there.
Maybe you could let users indicate the operation will take a long time... but if the user knows the operation is long running in advance, why not just guide them to a more suitable system?
15 mins max runtime simplifies the resource management and avoid abuse. If you have workload for long running jobs, then that should go to something like AWS EKS/Batch/SageMaker.
That being said, things can change, if more and more people requires long running capacity for Lambda (though I am skeptical of that, as Lambda abstracts the underlying hardware away and is supposedly flexible to the requirements)
Chris Munns - Lead of Dev Advocacy for Serverless@AWS
But being real, we hear you on this one. I can't comment on API Gateway's roadmap here but this is something both teams is aware of. The reason it is the way it is today is for a valid reason. But this is def something we hear pretty often.
What I care about is:
* Scale to 0, and automatic scaling up without configuring it
* Automatic patching of the OS
* Fault isolation
Lambda gives me that. So each one runs for 15 minutes, processing all data in an SQS queue.
I do wonder if Fargate would be cheaper per millisecond? Dunno.
If data doesn't have to be processed sequentially an option is to configure the AWS Lambda function to get invoked for new data in the SQS queue , so you don't have to care about manually fetching data from SQS at all.
Someone mentioned step functions above. All of our steps run on spots. We also have some tasks like you have that read off queues and do processing, which also all run on spots.
One example we've been wrestling with is a merge operation. Usually it's merging about 1000 records which completes in a few seconds. But every once in a while someone kicks off a job that tries to merge 1,000,000 records and it times out.
We want the benefits of serverless (scale down to zero, up to infinity at the drop of a hat) but these edge cases mean we're having to evaluate other options.
An hour or two would be a good start; then it'd cover 99.9% of requests. With a few hours we could add more nines :)
Was this recent, should I take a look at this again? My issue with Fargate as recent as a year ago was that running the same workload on ECS (if you can use your cluster nodes efficiently) was twice as cheap (even without reserved instances).
It's best suited for jobs that can be broken down into tons of small individual computations, or to respond directly to HTTP requests. If you can fit your pipeline / application into that model it's usually beneficial: Instant scaling, retries, reliable etc. Mixed with other concepts like SQS you can build pretty powerful things without having to pay when there's no load.
Lambda's are good at batch jobs where you might need to kick off a few of them but not have a dedicated system for it. I've used it to automate manual customer support tasks that are sporadic in requests.
It's essentially a way to save some money by avoiding long http requests by buffering requests and sending a callback when the result is complete.
(Google do per 100ms and have 6 pre-defined memory sizes to pick from).