
AWS Introducing Provisioned Concurrency for Lambda Functions - marvinpinto
https://aws.amazon.com/blogs/aws/new-provisioned-concurrency-for-lambda-functions/
======
reilly3000
With Fargate Savings Plans and Spot Instances, the cost of running workloads
on Fargate is getting substantially cheaper, and with the exception of
extremely bursty workloads, much more consistently performant vs Lambda. The
cost of provisioning Lambda capacity as well as paying for the compute time on
that capacity means Fargate is even more appealing for high volume workloads.

The new pricing page for lambda ("Example 2") shows the cost for a 100M
invocation/month workload with provisioned capacity for $542/month. For that
same cost you could run ~61 Fargate instances (0.25 CPU, 0.5GB RAM) 24/7 for
the same price, or ~160 instances with spot. For context I have ran a simple
NodeJS workload on both Lambda and Fargate, and was able to handle 100M
events/mo with just 3 instances.

Serverless developers take note: its time to learn Docker and how to write a
task-definition.json.

~~~
munns
Dollar to dollar comparisons are one way to compare these two technologies but
it leaves a lot not covered. The application programming model varies greatly
(socket/port vs. event). There's also a lot more that Lambda brings to the
table in terms of monitoring, logging, etc that you'd need to do work yourself
to enable.

Fargate is a great product, but it doesn't completely remove _all_ operational
work to the degree that Lambda does.

~~~
reilly3000
Agreed. For the vast majority of cases, a Lambda function is easier to ship
and maintain, and most likely dramatically cheaper. I really only think the
value of using Fargate kicks in compared to Lambda at around 5M+
invocations/month. YMMV based on workflow and workload.

~~~
efi-lumigo
You are comparing only the costs of running them, what about the cost of
developers who build/debug/troubleshoot the container. As someone who is
running both on Lambda and Fragate it's way harder to make things tick on
Fargate

------
etaioinshrdlu
This feels like a step backwards to me, nevermind how necessary it may be. The
magic was paying only for what you use on super bursty workloads.

Now this is like throwing your hands up and saying the users bursts are too
big for AWS.

~~~
dillondoyle
Honest question not being sarcastic: if this cold start latency is so
important why chose function over elastic beanstalk or other auto-scaling type
system?

Answer could help us try something new. We currently use large google app
engine 'apps' after failing to get functions to scale quick enough (and hit
limits). we have SUPER bursty traffic that needs to scale up to hundreds of
instances very fast.

~~~
matteuan
> if this cold start latency is so important why chose function over elastic
> beanstalk or other auto-scaling type system? Pricing! It depends on the
> application, but there are some use-case where Lambda is way cheaper, so if
> we can also partially solve cold start latency, why not?

~~~
piva00
Way cheaper given a particular workload and access pattern*, Lambda is not
magical and its pricing is tricky to figure out until you have something
running.

------
scottndecker
AWS 2006: "Run your workloads on our EC2 instances in the cloud 24/7."

AWS 2014: "Run your work loads on serverless so you don't have to deal with
those pesky EC2 instances 24/7 anymore."

AWS 2019: "Click a checkbox and you can have your serverless workloads get
dedicated EC2 instances 24/7!"

~~~
cle
That's pretty reductive, it's more like

"Click a checkbox and we'll run your code for you, take care of OS security
updates, compliance requirements, autoscaling, load balancing, AZ resiliency,
getting logs of your box, restarting unhealthy processes, ..."

------
munns
Hey all, I lead developer advocacy for serverless at AWS and was part of this
product launch since we started thinking about it(quite some time ago I should
say). I'm running around re:Invent this week, but will try and pop in and
answer any questions I can.

Provisioned Concurrency (PC) is an interesting feature for us as we've gotten
so much feedback over the years about the pain point of the service over head
leading to your code execution (the cold start). With PC we basically end up
removing most of that service overhead by pre-spinning up execution
environments.

This feature is really for folks with interactive, super latency sensitive
workloads. This will bring any overhead from our side down to sub 100ms.
Realistically not every workload needs this, so don't feel like you _need_
this to have well performing functions. There are still a lot of thing you
need to do in your code as well as knobs like memory which impact function
perf.

\- Chris Munns -
[https://twitter.com/chrismunns](https://twitter.com/chrismunns)

~~~
scarface74
The cold start from using lambda has a number of causes

1\. the time to initialize the VM

2\. the time to create an ENI if you are connecting to a VPC[1](until the NAT
alternative rolls out globally)

3\. the time to initialize your language runtime (Java seems to be the worse,
scripting languages the best)

4\. any program initialization done outside of your handler that runs once per
cold start of your lambda runtime.

A fully “warm” instance avoids all four when run.

Is my understanding correct that a “provisioned” runtime that isn’t “warm”
will only avoid the first two?

What state is a “provisioned” instance in?

[1] I refuse to use the colloquial but incorrect statement that the lambda is
“running inside your VPC”.

~~~
munns
Fwiw new networking for VPC is completely rolled out for all public regions
now. (#2) (and thank you, its called "Attached to a VPC" and not in :) )

This covers you straight through 4.

Now it's possible that your execution environment could be sitting for
sometime waiting for any action and so pre-handler DB connections and things
like that might need to be tweaked in this model.

Thanks, \- munns

~~~
scarface74
So I had to convert three lambda APIs using proxy integration to Fargate
mostly because of the 6MB request/response limit but the cold starts caused us
to make a rule that we weren’t going to convert our EC2 hosted APIs to lambda.
We were going to host them on Fargate.

But since the APIs that I moved over to Fargate are now automatically being
deployed to both lambda and Fargate with separate URLs, we can A/B test both
and see if we will move to lambda in cases where the request/response limit
isn’t a problem.

Btw, I didn’t think using a NAT instead of an ENI had rolled out completely. I
tried to delete a stack recently and it still took awhile to “cleanup”
resources. I thought that was caused when it was deleting the ENI. I’ll be on
the look out for it next time I need to delete a stack.

------
nexuist
I am a huge fan of serverless, and AWS as well.

I also find it deeply ironic that their solution to cold starts is to keep the
function running 24/7...

Could I include openssh and Apache in my Lambda instance? Maybe run a
Minecraft server? :P

~~~
karavelov
Your function is frozen if there is no active invoke in progress. So no, ssh
or minecraft server will not work, unless you make them communicate over
Lambda invocations.

~~~
WatchDog
Even if it were not frozen, the idle "Provisioned Concurrency" price is still
more than paying for an active on-demand ec2 instance.

------
leovingi
Am I misunderstanding something here? Based on the AWS calculations on the
Lambda pricing page, a single 256Mb Lambda would incur a cost of $2.7902232
per month, using "provisionedConcurrency: 1". Pushing it to 3008Mb, to get
access to more processing power, makes that go up to $32.78 per month (EU
London region). Compared to the standard way of warming it up by hitting the
endpoint once every 5 minutes, which comes out to 8640 calls per month, which
costs next to nothing.

Unless I am terribly mistaken, it doesn't seem like allowing AWS to handle
this and not doing it in code (warmup plugin, cron job, etc.) is worth the
cost.

~~~
scarface74
Pinging the lambda to keep it warm doesn’t do the same thing.

When you do that, it only keeps one instance warm. If you have 10 concurrent
requests, even if one is warm, the other 9 requests will still experience a
cold start.

The only way around this is to send a request that holds the connection open
long enough to make sure concurrent requests start a new lambda instance.
While you are keeping the request open, that lambda instance isn’t available
for a real call.

If the entire purpose of lambda is to make things easier, once you start down
the Rube Goldberg path of trying to keep enough instances warm, it kind of
defeats the purpose. Just spend the money and the time to set up an
autoscaling group of the smallest instances of EC2 or use Fargate if you don’t
want the cold start times.

------
jugg1es
As a seasoned AWS developer, I love this feature. However, I wonder how the
increasing complexity of AWS affects new devs as they try to grok the offered
services. AWS typically does a pretty good job hiding advanced features from
beginners, but I wonder how long they can do that.

------
soamv
Lambda has always been the most expensive compute you can buy on AWS -- you
could think of that as the premium for being the most "elastic". So this
feature is about giving away some of that elasticity for (a) performance
predictability and (b) a bit of total cost savings. Note that you can still
happily "burst" into exactly as much concurrency as you could before, you'll
just have cold starts.

People used to write cron jobs to keep their functions warm, which besides
being ugly didn't even work well -- you could at best keep one instance warm
with infrequent pinging, i.e. a provisioned concurrency of 1. So this feature
addresses that use case in a much more systematic way.

There's some precedent for features like this -- provisioned IOPS and reserved
instances come to mind. In both those cases you tradeoff elasticity and get
some predictability in return (performance in one case, cost in another).

~~~
WatchDog
I doubt there are that many people that want a provisioned concurrency of
greater than one.

If you have a reliable base-load of a few requests a second and you don't have
some constraint that forces you to use lambda, you are going to get much
better value running your application on ecs/ec2.

------
hn_throwaway_99
This is a big deal. Cold starts were always the huge Achilles heal for using
lambdas for interactive APIs. Kudos for this.

------
peterkelly
They really went out of their wait to avoid using the word "server" in that
article.

I've always hated the term "serverless", but its usage in this context is even
more ridiculous.

------
tybit
So excited for this, between this and the removal of VPC cold start issues
recently, avoiding Lambda for APIs because of latency seems to be a thing of
the past.

------
gcatalfamo
Sorry for the stupid question, I genuinely want to know: how does this differ
from firing up your function with an additional call every, idk, 5 mins?
Wouldn’t it be cheaper and easier?

~~~
tybit
This works beyond a scale of 1, e.g if you want to be ready for 100 concurrent
invocations using ping it’s quite difficult to do that I imagine.

Also presumably more reliable. With the 5 minute ping the underlying container
will be reprovisioned every few hours. At which point it’s a race to see
whether the next ping comes before the real user request to swallow the cold
start.

~~~
momania
And what if I fire 500 concurrent keep-alive's? (Or maybe a bit of overhead).
Then the 500 lambda instances will stay alive for a couple of minutes again,
but I'm still only charged the 500 calls. Not the Gb/sec the rest of the time
the labmda's are alive, right?

~~~
mactunes
How do you ensure that the 500 concurrent keep-alives land on different Lambda
functions? I.e. requests 220 and onwards might hit lambda functions which were
warmed up by requests 0-219. I just made the above numbers up of course.

~~~
momania
That's why I mentioned 'with maybe some overhead'. You can also have the keep-
alive handling take a second or 2 extra to complete, to have the lambda
blocked for this time, so the burst calls get more spread. It's not going to
be a precise solution, but still, paying 2-3 seconds every minute instead of
paying the whole minute is still a lot cheaper.

~~~
momania
Ok, I quickly did the math for 500 lambdas (1Gb):

Provisioned: 500 * 86400 (sec/day) * 0.000004167 ~= $180

Keep-alive: 500 * 1440 (min/day) * 2 (sec. runtime) * 0.000016667 (100ms
price) * 10 ~= $240

Invocation costs on the provisioned ones would be a lot cheaper too cheaper
too. Roughly $45 provisioned vs $80 for 50M calls.

So depending on the demands, provisioning can be more performant, and cheaper
at the same time.

------
alexellisuk
This is relatively easy to do with OpenFaaS and Knative on Kubernetes. If
we're paying for idle, why not take a look at EKS on Fargate?

[https://www.openfaas.com](https://www.openfaas.com)

------
macintux
Request for anyone on the Lambda team who happens to read this: your API
doesn’t appear to offer a way to retrieve the “last modified by” user when
grabbing function metadata.

Very unlike other AWS APIs and very annoying.

~~~
msftie
Which API specifically are you referring to (GetFuntionConfiguration ?), and
which APIs are you comparing it against?

~~~
macintux
There are other oddities too. For example, to get a layer name from a lambda
definition, the simplest/most robust process I can define:

1 Retrieve all layers with list_layers and index them by ARN

2 Retrieve all function metadata

3 For each function metadata item, extract all layer version ARNs

4 For each layer version ARN, call get_layer_version_by_arn

5 Extract the layer ARN from that result

6 Use that layer ARN to retrieve the name from the data we retrieved in step 1

~~~
msftie
Layer Name and Layer Arn (the Layer Version Arn without a version suffix) are
interchangeable in APIs that require a Layer Name parameter. I understand that
you're trying to extract the "LayerName" field returned in the API response in
ListLayers, but you can do it more concisely.

If you just need the Layer Arn to call other APIs:

a) You could eliminate steps 1 and 6, and use the Layer Arn value from 5 to
call APIs that require a layer name.

b) Alternatively, you could yourself chop off the version number from the
Layer Version Arn string(s) in step 4, and skip the GetLayerVersionByArn call
in step 5 all together.

If you explicitly require the name, not the Layer Arn:

c) You could parse it right out of the Layer Version Arn yourself.

When it comes to doing your own string manipulation for (b) or (c), there are
many ways to skin a cat ... but you could use regex (the pattern is in the API
documentation), or split on colons and index the second to last element in the
array.

Is it useful to return the Layer Name in more APIs? What is your use-case?

~~~
macintux
Sorry, been away from this for a while. This is for a dashboard (we have many
accounts and not all stakeholders have permissions across all accounts, so we
need ways to manage the information outside the web console).

I am reluctant to treat ARNs as anything but opaque blobs; I do so when
necessary, but I know their format has changed in the past, and the lack of a
central resource to track breaking changes across AWS means that we would
likely not know that the ARN format is changing until our tools break.

I appreciate the ideas, however, and I'll look more closely at my flow to see
what shortcuts I can live with.

------
stunt
I think this is a really good feature and has many use cases. I also
anticipate so many developers that shouldn't use Lambdas are going to use
Lambdas becaues of provisioned concurrency.

------
ac360
Provisioned Concurrency is now supported in the Serverless Framework -
[https://github.com/serverless/serverless](https://github.com/serverless/serverless)

------
NewLogic
I'm still frustrated that Lambda can't have alias specific environmental
variables. Aren't alias' supposed to be used for staging function versions
through a release pipeline?

~~~
bni
Alias is a super weird feature of AWS Lambda imho. We setup separate Lambdas
for dev, test and prod instead, like Serverless Framework does it.

------
k__
At least if you build APIs, you can use VTL and avoid Lambda and its cold
starts completely

~~~
pumanoir
Which service uses VTL (is it Appsync)?

~~~
k__
AppSync and API-Gateway

------
choukri060
Ok

------
tkyjonathan
I am not even sure that developers around me know how to do concurrency, since
moving to micro services.

