
A shared file system for  lambda functions - petercooper
https://aws.amazon.com/blogs/aws/new-a-shared-file-system-for-your-lambda-functions/
======
VWWHFSfQ
I'm curious what people actually use Lambda for?

I tried Lambda for a use-case that I had in 2018:

We published Polls and Predictions to people watching the 2018 World Cup. We
set the vote callback URL to a function on AWS Lambda.

It failed spectacularly during our load-testing because the ramp-up period was
far too slow. We needed to go from 0 to 100,000 incoming requests/second in
about 20 seconds.

We had to switch to an Nginx/Lua/Redis endpoint because Lambda was just
completely unusable. It would have cost us $27,000/month to pre-provision
10,000 concurrent executions...

What is it that people actually use Lambda for?

~~~
redis_mlc
AWS lambda is used in 2 categories:

1) AWS administration-related actions (required for non-trivial mgmt.)

2) end-user actions (optional).

AWS throttles and limits everything, so expecting 100,000 requests in 20
seconds on a default AWS account is unreasonable.

The Cloud doesn't mean unlimited capacity.

Also, I haven't seen anybody version control lambda code, even in compliance
environments, so something to investigate.

~~~
philliphaydon
> Also, I haven't seen anybody version control lambda code, even in compliance
> environments, so something to investigate.

Huh, how is it any different building and deploying to anything else.

I store the code in git, build it in team city, and deploy it to S3 using
octopus deploy which updates the lambda to point to the new versioned off zip.

~~~
WatchDog
It's different because they put that code editor in the lambda console, and
some people actually use it.

I've built a lot of lambda apps and never used it, pretty much everything I do
requires dependencies.

~~~
philliphaydon
> and some people actually use it

Sure, I used a lambda to turn on/off a build server during particular hours. I
wrote it directly into the console.

For actual applications tho, I don't use the console.

Just because the console exists doesn't mean people are flat out not
versioning their code.

------
kleebeesh
I've found EFS enticing in theory but painfully slow and riddled with issues
in practice. In the past I've tried it thinking "it's basically an EBS volume
I can mount on > 1 EC2 instance," only to find terrible read performance and
misc. low-level NFS errors.

~~~
tidepod12
Dunno your exact requirements or when you last tried it, but they did boost
EFS's read speed (they claim by 400% [1]) as of this April, so it might be
worth looking into again if you're still trying to find a solution.

1: [https://aws.amazon.com/about-aws/whats-new/2020/04/amazon-
el...](https://aws.amazon.com/about-aws/whats-new/2020/04/amazon-elastic-file-
system-announces-increase-in-read-operations-for-general-purpose-file-
systems/)

~~~
spullara
That is great to know, thanks. It was pretty unusable the last time I tried
it.

------
scandox
Serverless is a bit like Stone Soup [1]. This I guess is the point at which
the Tramp says: "Now if you just add a few onions it really helps the
flavour..."

[1]
[https://en.m.wikipedia.org/wiki/Stone_Soup](https://en.m.wikipedia.org/wiki/Stone_Soup)

------
actionowl
I could see this being very useful.

I recall Joyent's solution to this (similar) problem where you have an object
stored somewhere (e.g. S3) and you want to use that object in a container, but
you have to copy it over HTTP or something to do any work on it and the object
could be very large.

With Joyent's Manta[1] you would spin up a container right where an object is
stored (instead of bringing the objects to the container via NFS.) Also has
map reduce support.

[1] [https://apidocs.joyent.com/manta/jobs-
reference.html](https://apidocs.joyent.com/manta/jobs-reference.html)

~~~
mlosapio
Sort of. The better analogy would be spinning up compute localized to the s3
object; which would be pretty interesting.

This feature they did release deserves little fanfare.

~~~
hedora
Ummm. Linux’s NFS client includes a kernel page cache.

You can just mmap or read the file without doing anything else. That is zero
or one memcpy overhead.

S3 clients have to copy the data over the network, assemble the tcp packets,
decrypt and checksum for ssl, and then memcpy the result. That’s at a minimum.
They may be doing other work, like verifying the s3 checksum, or allocating
memory to store the object.

They have to do that once per lambda process, again, at a minimum. They might
do it once per lambda invocation.

I wonder how amazon bills DRAM if multiple lambdas mmap the same thing read
only.

~~~
inshadows
Ummm, I'm pretty sure that before data from remote NFS server make it into
kernel cache they too have to be copied over network, assembled from TCP
packets, possibly decrypted (k5p) and verified (k5i) with NFS over Kerberos
(otherwise you would have no confidentiality/integrity), and moved into newly
allocated memory. Sure, once it is in kernel cache and data are not modified
there may be just "Is this handle still up to date?" remote calls but you
could achieve the similar cache with object storage.

------
jupp0r
This is a horrible idea. This gives lambda functions shared mutable state to
interfere with each other, with very brittle semantics compared to most
databases (even terrible ones).

~~~
scarface74
I need this specifically as a part of a state machine. Most of the steps
involve a Lambda loading and unloading csv data between S3, Redshift, and
Aurora where no local storage is needed. The last step where we had to
download the files locally and compress multiple files together was done
manually via a script because they were greater than 512Mb.

We were just about to put the script in Fargate (Serverless Docker) and run a
ECS task as part of the state machine. Now we don’t have to.

~~~
hansef
This - if you have to fetch data from or output data to outside of the AWS
ecosystem, the 512mb /tmp limit pushes you into the additional (relative)
complexity of having to run on Fargate pretty quickly. Just had to deal with
this for a content ingest job involving pulling a couple GB of data from an
FTP server, processing it and pushing it into an RDS database on an hourly
basis. Would have been super simple if the file was on S3 already.

~~~
jupp0r
Where do you need a distributed file system in this use case? Sounds like all
you need is some local scratch space?

~~~
scarface74
Lambda only gives you 512MB pf local “scratch space”. If it had to provision
and de provision gigs of space at each invocation, it would probably cause
longer start times and shut down times.

------
shoo
I worked with EFS but not lambda in 2017-2018 when migrating an app to AWS -
an app which included a bunch of random application code that assumed it could
read or write into a network file system. Having EFS as a migration target to
replace on prem CIFS was relatively pleasant, which removed the need to
rewrite a bunch of the application code. S3 would have been a reasonable
replacement but that would have required weeks or months of rewrites to hunt
down filesystem calls and rework them to use simpler object store API.

One thing that tripped us up at the time was EFS not supporting encryption in
transit: but this was fixed in early 2018 when EFS began supporting using
stunnel to wrap the underlying NFS connection in TLS.
[https://docs.aws.amazon.com/efs/latest/ug/encryption-in-
tran...](https://docs.aws.amazon.com/efs/latest/ug/encryption-in-transit.html)

It reads as if this lambda integrated EFS works out of the box with encryption
in transit

~~~
twoodfin
I’m genuinely curious about the factors that made it OK to have an unencrypted
protocol into 2018. Is the AWS infrastructure already encrypting at another
layer? Nobody worries about attacks on data that’s only transiting AWS?

~~~
staticassertion
AWS has stated that internal traffic, at least within a VPC, is "tamper
proof". I don't recall any specific details about that, whether it is the case
across regions, accounts, etc.

~~~
shoo
back when i was working on this (for a large org) they regarded anything in
AWS as fundamentally lower trust than their own on-premises corporate
networks, so there was a security requirement to do encryption in transit for
all TCP connections -- this was for an application within the VPC with no
public traffic

~~~
staticassertion
It's gonna depend on your threat model ultimately. Hard to imagine running on
AWS if you don't trust them though.

~~~
twoodfin
It's not so much about trusting AWS not to pry into your data. More that the
attack surface of your data as presented to AWS' other—potentially
hostile—customers is (presumably) significantly reduced by not having the
bytes flying around AWS' infrastructure in the clear.

To put it mildly, Amazon has a lot more folks a lot smarter than me thinking
about these tradeoffs, so I assume they had good reason to think it was fine
the first time around. I'm just surprised that the state-of-the-art for cloud-
hosted services doesn't presume building on TLS from day 1, so I'd love to
know what those reasons are.

------
matteuan
Does anybody have a price comparison between this and storing stuff in S3
bucket and loading it all the times?

------
saurik
OMG THIS IS SO AMAZING I HAVE BEEN WANTING THIS FOR AN ENTIRE YEAR NOW. (I've
been using Lambda to do massively distributed compile jobs, but had reached
the throughput limits I could achieve with distcc-like techniques doing local
preprocessing, and so was looking at doing limited synchronization of my
codebase to S3 to then either link against the compiler or, for other tools
and to let me use the gold standard compiler I want, do C runtime injection to
make it so that when files are opened I pull them from S3... but that entire
process sucked and doesn't really solve the general purpose problem: this
does; this lets me trivially do the moral equivalent of make -j1000 and have
all of the random sub-jobs get executed in lambda functions and have the
compile complete nearly instantaneously. I can even have those jobs just
directly share state and do "exactly what you'd expect" with respect to the
inter-dependency stuff <\- which like, is a tradeoff, but one that fits well
with how most projects are already designed when using make... I'm so pumped
to go back and work on that project again.)

------
boulos
Disclosure: I work on Google Cloud.

It wasn't obvious to me if this is somehow mounting EFS over !NFS (since never
says the words NFS in the post). My main fear when people say "Should I use
Lambda / Google Cloud Functions / Cloud Run against my NFS server" my response
isn't "How would you set that up" it's "Be careful. Cleaning up NFS locks held
by clients that have gone away is fairly painful, and you have none of the
mechanisms to make sure it exits properly".

Alternatively, you can mount without locking, and then you get one of the
comments downthread about "and now you've given functions shared mutable state
but with bad primitives".

tl;dr: Cool! ... But, how does this handle NFS locking?

~~~
geertj
(PM-T in the EFS team)

This is NFS and locking is fully supported (the blog includes an example).
Because EFS implements the NFS 4.0/4.1 protocols, locks are lease-based and
there isn't a need to clean them up. In the unlikely event of a client crash
where the client still held locks, they will automatically expire once the
associated lease expires.

------
ralusek
Pretty sweet.

Is anybody using Lambda to run huge MapReduce jobs? Do people still use
Hadoop?

Doesn't this basically just let you have something like HDFS for running large
distributed computations with some shared state, without having to reach for
S3 or redis?

~~~
ignoramous
I know at least one team at Amazon that does runs map-reduce style jobs with
Lambda, but now that Athena supports user-defined functions, I'd personally be
inclined to use it instead over EFS or S3 + Lambda.

------
philsnow
sharp knife to hand people -- because EFS is just NFS, it uses NFS for
security / isolation. Everything that can mount a given volume needs to agree
on what unix users are what, and you need to make sure to completely lock down
root access, otherwise you can't enforce any kind of data isolation.

If your use case can deal with one EFS volume per isolation boundary, you can
use IAM to control who can mount what volume, which might be easier to reason
about.

Cloud-y DLP tools don't know about EFS.

~~~
geertj
(PM-T on the EFS team)

The EFS/Lambda integration uses EFS Access Points, which allow you to enforce
a specific POSIX identity and directory for NFS operations. You can also use
IAM policies to require that specific IAM roles/users use a specific access
point.

~~~
philsnow
Ah, didn't know that part, that's very good to learn. Thanks!

edit: ah, EFS access points are general and you can mount them from EC2
instances ? MUCH better.

------
anderspitman
Very cool functionality, but so much complexity to set up, connect, and manage
all the various services.

~~~
lukehoban
There is certainly some additional complexity here over the basic Lambda
serverless setup. But I think the console-driven configuration as outlined in
these posts often makes it harder to see what the core concepts really are.

With infrastructure-as-code tools, this can be a little clearer. At Pulumi we
wrote a post earlier today on configuring the infrastructure needed to use EFS
with Lambda, and it boils down to just a few concepts and a couple dozen lines
of infrastructure code.

[https://www.pulumi.com/blog/aws-lambda-efs/](https://www.pulumi.com/blog/aws-
lambda-efs/)

Some of the complexity here also comes from the fact that EFS is a general
purpose managed NFS service, instead of a fully-abstracted Lambda-specific
feature. That does add a little additional up-front complexity, but means you
can use EFS across all sorts of different compute in AWS - not just Lambda.

------
hedora
Ooh. I wonder if efs is compatible with sqlite’s nfs mode.

More seriously, this is huge. Unix pipes over shared nfs has always been my
big data platform of choice (since before the cloud, or even google map
reduce). Things finally came full circle.

------
Thorentis
Cool, so we're staring to curve more sharply around the full circle we'll
eventually go on.

So now lambda functions can mount persistent block storage.

Next up: allow your lamda functions to run for longer

Then: allow multiple lamda functions to execute concurrently, and
indefinitely, as a group, while having block storage mounted

And finally: use your EC2 instances as lambda functions

~~~
staticassertion
I agree that we will see all of those, except the last one. It's no surprise
that we're bending around, that's normal. But when we get back to the "feature
parity with the past" things will still look very different. We're talking
about fully managed systems here that you can compose together to get parity
with what you have today with unmanaged systems.

I also don't think the time limit on lambdas will extend much further than it
is today. Not sure on that one.

~~~
Thorentis
I'm imagining the full circle looking like web frameworks "compiling" their
controller methods down into lambda functions. e.g. you deploy your Django
application to AWS Lambda and it automatically fires a lambda function when a
view method is hit via a route. This is already happening where people are
deploying static sites backed by lambda function APIs.

~~~
donavanm
> e.g. you deploy your Django application to AWS Lambda and it automatically
> fires a lambda function when a view method is hit via a route.

Is that a bad thing in your opinion? Examples like that were very much a Day 1
use case, or at least intent, for a lot of AWS Lambda. When we were working on
CloudFront Lambda@Edge it was explicitly the intent to move the compute out of
the traditional data center or EC2 instance. We wanted to enable customers to
run their entire application in this fully managed 'serverless' aspect where
AWS can optimize along multiple axes on their behalf.

Internally Amazon is a land of APIs, RPC, and federated services/business
logic. Getting to the point where each API action, or internal method, would
be hosted as a Lambda was entirely a goal.

Source: Principal at AWS. Not in this space directly anymore, but spent time
working on CloudFront & Lambda@Edge.

------
abrookewood
What is the advantage of using this over S3? Is it just the speed & latency
difference between S3 and EFS?

~~~
KaiserPro
You have the ability to seek() without downloading first.

~~~
nickcw
You can seek with S3 using the Range header. The latencies are poor though!

------
Konohamaru
This is misdirection by naming, like how University of Phoenix is similar to
Arizona State University, Phoenix. "Lambda functions" sounds like anonymous
functions, but is actually referring to a proprietary interface to AWS
Lambda™. They named it this way so that readers confuse AWS Lambda™ for
programming lambdas.

~~~
QuinnyPig
One of AWS’s core competencies is being bad at naming things.

~~~
KaiserPro
with useless logos.

------
tyingq
Curious if it adds significantly to cold start times.

~~~
digianarchist
They've already reduced cold starts significantly [0] for Lambas in a VPC, I
can't see them tossing all that good work for this.

[0] - [https://aws.amazon.com/blogs/compute/announcing-improved-
vpc...](https://aws.amazon.com/blogs/compute/announcing-improved-vpc-
networking-for-aws-lambda-functions/)

------
whalesalad
This is going to be a really interesting way to benchmark EBS performance.

------
black_puppydog
Can we please change the title? I was really afraid that AWS might have lost
it and called the new file system "new", which must be the least practical
name for anything from an SEO standpoint.

~~~
tanilama
Amazon New will be the worst name for a AWS product, since they announce TONS
of products every year, nothing is really new.

~~~
booi
Amazon New, a full-featured and elastic system for releasing new products.

------
zelphirkalt
"lambda functions" not "lambda expressions" \- It is funny (or not) how Amazon
redefines how a term is used by naming a product. The words actually make no
sense, but people do not care and say It anyway "lambda function".

~~~
arcturus17
I just glanced over the Wikipedia article for lambda calculus and there is at
least one mention to “lambda functions”, right in the intro.

What’s wrong with “lambda functions”?

~~~
zelphirkalt
To me it sounds like you say twice, that you have a procedure or lambda.

A lambda ("expression" is often not even said, instead simply "this lambda
here ...") in many programming languages is already an anonymous procedure and
often an anonymous function. So saying "lambda function" makes it sound like
"procedure function" or "function function" or "lambda lambda".

