
Firecracker – Lightweight Virtualization for Serverless Computing - leef
https://aws.amazon.com/blogs/aws/firecracker-lightweight-virtualization-for-serverless-computing/
======
sudhirj
What this is allows, and I'm hoping a full fledged service will be announced
on Thursday or Friday, is running containers as Lambdas. i.e. if you
application starts fast enough, you can just set a container to start and run
as a request comes in. It can also shut down when it's done running.

This allows things like per second billing for container runs, serverless
containers (there's no container running 24/7, only when there's traffic),
etc.

~~~
discodave
> containers as Lambdas.

How similar is AWS Fargate to what you're describing?

~~~
sudhirj
I need to run 1+ Fargate containers 24/7, which is useless and wasteful.

With Fargate-Lambda crossover I wouldn't be running anything 24/7, and it
would be a lot less resource intensive than one Lambda-Container per request
as well.

Google's App Engine gets / got this right when they first launched, but to
make it work they had to demand apps be written for their sandbox (like AWS
Lambda), because of which the model isn't as general purpose. Firecracker
would allow regular containers to be used this way, making a Firecracker
service the first service to allow general purpose servers to be started and
stopped (all the way to zero) based on incoming traffic.

~~~
sciurus
I think with the App Engine Standard generation 2 runtimes you don't have to
write to their sandbox anymore. It still has to be one of their supported
languages though, instead of any arbitrary server.

[https://cloud.google.com/blog/products/gcp/introducing-
app-e...](https://cloud.google.com/blog/products/gcp/introducing-app-engine-
second-generation-runtimes-and-python-3-7)

~~~
thesandlord
Serverless Container support is coming to GCP:
[https://services.google.com/fb/forms/serverlesscontainers/](https://services.google.com/fb/forms/serverlesscontainers/)

------
blasdel
There's a Github Pages FAQ describing why it was made and how it fits with
other solutions: [https://firecracker-microvm.github.io/](https://firecracker-
microvm.github.io/)

and a high-level design document about how it works
[https://github.com/firecracker-
microvm/firecracker/blob/mast...](https://github.com/firecracker-
microvm/firecracker/blob/master/docs/design.md)

~~~
espeed
Interesting name choice. When I clicked on the link and saw the name and
design, my first thought was, "Is this a Firebase knockoff...?" [1] ... and
then I scrolled to the bottom to see the copyright and saw this project is by
Amazon Web Services.

[1] [https://firebase.google.com](https://firebase.google.com)

------
krat0sprakhar
> Firecracker was built in a minimalist fashion. We started with crosvm and
> set up a minimal device model in order to reduce overhead and to enable
> secure multi-tenancy. Firecracker is written in Rust, a modern programming
> language that guarantees thread safety and prevents many types of buffer
> overrun errors that can lead to security vulnerabilities.

This is awesome! Really excited to try this out!

------
talawahtech
This is huge! It basically removes the VM as the security boundary for
something like Fargate [1]. This should lead to a significant reduction in
pricing since Fargate will no longer need to over provision in the background
because VMs were being used even for tiny Fargate launch types.

It should hopefully eliminate the cost disparity between using Fargate vs
running your own instances. Should also mean much faster scale out since you
containers don't need to wait on an entire VM to boot!

Will be interesting to see what kind of collaboration they get on the project.
This is a big test of AWS stewardship of an open source project. It seems to
be competing directly with Kata Containers [2] so it will be interesting to
see which solution is deemed technically superior.

[1] [https://aws.amazon.com/fargate/](https://aws.amazon.com/fargate/) [2]
[https://katacontainers.io/](https://katacontainers.io/)

~~~
Aissen
Indeed, this seems very similar to kata+runv+kvmtool(lkvm). I'm curious why
they don't provide a comparison. Here's what I gathered:

\- it seems to boot faster (how ?)

\- it does not provide a pluggable container runtime (yet)

\- a single tool/binary does both the VMM and the API server, in a single
language.

Can anyone else chime in ?

~~~
coder543
> I'm curious why they don't provide a comparison

They do, if you read the FAQs: [https://firecracker-
microvm.github.io/#faq](https://firecracker-microvm.github.io/#faq)

~~~
Aissen
I did, and it does not answer my question, because they only address the
runv+qemu usecase, not the runv+kvmtool one:

 _Kata Containers is an OCI-compliant container runtime that executes
containers within QEMU based virtual machines_

------
kraemate
Clear containers (now called kata containers) did this more than three years
ago, with similar performance numbers (sub 200 ms boot times). It is
frustrating, but not surprising, to see the same regurgitated solution receive
this much excitement. The firecracker documentation also does not mention the
similarity with prior work, oh well.

[Not affiliated with Intel in any way---just a long-time proponent of the
clear containers approach.]

~~~
talawahtech
The FAQs on the Firecracker website[1] specifically address the difference
between Firecracker and Kata Containers. The main thrust being that they have
decided not to use QEMU and have instead chosen a much more minimal "cloud-
native" oriented approach that deliberately abandons certain features in order
to gain greater security, efficiency and agility going forward. They also
decided to implement it in Rust.

Based on the the responses I have seen from non-Amazon employees with
experience in this space[2][3][4], it looks like their approach is solid.

It should also be noted that one of the main architects of Firecracker was
formerly the project lead for QEMU[5][6]

1.[https://firecracker-microvm.github.io/#faq](https://firecracker-
microvm.github.io/#faq)

2.[https://twitter.com/bcantrill/status/1067326416121868288](https://twitter.com/bcantrill/status/1067326416121868288)

3.[https://twitter.com/jessfraz/status/1067286831287418881](https://twitter.com/jessfraz/status/1067286831287418881)

4.[https://twitter.com/kelseyhightower/status/10672947809488322...](https://twitter.com/kelseyhightower/status/1067294780948832258)

5.[https://twitter.com/jessfraz/status/1067282499938721792](https://twitter.com/jessfraz/status/1067282499938721792)

6.[https://twitter.com/anliguori/status/1067293131366785024](https://twitter.com/anliguori/status/1067293131366785024)

~~~
kraemate
OK I had missed the kata containers blurb in the FAQ, thanks for pointing it
out. In fact the tweets make my point: we are all so blinded by new shiny
releases that we forget their highly incremental nature.

~~~
talawahdotnet
Sure, there are going to be some people that are excited by the fact that
something seems new or just because it is written in Rust, but jessfraz and
bcantrill certainly don't fall into those categories. They have a lot of
experience with Operating Systems, VMs and containerization and I don't get
the impression that they are eaisily impressed by shiny things. Note that they
all work for or worked for AWS competitors (Google/MS/Joyent).

I think what is impressive about Firecracker is that they have chosen to reuse
a lot of the right things (Linux/KVM/Rust) while also taking a new approach
and rethinking important assumptions (No BIOS, no pass-thru, no legacy
support, minimal device support).

In my opinion the Firecracker FAQs give sufficient mention to parallel
projects and tools they have built on like Kata Containers, QEMU and crosvm.
The developers certainly seem open to collaboration with those communities.

AWS doesn't have much of a track record in terms of leading an Open Source
projects so some skepticism is understandable, but I think what we have seen
so far is a very good start.

~~~
steveklabnik
These days, I would expect bcantrill to be excited by something written in
Rust :)

~~~
bcantrill
Hey now -- I'm not quite _that_ easily impressed! ;) This is a problem domain
that I have suffered in[1] -- and we have recently moved from KVM to bhyve[2]
for several of the same reasons that motivated Firecracker. Not that it hurt
that it was in Rust, of course... ;)

[1]
[https://www.youtube.com/watch?v=cwAfJywzk8o](https://www.youtube.com/watch?v=cwAfJywzk8o)

[2]
[http://bhyvecon.org/bhyvecon2018-Gwydir.pdf](http://bhyvecon.org/bhyvecon2018-Gwydir.pdf)

~~~
steveklabnik
Ha! I wasn't trying to imply that it would _only_ take Rust, for sure. :)

I am excited that everyone seems very excited.

------
xaduha
> microVMs, which provide enhanced security and workload isolation over
> traditional VMs, while enabling the speed and resource efficiency of
> containers.

Reminds me of rkt + kvm stage 1
[https://github.com/rkt/rkt/blob/master/Documentation/running...](https://github.com/rkt/rkt/blob/master/Documentation/running-
kvm-stage1.md)

Too bad it didn't take off.

------
tlrobinson
This looks great, I’m just wondering what Amazon’s motivation for open
sourcing it is. It seems like some pretty critical secret sauce for making
services like Lambda and Fargate both secure and efficient.

~~~
sciurus
Google recently open-sourced Gvisor, which although implemented differently
solves a similar problem. Possibly Amazon wants to encourage other vendors to
build integrations with Firecracker rather than Gvisor.

[https://cloud.google.com/blog/products/gcp/open-sourcing-
gvi...](https://cloud.google.com/blog/products/gcp/open-sourcing-gvisor-a-
sandboxed-container-runtime)

~~~
toddh
Also cloudflare announced workers using isolates.

------
xrd
My big question is: is this something only exciting for people doing lambda at
massive scale?

Qemu is exciting technology and has paved the way for all kinds of interesting
layers. So, creating a slimmed down improvement that really makes it faster
and provides a new lambda-ish execution context is great.

I'm sure Amazon cares about that. I'm sure people doing millions of lambda
calls a day care about that.

But, if I'm an entrepreneur thinking about building something entirely new, is
there something I'm missing about this that would make me want to consider it?

Lambda and Firebase Functions are exciting partially because they break
services into easy to deploy chunks. And, perhaps more importantly, easy
things to reason about.

But that's not the big deal: the integration with storage, events, and
everything else in AWS (or Firebase) is what really makes it shine. It's all
about the integration.

When I read this documentation, I'm left wondering whether I want to write
something that uses the REST API to manage thousands of micro vms. That seems
like extra work that Amazon should do, not me.

Am I missing something important here? Surely Amazon will integrate this
solution somewhat soon and connect it to all the fun pieces of AWS, but the
fact that they didn't consider or mention it makes me think it is something I
should not consider now.

------
Tehnix
I really hope this helps with the cold start times on Lambda. We were
currently looking heavily into moving our API from Lambda to EKS, but if this
impacts cold start times, I think we will look at how it ends up looking like
in practice.

~~~
sudhirj
Most code start time problems on Lambda I've seen are VPC related - public
network Lambdas start in milliseconds, with the main lag being the userspace
code startup time.

Starting a Lambda inside a VPC involves attaching a high security network
adapter individually to each running process, which is likely what takes so
long. I assume AWS is working on that, though, they've claimed some speedups
unofficially.

If your security model allows, try running your Lambdas off-VPC.

~~~
Tehnix
The VPC startup times are insane, so we quickly move our lambdas out of that,
accepting the trade off.

Our normal cold starts are in the 1-2 second range, and the app initialization
comes after. Too high for an API facing users :/

~~~
cddotdotslash
We got around this with a bit of a hack - use a CloudWatch event to trigger a
dummy invocation of your function every five minutes. This keeps the container
"hot" and reduces the start time (and is negligible cost-wise). This won't fix
the cold starts when the function scales up, but it does reduce latency for
99% of our API requests.

~~~
vdfs
Isn't using a server better in this case? Or Lambda have some benefits in this
setup?

~~~
cddotdotslash
It's still insanely cheap. You could have millions of executions per month and
only pay $0.50. But if you needed to, it could scale up to billions of
invocation nearly instantly, something a standard server would have trouble
doing as easily as Lambda does.

~~~
markonen
Then again, you are doing the hacks you describe because it is _not_ scaling
up nearly instantly. The cold start delays are not only an issue when scaling
from zero to one, they hit you whenever you scale the capacity up.

~~~
aspyker
This. I continue to hear about hacks of people running ping to keep a single
instance warm. But that doesn't cover periodic changes in capacity needs nor
spikes. I would think to avoid cold starts all together you'd need a pinger
that sent exactly the load difference between peak load and current load. I
would love to hear if anyone is keeping Lambdas warm at more than n=1
capacity.

~~~
Tehnix
>if anyone is keeping Lambdas warm at more than n=1 capacity

There are various ways to do it, but I feel that it's a very suboptimal
solution, and it still won't guarentee no cold starts happen.

I've personally come to the conclusion, that lambda is very nice for anything
non-latency sensitive. We are still using it to great effect for e.g.
processing incoming IoT data samples, which can vary quite a lot, but only
happens in the backend, and nobody will care if it's 1-2 seconds delayed.

------
colemickens
The crosvm and Rust have me intrigued. I was hoping for something like this
since I saw the first hints of Rust showing up in ChromeOS in crosvm.

A compare/contrast with Kata Containers would also be interesting. Their
architectures look similar. (Kata Containers [1] being another solution for
running containers in KVM-isolated VMs, that has working integrations with
Kubernetes and containerd already. Not affiliated, but I'm tinkering with it
in a current project, though I'm also now keen to get `firecracker` working as
well.)

Obviously, if nothing else, qemu vs crosvm is a big difference, and probably
significant since my understanding is that Google chose to also eschew using
qemu for Google Cloud.

[1]: [https://katacontainers.io/](https://katacontainers.io/)

~~~
aliguori
Kata Containers is a lot of infrastructure for running containers and it uses
QEMU to run the actual VMs. Firecracker just replaces the QEMU part and we're
eager to work with folks like the Kata community.

I love QEMU, it's an amazing project, but it does a ton and it's very oriented
towards running disk images and full operating systems. We wanted to explore
something really focused on serverless. So far, I'm really happy with the
results and I hope others find it interesting too.

~~~
zaxcellent
We felt the same way about QEMU before we started crosvm. Glad to see you all
found some use out of it.

------
tatoalo
It’s QEMU without all the legacy stuff, they also open sourced it,
interesting.

~~~
jetzzz
QEMU can do much more than this.

~~~
sitkack
Which is exactly the problem.

------
mcrute
More discussion here:
[https://news.ycombinator.com/item?id=18539532](https://news.ycombinator.com/item?id=18539532)

------
sudhirj
@zackbloom, @kentonv hint hint. Isn't this roughly the same memory footprint
as a Worker? CONTAINERS ON ALL THE CLOUDFLARE THINGS!

~~~
zackbloom
Heh. Truthfully, what I'm most excited about right now is being able to start
a worker in less time than it takes to make an internet request. When you can
do that you get magical autoscaling and it becomes just as cheap to run it in
hundreds of places as one. As long has you have to invest ~100ms of CPU to get
one of these VMs running I'm not sure it will have quite the same economics.

~~~
sudhirj
Yeah, jokes aside I simply don’t think it makes sense to run full processes on
the edge. Not yet, anyway.

Script isolates makes a lot of sense with current hardware limitations, but
full processes at the edge are coming sooner or later.

~~~
zackbloom
That would make me a little sad. I'm not excited about the idea that we
figured out the ideal way for a program to be encapsulated in 1965 and it will
never change.

------
whalesalad
I’m very excited to play with this technology in the same way I love playing
with Elixir/Erlang and userland concurrency models. I also love the _idea_ of
docker (and use it daily) but dislike the ergonomics. My first thought is,
particularly with the emphasis on oversubscription, how does the kernel of the
host schedule work?

------
mark212
still seems much slower than the model used by Cloudflare for what they call
"workers."[1] A recent blog post a few weeks back was the subject of
considerable discussion here[2], and it seems to me to be doing much the same
thing as Firecracker, but still faster because there's less overhead. But
maybe I'm missing something.

[1] [https://blog.cloudflare.com/cloud-computing-without-
containe...](https://blog.cloudflare.com/cloud-computing-without-containers/)

[2]
[https://news.ycombinator.com/item?id=18415708](https://news.ycombinator.com/item?id=18415708)

~~~
tlrobinson
> But maybe I'm missing something.

From the "Disadvantages" section of your first link:

"No technology is magical, every transition comes with disadvantages. An
Isolate-based system can’t run arbitrary compiled code. Process-level
isolation allows your Lambda to spin up any binary it might need. In an
Isolate universe you have to either write your code in Javascript (we use a
lot of TypeScript), or a language which targets WebAssembly like Go or Rust."

"If you can’t recompile your processes, you can’t run them in an Isolate. This
might mean Isolate-based Serverless is only for newer, more modern,
applications in the immediate future. It also might mean legacy applications
get only their most latency-sensitive components moved into an Isolate
initially. The community may also find new and better ways to transpile
existing applications into WebAssembly, rendering the issue moot."

------
solatic
"Process Jail – The Firecracker process is jailed using cgroups and seccomp
BPF, and has access to a small, tightly controlled list of system calls."

So basically, a gVisor alternative?

~~~
ec109685
gVisor doesn't use KVM:

"Machine-level virtualization, such as KVM and Xen, exposes virtualized
hardware to a guest kernel via a Virtual Machine Monitor (VMM). This
virtualized hardware is generally enlightened (paravirtualized) and additional
mechanisms can be used to improve the visibility between the guest and host
(e.g. balloon drivers, paravirtualized spinlocks). Running containers in
distinct virtual machines can provide great isolation, compatibility and
performance (though nested virtualization may bring challenges in this area),
but for containers it often requires additional proxies and agents, and may
require a larger resource footprint and slower start-up times."

~~~
solatic
Yeah but one of the main ways in which gVisor provides security is by
intercepting system calls and strictly limiting which calls can be made.
Firecracker may use KVM instead of running entirely in usermode, but as far as
most of us are concerned, that's an implementation detail. The pertinent
question is whether the price of security is limiting the possible system
calls, which means that Firecracker won't be able to run arbitrary containers,
just as gVisor doesn't guarantee that it can run arbitrary code (which may
require filtered system calls).

~~~
ec109685
That’s not true. Your guest application has access to all Linux system calls
in the guest VM.

You can see here the security model: [https://github.com/firecracker-
microvm/firecracker/blob/mast...](https://github.com/firecracker-
microvm/firecracker/blob/master/docs/design.md)

The firecracker process itself is limited in the system calls it can make, but
kvm allows the guest Linux process the ability to expose a full set of system
calls to end user applications.

------
sdart
Does this provide any multi host cluster management capabilities?

------
polskibus
Does it support Windows?

~~~
chupasaurus
KVM-based, so no it doesn't.

~~~
perbu
KVM supports Windows just fine, which is why you can run Windows on GCP and
Openstack. And Firecracker seems to support enough of a machine to boot
Windows as long as the windows instance has support for libvirt disk devices
and a libvirt NIC.

However, it seems they boot in a slightly unconventional way. They take a
elf64 binary and execute it. This works for Linux and likely some other
operating systems that can produce elf64 binaries. Windows supports legacy x86
boot and UEFI, but likely not elf64 "direct boot".

So if you can get windows into an elf64 binary and have it run without a GPU
you could have it boot. So, likely not. But the reason isn't due to KVM.

------
testbotlo2
Can someone explain me how does this work? Is it an orchestration service for
containers like Kubernetes or is it any different?

------
nunez
I am extremely excited by this. i wonder if this can be used to provision jit
kubernetes workers.

------
polskibus
How does this compare to containers?

~~~
perbu
Containers share the OS kernel and some services. This is a virtual machine
monitor, so it deals with virtual machines. A container can only run Linux
containers.

Firecracker can likely run other operating systems, such as IncludeOS. You
can't run those in containers.

