Hacker News new | past | comments | ask | show | jobs | submit login
Firecracker: Secure and fast microVMs for serverless computing (firecracker-microvm.github.io)
425 points by based2 on March 7, 2020 | hide | past | favorite | 103 comments

Firecracker is great We use it to run fleets of fast booting vms at https://fly.io.

It’s really the best OSS to come out of Amazon.

I played around with fly.io for a bit, it seems pretty interesting. It works pretty well too, I went through the setup for the DoH proxy and the latency I get is very similar to Cloudflare itself, so that's pretty awesome.

It seems that the autoscaling limits are only defined in the fly.toml with the soft and hard limits? It might be useful to make this easily visible under flyctl scale. Also if I delete the fly.toml, can I regenerate it easily?

As a sidenote, I was looking around for more information on the platform and looking at old hn posts. I know the company pivoted a couple of times, but all the old articles are 404ing because the blog url changed.

That's nice to read! Thanks.

We do need to cleanup our old blog posts and links. We created a lot of content at various times. This content is not always relevant anymore.

As for your fly.toml question, you can get the config with `flyctl config save -a your-app`. It'll create a fly.toml with the latest config we know about.

Concurrency limits are still being worked on. They should definitely be visible in more places. The only way to know about them right now is from the fly.toml, that's not ideal.

I dabbled with an idea similar to Fly.io's Heroku supercharging functionality (Turboku?).

One issue I encountered is that the app in question does not benefit from full-page caching. Even if we deployed our app through Fly.io, we'd still have our databases hosted somewhere else. How does Fly.io solve this, or how could we solve this?

When I dabbled with this idea, I thought about deploying DB read-only replicas around the world. There would be some replication lag, but for the app in question, that would not be problematic. Writes would still be affected by the added latency, but this would not be that problematic as the fast majority of the queries are reads.

You've pretty much nailed the problem. The ?good? news though is that Heroku is really slow, so just running Firecracker VMs on real hardware, doing edge TLS, and adding http2 + brotli is a huge win.

When people use https://fly.io/heroku, we launch VMs in the same region their Heroku app is in so there's no latency hit to the DB. Weirdly, latency between a Fly app a DB on AWS in the same region is sometimes even better than AWS cross-zone latency.

We _also_ give apps a special global Redis cache (https://fly.io/docs/redis). This is sometimes enough to make a full stack app multi-regional, usually people cut way down on DB queries when they use their framework's caching abilities, which can make it pretty nice to run a Rails app + cache in, say, San Jose while the DB is in Virginia.

I know of a couple of devs running Elixir apps on Fly that leave a data service in the region where their DB is and basically rpc to it from other regions, which seems to work well.

Read replicas are a good idea, we'd actually like to try that out at some point. It seems pretty doable to put something like pgbounce/pgpool in front of a read replica and let it handle routing write transactions properly.

> The ?good? news though is that Heroku is really slow

Did you do any measurements and if so, on which dyno types? We found that using Performance-M dynos' gives us a rather large performance boost. Performance-m dyno's are also more stable because they run on dedicatd hardware. They're expensive, but we don't run any apps in production without it.

One thing that worked really well for us is to just put Cloudflare or Cloudfront in front of our app. As I mentioned, we don't do any full-page caching. We cache pretty much everything else, but pages themselves have zero caching (business requirement). I believe Cloudflare and Cloudfront also do edge TLS.

> Read replicas are a good idea, we'd actually like to try that out at some point. It seems pretty doable to put something like pgbounce/pgpool in front of a read replica and let it handle routing write transactions properly.

This is going to be tricky. We weren't able to set up replication from Heroku Postgres databases to hosts outside of Heroku. Another thing to keep in mind that is it might be better to let the app decide what is a read query and what is a write query. We have some parts of the app that we need reading directly from the master, so we let the app handle it. The app receives two database URI's, both pointing to pgbouncer.

Cloudflare and Fly are both reverse-proxy CDN services that handle caching and TLS at the edge. They also both support running arbitrary logic at the edge. Cloudflare has Workers (javascript web workers API) with their custom KV key/value persistent data layer. Fly started similar but now supports containers running anything and has a Redis non-persistent cache layer.

If all you're doing is caching some endpoints and TLS termination then either will work. Cloudflare has a bigger network with robust security capabilities, Fly has more flexibility in application logic you can run.

Data has gravity and having a globally distributed database layer is something companies have spent millions on. Usually the solution is to cache as much as possible in each region first, then look at doing database replicas, and eventually multi-regional active/active database scale-outs.

> Did you do any measurements and if so, on which dyno types? We found that using Performance-M dynos' gives us a rather large performance boost. Performance-m dyno's are also more stable because they run on dedicatd hardware. They're expensive, but we don't run any apps in production without it.

We did some measurements, but mostly focusing on the network bits (which you largely solved with CloudFlare): https://fly.io/blog/turboku/

I was surprised at how much faster things seemed on our VMs vs Heroku's Dynos, to be honest. We only compared Standard dynos, but we should be even better price vs performance compared to the performance dynos since we run our own physical servers. A performance-m dyno on Heroku costs about the same as 8 cpus on fly.

It's totally self serving, but if you feel like playing around with the Fly stuff I'd love to know how it compares.

> This is going to be tricky. We weren't able to set up replication from Heroku Postgres databases to hosts outside of Heroku. Another thing to keep in mind that is it might be better to let the app decide what is a read query and what is a write query. We have some parts of the app that we need reading directly from the master, so we let the app handle it. The app receives two database URI's, both pointing to pgbouncer.

This is why I think the in memory caching is such a good option. Usually if I'm building an app, I'll add a caching layer before a DB replica. Write through caching seems to fit my mental processes better. :D

I tried adding Cloudfront in front of an app hosted on Heroku with page caching off (Vary by Cookie) and it increased latency to 3X, never could figure out why. Would like to do it but seems like way too much of a trade off.

Shameless plug, https://fly.io/heroku gives you a lot of the benefit of Cloudfront without adding a layer. It's like running Heroku with a modern router.

--edit-- I confused Cloudfront and CloudFlare yet again. :)

Yeah, I’m gonna try this out. I’m writing a book authoring platform and I want to let people give access to their books on their own domains. Is there an API to add custom domains with LE certs to my app? And is there a limit on how many domains I can add?

There is indeed a certificate API! We’re putting up a guide for it this week, I can send you the draft if you’d like. The CLI commands for managing certs are here: https://fly.io/docs/flyctl/certs/

There’s no limit to domains.

Hi Kurt,

What are usual cold-start times you see with firecracker?

What other VMMs or Unikernels did you consider before settling on firecracker?

Was the firecracker documentation good enough or did you have to go digging through emails or code to figure out certain things?

What was the hardest part of using firecracker in production?


Hello again!

Cold starts depend a lot on what people actually deploy. They're really fast for an optimized Go binary, really slow for most Node apps. We were playing with Deno + OSv just today and got an app to boot and accept an HTTP request in about 20ms. That assumes you have the root fs all built and ready to go, though, pulling down images and prepping everything is a bit of a bottleneck for that.

We looked at gvisor pretty hard but preferred more traditional virtualization. We didn't look much at other virtualization options, Firecracker was really good from day one.

The docs were pretty good. We ended up having to build a bunch, though, probably just because of the nature of our product. We built a custom init (in Rust), a Docker image to root fs builder, and a nomad task driver (both in Go). The init includes and rpc mechanism so we can communicate with the VM.

Firecracker was pretty easy, building the scaffolding to use it was a little harder, but the vast majority of our time is spent on the dev UX and proxy/routing layer.

Your tech stack is really fantastic and cutting-edge.

Core product in Rust, Firecracker Micro-VMs, Nomad instead of k8s (never used it myself but see the strong value in it and think it makes sense + deserves more attention), and experimentation with Deno (huge fan).

I wish I could clone myself and do some work for you guys just to soak up that knowledge.

Even after reading the comments and looking through the site I'm still not sure what fly.io is from a developer perspective. Is it a drop in replacement for heroku? How different is it from cloud run or cloud functions?

It's closest to Cloud Run. We run your containers and scale across regions to minimize your users' latency.

You can't quite use it as a drop in replacement for Heroku since we don't have a Postgres offering. You can use fly to replace the web dynos in a Heroku app for faster performance, though.

Thanks for clarifying! Does it scale down to zero? If I have a very small hobby app that may only have a few users a day would it make sense to throw it on to fly.io?

It would make sense! We give free credits specifically for side/hobby apps (in theory, you can run ~3 microscopic VMS pretty much full time with this): https://fly.io/docs/pricing/#free-for-side-projects

We don't scale to 0 because the cold start experience for most apps is brutal. In the future we may be able to suspend VMs and wake them up quickly, or even migrate them to regions with idle servers.

What do you use for orchestration?

Nomad + our own firecracker task driver. There's a promising open source task driver for Firecracker as well (ours does a ton that's specific to our networking setup): https://nomadproject.io/docs/drivers/external/firecracker-ta...

What's OSS in this context?

Open source software.

Oh FOSS. Got it.

Do people still try to claim that source available is open source? I go by the OSI definition.

That was weird, I'm not sure why that comment got flagged.

I'm not sure what you're asking.

I really was confused about what "OSS" meant w/o the "F" in front.

I played with Weave Ignite the other day which is a Docker-like CLI for Firecracker. Sure there were some rough edges but the overall experience was pretty good. If you are familiar with the Docker CLI you will be able to get some virtual machines up and running very quickly.

Two questions in case someone from Weave tunes into the discussion:

I got the impression that VMs needed an SSH server to be accessible. Is this correct and if so will it be possible to implement something similar to docker exec so that I won't need an SSH server on every VM?

> At the moment ignite and ignited need root privileges on the host to operate due to certain operations (e.g. mount). This will change in the future.

Is there a timetable and could you perhaps elaborate a bit as to why it currently requires root? (I don't know anything about virtual machine internals so this isn't a passive-aggressive question from my side. It's genuine curiosity.)

Containers are provided by host kernel cgroups and namespaces, therefore the kernel implements attach (exec) operation which is practically running a new proces (e.g. bash) in a cgroup (container).

Virtual Machines are provided by software or hardware emulation which run separate guest OS with own kernel. There is no standard way for a host to let you run any process and interact with its stdio inside guest OS because the host simply is not aware what you exactly run inside.

The solution is to have an agreed connectivity standard both on the guest and the host. The guest can provide SSH server, telnet server serial terminal, irc bot or some other kind of control capability. Then of course host needs a tooling too, e.g. SSH client.

Are there any real alternatives to SSH and/or sftp? E.g. a mutual TLS authenticated HTTP server...


Why not use a virtual terminal? That seems like a pretty standard machine interface to use if the machine is virtualized.

>Is there a timetable and could you perhaps elaborate a bit as to why it currently requires root? (I don't know anything about virtual machine internals so this isn't a passive-aggressive question from my side. It's genuine curiosity.)

Not from Weave, but I might have an idea as I've played with Firecracker a bit. When you start up a Firecracker VM, you need to provide it with a rootfs drive, which is a file containing the root file system to be used for the VM. Ignite uses OCI images, so I guess they are doing something similar to [1] in code, the `mount` part requires sudo, so that would be my guess to why you need root.

[1] https://github.com/firecracker-microvm/firecracker/blob/mast...

I’ve personally never needed to ssh into my “cattle” even when I could (EC2 instances in an autoscaling group, Docker containers run with Fargate) and I haven’t missed it when I couldn’t (lambda). For Fargate/Lambda all console output goes to CloudWatch, for EC2 tasks (legacy Windows), we use Serilog with a CloudWatch sink.

But more generically, if you have to log into your cattle for troubleshooting, you probably need a better logging infrastructure.

Then again, if you are referring to how to initially install software, wouldn’t you usually just create an image for it to run?

It's for initial installation.

I use it to figure out what the image and orchestration settings should be. Packaging an application for containers and container orchestration takes me many, many, attempts to get right. Personally I'm unable to divine the correct combination of settings by reading documentation alone so I try something, enter the container, and look at the outcome.

It's nice to have ptrace once in a while.

The only time I can think of where copious logging wasn’t good enough for “remote debugging” is when I was writing C code trying to figure out why I was killing the call stack or overwriting memory.

Having metrics and logging is a requirement to find out that there is a problem, but it doesn't help much to find out what the problem is.

It depends on the granularity of logging. What can local debugging tell you that copious logging at the “debug” level can’t?

At least crosvm which this is based on offers serial device emulation.

> I got the impression that VMs needed an SSH server to be accessible. Is this correct and if so will it be possible to implement something similar to docker exec so that I won't need an SSH server on every VM?

If you want the capability to exec processes from the host into the VM, I think either Docker API or Kube API is the thing for that, as I understand it. If you could kernel exec processes directly into a VM, then it would not be isolated from the host, this seems almost tautological.

You can arrange for process execution another way than SSH, Docker, or Kube API but regardless of what shape it takes, it will still be an entry point in similar fashion to any of these, as the MicroVM or VM runs its own kernel on KVM and does not talk to the host in this way. Perhaps someone knows more about KVM and can clue me in further if there is more here than meets the eye and maybe what you said is possible.

If you don't need the isolation of a proper VM and were only looking for a roughly VM-shaped system that you CAN "kernel exec" or use nsenter to get processes into, you should look at Footloose[1].

I'm suggesting what you are looking for is actually a container that looks more like a VM or bare-metal machine from the perspective of inits and with the vantage point of the processes running inside.

By default, Footloose nodes are running SSH and SystemD, may appear to work similarly to Ignite VMs, but they are Docker containers that may or may not run privileged mode.

So, if it suits you, then you could start up Footloose "VMs" as I still call them, strip SSH from them, then nsenter or Docker exec into them as you desire, or run Kubernetes on them and use the Kube API including exec.

That is actually a lead in to the next project, known as Firekube[2], kind of a mashup of all these technologies plus one more (wksctl[3]). Firekube integrates both Ignite and Firecracker as well, so you can use it similarly on Linux, (where KVM support is available for Ignite), or MacOS, where Footloose runs container-VMs instead, both behave alike; this suite of projects all put together is a very slick and well integrated package IMHO. It is probably comparable in functionality to Minikube, but with GitOps baked right in.

Disclosure: I am not working for Weaveworks, but we are good friends.

[1]: https://github.com/weaveworks/footloose

[2]: https://github.com/weaveworks/wks-quickstart-firekube/

[3]: https://github.com/weaveworks/wksctl

i think using an emulated serial connection or similar driver interface could be part of a solution to this.

You can replace SSH requirement with virtio-vsock (also supported by Qemu)


"Firecracker provides a rate limiter built into every microVM. This enables optimized sharing of network and storage resources, even across thousands of microVMs."

Probably the most interesting feature.

This is enabled by disabling anything that depends on pinned kernel memory.

You get a VM with user space networking and a one button keyboard. This allows the kernel too aggressively swap out unused resources and ultimately increase the total concurrency supported.

One of the things you learn about multitasking is that in theory cooperative multitasking is the most efficient, but the least reliable. It’s cheaper (for the human) to use hard and fast rules that trade nasty surprises for vague disappointment.

This sounds like a hybrid system. The intermediary is cooperative, the client code is oblivious. I’m curioue to see how this plays out over the long haul.

This runs Amazon lambda, so presumably well enough?

You don’t think 12 years from now we’ll be laughing about how lame Lambda is?

Coding practices change like the seasons. Every year is a little different, maybe a little better than the last, some things feel brand new, but the general patterns hold.

I'm sure we'll laugh about most things we are currently using and gasp when we need to revisit them in legacy systems. :)

Can I run a single process in a Firecracker microVM, or do I need to bring a full distro?

Seems like you can, but probably shouldn't since you'd want to run services like chrony to keep time inside the VM.

I'm curious about restarts and snapshotting: would it be feasible to reset a microVM for each incoming request?

Snapshot / suspend isn't doable yet, but it sure would be nice.

We've booted some apps in Firecracker in <20ms, so if you build the app properly you can absolutely do a new VM per request or TCP connection even.

Yeah, this is what AWS does....

No it doesn't. It pre-inits the Firecracker VM's before the first request: https://www.usenix.org/system/files/nsdi20-paper-agache.pdf

I meant the part about one request per VM.

Might consider a pre-emptive approach that always keeps a few warmed up and ready to receive a request. The determination on how many to keep warm could be a function of traffic over the last X period of time.

Wasn't fastcgi essentially about starting new processes ahead of time and then doling them out as new requests came in? Worked quite well for bursty traffic, and for sustained traffic was about the same. Feels like a similar strategy could be apropos here.

And if I'd read farther down the chain, someone said that Amazon already does this.

I'm happy to see AWS is giving back to the open source community.

if you had to guess.. what are your thoughts on why this was open sourced? everybody i listen to seems a little confused by the "why" of this specific action

Note Firecracker does not work on AWS cloud instances, apart from bare metal instances.


>Still no nested virtualization on aws

;( It should work on some azure instances and gce instances when you enable nested virtualization.

Does anyone know if any integration with Kubernetes is on the cards? Something like an operator with CRDs?

Or is this intended to be separate, a competitor?

There are a few ways to use k8s + firecracker, weave has one: https://www.weave.works/oss/firekube/

This is more comparable to runc.

Firecracker can be used in kubernetes through containerd+kata containers integration. (Containerd from docker is a key tech behind both docker and K8s, kata containers from Intel enables containerd to run vms)

I don't think the experience is great yet, but it's on the way.

> Firecracker: OSS virtualization techno, creating and managing secure, multi-tena

This title is a bit awkwardly worded; in particular, "techno" in American English is a genre of music, not an abbreviation for "technology".

We've reverted it now. Submitted title was "Firecracker: OSS virtualization techno, creating and managing secure, multi-tena".

Submitters: please don't do that—this is in the site guidelines: https://news.ycombinator.com/newsguidelines.html. If a title is misleading or baity, please rewrite it, but please also make it good English.

My skills are more on the developer side, this seems more like a devops tool, right?

As such, I'm never really sure if this is something worth playing with. I've played with docker and k8s, and generally understand how those tools help me, but I'm unsure about how firecracker would help unless I'm building a PaaS like fly.io.

It would be really cool if someone wrote a blog post which compared running a service on the different comparable options. Developers have the Todo apps written in dozens of frameworks to compare against. Is a similar type of exploration not feasible here?

docker is used for microservices, is firecraker designed for serverless applications? what's the key difference between firecracker and docker? are these two overlapped?

Amazon uses it mainly for Fargate and Lambda (from what I've read). Docker is a container technology (shared kernel), while Firecracker is an actual VM manager so it provides better isolation. It is more comparable with QEMU.

I would love to see a high level overview and compare/contrast between different container and virtualization technologies out there. For those if us who have a good understanding of operating systems and hardware, but haven't been keeping up with the plethora of new technologies that are out there.

From what I understand, even QEMU can work in different modes, either emulating hardware or a system call interface. So I'm not sure which of those modes you are referring to.

QEMU cannot emulate a system call interface with KVM, only with just-in-time compilation. In hardware emulation mode however it can provide multiple hardware models, including one that is rather similar to Firecracker.

Why does FaaS (that's what Lambda is, right?) need more full blown virtualization? I thought you could maybe get away with even lighter separation than Docker?

Docker isn’t really designed to be a security boundary, so if you’re colocating containers from different customers (e.g. in Fargate), you need to separate them with a real security boundary like a VM. The same thing is true for lambdas: a lambda is just an archive and the code in the archive needs to run somewhere where one customer cannot intercept another customer’s data.

To add on, AWS has never run Lambdas for different accounts on the same VM. Before Firecracker, they would run multiple Lambdas for the same account on the same VM. Now with Firecracker, they can run each lambda in its own VM.

I know very little about the actual technology, but I feel like the blurb on their front page explains this succinctly:

“ Firecracker enables you to deploy workloads in lightweight virtual machines, called microVMs, which provide enhanced security and workload isolation over traditional VMs, while enabling the speed and resource efficiency of containers”

Security of a VM + efficiency of a container.

AWS needs to separate one customer’s lambda executions from another’s. When you deploy Docker, isolation is guaranteed either because you do it on an underlying dedicated VM, or you give the Docker image to something like Fargate.


It wasn't a terrible question, but your question is answered in the first couple sentences of the main article. I do notice a trend where comments that suggest that the commenter didn't read the article get downvoted. I think in general I'm fine with that, except it's sometimes hard to notice when someone has read it, but just didn't understand.

> Intel processors are supported for production workloads. Support for AMD and Arm processors is in developer preview.

Firecracker looks very promising from a server-side technology stand-point but the support for AMD, RISC-V platforms couldn't be stressed more enough.

Amazon better find a way of supporting AMD processors since Intel's CPU bugs are being brought into the sunlight and exploited in all directions by security researchers which have cataclysmic implications for users and server providers these days. This is demonstrated by a ridiculous Intel vulnerability which rendered Apple's FileVault encryption facilities completely useless which is absolutely unacceptable to Apple. There are many other CPU vulnerabilities waiting to be found and it could be the next Meltdown-like candidate.

The sooner the move to AMD or RISC-V open technologies, the better for developers and users.

(I work for AWS, but not on the Firecracker team. Opinions are my own and not of the company.)

Firecracker is open source. We welcome community contributions to bring the technology to additional CPU architectures.

Also, as other commenters have noted, AMD support is already in-tree. It's just in Developer Preview, which indicates the relative level of maturity.

AWS does not have any RISC-V processors in its EC2 offering portfolio, but if customers are demanding them, we'd love to hear from you - please reach out to your account team.

Regarding the Apple's FileVault bit, are you referring to the recently disclosed Intel vulnerability ( https://www.theregister.co.uk/2020/03/05/unfixable_intel_csm... ). Did FileVault ever use this Intel feature (even on non-T2 Macs)? It seems that there is some contention on Twitter whether FileVault is affected.

Besides that... “completely useless?” Is that right? The only way you make disk encryption totally useless is by breaking the crypto. This allows an evil maid attack at best... a very different vector for most people than “completely useless”.

I am actually surprised by the absence of AMD support for a project born at AWS. AWS has been offering AMD Epyc ec2 instances for quite awhile[0].

Missing arm support is also suprising but less so, as arm market penetration is obviously lower than x64


AMD processors work fine with Firecracker, support is relatively recent though so they’re being conservative about calling it production ready.

the vast majority of EC2 is intel. It's not surprising that AMD support is coming after intel support

Intel has been an absolute security shit-show for over 3 years now.

The shift is underway but it takes time.

Meltdown effects all the way back to 2nd gen(2010), it's been bad for a decade we just didn't know.

Meltdown goes much further back than that, potentially all the way to the Pentium Pro (1995)

It was also independently implemented in POWER and ARM A75

What are some use cases?

Would it not be better to make docker vm aware? The tooling would not have to radically change.

... and all the koolaid drinking Kubernetes fans are now going to refer to Kubernetes as legacy and start creating a new cottage industry of Firecracker migrations. Slightly cynical yes. But also a bit true.

Will microVMs replace containers in short time?

i'd also like to know this..

I believe it is still using virtio. Therefore, disk read/writes are not good enough compared to other virtualization tech such as qemu

I am not sure if this is true. Well, you are not really making a statement that could be easily interpreted. Virtio is specifically created to address performance issues with virtualization. Are you unhappy about the implementation quality or you generally believe that para-virtualization is not the way to go about improving virtualization overhead?

One prime example when with virtio it was possible to get native performance _after_ minimum configuration tuning:


More details on virtio:


qemu setups often use virtio, afaik virtio was defined by qemu devs, so your statement doesn't really make sense without more details.

Also, "not good enough" kind of needs a definition of a workload and what's "good enough" for it.

It's using virtio-mmio which is less efficient than virtio-pci. But I/O was not a focus of Firecracker, for example it doesn't scale very well because it doesn't do concurrent I/O operations.

I thought virtio was higher performance; why would that make it "not good enough"?

hello guys,

I'm sorry that my first comment was not enough informative.Now, I'm refering to the paper that I read couple of days ago [0]. If you look at the figure 8-9, FC has some issues with IO.

[0]: https://blog.acolyer.org/2020/03/02/firecracker/

> disk read/writes are not good enough

This is an odd thing to say; surely that depends on the application? I doubt most FaaS users are disk IO-bound?

Which of qemu's many disk technologies are you proposing as an alternative?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact