Really excited about this and rust-vmm which allows you to build custom Virtual Machine Monitors. The problem with a solution like Firecracker, and why they are able to get such fast startup times and a small memory footprint is that they selected a subset of features to expose. This limits the capabilities of the running container, which is of course the point. But it is not general purpose enough for all workloads. For instance, I want to use Firecracker but require host file system sharing. Rust-vmm looks like it is trying to solve this problem by providing a collection of rust crates that allow users build their own VMM with the features they need. It's build-a-bear for VMMs :)
Nearly everybody says they want a simple and minimal solution in this space, but everybody means different things when they say that. For example, in Lambda we do everything at the block level, and intentionally don't want to share filesystems (one reason is that it exposes the host kernel's complex FS code to the guest). But you want filesystems, and that's cool.
Instead of having a box that does everything, with the associated size and attack surface, rust-vmm moves that feature binding to build time. It's slightly less convenient, but much more powerful.
(I'm the guy in the video on the linked page, but this is just my opinion, nothing official)
Yep! A project I use with Kubernetes is CloudHypervisor[0] (with kata-containers[1]). This is a rust-vmm based vmm that has VirtioFS (host FS sharing) support while still being leaner than QEMU.
Firecracker is absolutely awesome. Looking at the dismal state of sandboxing APIs in the major operating systems, and the advent of KVM on Linux, WHP on Windows (including Windows Home!), and HVF on Mac, I wonder if the future of securely running applications might be to just throw every app in a VM, similar to how Qubes OS works. Technologies like Looking Glass[0] can even enable 3D acceleration.
While Firecracker is faster, QEMU does actually have a microvm[0] machine which boots pretty dang quickly. QEMU also has the advantage of user space networking which allows guests to access the internet without any admin privileges required. Unfortunately the network speeds are fairly slow. I'm hopeful that this can be made much faster moving forward.
QEMU also works on Windows, which is huge. But I haven't been able to get microvms to work on Windows hosts yet.
(I'm the guy in the video on the linked page, but this is just my opinion, nothing official)
> Looking at the dismal state of sandboxing APIs in the major operating systems, and the advent of KVM on Linux, WHP on Windows (including Windows Home!), and HVF on Mac, I wonder if the future of securely running applications might be to just throw every app in a VM
Yes, absolutely. Its clear on the server side the tradeoffs here strongly favor (micro) VMs over kernel-level or language runtime-level multi-tenancy for any non-trivial applications. The paper goes into more detail on why that is, but the short version is that the tradeoffs seem easier if we put the shared API at the "hardware" level rather than on top of the kernel. These kernel/"hardware" interactions are vastly simpler than the userspace/kernel interactions, and also much less of a moving target.
I'm not a client-side expert, although I have played with Qubes. The tradeoffs may be different there.
> While Firecracker is faster, QEMU does actually have a microvm[0] machine which boots pretty dang quickly.
Yeah, QEMU is great. QEMU's ambitions are broader than Firecracker's, and that leads to the teams making different tradeoffs over time. Intentionally staying minimal and simple is nice if you can get it, and in the context we use Firecracker at AWS we can.
There's also a "rising tide lifts all boats" angle here. QEMU, or any of the rust-vmm hypervisors, succeeding is good for everybody in this space.
Thanks for your work on Firecracker. One big remaining hurdle is for virtualization to be enabled by default on users' systems. WHP is fortunately easy to enable, but still an extra step that users probably won't understand the implications of. KVM seems more hit-or-miss. Some distros allow read/write to /dev/kvm by default. Others require jumping through udev hoops or similar. If virtualization is disabled in the BIOS/UEFI you have a much harder problem to solve.
Are there security implications to enabling this stuff by default, or do vendors disable it just out of habit? Do you think there's any chance of virtualization being enabled by default becoming the norm?
There are similarities, sure. But unless I'm misunderstanding what you mean, running a desktop application in a fully 3D-accelerated VM that's completely sandboxed from the rest of your system isn't anything that's been widely done before that I know of.
It's a niche use-case, but one of the things I've been curious about is nested virtualization using Firecracker as the outside VMM and KVM as the inner VMM (with QEMU being the sole userspace process running in the Firecracker VM). This would make it possible to use Firecracker for whole-system virtualization, and run Windows.
KVM-in-Firecracker is apparently technically possible and just works thanks to Linux' support of nested virtualization, but celebratory noises to that effect elicited a slightly freaked out response: https://github.com/firecracker-microvm/firecracker/issues/17...
That issue only really covered the security impliciations of multiple KVM (QEMU) instances inside a given Firecracker instance though, which I can totally see as (strictly, pedantically) problematic. But if QEMU is literally the only thing running (besides an only-what's-necessary /init) in a given Firecracker instance, that sounds to me like I'd have the best of all the worlds (Firecracker's awesome security and hardware emulation) with verifiably no downsides, right?
The only ramifications I can see are if the hardware acceleration involved in nested virtualization might represent a security vulnerability. Theoretically a practical attack would equivalently impact Firecracker-inside-Firecracker use cases as well, on the assumption that it would need to first break out of QEMU and then Firecracker, as opposed to *waves hands* doing something that punches all the way from the twice-VMM'd guest all the way out to the host because of a hardware fault.
Such hardware issues are arguably unfixable though (insert standard paragraph about microcode updates here), so theoretically (from a software boundary correctness perspective) I should be good to go right?
AFAIK Amazon as a company is not really open sourcing a lot of their software. I was surprised that they would open source such a key component of their Lambda offering.
As someone here already mentioned, it certainly helped Fly.io steeling some of their customers. (Although Fly has no serverless offering, but uses it for a docker-ish deployment model.) And last time I checked there were a few more startups using it.
Can anyone (maybe Amazon folks here) shed a light on the rationale behind it, why they’d open sourced it and what they could possibly gain by it?