
VM Escape: QEMU Case Study - 0xFFC
http://phrack.org/papers/vm-escape-qemu-case-study.html
======
throw2016
I think there is too much cruft in Qemu and a simple install can pull in
hundreds of dependencies. For instance if you just want to run x86_64 VMs
there is no point getting 20 different platforms.

For the standard run of the mill VM there is also rarely any need for the
multitude of devices. The more stuff you put in the more surface area for
exploits. The Qemu install candidate needs to be seriously redesigned and
pruned down by package maintainers.

Intel Clear Linux tried running VMs without qemu altogether with Kvmtool but
are now using 'Qemu lite'. But there are no packages, instructions or even a
readme on what exactly they have done to make 'Qemu lite'. Kvmtool is one of
those packages with zero documentation.

Libvirt is needlessly convoluted and its usage complexity has led to an
inferior solution like Virtualbox gaining traction on Linux, inspite of kvm in
the kernel. For those looking for a simpler way to use kvm I suggest trying
something like Aqemu or even just resorting to the command line. I found it
far more lightweight and reasonable than libvirt's heavy handed way of running
a simple qemu-kvm command line and adding an XML mess to it.

~~~
kashyapc
Please don't make suggestions like "libvirt is too convoluted, just resort to
[QEMU] command-line" \-- FWIW, I'm saying this as an _every_ day QEMU command-
line user -- without having the complete picture. Yes, there _are_ some valid
scenarios where people who know what they're doing _can_ (and they do)
directly use the QEMU command-line. But for the most majority, libvirt makes
life far simpler, because of the following.

Daniel Berrangé (lead maintainer of libvirt) articulates it much more
succintly than I could, allow me to quote him:

"It is a common attitude among people who look at QEMU and see a deceptively
simple command-line syntax for launching a VM and don't realize that you'll
enter a world of hurt when you go beyond the initial simple command-line
syntax.

This[1] was written in 2011, but is pretty valid and probably more besides:

    
    
      https://www.berrange.com/posts/2011/06/07/what-benefits-does-libvirt-offer-to-developers-targetting-qemukvm/
    

Recently, we had a very nice example of benefit of libvirt to security when
the VENOM bug came out. Anyone using libvirt automatically had anti-VENOM
available thanks to SELinux/AppArmour which would prevent exploitation in most
cases."

More on it (from his very detailed response on this thread[+], specifically
from the 4th paragraph on):

"Also note that this kind of bug [VENOM] in QEMU device emulation is the
poster child example for the benefit of having sVirt (either SELinux or
AppArmor backends) enabled on your compute hosts.. With sVirt, QEMU is
restricted to only access resources that have been explicitly assigned to it.
This makes it very difficult (likely/hopefully impossible[1]) for a
compromised QEMU to be used to break out to compromise the host as a whole,
likewise protect against compromising other QEMU processes on the same host.
The common Linux distros like RHEL, Fedora, Debian, Ubuntu, etc all have sVirt
feature available and enabled by default and OpenStack doesn't do anything to
prevent it from working. Hopefully no one is actively disabling it themselves
leaving themselves open to attack...

[1] I'll never claim anything is 100% foolproof, but it is intended to be
impossible to escape sVirt, so any such viable escape routes would themselves
be considered security bugs. "

[+] [http://lists.openstack.org/pipermail/openstack-
operators/201...](http://lists.openstack.org/pipermail/openstack-
operators/2015-May/006947.html)

~~~
throw2016
I think you are presuming here about the 'complete picture'. The point is not
to encourage anyone. I am sure people can make their own decisions.

We need to highlight the issues with libvirt that leaves a great technology
like kvm unused. Security can be used to close down any discussion but the
result is many people are using virtualbox and not libvirt.

Seccomp, selinux or Apparmor can be used without libvirt as required so I
don't see this as a security issue. On the contrary the needlessly complex XML
config files likely put many users off from using it properly or using it as
all.

~~~
kashyapc
You are simply overstating the issue. Was not closing the discussion down with
security -- that line of thought didn't even occur to me.

The `virtual-box` equivalent in the KVM stack is `virt-manager` (under the
hood, it uses libvirt APIs) -- these are primarily for _desktop_
Virtualization, not for server Virtualisation. To manage a fleet of servers in
a data center, people don't fire up a desktop application.

Can some aspects of the experience with `virt-manager` / libvirt stack be
improved? Certainly yes. But saying things like it "leaves a great technology
like kvm unused" is ridiculous hyperbole.

A clarification about _why_ it is partly a "security issue": The sVirt guest
confinement mechanism in libvirt _builds_ on top of basic SELinux / AppArmour
confinement. As in, with sVirt: each VM (the QEMU proces that is managed by
libvirt) and its associated disk gets a _unique_ SELinux label -- this means,
_even_ if a guest (in a fleet of 1000 VMs) is compromised, it is contained to
just _that_ specific guest.

~~~
throw2016
I think you are conflating issues here.

Obviously virtualbox users do not need data center level security and should
not have to deal with that complexity, and naturally choose not to, and use
virtualbox.

You have turned a simple observation about the complexity of libvirt's
configuration and the resulting traction of virtualbox on Linux into a
discussion on data centers, 1000's of VMs, and security. This seem to be
talking around the issue so this discussion is moot.

Are you saying selinux, apparmor and seccomp cannot be used on qemu processes
without libvirt?

~~~
kashyapc
You made a sweeping generalization about libvirt's complexity, and
characterized it as a "simple observation", without any concrete pointers.

And you seem to be comparing Virtual Box with libvirt, which is equivalent to
comparing apples to aardvarks (okay, that's an exaggeration). But seriously, a
fairer comparison would be Virtual Box vs. Virt-Manager on Linux.

 _Here_ , I admit -- I'm not a daily Virt-Manager user (I live on the command-
line), so I can't tell you where exactly it is lacking compared to Virtual
Box.

If you've tried it, & have suggestions​, you might want to write to:

[https://www.redhat.com/mailman/listinfo/virt-tools-
list](https://www.redhat.com/mailman/listinfo/virt-tools-list)

\---

About the libvirt's _internal_ representation (in XML -- yes, I'm not a big
fan of it either, and I suspect, nor are the current maintainers; had it been
today, they would've likely chosen a more gentler-on-the-eye format like JSON
or some such) of guest definition: Most users _don 't_ need to touch the XML
definition at all. Most tasks that are involved in managing VMs can all be
done trivially via Virt-Manager GUI, or the command-line, _virsh_.

And the _regular_ libvirt configuration files are all in standard Linux
configuration file format.

> _Are you saying selinux, apparmor and seccomp cannot be used on qemu
> processes without libvirt?_

No, I'm not saying that. My argument (based on daily interactions with users
on public IRC & mailing lists) is that, for QEMU-based guests, libvirt makes
it _easier_.

~~~
throw2016
I think the issue here is I am more interested in understanding why virtualbox
has so much traction on Linux inspite of kvm and you appear to be more
interested in defending libvirt.

You have taken this discussion to unrelated security and data center issues
that have got nothing to do with the topic on hand. You dwell on the security
of seccomp, selinux and apparmor that have nothing to do with libvirt
specifically.

I am not interested in discussing libvirt, virsh or virt-manager and its
various offshoots. And given libvirt is the what virsh and virt-manager use in
the context it's disingenuous pedantry to make a distinction.

I am interested in understanding how kvm can be made more accessible to
virtualbox users. If that's not something that concerns you and you would
rather dwell on defending libvirt then that's not a productive discussion. If
libvirt was the solution we would not be having this discussion to start with.

------
j_s
If for some reason using QEMU with potentially hostile code (not
recommended!), disable all the hardware emulation you can!

~~~
bonzini
Pretty much all cloud providers that use KVM (where "pretty much all" probably
means all except Google) are using QEMU and potentially can be running hostile
code, so the "not recommended" remark is perhaps a bit exaggerated.

~~~
0xFFC
What do they use at Google?

~~~
rwmj
Some proprietary code replacing qemu (but still using KVM). Since they still
have to emulate PC devices, this just means they have a different set of
security holes and fewer people reviewing the code.

~~~
sitkack
Given QEMUs track record, I'd wager that the goog code gets more reviewers and
more testing. QEMU is literally swiss cheese, or it emulates real swiss
cheese, poorly.

~~~
bonzini
Any proof of what you are saying, or is it just FUD?

Most QEMU CVEs are related to devices that should never be used in cloud
provider scenarios (you'll often find that they are disabled in RHEL for this
exact reason). If anything, prompt handling of vulnerabilities in those
devices is a sign of taking security seriously...

~~~
peterwwillis
QEMU is a 14 year old codebase designed for research and cross-platform
emulation. It is not developed with a focus toward ongoing security
testing/auditing.

Xen uses a stripped-down QEMU to boot unpatched guest OSes. However, even Xen
doesn't test its qemu-xen components extensively. Writing a new purpose-built
emulator (assuming you know what you're doing) is a better idea.

edit: Or use PV guests, and skip all potential QEMU flaws.

~~~
rwmj
Sorry, this is completely wrong FUD. Paolo Bonzini and I work at Red Hat and
there is constant review of the QEMU codebase, both for code quality and
security issues. Furthermore devices in RHEL are whitelisted. TCG ("cross-
platform emulation") is not involved at all when you're using KVM. A newly
written emulator would just have a new set of vulnerabilities. When you don't
know what you're talking about, please just don't.

~~~
peterwwillis
I'm glad to hear that! When did you implement fuzzing coverage? After the 2016
Qemu mailing list article I read where they're asking someone to look into
applying to Google's oss fuzzing project? And how does your whitelisted set
compare to Xen's stripped-down qemu support?

~~~
rwmj
I'm actually fuzzing qemu's block device layer as we speak.

From the latest run:

    
    
        │        run time : 11 days, 23 hrs, 12 min, 49 sec    │  cycles done : 0      │
        │   last new path : 0 days, 12 hrs, 55 min, 7 sec      │  total paths : 364    │
        │ last uniq crash : none seen yet                      │ uniq crashes : 0      │
        │  last uniq hang : 0 days, 4 hrs, 4 min, 36 sec       │   uniq hangs : 2      │
    

Re the comparison with Xen's qemu, you can grab the sources for RHEL's qemu-
kvm and qemu-kvm-rhev packages and examine the driver whitelists, patches and
./configure line yourself.

------
legulere
With all the bugs in hardware emulation, wouldn't it make sense to emulate the
linux kernel a la bash for windows instead of running the linux kernel on
emulated hardware?

~~~
btbuilder
Sounds like you are describing containers; while not emulation neither is
virtualization. There are many more opportunities for escape dealing with
Linux containers than virtualization due to the increased complexity of the
interface.

While I'm impressed by the work Microsoft have done to support the Linux
kernel interfaces I would imagine the complexity of the effort to implement
correct behavior from Windows kernel primitives would lead to more potential
security vulnerabilities.

Another comparison might be Linux syscall support within illumos[1] which
AFAIK relies on mature Solaris Zones for isolation.

[1] [https://www.slideshare.net/bcantrill/illumos-
lx](https://www.slideshare.net/bcantrill/illumos-lx)

~~~
danieldk
_Sounds like you are describing containers; while not emulation neither is
virtualization._

Another possibility would be User-mode Linux (UML), in contrast to containers,
it gives each 'virtual machine' its own Linux kernel, where the Linux kernel
runs as another Linux program.

------
overgryphon
This is a case of legacy code left in an important attack surface. I doubt
many people need a virtual floppy drive today.

~~~
0x0
Nonsense. I use the virtual floppy drive in VMs all the time, because I'm
virtualizing legacy systems. But I do agree it could probably be disabled (and
thus unexploitable) by default.

