
An Updated Performance Comparison of Virtual Machines and Linux Containers [pdf] - nreece
http://domino.research.ibm.com/library/cyberdig.nsf/papers/0929052195DD819C85257D2300681E7B/$File/rc25482.pdf
======
usaar333
A bit of a misleading title.

Comparing linux containers to native performance wouldn't be terribly
interesting; there is little reason to believe there is any nontrivial
overhead from using cgroups, chroot, etc.

Instead, this is comparing different Docker configurations to native
performance (to Virtual Machines as well). The insights one can get is the
amount of I/O overhead that is introduced by using local NAT as well as AUFS.
(turns out NAT and AUFS overhead are quite significant.)

Finally, figure 17 is a bit interesting. It implies that there is some
inherent overhead (before NAT/AUFS) of 2% from the container; it'd be
interesting to find out what the source of that is.

/aside: putting percentage loss and absolute numbers on the same graph (figure
16) is incredibly confusing.

~~~
mtanski
I don't know if the AUFS overhead matters in many workloads. If you're using
for deploying code that gets loaded once and you're and you're not performing
a lot of filesystem access.

Think of many webapps, their assets are deployed on a CDN and most of the time
they are performing network IO with databases, memcache, etc...

------
sergiolp
On the linpack test, the KVM domain should have been configured for inheriting
the CPU extensions from the host (or, alternatively, cherry picking the
extensions relevant for such test).

For the rest, I think an updated KVM vs. Xen vs. VMWare comparison would have
been way more interesting. If you don't need to run other OSes, process
tagging (jails/zones/containers/whateveritscalledonyouros) will always give
you better performance.

~~~
wcchandler
I would love to also see a comparison of hypervisors. Unfortunately, it
violates practically everyone's EULA. I can understand why, there's so many
different ways things can be implemented and it's almost never a 1:1 identical
setup. That being said, I can only rely on anecdotal evidence accompanied with
confirmation bias when determining which virtual environment is best.

------
tinco
Interesting that using vETH and NAT causes around 20% extra latency and around
20% less throughput. Is this expected from these technologies?

I was expecting them to have no significant impact at all, since surely the
hardware NIC would be the bottleneck? perhaps the kernel is doing some
memcpy'ing to achieve these features?

Then again, 10gbps NIC is quite fast, perhaps this performance losses are less
significant with more reasonable 100mbit/1gbit interfaces?

~~~
zdw
NAT requires each and every packet to be rewritten with different source (or
destination, or both, depending on the flavor) IP addressing information, so
it isn't just a simple copy.

Drivers and sanity checking what's going in/out of the VM probably account for
the rest. Hardware may be fast but adding layers of complexity will have a
performance impact.

------
kator
Virtualization costs something and the goals of any design should be to
balance that cost against the benefits. I've designed systems that have crazy
low-latency and high QPS needs and nothing beats bare metal for these systems.
That said if you need to manage 5,000 of them you start to debate the
advantages of having the application in a virtual environment against the cost
of buying 600 more servers.

In most situations the difference is noise against the backdrop of security,
maintenance, labor to manage a system -vs- the performance needed.

Sadly many people think virtualization and/or containers are "Free" and make
uninformed decisions that have real impacts on systems at scale later in the
system's lifetime.

In general if you need the flexibility or your application is usually 5% CPU
95% of the time you'd better off doing some form of virtualization to save
labor and maximize resource utilization of systems you're deploying.

That said I agree with many of the comments here it would be nice to see more
head-to-heads of hypervisors but that's scary stuff for the vendors involved
and puts them at serious risk in the arms race to save every penny we can in
managing large farms of applications and the systems running them.

Remember, Virtualization is just a technology like anything else it has it's
benefits and brings with it resource usage and challenges.

There is no "Free Lunch".. :-)

------
techdude
Assuming that the goal is to host a single "App" on the bare metal server,
these tests results are indeed interesting.

Another interesting result -- not addressed in the paper -- is to see how many
small Apps that the container approach can support versus the vitalization
approach can support.

While hosting many Apps/VMs, not having to run the OS multiple times is one of
the advantages that the Docker approach provides and it would be interesting
to quantify the performance implications.

------
rwmj
Containers don't have the same security properties as full virtualization.
Also containers can only run guests that require the same kernel as the host,
ie. only Linux on Linux. In other words, they are not direct substitutes.

~~~
j_jochem
Still, it's important to understand their behaviour with regard to performance
in order to inform trade-off decisions.

------
hbogert
The first graph is already misleading, it's the graph about linpack. It shows
that linpack is halved in performance in KVM, the explanation: ".. unless the
system topology is faithfully carried forth into the virtualized environment.
CPU-bound programs that don’t attempt such tuning will likely have equal but
equally poor performance across na- tive, Docker, and KVM"

Seriously? I am the only one who adds a simple parameter when launching KVM.
In more production/deployment environments higher level libraries like Libvirt
are hopefully used and handle CPU settings automagically.

Then we have the MySQL tests. Comparing qcow (KVM) filesystems against native
and AUFS.. Really, at least make it a fair fight and include a test which
passthroughs the KVM IO to something remotely bare metalish, like a LVM
volume.

~~~
markbnj
Why is it a fight at all? I enjoyed the paper for a few insights into the
implementation of containers, but I didn't find the performance comparisons
very relevant. As another commenter noted: it's not surprising that namespaces
and cgroups are a crapload less demanding than a full virutalization stack.

I don't much like referring to containers as a virtualization technology,
although it seems like that fight is lost. In my view they aren't the same
thing, they aren't solving the same problems, so seeing people line up
defensively around one or the other is sort of amusing.

------
zdw
A similar comparison of Solaris Zones (which, like FreeBSD Jails and similar
existed before Docker but are comparable) to KVM and Xen:

[http://dtrace.org/blogs/brendan/2013/01/11/virtualization-
pe...](http://dtrace.org/blogs/brendan/2013/01/11/virtualization-performance-
zones-kvm-xen/)

Basically, you take a fairly sizable I/O performance hit whenever you try to
virtualize.

~~~
tachion
I would love to see the comparision of these Linux hypervisors and BSD ones,
like Jails and Bhyve...

------
contingencies
TLDR; (1) Docker can be much slower than plain containers - particularly if
using NAT or AUFS. (2) Run VMs inside containers, not the other way around.

------
kator
> This is also a case against bare metal, since a server running one container
> per socket may actually be faster than spreading the workload across sockets
> due to the reduced cross-traffic.

I'm sorry this doesn't preclude a well designed system on bare metal with
multiple sockets being performant. As an example often I've pinned workers to
cpu's to keep network buffers on the same core as the worker thread for low
latency high through-put network applications. I would say the idea of a
container per socket might make it "easier" to isolate applications not
designed or deployed with this factor in mind but it isn't an indicator
against bare metal.

------
kasperset
Just one minor point.There is one slight little typo on page 9 - Mysql because
it is a "polular"(Popular) database.

