
My VM is lighter and safer than your container - tejohnso
https://blog.acolyer.org/2017/11/02/my-vm-is-lighter-and-safer-than-your-container/
======
apeace
This is interesting, but means very little to me as a heavy user of Docker.

I don't care about boot time. I care about 1) build time, and 2) ship time. In
Docker, they are both fast for the same reason.

At the top of your Dockerfile you install all the heavy dependencies which
take forever to download and build. As you get further down the file, you run
the steps which tend to change more often. Typically the last step is
installing the code itself, which changes with every release.

Because Docker is smart enough to cache each step as a layer, you don't pay
the cost for rebuilding those dependencies each time. And yet, you get a bit-
for-bit exact copy of your filesystem each time, as if you had installed
everything from scratch. Then you can ship that image out to staging and
production after you're done locally--and it only has to ship the layers that
changed!

So this article somewhat misses the point. Docker (and its ilk) is still the
best tool for fast-moving engineering teams--even if the boot time were much
worse than what they measured.

~~~
muxator
You described the layering features of Docker images: you can upgrade the
upper layers without touching the lower ones. It helps, sometimes.

The reality always seemed a bit more complex to me: once you ship, you'll have
to start to maintain your software. You will definitely want to upgrade the
lower layers (base OS? JVM? Runtime of the day?). In that case the statically-
allocated cache hierarchy of Docker layers will be of little utility.

On the bit-for-bit reproducibility: I have my doubts, too. Are we sure that
all the Dockerfiles out there can be executed at arbitrary times from
arbitrary machines and always generate the same content hash?

Downloading a pre-built image from a registry does not count as
reproducibility, to me.

Obviously Docker is a tool, and as such you can use it with wildly varying
degrees of competence. I am just skeptical that using a specific tool will
magically free ourselves from having to care about software quality.

~~~
zackify
"Downloading a pre-built image from a registry does not count as
reproducibility, to me."

Completely disagree... On my team, the lead devs build the docker images, push
them to a private repo, and everyone else pulls that exact image. Bringing up
dev environments is almost instant. If a lead dev adds a dependency that
breaks the build, everyone else is fine. They will fix the build and push it
up when ready.

~~~
muxator
I think we are using two different meaning of "reproducible".

To me, saying that a build is reproducible means that anyone is able to
independently build a bit-for-bit exact copy of an artifact given only its
description (source code + build scripts, Dockerfile, ...), more or less in
the sense of [0].

For this definition, even content-addressability is not enough.

[0] [https://reproducible-builds.org/](https://reproducible-builds.org/)

~~~
yebyen
Well, I think you disagree on the meaning of the word "artifact." GP cares
that all production deployments are identical to the development environment
that went through system tests. To ops, a reproducible build is one that has
been uploaded to the registry, because it can be ported over to the production
cluster and it will be bit-for-bit identical.

It is not necessary to reproduce the same build bit-for-bit because you have
kept a copy and distributed it.

You are not rebuilding the same thing twice because that is an inefficient use
of resources.

Nobody really cares that the timestamps are different when the builds started
at different times of day, it is evident therefore that you will not produce
bit-for-bit identical builds as long as your filesystem includes these kinds
of metadata and details.

If you built it again, it is to make a change, so there is no point in
insisting on bit-for-bit reproducibility in your build process (unless you are
actually trying to optimize for immutable infrastructure, in which case more
power to you, and it might be a particularly desirable trait to have. Not for
nothing!)

~~~
FooBarWidget
> Nobody really cares that the timestamps are different when the builds
> started at different times of day, it is evident therefore that you will not
> produce bit-for-bit identical builds as long as your filesystem includes
> these kinds of metadata and details.

Do you think different timestamps is all that you need to worry about? How
about the fact that some source tarballs that you depend on may have moved
location? Or that your Dockerfile contains 'apt-get upgrade' but some upgraded
package breaks your build whereas it did not before? Or that your Dockerfile
curls some binary from the Internet but now that server uses an SSL cipher
suite that is now incompatible with your OpenSSL version? All problems that I
encountered.

~~~
yebyen
Hey FooBarWidget! :D

I was not speaking in terms of formal correctness in terms of the definition
of reproducible builds, I was speaking of practical implementation for a
production deployment. All of the issues you mentioned are (at least
temporarily) resolvable by keeping a mirror of the builds so that you can
reproduce the production environment without rebuilding if it goes away.

I'm just defending the (apparently ops) person who was going by the original
dictionary definition of "reproducible" which predates the "reproducible
builds" definition of reproducible.

If my manager asks me to make sure that my environment is reproducible, I
assure you she is not talking about the formal definition that is being used
here. I'm not saying that one shouldn't care about these things, but I am
saying that many won't care about them.

If you're doing a new build and the build results in upgrading to a newer
version of a package, then that is a new build. It won't match. If you're
doing a rebuild as an exercise to see if the output matches the previous
build, then you're concerned about different things than the ops person will
be.

If my production instances are deleted by mistake, I'll be redeploying from
the backup images, I won't be rebuilding my sources from scratch unless the
backups also failed.

I agree that it is a sort of problem that people usually don't build from
scratch, it's just not one of the kinds of problems that my manager will ever
be likely to ask me to solve.

------
andrewstuart2
Containers are not about safety or lightness IMO. Rather, the reasons I prefer
them are better insight and usability. Since the same kernel is running
multiple applications, it has insight into how much RAM has been requested,
and how much CPU is being used, and can thus be more efficient in packing
processes onto nodes. If you try to build this with unikernels, you will end
up turning your hypervisor into a de facto kernel. As it is, hypervisors
simply preallocate resources and in pretty much every case must overprovision
for peak usage, which requires more unused capacity.

As for usability, I mean that you get many of the benefits of VMs (namespaces
and resource constraints), but still with the added benefit of the kernel
introspection.

~~~
craigyk
I think what you describe most people lump into the "lightweight"
characterization.

~~~
andrewstuart2
Under that definition, then, VMs would not be lighter than containers, because
they require overprovisioning and don't (by deliberate design) dynamically
communicate memory or CPU usage to the hypervisor.

~~~
inetknght
Linux supports hot-adding memory. Since it can do that, as a guest, so can any
other VM. Overprovisioning might be required, but it's only required because
of software limitations, not by hardware. I would venture a guess that could
be solved.

~~~
andrewstuart2
Hot-adding memory is essentially have the equivalent of the mmap kernel
syscall, but there's no ability for the VMs to return memory that they no
longer need, as you have with the munmap syscall. So you may start out without
overprovisioning, but eventually you will hit peak hours, redistribute your
VMs, and be overprovisioned again unless you reboot machines after peak hours.

Even if you did add a capability to compact and dynamically stop using
hardware memory into linux, then signal the hardware layer that it's not used,
you're again just implementing another feature provided by standard kernels
for decades. Not to mention this feature seems basically coupled to
virtualized hardware.

~~~
wiredfool
You can take memory from the VM. I've just tested it on KVM, dropping the
current memory allocation for a vm, and reraising it. Shows up in free in the
vm instantly.

~~~
andrewstuart2
Well it sounds like that is possible, but is there any way for the VM to
indicate to the hypervisor that there is unused memory? Or for that fact, that
it needs more? That's the critical missing piece, unless there's an API I'm
unaware of.

~~~
inetknght
I suppose this question should be posed on a linux kernel mailing list
somewhere.

------
frou_dh
My personal long bet has been to _completely_ ignore Docker etc. Seems that
has bounced between plain ugly apathy and foresight.

~~~
koolba
Putting aside deployment, Docker has been a sheer joy for local development.
It provides a complete wrapper for the voodoo and tribal knowledge required to
run packaged apps.

Gone are the days of building a VM from scratch, following instructions for a
previous version of the OS, installing 10 dependencies, installing the app,
cursing when it fails half way through, installing another missing dependency,
installing the app again, bouncing the server, and then having it all fall
apart again because the app isn't registered to run as a service at startup
(because you missed that step).

All of this was possible before Docker through Vagrantfiles and the like but
Docker made it much cleaner and significantly less resource intensive (i.e. no
longer need a VM per item).

~~~
apple4ever
This is why you do config management. Here, our Vagrant builds are exactly
like our development and production servers. Its literally the same config to
run them both, with just variables to differentiate as needed.

~~~
djb_hackernews
Docker doesn't imply not having config management. In fact for our
infrastructure that was using puppet to build vms, we just run puppet as part
of `docker build` instead. Still using config management and our prod, staging
and dev environments run the same exact blobs.

~~~
geerlingguy
Sometimes; and tools like Ansible Container can help have the best of both
worlds.

But for every time I see someone using actual configuration management and
best practices for reproducible Docker architecture, I see a hundred times
when there are brittle and convoluted shell scripts or worse, entire
application definitions in a 500+ line Dockerfile (which is neither
configuration management nor scripts. At least the scripts are somewhat more
portable and maintainable!)

------
arca_vorago
As a senior sysadmin who has had to deal with all the crap devs throw at me, I
skipped the container craze to focus on VM security and other issues. Give the
devs a nice powerful boxen to use (APIs in many hypervisors these days)so they
can do the provisioning themselves.

Without all the added complexity another tooling like docker requires.

Real world benchmarks I ran indicated perf differences were negligible, so
devs seem to mostly like docker cause they don't have to do sysadmin stuff...
but I can do all the same things in a VM, plus much more.

A big shoutout to the people at Proxmox while I'm at it.

~~~
rconti
Also a sysadmin, I've inevitably found that the environment needs/expects a
number of services, and it just feels like docker is ill-suited to these use
cases. Frequently even the devs themselves want multiple services and can't
tolerate stateless systems anyway, but at the very least we have to deal with
logging and the like.

I'm not saying it can't be done, but I've never found a developer who had a
particular desire to implement containers, and even if they did, their
concerns don't represent the entirety of what's involved in running
infrastructure.

~~~
arca_vorago
I feel like the increase in devs in general and in business have made them
more likely to follow bandwagons and fads more easily than they used to, which
just makes our job harder.

------
tabeth
Looks very interesting.

One thing I've been wanting is a VM that would contain a "minimized" version
of an operating system, plus a single program, in addition to the programs
dependencies.

I've looked into using Docker to do something like the above, but it didn't
seem straightforward a few years ago. LightVM seems like it could be an option
in the future.

~~~
axelfontaine
This is exactly what we provide at Boxfuse
([https://boxfuse.com](https://boxfuse.com)), with the ability to run it on
VirtualBox, Hyper-V and AWS.

~~~
odammit
Hey!!! Thanks for FlyWay! Great project!

------
detaro
Paper discussed here yesterday:
[https://news.ycombinator.com/item?id=15600596](https://news.ycombinator.com/item?id=15600596)

~~~
fredrb
My bad

------
mrmrcoleman
This is almost exactly how Hyper.sh works under the hood:
[http://hyper.sh](http://hyper.sh)

Disclaimer: I used to work with Hyper.sh.

~~~
_Marak_
Why are you no longer working with Hyper.sh?

~~~
mrmrcoleman
I was hired as an external for a specific project which was completed.

------
otterley
The perplexing thing to me about this paper is that "containers" are launched
merely through a particular type of clone(2) semantics (and clone(2)
implements fork(2) under the hood on Linux). So the process of launching a new
container should take around the same time as it takes to perform a fork. The
fact that it doesn't in their analysis is surprising.

So the question for me is, what is the paper measuring, exactly, when they
talk about Docker's relative performance? Are they including Docker API
overhead in their instrumentation? If so, that sounds like an apples-to-
oranges comparison.

~~~
fpoling
To start a container Docker does a lot of expensive setup like preparing
network stack, volumes, logging etc. One can trivially see how expensive is
that just by timing the command line and dropping expensive Docker options. On
my laptop:

$ time docker run --rm ubuntu true

real 0m0.641s user 0m0.036s sys 0m0.018s

$ time docker run --rm --net none --log-driver none --read-only ubuntu true

real 0m0.399s user 0m0.039s sys 0m0.018s

Compare that with the unshare utility that one can use as a Docker replacement
for simple containers (options create all new namespaces that the utility
supports including user namespaces):

$ time unshare -m -u -i -n -p -U -C -f -r /bin/true

real 0m0.029s user 0m0.001s sys 0m0.009s

------
exabrial
This is pretty cool tech, a "minimized" OS and support libs

------
graton
Intel has something like this: [https://clearlinux.org/features/intel%C2%AE-
clear-containers](https://clearlinux.org/features/intel%C2%AE-clear-
containers)

They call them containers, but it is actually a VM.

~~~
bryanlarsen
The two initiatives are quite different, IMO.

AFAICT ClearContainers only supports running your current kernel and
virtualized hardware in their VM. That assumption gives them some massive
speedups.

LightVM supports running arbitrary kernels, and can be really fast when
running a very light weight micro kernel.

~~~
kraemate
How many custom kernels are out there?

Given the already sad state of most user-space software, do we really want a
proliferation of supposedly light-weight kernels?

~~~
convolvatron
I agree with your point about 'supposedly' light weight. and its lot clear how
much weight really costs the kernel's user.

but I'm pretty reluctant to agree that 'last years linux with a random bunch
of patches' is the ultimate solution, or even the best solution available
today.

the one really interesting thing that VMs give you is an abstract diver model.
it might not be the absolute best or higher performance, but if a kernel or
unikernel is designed only to be run in an VM environment, on a server, then
the giant mass of buggy drivers just goes away. It may even be possible in
some circumstances to run with just a VM network device and nothing else.

I personally think there is some fruitful work to be done around the kernel
interface to admit better scheduling of i/o in highly thread-parallel
environment.

But I have to imagine there are some radically improved security architectures
for server-in-a-VM that doesn't involve a Linux userland, firewall, package
manager, etc etc

------
thegabez
Keep your VM's, I'll stick with Docker/K8's

------
xellisx
Does it have any ability to use USB hardware? Something I looked into in the
past was trying to use LXCs to do a multi USB WiFi adapter router to connect
to multiple networks, which might be using the same 1918 netblocks.
Unfortunately, LXCs had issues of "bridging" USB hardware into the LXC.

------
simonjgreen
If you're stuck in the middle, check out VIC (vsphere integrated containers)
for a best-of-both.
[http://www.youtube.com/playlist?list=PL7bmigfV0EqTsggWvOpPWY...](http://www.youtube.com/playlist?list=PL7bmigfV0EqTsggWvOpPWYdlwVwTj2_K7)

------
srigi
If unikernel image boots in 4ms, it can be used to directly hande incomming
traffic in some cases. Does anybody have an experience with this?

~~~
yaantc
There's a prototype Mirage based DNS server that does that, Jitsu. See there:
[http://www.skjegstad.com/blog/2015/08/17/jitsu-v02/](http://www.skjegstad.com/blog/2015/08/17/jitsu-v02/)

Quote: "Jitsu - or Just-in-Time Summoning of Unikernels - is a prototype DNS
server that can boot virtual machines on demand. When Jitsu receives a DNS
query, a virtual machine is booted automatically before the query response is
sent back to the client. If the virtual machine is a unikernel, it can boot in
milliseconds and be available as soon as the client receives the response. To
the client it will look like it was on the whole time."

~~~
anitil
Holy Moly, sometimes I forget just how fast computers can be.

