

Baseimage-docker, fat containers and “treating containers as VMs” - specto
http://blog.phusion.nl/2015/01/20/baseimage-docker-fat-containers-treating-containers-vms/

======
Gigablah
I used baseimage-docker for a while but wasn't satisfied with the image size
(around 300mb with nodejs installed), so I switched to debian as the base and
removed stuff I didn't need such as syslog and openssh. That alone brought it
down to 140mb.

However seeing as other people have managed to distribute their applications
in images as small as 7mb (e.g. progrium/logspout), I decided to create one
based on busybox with s6 as the process supervisor, with help from this
article [1]. I'm now pretty happy with my 33mb nodejs environment :)

[1]: [http://blog.tutum.co/2014/12/02/docker-and-s6-my-new-
favorit...](http://blog.tutum.co/2014/12/02/docker-and-s6-my-new-favorite-
process-supervisor/)

~~~
mikepurvis
I really like this idea, but it really only works for stuff you can build
statically.

I'm wondering if there's a middle ground— for example, a busybox container
which gives you python and pip, or workflow which lets you install a deb and
all its dependencies into a container, without the container needing to itself
have all the apt machinery and other bootstrap detritus on board.

~~~
Gigablah
You can bundle busybox with opkg and use that to install python 2.7 and pip. I
tried it out myself (using progrium/busybox [1] as a base) and the image comes
to 27mb, not bad.

[1]:
[https://github.com/progrium/busybox](https://github.com/progrium/busybox)

~~~
mikepurvis
Ah, that's fantastic. Thanks for the pointer!

------
herp_derpington
I don't use phusion/baseimage, but I'm not against it existing. I would,
however, like to clarify a couple of things in your two blog posts today:

1\. "we are the most popular third party image on the Docker Registry". This
is true based on the # of stars, but that can be misleading when you look at
the actual number of pulls. Don't get me wrong, phusion/baseimage is popular
with about 230k pulls, but if you look around there are dozens of images with
millions of downloads, so your claims are a bit misleading.

2\. phusion/baseimage inherits from the official ubuntu:14.04 image. It adds a
lot of things, and starts a number of services by default, so it is absolutely
a "fat" VM-like container. I'm not against this at all, but I will point out
that the Dockerfile Best Practices article
([http://docs.docker.com/articles/dockerfile_best-
practices/](http://docs.docker.com/articles/dockerfile_best-practices/)) is
explicit that a container should kick off one process. If you look at how the
highly curated official repositories function, none of them run more than one
process or utilize a supervisor. Docker gives you the freedom to do whatever
you want, but calling phusion/baseimage a "correct" way to do Docker conflicts
with the official documentation. I totally get that phusion/baseimage has been
around for a long time, and provides solutions to some common problems which
may make adoption easier for some with legacy apps, but I would refrain from
claiming that your solutions "gets everything right". By all means, use
whatever works, just be aware that the best practices are clearly outlined on
docs.docker.com, not the Phusion blog.

------
kstenerud
I'm almost done migrating all of my images over to baseimage. Most things I'm
running come from debian packages, and are designed to run in a full unix
environment as a service, so it's a lot easier to just make
/etc/my_init/00_myservice.sh which sets up permissions for mounted volumes and
calls "service myservice", then lets the OS handle the rest.

The image size doesn't bother me. Storage is cheap, and I can easily clear out
the old cruft using "docker rmi $(docker images -q -f dangling=true)". Why
should I care about 200mb in the age of multi-terabyte drives?

~~~
Gigablah
Storage is cheap and plentiful, yes. Can't always say the same for bandwidth
or transfer.

~~~
FooBarWidget
If people _really_ care about that then VMs wouldn't be so popular, and Docker
wouldn't be either. Everybody would be using shared libraries in order to
optimize away duplication as much as possible, instead of statically link
things in order to make deployment easier.

~~~
regularfry
Yes, because it's _totally impossible_ for the popularity of VMs and Docker to
be driven by people who don't have that constraint.

