

How I shrunk a Docker image by 98.8% – featuring fanotify - jtlebigot
https://blog.jtlebi.fr/2015/04/25/how-i-shrunk-a-docker-image-by-98-8-featuring-fanotify/

======
pmlamotte
Reminds me of the method used for the demoscene FPS game .kkrieger which was
stored in less than 100kb. They basically played through the game several
times and trimmed out any code paths that weren't used in order to get it
small enough, using a rudimentary c++ "pretty printer" they wrote that tracked
executions of code paths. They had the advantage of being able to alter their
code to only use constructs supported by their custom tool. Utlimately, this
led to some bugs/features being stripped. An example being they hadn't pressed
up in the menu during their runthrough, so you could only navigate downward.

[https://fgiesen.wordpress.com/2012/04/08/metaprogramming-
for...](https://fgiesen.wordpress.com/2012/04/08/metaprogramming-for-madmen/)

------
neonfreon
I don't think there is any way to prove that this found all the required
files. The more paths through the code, each with its own potential file
accesses that can't be predicted with out run time information, the more
likely one will be missed in this optimization stage.

~~~
dxhdr
Variation on the halting problem? Given infinite running time and arbitrary
input, can you prove that a program will never access file X.

~~~
birdsbolt
There's no general algorithm, but you could probably prove it, if you tried
really hard for your given example. :D

~~~
seanp2k2
I'm also interested if it's possible to say so with certainty.

------
kentonv
Sandstorm.io has baked something like this into its basic packaging tool for
about a year now, except based on FUSE rather than fanotify. Really helps cut
down package sizes - many are 10-20MB despite containing all userspace
dependencies of the app. [https://blog.sandstorm.io/news/2014-05-12-easy-
port.html](https://blog.sandstorm.io/news/2014-05-12-easy-port.html)

------
derefr
The 80/20 solution here is to just find the few files that take up the largest
amount of space and are clearly pointless to your app, and remove them. The
USB hwdb, for example. Also, trimming down the timezone and locale DBs to just
the ones your app runs on (hopefully UCT and UTF-8) should help—unless your
app has to deal with data containing user-defined datetimes/charsets.

The other interesting thing to try, if your app's problem isn't so much
library-dependencies but instead Unix shell dependencies, is to use a Busybox
base image. Apps whose runtimes are already sandboxed VMs, especially, usually
work great under Busybox: the JVM, Erlang's BEAM VM, etc.

------
rwmj
A better idea for chroots or VM images is supermin, where you copy the files
from the host filesystem.
([http://libguestfs.org/supermin.1.html](http://libguestfs.org/supermin.1.html))

------
xorcist
Isn't the point of running an application in a container, or any chrooted
environment, to only isolate the application from the rest of the operating
system?

Then why would you start out with a complete extra operating system in there?
Why not just put the application and its dependencies in there?

To strip non-dependencies from an complete operating system sounds like a very
failure prone way to accomplish almost the same thing. You really need to
execute all code paths, which is difficult to guarantee (did you really run
your application in all locales for example?).

~~~
derefr
Any Unix-ish application (i.e. one that shells out to do something at some
point) will have a package dependency tree that ends up transitively closing
over the "base"/"essential" package-set of the OS. "Dependency" has _three_
meanings, to a packaging system, even though at run-time only one of them is
relevant. There are:

1\. "run-time dependencies" — package B needs package A installed because a
binary from B actually makes use of a file from A when it runs.

2\. "install-time dependencies" — package B needs package A installed because
B is effectively a "plugin" for A. B is theoretically useless _to the OS_ ,
except when used in the context of a sane A-like environment. This usually
also implies that B, when installing itself, will run a script provided by A,
usually to register itself in a database that A owns. This doesn't at all
imply, though, that you couldn't just directly call the binary contained in
the A package for a useful effect.

3\. "asynchronous/maintenance-time dependencies" — package B needs package A
because B does something to increase the system's entropy, and is written to
assume that the system will compensate for this by having A running.

Docker images really only need type-1 dependencies, but as you dig toward the
core of a package dependency graph, you start to see a lot more of type-2 and
type-3 dependencies. If you execute a "debootstrap --variant=minbase", pretty
much everything in there is there for type-2 or type-3 reasons.

A Docker container doesn't need to be a maintainable or autonomous OS
distribution. It doesn't need grub, it doesn't need mkfs or fsck, it doesn't
need mkinitramfs or the HAL hwdb; it doesn't need localegen, or debconf, or
even apt itself. It needs to be a baked, static collection of files related to
the application's run-time needs. But there's no demand you can make of apt or
yum or even debootstrap that will spit out such a thing.

There was a project somewhat in this vein a long time ago, for embedded
systems, called "Emdebian Baked"[1]. It was a misstep, I think, because it
focused on creating variants of packages and a secondary dependency graph;
rather than being a transformation one could apply to existing packages and
the existing graph.

I've worked on and off on creating a transformation tool—effectively, a
combination of a dependency graph "patch" that contains empty virtual-packages
for many essential-package dependencies, a file filter/blacklist, and a final
package whose installation burns away the whole package-management
infrastructure from the chroot this is executing in. I haven't been happy with
any of the results yet, though. Would anyone be interested in collaborating on
such a thing as an open-source project?

[1] [http://www.emdebian.org/baked/](http://www.emdebian.org/baked/)

~~~
xorcist
I beg to differ, but we can probably compare data points until the cows come
home.

Anyhow, even a large-ish application such as Oracle or a control system
doesn't actually use ping or dd or troff, or most parts of what a modern unix-
OS is comprised of. Most things suid are usually unnecessary, which if nothing
else does decrease the attack surface.

Most web apps probably needs nothing unix-ish at all. A chrooted PHP app
mounted noexec makes me sleep better than one running in a complete operating
system. And most server side Java apps re-invents everything unix anyway, from
mail processing to cron jobs, so they generally don't shell out as often as
you'd think.

So I would argue it's actually pretty common that your applications have a
limited set of dependencies. Especially compared to the hundreds of packages
in any minimal modern unix install.

~~~
derefr
I agree that it's common, but it's not common enough to make this into a
helpful property if you're trying to define a 100% solution. The reason Docker
exists at all, apart from just nsexec(1)ing static binaries, is that a lot of
things _do_ need an environment—not of other Unix binaries per se, but of
library assets like locales, charmaps, keymaps, geoip mappings, etc.—and then
these asset packages think they're there to provide assets for maintenance-
time functionality of a computer rather than to provide run-time functionality
to an app in a container, so they pull in utilities related to themselves,
which pulls in the base system.

If you can manage to get a working install of Postgres without pulling in half
of Debian, I would be surprised.

But yes, on the other hand, it's perfectly possible to package _some_ things,
like the JVM, in a sort of "spread-out in a directory but equivalent to
static-linked" fashion. The sort of things you see telling you up "unzip them
into /opt/thispkg" because they don't really follow any Unix idioms at all,
tend to be surprisingly container-friendly. They come from a world where
binaries are expected to be portable across systems with different versions of
OS libraries available, rather than a world where each app gets to ask the OS
to install whatever OS library versions it requires.

~~~
xorcist
Postgres is actually a good counter example to your point. It is a self-
contained application that doesn't shell out. It doesn't need to access any of
the things you mention, including charmaps, keymaps and geoip mappings.

I regularly run it chrooted without problems. You do need to understand you
use case however. Things like external database utilities and backup scripts
differ in requirements. Some of them are run outside the chroot, some don't.

It's absolutely not complicated, and if you have the faintest idea what you're
doing it's much easier to get right than the fanotify dance described above.

And a complete operating system in a chroot would sit mostly unused, and only
increase the attack surface for no reason at all. So, why?

------
errordeveloper
The exact approach described here is very extreme. It's a top-down method with
a tool. I find the tool may be of some interest, but I think bottom-up method
would be more practical . I have done some experiments with Yocto/OpenEmbedded
and about to put that out one day, once I have time to document it ...

------
social42
Why not just use a micro kernel container like OSV from cloud outs? Same
result with less effort

~~~
jtlebigot
The truth is hidden in a comment: The goal was to learn fanotify syscall using
a real world use-case. This said, when Dockerizing an application from
scratch, using an optimized base image may be a suitable option. But that's
not always the case. For instance, I often start a project from the Python
base image which contains loads of generic libraries that I will not use in a
given project but will be important for others. This is when a profiling based
approach is interesting. You get the ease of a known environment and the
efficiency of an optimized image.

------
zokier
ptrace probably would have been better solution, at least it would have
avoided the problems with links

------
lsllc
Here you 'go':

[http://blog.xebia.com/2014/07/04/create-the-smallest-
possibl...](http://blog.xebia.com/2014/07/04/create-the-smallest-possible-
docker-container/)

------
nathwill
Why not ldd?

~~~
sophacles
I've been playing with a project to do this. The first major obvious problem
is anything that uses dlopen won't necessarily get all that it needs.

~~~
errordeveloper
Yes. Grepping the code or fighting runtime errors are two complementary
approaches I can think of... Not sure if there are other methods.

------
pure_x01
another way of achieving this
[https://github.com/PerArneng/fortune](https://github.com/PerArneng/fortune)

------
guidob
While a standard base might be bigger, it does make it easier to cache when
you use it in most images. A lot of smaller specific images will mostly be
unique.

------
krakensden
Does CAP_SYS_ADMIN still leak out of containers? I know at some point running
with that meant you were root on the host...

~~~
justincormack
Thats only to find the files, not afterwards.

------
Immortalin
Could this method be used as a reversed way of creating Unikernels?

------
SandB0x
If only there were some way to describe in a few lines of text what an image
should contain?

~~~
Animats
Do Docker images really have to contain an entire bloated Linux distro? Even
for Xen, which, as a hypervisor, provides fewer services than Docker, it's
possible to write applications which run directly under Xen.

~~~
errordeveloper
They don't have to, one can run static binary without any problems. It's just
that most people keep throwing in a whole distro...

~~~
iso8859-1
You can't make a truly static binary with glibc, so almost no one has a
toolchain that is able to do it.

~~~
michaelmior
What do you mean by "truly static"?

~~~
iso8859-1
One that doesn't try to load dynamic objects on runtime like glibc does, if
you use certain functions.

