
Linux Container Internals - deepakkarki
http://docker-saigon.github.io/post/Docker-Internals/
======
kragniz
If you want to learn about linux container internals, I can recommend just
trying to implement one yourself in some random language for fun. I wrote a
basic runtime in python that can run docker images:

[https://github.com/kragniz/omochabako](https://github.com/kragniz/omochabako)

Quick demo:
[https://asciinema.org/a/77296?speed=2&autoplay=1](https://asciinema.org/a/77296?speed=2&autoplay=1)

~~~
decebalus1
that sounds pretty cool. Sorry for ignorance (heavy Windows background) but
where does one start with something like that? Is there a RFC type of thing
for these images or runtime spec?

~~~
gens
[https://lwn.net/Articles/531114/](https://lwn.net/Articles/531114/) and
[https://www.kernel.org/doc/Documentation/cgroup-v1/](https://www.kernel.org/doc/Documentation/cgroup-v1/)
(or the newer
[https://www.kernel.org/doc/Documentation/cgroup-v2.txt](https://www.kernel.org/doc/Documentation/cgroup-v2.txt)
)

Actually all that was said (and somewhat explained) in the article in
question. Although i would recommend to try cgroups directly with the kernel.

------
Demiurge
The url is correct, 'Docker-internals', but the title, and further
equivocation of 'Linux Containers' and 'Docker', is a bit confusing. I'm a
happy lxc/lxd user, and had to stop reading because of the cognitive
dissonance.

~~~
agumonkey
I also thought it was about LXC. Sad.

~~~
skeptic2718
Why is it sad?

~~~
dozzie
Because LXC on a technical level is much simpler, doesn't make mess in network
and filesystem setup, and is easy to understand. Docker does some heavy magic
for things to work for programmers (the ones who can't be bothered to learn
how ethernet bridging or IP routing work), so the whole thing feels _brittle_.

In short, LXC is vastly underappreciated.

------
xorcist
OpenVZ is dated 2005 in the document, which makes FreeBSD look a bit alone
back in 2000, but that's not really accurate. OpenVZ was just a re-branding of
Virtuozzo, which was released in 2000, in an effort to upstream it.

Virtuzzo got really popular quickly in the cheap webhosting market, as a more
secure and powerful alternative to shared hosting. I worked with it a lot in
the early 2000s. I don't think they ever outgrew that market however, and when
"real" virtualization came with VMware and friends, that's where all the money
went.

It's only fair to mention where it really started in the Linux world. (Also a
bit funny to see the pendulum of tech swing back again. It's about time!)

~~~
so0k
Cool! thanks for info. I've added a link back to these comments to the blog
source (haven't re-generated the static html yet though)

------
siegecraft
This is from February, which is ancient in Docker years, but the container
history and references are quite useful.

~~~
so0k
yet it covers containerd ;)

------
SEJeff
Also relevant:

[https://0xax.gitbooks.io/linux-
insides/content/Cgroups/cgrou...](https://0xax.gitbooks.io/linux-
insides/content/Cgroups/cgroups1.html)

------
peterwwillis
I have some small critiques of some of the hyperbole in the article:

 _" Package managers failed us due to shared libraries version differences
causing dependency issues"_

Incorrect. The software administrators (read: The Users) failed to understand
that installing duplicate incompatible software does not work, was never
intended to happen, and shouldn't even be possible. But users are stubborn and
will force a conflict if at all possible.

Containers allow users to _side-step package management_. It doesn't replace
it or help it at all, because it completely ignores all the work gone into the
package. Imagine putting on tennis shoes, and then trying to put on snow
boots. Containers give users a second pair of feet.

And this is not a container innovation. Chroot environments have been
providing the exact same functionality (installing side-by-side conflicting
packaged software in a simple manner) for decades. You don't even need any
extra software to use it.

 _" Docker provides a self-contained image that is exactly that same image
running on your laptop vs in the cloud while i.e. Puppet/Chef are procedural
scripts that need to rerun to converge your cluster machines. This enables
approaches also know as Immutable Infrastructure or Phoenix Deploys."_

Unless you designed your software to be immutable, it probably isn't. Software
changes as it runs, and different hardware changes software differently, so at
the best this claim is disingenuous. Different networks and systems
interacting in different locations add complications. If you tested it on your
laptop, _do not expect it to run the same in production, period._

 _" Before Docker, LXC would create a full copy of FileSystem when creating a
container. This would be slow and take up a lot of space."_

Loop and COW filesystems (Unionfs, Aufs, Overlayfs, etc) on Linux pre-date
Docker by a long time, and were used with containers and container
alternatives.

\--

I thought i'd see more about _linux container internals_ , not a description
of how Docker works, but I guess the host name should have been a dead
giveaway. Don't read this if you want to know about the kernel.

~~~
geofft
> _Incorrect. The software administrators (read: The Users) failed to
> understand that installing duplicate incompatible software does not work,
> was never intended to happen, and shouldn 't even be possible. But users are
> stubborn and will force a conflict if at all possible._

Why doesn't it work? By whom was it never intended to happen? Why should it
not even be possible?

I've shipped production software that - very carefully - links multiple
versions of OpenSSL _within the same process_ , so it's not a matter of some
law of physics that I can't have two versions of OpenSSL on my system used by
separate binaries. It's a design choice that this is how things are going to
work. You don't need containers to pick a different design choice, yes, but
neither do you need chroots - just careful use of shared library versioning
and symbol versioning.

Containers won because containerization tools made all of this easy. Nobody
wants to piece together shell scripts to do things in chroots any more than
they want to piece together shell scripts to set LD_LIBRARY_PATHs. (And way
more commercial software actually does the latter, because they want to side-
step package management because they have no idea what libraries are on your
system.)

~~~
peterwwillis
* > Why doesn't it work? By whom was it never intended to happen? Why should it not even be possible?*

It doesn't work because it's incompatible, and so it's complicated. If I build
A with B1, and you build C with B1.1, and the user wants both A and C, they
need both B1 and B1.1. Which is fine - IF they built B* with unique symbol
names, and built their apps against those unique symbol names, and if everyone
else in the world follows exactly the same convention. Of course, if anything
else changes (cpu architecture, features, ABI, whatever) everything may break
anyway. But in general the biggest problem is not everyone builds software the
same way.

Both the software developers and the package managers never intended for
incompatible software to be installed at the same time. The software devs
could make it handle these cases, but they usually don't, so it doesn't work.
The package managers could package their software uniquely every time, but
that would be annoying, cumbersome and not very useful for managing systems
("do i need to remove db3 before i install db4? what are all the packages
called? what's the order? what else will be affected? do i rename everything
and rebuild everything with names specific to this one library package name?"
etc).

It shouldn't be possible to install conflicting software because the package
should be built to fail to install if conflicting software exists, or remove
the conflicting software before install. But sadly there also exists the
ability to remove all these safeguards, or to install unpackaged software.

Containers are just a wrapper around existing tools, such as package managers.
They don't add functionality, they just simplify it. With Docker, you aren't
linking to multiple versions of openssl within the same process: you're
running one process in one environment with one version of openssl, unless you
intentionally get really fancy, which really isn't easy. Package managers
never failed, they simply weren't being used right.

Containers won because someone finally realized users don't care how they do
what they want, as long as they get to do it without having to know how it
actually works. Devs get to pretend they know how to deploy software or manage
systems and Ops people get less responsibility because they didn't build the
shit so they don't support it. It's a win-win, but it's still a mess, and none
of it is new or novel.

~~~
geofft
> _IF they built B_ with unique symbol names, and built their apps against
> those unique symbol names*

This isn't necessary. If you're not going to load both versions into the same
process, they can overlap symbol names. This is how Linux distro version
upgrades work: the system installs libfoo2, then upgrades binaries that use
libfoo1 to versions that use libfoo2, then removes libfoo1 when nothing needs
it any more. At all times, the system is in a working state; any given binary
will load either libfoo1 or libfoo2.

The trouble is that Linux distros tend not to want to provide more security
support for libfoo1 than they have to, so if you have software that still
requires libfoo1, the easiest approach is to use a
container/chroot/VM/whatever with an older distro release, possibly from a
different vendor, that's hopefully still under security support.

(If you do care about loading both libraries into the same process, you need
symbol versioning / two-level namespaces / direct binding / whatever your
ld.so wants to call it, which means that every reference to a dynamic symbol
specifies which dynamic library the symbol comes from. The names themselves
remain unchanged, but they're referenced by a tuple of library and name. This
works. Again, I've shipped software that would crash horribly if this didn't
work.)

> _Both the software developers and the package managers never intended for
> incompatible software to be installed at the same time._

I'm not sure that's true for software developers: I can't imagine that, say,
the OpenSSL developers do their work by replacing their system OpenSSL every
time they recompile. They already know full well how to test an OpenSSL in
~/src/openssl and keep it separate from the one in /usr/lib, without using
chroots.

It's true for package managers, but that just means that package managers are
failing at delivering a thing users want.

In particular, forcing upstream software to follow conventions and building
all software the same way, and patching things as necessary, _is the entire
job of a distro_. If two versions of a distro package conflict, that's because
the distro chose not to make them coinstallable. If only one version of a
library is available in a distro, that's because the distro chose not to make
other versions available. They might have reasons for this (e.g., security
support effort) but none of it is fundamental impossibility.

(Also, if you mean B1.1 in a semver sense, or equivalently a libb.so.1.1
sense, upstream is promising that it's backwards-compatible with B1, such that
A can dynamically use B1.1 despite being compiled against B1. If that's not
true and B1.1 is ABI-incompatible with B1, either upstream or the distro needs
to rename B1.1 to B2 / rename libb.so.1.1 to libb.so.2.)

> _Devs get to pretend they know how to deploy software or manage systems_

I submit that the only valid measure of whether you know how to deploy
software or manage systems is whether systems get deployed or systems get
managed.

~~~
peterwwillis
I'm not saying you can't run software with duplicate libraries installed. I'm
saying there is conflicting software, both on individual distros and across
distros, that is simply _not currently_ created in a way that can be installed
side by side and run without extra steps involved. Specifically conflicting
file names, but also conflicting functionality which extends beyond just
shared library conflicts. And i'm saying that Docker serves the function of
"fixing" a problem which package managers did not create.

 _> I submit that the only valid measure of whether you know how to deploy
software or manage systems is whether systems get deployed or systems get
managed._

If you don't care at all about the result, sure.

------
mthoms
I've been struggling with fully understanding containers. This article helps
but it's a little too low level for me.

A quick question for HN'ers: If you've got a machine running say 4 docker
instances, does it help resource usage if all instances are running the same
Linux distro?

Or, since the kernel is the only thing shared between them does it even
matter?

------
MaBu
Site looks broken, because CSS is loaded over HTTP which is disabled if site
is loaded over HTTPS.

