Hacker News new | past | comments | ask | show | jobs | submit login
How did we end up with containers? (tedinski.com)
133 points by mmphosis on April 5, 2018 | hide | past | favorite | 74 comments



This is mostly a good post, but he doesn’t mention one huge security disadvantage of containers: When application artifacts all include their own versions of dependencies locked at build time, it becomes very hard to update a component that has a security fix. It’s somewhat tractable if you own all the images — you “just” have to rebuild every application with the vulnerable component. But once you start accepting prebuilt images from elsewhere, you are at the mercy of the external source unless you want to fork the Dockerfiles.

It’s the same lesson that a lot of people learned about static linking, with the zlib vulnerabilities years ago.

https://www.linux.com/news/linux-kernel-netscape-affected-ma...


this is really about static vs dynamic linkage, regardless of the mechanism you use to effect run or build time binding of dependencies.

there are as always tradeoffs, but a few more arguments against dynamic:

by opening up a path (automatic updates) to a third party, you're providing a wide vector for those updates to cause new* problems (correctness, security) for your deployment, as well as fix old ones

transitive api compatibility across these upgrades is more of a statement of faith and effort of will than any real guarentee. in reality the compatibilities between the various components are piecewise, and if the graph gets big enough it can be really hard to find a compatible set of libraries.

runtime binary integration is always strictly worse than build time source integration, since there is tooling in the latter promoting consistency, and basically none in the former

since you'd really like to be able to deploy changes in your own* software with very little effort - dependency updates can ride on the same train for free.

presumably you have to have some intervention to restart your service against the new dependencies regardless. at that point its a question of the relative complexity of a dynamic restart of an existing service vs a build and deploy. in the limit these are both buttons.

from that perspective, dynamic updates are a helpful crutch for people who don't have a good handle on their own build, test and deploy. thats pretty far from being an arguable fundamental good.


> this is really about static vs dynamic linkage, regardless of the mechanism you use to effect run or build time binding of dependencies

Are you familiar with the functional package management approach of tools like Nix (https://nixos.org/nix/) and Guix (https://www.gnu.org/software/guix/)?

They use dynamic linking but store all packages in a content-addressed format, outside the usual FHS. Packages are free to rely on separate versions of their dependencies as needed. Packages are defined in a 'source-based' format, but because of the content-addressable storage strategy, both Nix and Guix can offer transparent binary caching.

Guix folks in particular are working on verifying the integrity of binary caches, so you don't have to blindly trust them, and they've developed a system they call 'grafting' which can safely and quickly replace vulnerable packages without recompiling.

Both Nix and Guix have some container infrastructure as well. I'm not as familiar with the Guix side, but Nix has tooling for deterministically generating Docker images and systemd-nspawn containers, so you can leverage these packaging tools at the same time as existing container deployment infrastructure.

I like your analysis of 'build time' vs 'runtime' integration of dependencies, and I think you might enjoy playing with/critiquing Nix and Guix, because they address these problems in a strange and powerful way. They use dynamic linking for deduplication and small upgrades, but try very hard to integrate everything they can at build time. I'd be interested to see where you think their approach is strong and weak, if you ever get a chance to try it out.


i dont really deal with these issues directly anymore, so no, I haven't had a chance to use any tools in the Nix family. but sure, once you eliminate the ambiguity around 'which bytes' then it hardly matters whether you distribute them in one chunk or several chunks.

What makes Nix more interesting than just that is that it actually provides tools to manage the dependencies and build. As much as I think owning your whole world is important, it can really consume alot of engineering resource just due to the state of the world. Since Nix signatures are transitive (as I understand it) it also captures the versions of the included tools, which is usually too far for most organizations to go. And while issues having to do with compiler versions are rare, they can really consume alot of energy trying to track down.

So, without having any firsthand experience, it seem like its as dependable as static linkage, and may be quite a bit more convenient.


I can't help but feel that twenty years from now we'll be tut-tutting because someone discovered that some old company is still running containers built in 2018 that have massive security vulnerabilities in them.

There's a certain positive aspect to the treadmill effect you get with operating system upgrades that gets blunted once you can run another version of everything in a container nobody has rebuilt in ages.


Although the counter-argument is, I've seen a whole bunch of cases where OSes weren't updated because this one old app wouldn't work on the new thing, so unmaintained legacy apps are always a problem.


True. Can't update python because we are running our internal tool Dingus 2.1 on this machine.

But in that case you might have 3 teams putting pressure on the Dingus team to upgrade their stuff. If they can just retort with "run it in a container dummy" then they have no incentive to ever fix it.


side-eyes windows xp box controlling a machine


> this is really about static vs dynamic linkage, regardless of the mechanism you use to effect run or build time binding of dependencies.

No. Distributions have been ensuring security for 20+ years by:

- using dynamic linking

- unbundling/unvendoring dependencies into dedicated packages

- having a security team that updates vulnerable packages in a timely manner

- having tools to update vulnerable packages without having to rebuild/reinstall/reconfigure the world

This is what happens when you don't do that:

https://blog.acolyer.org/2017/04/03/a-study-of-security-vuln...


>runtime binary integration is always strictly worse than build time source integration

That is true, except in the case where you aren't empowered to do the building. Then dynamic update is better than no update.


> runtime binary integration is always strictly worse than build time source integration

This is rather harsh on "plugin" architectures, COM, and OS libraries.


It may be harsh but its true. Don't get me wrong: plugin architectures, OS libraries etc. do make the problem much easier to solve and 99% of the time you will be OK. Its that 1% outlier, when a dependency you thought was rock-solid-stable that needs an immediate security patch but fails in weird ways when you apply it that it really shows up. And because its so infrequent you may not even have any mechanism in place to fix it quickly.


Good luck with the fork-and-rebuild strategy. It's a maddening process to track down then vendor all the upstream Dockerfiles when you want to use a DockerHub image, but don't trust DockerHub binary images.

I seriously don't get how folks are comfortable with using all these binary layers built and rebuilt from unspecified versions of other binary layers, all shipped with no cryptographic verification by default. (Transport security with SSL is good enough, yeah? sob)


We're in a situation of needing to create a very stringent SDLC + supporting tooling for use in safety-critical software development, and it's sometimes surprising how much of the traditional ecosystem (and thus convenience) we have to opt-out of because of how many layers of weak trust + verification we'd have to be willing to take on the liability for using without meaningful ways to measure how exposed to risk we'd be.

It's thankfully a very interesting problem to have to address, but it does mean that our time-to-market is inherently atypical compared to others.


I worked on a code signing library years ago that fed data to safety critical systems. Other than the one coworker who suggested we add a sign-off step where we manually verified the SHA-1 hashes (which caught that we had recompiled one of the libraries ourselves, oops!) I had to defend my decision to check all of the dependencies into version control and manually update them.

This class of problems has been around for ages, the noise is just getting louder every day.


It’d be a great strategy for Docker, considering what looks like an industry convergence on K8S as an orchestration platform.

Double down on the Dockerfile format and abstract things that would otherwise be frozen in time, taking ownership of some aspects of DockerHub that are currently managed by contributors. They do this already with official language images but there has to be more to it outside of that than expecting people to figure out the intimate details of each base image while keeping them up to date across OS versions and library versions.


I think they are doing something along those lines with linuxkit: https://github.com/linuxkit/linuxkit.


Removing as many layers as possible is one way to mitigate this: adding a static binary in a `FROM scratch` container, for example, where all libraries are compiled from sources that you verify the best you can.


Love this comment. And especially with compiled languages like Go and the multiple build stages feature in Dockerfiles, this makes it incredibly easy to do just that.


One thing I forgot to mention is that you are still trusting trust – the builder needs to be verified for the output to be verified.


True, but really the verification problem isn't nearly as bad as the fact that people are running perfectly valid year-old images with year-old vulnerabilities permanently ensconced.


After using https://github.com/coreos/clair to scan some official images, I was blown away by the amount of exploits in every one.

Seems like the only path forward if security is a priority is to build the base layers from scratch...


Good thing hardly anyone actually cares anywhere near as much about security as they claim to.


Why's that a disadvantage of containers versus, e.g. third-party libraries used by an application?


You can update a shared library without regard to the applications using it. You just "apt update" the vulnerable library and all applications get it. (In theory--but the other side of this argument is that occasionally that will trigger a bug.)

If it's a static library, see my zlib comment. Those have the same problem.


> You just "apt update" the vulnerable library and all applications get it.

And restart every effected service. You can't "just" run the apt update; the binaries using any shared libraries must be restarted so as to load the newer version of the library. I find this is often the harder problem to solve. (If you can just reboot the machine, do that, it's easier; otherwise you have to track down who was using the old binary and restart each one of them…)

(sorry, this is a rather critical step that I've seen missed before.)


CentOS has a great utility called needs-restarting which finds processes which have been updated and need to be restarted. I just stick it in a cron job and I have an email waiting for me in the morning if I need to do anything.


Debian and child distros also have this via the needrestart, packaged under the same name, or checkrestart, in the debian-goodies package.


> If you can just reboot the machine, do that, it's easier; otherwise you have to track down who was using the old binary and restart each one of them

Of course it is arguable that if your system is setup so that you can not afford to reboot a single server, then you are in pretty poor shape to begin with. When I was young, long uptimes were such cool thing. Nowdays I see them more as a liability or smell that something is iffy.


Oh, I completely agree. I think what soured me is when something like OpenSSL pushes a critical bug. It affects basically everything, so you're close to guaranteed to learn what machines are snowflakes.


Yeah, I wouldn't say this is common knowledge. It's not terribly hard though, on debian-based systems you can just use checkrestart.


as far as I know as long as a shared library is loaded you can't just restart every effected service. the os would first need to unload the shared library, thus if two services need the same library you need to stop both first and start them again.

thus making the "positive" side of shared libraries basically look very bad on paper. I think that a lot of systems are affected by that problem. I like shared libraries for a desktop os and applications where you often reboot. But for a server? Hell it's easier to upgrade golang applications/java applications (if the jvm has no bugs and since j9 even bundled) is "saner" to upgrade.

also what a lot of people just don't get is that a sane system mostly only has one service per os/container. thus docker pull vs apt update is basically the same.


> the os would first need to unload the shared library, thus if two services need the same library you need to stop both first and start them again.

No, that's not how it works, at least on Unix systems. Shared libraries are mapped into each process's address space separately; there's no system-wide mechanism that prevents different processes from running different versions of the same library at the same time.


If you try to write a file backing an executable mapping you get ETXTBSY (or not, if it's NFS/some other broken FS). So what everyone is doing is unlink the old file and create a new one. Unlinking preserves open file descriptors, extending to mappings.

Note that often the actual backing files are disjunct anyway and only symlinks get replaced, due to versioned .so names (as in libfoo.so.2.4.5, libfoo.so.2.4@, libfoo.so.2@, libfoo.so@, you've all seen it).


Mostly scale, 3rd party libraries are a subset of the dependencies you see in a container.


You could solve the problem with containers, but then you've re-implemented the same problem. You could have an authentication microservice that other services use that way authentication is handled in one place. If there is a security issue with the code in authentication, updating the one container fixes the whole system.

But doing this has the same problem. You've just re-implemented DLL hell as container hell.


> This is mostly a good post, but he doesn’t mention one huge security disadvantage of containers: When application artifacts all include their own versions of dependencies locked at build time, it becomes very hard to update a component that has a security fix.

I absolutely agree. So how do we fix this?

Well, first we need to identify what's broken, and then we can talk about how to fix it. We could look at container images and see that, in the pursuit of reproducibility/determinism, they make it infeasible to make critical updates; from this we could infer that determinism is the root of our security woes, and that we have to choose between having a run-time environment that you can depend on, or being able to promptly make critical security updates.

Fortunately, that a false dichotomy. Actually, it's even better than that: it turns out there's a third option that gives you all of the above:

1. The ability to quickly make security updates

2. A deterministic and serializable run-time environment

3. A deterministic and serializable build-time environment

Note the addition of bullet point #3. In something like Docker, you can successfully build an image from a Dockerfile today, and yet it could fail to build tomorrow. You can build a Dockerfile on your machine, but it might not build successfully on my machine. What I am claiming is that there is another approach to containers that would additionally provide determinism when you're building the container's root file system, whereas e.g. Docker only gives you determinism in the sense that a previously built image can be passed around.

So what system could provide all of the above? How could we make that work?

(If you get bored with the details, feel free to skip down to the spoiler after TL;DR)

We'd first need a deterministic build mechanism; broadly speaking, one in which we could build e.g. executables, fetch tarballs, construct file-systems, etc. To be deterministic, we our build system must have certain properties:

* "normal" builds must be cut off from network access, whereas

* A special "fixed output" build type could e.g. fetch tarballs, provided that e.g. a SHA-512 hash is given beforehand

* Each build specification must specify what other build specs it depends on. Each build environment is constructed such that processes therein are only permitted to see/access files that are the resulting artifacts of those explicitly listed dependencies. Arbitrary read/write access to the host file-system is not permitted.

* The resulting build artifacts go in their own immutable prefix, something like, let's say, /store/<hash>-<name>, where <hash> is a unique key derived by hashing the build specification, and <name> is a friendly name for the build artifact. To provide a concrete example, if you were to build `gcc`, you'd have something like /store/4r5kszyy0iirc5agfah45lvz7mnnsrb4-gcc-7.3.0/{bin,include,lib,lib64,libexec,share}, which would all contain the usual suspects. Note that when we hash the build specification, that spec refers to its dependencies, which in turn have their own unique hash, and so on. As a result of this design decision, we have: support for multiple versions of the same software; updates can't break existing executables (due to, let's say, breaking ABI changes in openssl or libc or whatever); and a final point that's important enough to split out as the following bullet point:

* We can trivially construct deterministic environments: write a build spec that takes as an input all of the dependencies you wish to have in the environment, and (as the single resulting build artifact) produce a symlink tree that is the union of all the build artifacts of those aforementioned inputs. For example, you might want an environment with, I don't know, let's say gcc, python and ruby; you would end up with something like /store/m6n3y1rlz9rf06rj4spmvlkmv51angbm-container-env, which would contain bin/gcc, bin/python, bin/ruby, etc. Of course, having this environment wouldn't be much good unless you could actually run something inside of it, so that leads to the next point:

* We can take an environment (as described in the previous point) and use it as the file system for a container: just do all of the usual container/namespace stuff, and e.g. symlink the env to /usr. A byproduct of this is that rollback/forward is easy: just bump a single symlink to the appropriate env path.

* Due to our design, it's a trivial matter to serialize an environment and (re)distribute it.

* Also important: the host system can build and retain all of the container environments in the host's /store. We could then bind-mount paths into the container as necessary. As a result, when two containers depend on precisely the same software, you only pay the price of a single copy. This is unlike Docker in that Docker requires that your image's layers directly inherit from each other; if you have, say for argument, an Ubuntu based image and a CentOS based image, it doesn't matter if there are components that are exactly the same in both - you have to pay the price of two copies.

TL;DR -- TL;DR -- TL;DR --TL;DR

We now have a system in which we can construct recipes to build container environments, and those recipes can be deterministically built, serialized, distributed, and then deserialized and used as a reproducible environment. Thanks to this, it's easy to tweak an existing recipe to bump up, say, openssl when a critical CVE is announced. We've beaten the typical image-based approach to containers (e.g. as used by Docker) in terms of security risk mitigation, disk foot print, developer time (from otherwise fighting with broken build specs / Dockerfiles), etc.

Okay, so we don't actually have this system, we've just fantasized about one...

...unless something like this already exists!

Indeed, this is precisely what you'd get by using the Nix package manager (and you get extra goodness on top if you use the NixOS distribution): you write what amounts to a Dockerfile in a slightly different syntax, but then you get all of the above benefits.

https://nixos.org/nix/

https://nixos.org/


I forgot to mention:

If you've already bought into Docker, you can actually use Nix to construct your Docker images: https://nixos.org/nixpkgs/manual/#sec-pkgs-dockerTools

Some benefits:

* Nix provides reproducible docker images; Dockerfiles don't.

* Faster builds: Nix is better at (safely) caching the results of intermediate steps in your build, whereas a Dockerfile has to be conservative (if you so much as change the whitespace in a RUN command, that step (and all subsequent steps) have to be re-run).


Isn't this what Void Linux does too?

Previous discussion: https://news.ycombinator.com/item?id=16670337


The description at the end of the article about how package management doesn't have to be done in a way that shares all the files sounds a lot like a description of https://nixos.org/nix/


I feel like containers (at-times) are the artifact organizational disfunction, the inability for teams to collaborate and a lack of engineering quality.

What’s sad is that I believe this to be true for the overhwelming majority.


Mixed bag: containers applied right can solve some real problems, containers applied wrong cause some real problems.


I can't seem to find any easily accessible documentation on this, but it sounds like a container is basically an chroot jail: you copy over all the libraries and files into a subdirectory, chroot into it, and run. Am I right? And if so, what took people so long? Chroot has been around for quite a while.

The article sort of makes it sound like a lightweight version of containers would be putting everything in its own directory, rather than scattering everything in /usr/lib, /usr/bin, /usr/share, etc. Which is basically macOS' app bundles. I've never understood why other OSes didn't steal that idea. And while I hear a lot of griping about iOS and static libraries, basically what iOS is doing is creating a container, only since they call it an "application bundle" instead of "container" they get all this hate. But... Grandma Tillie (or whoever the user story for Linux on the Desktop back in the The Day was named) can actually figure out how to delete her Mac and iOS apps.

Seems like iOS is kind of containers writ large.


chroot is a filesystem namespace, and containers are basically just that plus some other namespaces. Prior to containers getting really popular I was using debian's schroot project and prior to that I was using solaris zones. There were other similar technologies prior to that but I wasn't really technical or even alive at the time. It's not new.

Here's what's made them so popular lately IMO. Docker improved the usability by adding the container registry and adding some better tooling around running a container.

Compare installing schroot plus configuring a new container plus running debootstrap plus potentially configuring lvm/btrfs/directory snapshotting to:

docker run ubuntu:lts /bin/bash


Chroot is just part of it, there’s also cgroups and namespaces which are more recent.


I like the focus on artifacts and repeatable testing.

Are full blown containers even necessary though? The newest version of systemd can do filesystem namespaces, and use cgroups to limit access to resources. http://0pointer.de/blog/projects/changing-roots


Not even every Linux distro uses systemd, to say nothing about other systems on which one may want to host containers. Remember, there's more to containerization than just Docker and its consumers and dependencies.


Which production grade distros don't use systemd?


Gentoo!

ducks


I think RHEL 6 also doesn’t use it. But then you have bigger problems :)


What do you mean by production grade?


A distro a sane person would put into a production environment. EL/Debian for example.


Void Linux


Slackware.


I have a less charitable explanation, but a container fan on our team convinced me recently to switch to fully static binaries which did indeed simplify many things. So there is something to be said for this in an age of cheap storage.


Static linking was the norm until someone figured out that all the copies of glibc were taking up a lot of space in a time when disk space will still expensive. Today it makes much less sense than it did back then but there is one more advantage: updating a library (which is trivial because almost all these systems are online all the time now) will automatically cause all new instances of the rest of the software on the system to use the new library without having to re-link all the binaries.


"there is one more advantage: updating a library (which is trivial because almost all these systems are online all the time now) will automatically cause all new instances of the rest of the software on the system to use the new library without having to re-link all the binaries."

In Windows this is called "DLL hell". Too often the new version will have some subtle incompatibilities breaking a lot of stuff.

I am a fan of static linking or putting DLLs into a dedicated folder so these shared libraries are actually not shared.


Windows does a good job of versioning the actual system DLL's like user32.dll, kernel32.dll, and so on.

The run-time libraries on top of this aren't part of the system. Each application brings its own and so of course hell will break loose if you mix and match.

Windows has no malloc or printf (not for public use anyway).

In the POSIX world, the user-space C run-time support is part of the platform so it is carefully versioned, just like kernel32.dll on Windows.

User space compiler run-time support isn't part of the platform on Windows; it is not that carefully versioned. Everyone is supposed to provide their own. If you have a compiler from Acme, Inc, you use the Acme C library with Acme malloc and Acme printf.

> putting DLLs into a dedicated folder

Which actually works on Windows, that's the thing.

Searching for shared libs in the same directory as the executable is very nice: like me.c finding its #include "me.h" next to it.

The Unix people didn't pay attention to their own programming language when they designed shared libs.


Most compilers used by app developers don't use this, but...

link /dump /exports c:\windows\system32\msvcrt.dll

    Dump of file c:\windows\system32\msvcrt.dll
    Section contains the following exports for msvcrt.dll

    00000000 characteristics
    57898F96 time date stamp Fri Jul 15 18:36:22 2016
        0.00 version
           1 ordinal base
        1317 number of functions
        1317 number of names

    ordinal hint RVA      name
    ....
       1160  487 00019DE0 malloc
    ....
       1183  49E 00048940 printf


Most compilers used by app developers don't use that for good reason: Microsoft says that it is an unsupported, undocumented library which is off limits to application developers. Programs that are part of Windows itself use it.

https://blogs.msdn.microsoft.com/oldnewthing/20140411-00/?p=...


Hm. DLL Hell to me was on Windows to have to upgrade a DLL for which some other dependency had to be downgraded which was the original reason for starting the upgrade in the first place.

But I've seen your trick used in lots of places, in a way that trick is a predecessor to containers, you isolate a functional unit and all dependencies somehow from the rest of the system.


This is essentially what MS did with .NET and assemblies: they made it pretty difficult to install assemblies in the global assembly cache (GAC), and encouraged application developers to simply distribute/load application-specific assemblies in application-specific directories (\Program Files tree). This was all done with the prior experience gained from the system-wide DLL versioning issues that were finally resolved in (I believe) XP/2003.


Fortunately by then I'd left the Windows eco system behind.


Lucky you!


Disk space might be a lot cheaper but RAM is still expensive. If you have a dozen apps running that are statically linked to libc then you have dozen copies of libc in memory. If you have a dozen apps that are dynamically linked to libc, they all get a single copy-on-write version of the same libc. This results in the library code getting shared while library data gets copied.


That's true, but there would be 100's or even 1000's of copies of that stuff on your harddrive and with drives being a lot smaller that was a real problem. RAM was always expensive so that's a valid point.

There used to be such a thing as an optimizing linker that would throw out all library code that could not be reached.


> ...Static linking was the norm until someone figured out that all the copies of glibc were taking up a lot of space

Dynamic (with and without relocation) linking far predates the existence of glibc, much less Linux, by decades.


Make that libc or the X libraries or any other large chunk of code that was replicated over and over again in every linked binary. The principle remains the same, and yes, it predates Linux by a comfortable margin.

I'm trying to remember if the old mainframe I worked on had it but I'm not 100% sure.


It was core functionality of the design of Multics in the 1960s and Multics adopted it from prior art.


Interesting!

I worked with a PDP/11 running Unix in the 80's and it definitely did not have dynamic linking. All the library files were simply linked to the executables and stripped of symbols before shipping to save space.

Anybody here have VAX 11/7XX experience? I wonder how it was on other DEC products.


The security aspect is critical!

We cannot expect organizations to be ready to rebuild tons of software, in a short time, every time there's a new vulnerability in a popular library like OpenSSL.


> In short, there’s no reason why apt and dnf or whatever couldn’t work in a similar way npm or bundle do, except that Linux distributions seem to be disinterested. (And occasionally throw a specious fit about “bundling” when features like this are proposed. Or are unwilling to understand why the ability to install multiple versions of a package is necessary.)

Wouldn't this just require tighter integration with environment modules (http://modules.sourceforge.net/) or similar?


> How did we end up with containers?

1) You bought stuff. 2) Bonus!


Containers != Images




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: