Hacker News new | past | comments | ask | show | jobs | submit login
Crafting container images without Dockerfiles (ochagavia.nl)
193 points by wofo on Feb 6, 2023 | hide | past | favorite | 56 comments



I've been using Nix for this. It is great for building an image that contains exactly the dependencies you need. Often that is just my app, glibc and maybe a few data files. However if you need more it is trivial to bundle it in. For example I have a CI image that contains bash, Nix itself, some shell scripts with their dependencies and a few other commands and files that GitLab CI expects to be available.

I absolutely love the declarative nature and not needing to worry about each `RUN` step creating a layer which may bloat the image. For example my build process validates my inline SQL queries against the database schema. It was refreshingly simple to spin up a Postgres instance inside of the build step, applying migrations using a different CLI tool then start the build without any of these deps ending up in the final image.

The only real downside is that Nix doesn't have great support for incremental builds, so for my Rust app building the optimized build from scratch can be slow if you change a comment in a source file. But most Docker builds don't do this either (or if they do it is often buggy which I see as worse). Bazel does help this which is a notable advantage for trading off the ability to pull in other programs from nixpkgs.


> not needing to worry about each `RUN` step creating a layer which may bloat the image

Could someone please explain to me, why exactly do people avoid layers and treat them as "bloat"?

I always thought that layers are nice to have: you only need to rebuild those that had changed (typically the last ones that handle the application, while environment layers remain the same) and pulling image updates is handled much better due to only changed layers being pulled.

How is this "bloat"? Isn't that the opposite? Pushing images containing 80% of the same stuff feels more like a bloat to me.

Am I missing something here?


It depends. There is some implicit "bloat" because setting up 100 layers and accessing files through them isn't free (but caching works quite well). However the biggest problem with layers is that you can never delete data. So doing something like `RUN apt-get install foo` `RUN foo generate-things` `RUN apt-get uninstall foo` will effectively still have `foo` in the image.

It definitely depends on the use case. In many cases `RUN foo`, `RUN bar`, `RUN baz` is fine. But if you are every creating temporary data in an image the layer system will keep that around. This is why you often see things like `RUN apt-get update && apt-get install foo && rm -r /var/lib/apt`. You squeeze it into a single layer so that the deletion of the temp files actually avoids image bloat.


It is possible with build stages and copy from, but yeah not super trivial.


Definitely not trivial, but staged builds are my go-to solution. Depending on the specifics of the tech you're including it can be a lot easier than figuring out how to clean up every little build artifact within a layer - just add a second FROM line and copy precisely the pieces you need in the final image, and nothing else.

I also think it makes the build stage a lot easier to follow for people who aren't as familiar with Dockerfiles and all the quirks that come with optimizing a final image.


Exactly. You need to remember to do it and restructure your build at least slightly to do so. It isn't hard but non-default and annoying.


Depends very much on the specifics of what the RUN steps are doing and the order of them. One issue is that just changing files will often create a layer with another copy of those files with the different attributes (e.g. chmod) or possibly a layer with an empty directory for files that are deleted. That means you have very similar content in two separate layers which creates bloat.

The COPY command now supports performing a chmod at the same time to help with this issue. Another common trick is to have a layer that performs an "apt update" followed by installing software and then deleting the contents of /var/lib/apt/lists/ so that the layer doesn't have unnecessary apt files.

When developing scripts for running inside Docker, I'll often try to have the copying of the script as late as possible in the Dockerfile so that the preceding layers can be reused and just a small extra layer is needed for the script changes.


I tend to agree, but I think the angle they're going after is the mental load of ensuring consistency in those layers

A very simple example of this is installing packages and clearing the generated metadata all in one chain of a single RUN.

It gets more complicated when you look at it from the 'reproducible builds' POV; subtle binary changes from using things like dates/timestamps


Layers rebuild those that changed and every layer defined later in the Dockerfile.


Because cargo cult.

Apparently its better value to waste human time trying to debug a failed docker build with 200 commands strung together with && vs letting your runtime just mount and flatten extra layers.


I suspect folks are doing what they naturally do whether it's playing factorio or playing docker... optimize


I built a service for doing this ad-hoc via image names a few years ago and it enjoys some popularity with CI & debugging use-cases: https://nixery.dev/


I've definitely used this in CI before. It is useful for base images for docker-based CI.


I put together an example that mixes Nix and Bazel a couple of years ago: https://github.com/jvolkman/bazel-nix-example

Nix is used to build a base Docker image, and Bazel builds layers on top.


I've been doing the same with Guix. However, more so lately with declarative, lightweight VMs. It's nice to be able to painlessly make throw away environments that I can easily log into via SSH.


Do you have an example or an article demonstrating this? I just recently had the desire to build systemd-nspawn images declaratively, but couldn't find much other than Dockerfiles.


Sure. Putting a simple binary in a container: https://gitlab.com/kevincox/tiles/-/blob/a2b907eab7a84989c94.... This is the trivial case where you just stick the main executable in the command string. Nix will automatically include the dependencies.

The GitLab CI example is a bit more complex. It requires some commands that are unused by the image and some config files: https://gitlab.com/kevincox/nix-ci/-/blob/efe6f4deedc50c2474...



To get Rust incremental builds, did you consider using something such as crane https://github.com/ipetkov/crane ?

And regarding OCI images, i built nix2container (https://github.com/nlewo/nix2container) to speed up image build and push times.


Someone is working on consuming nix packages inside Bazel.



My company uses Bazel's rules docker to build our images: https://github.com/bazelbuild/rules_docker

They're pretty great and have a lot of the caching and parallelism benefits mentioned in the post for free out of the box, along with determinism (which Docker files don't have because you can run arbitrary shell commands). Our backend stack is also built with Bazel so we get a nice tight integration to build our images that is pretty straightforward.

We've also built some nice tooling around this to automatically put our maven dependencies into different layers using Bazel query and buildozer. Since maven deps don't change often we get a lot of nice caching advantages.


We use bazel for our builds at work, I think it works quite well, but then our bazel guy left the company and no one else dares touch it aside from basic updates because it’s really complex and a bit of a dark art in itself.


Haha yeah I understand this well. I'm the Bazel guy at our company. Basic stuff is simple but as soon as you dip into more complex stuff like the queries and buildozer you quickly need to have a lot of information to be productive


Out of interest, how did you learn Bazel?

I've been trying for a while and, while I get the concepts, I'm struggling to get anywhere productive with it.


I worked at Google, then created a bunch of stuff with Bazel (ie most of shortwave.com is built with Bazel). I've picked stuff up over a few years and often study open source rule sets to learn how stuff works under the hood


If this ends up being a cleaner/easier way to having to workaround super expensive rebuilds for Rust given cache + deps compared to this https://github.com/LukeMathWalker/cargo-chef , reading this thread will have been a huge win for me (and hopefully others).

Whether introducing Bazel is easier/worth it, subjective I guess.


I've also enjoyed using nixpkg's dockerTools for this kind of thing too.


Huge fan of bazel, and was using rules_docker, but these still do not work under Windows last time I checked, which is my main dev platform. I know I can probably tweak it through WSL2, so it's on my list of things to try.


Thanks for sharing! This will definitely be interesting to look at :)


Honestly Dockerfile sucks.

It's good enough to be adoptable everywhere and put a high burden on anyone trying to replace it but they really are not very well designed.

And while some of their problem have somewhat been reduced with multi stage builds or run mounts it's still not a grate design.

Now that I think about it a lot of things about docker are at least slightly sub-optimal, and if we limit the view to docker CLI podman beats it in a lot of points (but not all!!!). And it's often small things like `--all` variants of various commands not being available or things like the whole docker group == semi-admin nightmare (which can be avoided by now but is still the default many distros and in the past I have seen docker use banned due to security considerations in some companies).

In the end not Dockerfile or CLI but the docker desktop and similar are their product and it shows.


Biggest thing I've ran into personally (for small projects, hobbyist stuff) is:

do I need Terraform/Ansible/Kubernetes or can I just use Docker Compose?

do I need Bazel/nix or can I just use Dockerfile?

Something about not needing to install anything other than Docker/Docker Compose to "move quickly, end up with good enough, low barrier to entry up front, minimal/no tradeoffs for "maintaining" long term, not need to invest a bunch of time managing/getting other tools/pieces of software ready" kind of attractive


A simple fix would be an option to surround dockerfile commands in curly braces or something and all changes inside get flattened to a single layer.


Although Dockerfiles have the benefit of migrating existing workloads to containers without having to update your toolchain, I definitely prefer the container-first workflow. Cloud Native Buildpacks(https://buildpacks.io/) are a CNCF incubating project but were proven at Heroku. Buildpacks support common languages, but working on a Go project I've also had a great experience with ko(https://ko.build/). Free yourself from Dockerfile!


I'm not really thrilled with Cloud Native Buildpacks.

They are optimized for the Heroku style use case of being able to create a "universal builder", which will detect and build projects of various types (node, python, go. java, etc).

The problem is that the buildpacks in such universal builders are pretty much black boxes to people trying to use them in projects, and often end up handling simple cases, but failing for more complex, or if flexible enough to handle most projects, ends up being rather arcane.

Of course, you can end up creating your own buildpacks for your company that do just what you need, but even then you may have some projects with really complex requirements. For such complex projects, you made need to create a special buildpack.

Lastly, buildpacks don't provide full flexibility in the output. For example, a minimal container for a program written in Go only needs a single layer with the resulting executable set as entrypoint. You can't make that with buildpacks. In practice most "stacks" will provide some full OS environment, like an apline or ubuntu image. On top of that is a mandatory runtime layer container the `/cnb/lifecycle/launcher` executable. Finally your buildpack outputs are layered on top of that.

(Admittedly there is work-in progress that would allow specifying some other base image via Dockerfile, and you could specify "FROM scratch". This would mean an empty base layer, on which the launcher layer, the config layer, app layer, which is a little better, but still not as nice as a fully customizable output.)


(Self plug) I had the same thoughts as the author, and made this: https://github.com/andrewbaxter/dinker . Like stated in the article, if you're doing rust, go, or java all you want is to dump the binary in the image. There's no reason to do the build inside the docker vm in that case, and it's super fast, and only uses dumb filesystem access - no daemons like docker, weird wip container managers like buildah, etc.

The author glossed over actually uploading the resulting image to Fly.io AFAICT. It's not documented, but after a long session with Fly.io support it turns out that they don't actually support OCI manifests - only Docker v2 manifests unlike most popular registries I know of; if you upload an image with an OCI manifest you get a 404 when trying to launch your machine. Skopeo has an option to switch to docker manifests (--format I think?), dinker does docker manifests by default.


When I saw the title I thought it was going to be about `buildah` [1][2]

Which allows you to create images using the command line to build them up step-by-step.

[1] https://buildah.io/ [2] https://github.com/containers/buildah


I told a Red Hat interviewer that I prefer Buildah over Dockerfiles and he was dumbfounded. Ironically, it's a Red Hat created tool.


> Which allows you to create images using the command line to build them up step-by-step.

I mean.. you can also do this with the Docker CLI right?


This is one of my absolute favorite topics. Pardon me while I rant and self-promote :D

Dockerfiles are great for flexibility, and have been a critical contributor to the adoption of Docker containers. It's very easy to take a base image, add a thing to it, and publish your version.

Unfortunately Dockerfiles are also full of gotchas and opaque cargo-culted best practices to avoid them. Being an open-ended execution environment, it's basically impossible to tell even during the build what's being added to the image, which has downstream implications for anybody trying to get an SBOM from the image for example.

Instead, I contribute to a number of tools to build and manage images without Dockerfiles. Each of them are less featureful than Dockerfiles, but being more constrained in what they can do, you can get a lot more visibility into what they're doing, since they're not able to do "whatever the user wants".

1. https://github.com/google/go-containerregistry is a Go module to interact with images in the registry and in tarballs and layouts, in the local docker daemon. You can append layers, squash layers, modify metadata, etc. This library is used by all kinds of stuff, including buildpacks, Bazel's rules_docker, and all of the below, to build images without Docker.

2. crane is a CLI that uses the above (in the same repo) to make many of the same modifications from the commandline. `crane append` for instance adds a layer containing some contents to an image, entirely in the registry, without even pulling the base image.

3. ko (https://ko.build) is a tool to build Go applications into images without Dockerfiles or Docker at all. It runs `go build`, appends that binary on top of a base image, and pushes it directly to the registry. It generates an SBOM declaring what Go modules went into the app it put into the image, since that's all it can do.

4. apko (https://apko.dev) is a tool to assemble an image from pre-built apks, without Docker. It's capable of producing "distroless" images easily with config in YAML. It generates an SBOM declaring exactly what apks it put in the image, since that's all it can do.

Bazel's rules_docker is another contender in the space, and GCP's distroless images use it to place Debian .debs into an image. Apko is its spiritual successor, and uses YAML instead of Bazel's own config language, which makes it a lot easier to adopt and use (IMO), with all of the same benefits.

I'm excited to see more folks realizing that Dockerfiles aren't always necessary, and can sometimes make your life harder. I'm extra excited to see more tools and tutorials digging into the details of how container images work, and preaching the gospel that they can be built and modified using existing tooling and relatively simple libraries. Excellent article!


We've been baking this functionality directly into the .NET SDK for a couple releases now: https://github.com/dotnet/sdk-container-builds

It's really nice to derive mostly-complete container images from information your build system already has available, and the speed/UX benefits are great too!


Systemd's mkosi is worth checking out too: https://github.com/systemd/mkosi

I don't think it generates docker/OCI images directly, but it definitely can generate a tarball of the final filesystem image contents and then crane or a similar tool could package it up into an appropriate OCI image.

For just docker usage it's probably overkill, the main advantage would be it can build other image types like adding a kernel and init to be a fully bootable iso or VM image, in addition to just getting a container image.


Can Kubernetes pods/workloads run anything other than Docker/OCI format with ease or is that basically the standard?


It can, kubevirt is a project for running VMs https://kubevirt.io/ and there have been more esoteric things like WASM (https://github.com/krustlet/krustlet).

Virtual kubelet likely allows you to plug in almost any kind of workload: https://github.com/virtual-kubelet/virtual-kubelet


I don't know how big of a market there is for tinkerers like myself who aren't afraid to give DigitalOcean (VPS host) their money, but would rather subject themselves to hosting their own poor man's k8s cluster instead of pay to use theirs.

It's like a subset of tinkerers inside of a larger group of tinkerers.


For static binaries "crane append" + "crane push" eliminate the need for "docker build".

crane: https://github.com/google/go-containerregistry/blob/main/cmd...


Crane is also exposes a good library for doing docker operations if you want to build your own tooling on top


8 steps to add a layer. Now we see why Dockerfiles existed and why Docker succeeded even though it "just" recombined open-source tooling.


I use Packer to create images for all but the most simple cases, I'd never go back to all those &&\'s and edge cases.


For creating images without docker from conda/mamba environments, there's also the existing `conda-docker` tool https://github.com/conda-incubator/conda-docker.


I think a lot of the complexity with creating containers these days is the pushing to a registry part. Last time I looked at doing it with curl, it was a lot more involved than the tar dance. The article glosses over this, I think.

Also not sure how you build multi-arch containers.


I've been trying for a while to build multi-arch containers in CI.

I was trying to do it using `manifest-tool`[0] which seems like is mostly deprecated in favour of `docker manifest`

If I use manifest-tool on my local machine against GCR (google container registry) it seems to work, but if I use it in CI with `docker-credentials-gcr` it does not work at all.. go figure.

What you would generally do is create a docker image for each of your variants (arm64, amd64, windows, linux, macos(who does this?)) then merge them:

    manifest-tool push from-args \
        --platforms linux/amd64,linux/arm/v5,linux/arm/v7 \
        --template foo/bar-ARCHVARIANT:v1 \
        --target foo/bar:v1
[0]: https://github.com/estesp/manifest-tool


> Also not sure how you build multi-arch containers.

not sure whether this is the preferred method or what, but i use `docker buildx` -- eg:

docker buildx build --tag myimage:mytag --platform linux/amd64,linux/arm64 .

this builds the image inside a qemu vm for the non-native architecture


Yes, that's what I use currently. I meant without docker/just with tar and friends.


You can use ansible to push self built images to amazon’s ECR (elastic container registry). Works pretty well for my deployments.


Awesome, thanks for this




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: