Right now docker images and Dockerfiles are joined at the hip to the Docker daemon.
It works great for local development, but for hosted systems that run on containers, it's a dire mess. I have personally slammed head-first into Docker-in-Docker quagmires on Kubernetes and Concourse. Not knowing the particular arcane rites and having neither sufficient eye of newt nor sufficient patience to get it to work, I like everyone else in the universe gave up.
Not an acceptable state of affairs, given the many problems of Dockerfiles in themselves. Dockerfiles force an ugly choice. You can have ease of development or you can have fast, safe production images. But you can't really have both.
Kaniko is another step in the direction of divorcing docker images as a means of distributing bits from Dockerfiles as a means of describing docker images from Docker daemons as a means for assembling the images. All three are different and should no longer be conflated.
Disclosure: I work for Pivotal, we have a lot of stuff that does stuff with containers.
One thing I feel like more people need to know: Docker container-images are really not that hard to build "manually", without using Docker. Just because Docker itself builds images by repeatedly invoking `docker run` and then snapshotting the new layers, people think that's what their build tools need to do as well. No! You just need to have the files you want, and know the config you want, and the ability to build a tar file.
Here's a look inside an average one-layer Docker image:
$ mkdir busybox_image; cd busybox_image
$ docker pull busybox:latest
$ docker save busybox:latest | tar x
│ ├── VERSION
│ ├── json
│ └── layer.tar
1 directory, 6 files
• `manifest.json` contains the declarations needed for the daemon to unpack the layers into its storage backend (just a listing of the layer.tar files, basically);
• the SHA-named config file specifies how to reconstruct a container from this archive, if you dumped it from a container (and I believe it's optional when constructing a "fresh" archive for `docker load`ing);
Each SHA-named layer directory contains:
• a `layer.tar` file, which is what you'd expect, e.g.:
-rwxr-xr-x 0 0 0 1037528 16 May 2017 bin/bash
That's pretty much it. Make a directory that looks like that, tar it up, and `docker load` will accept it and turn it into something you can `docker push` to a registry. No need to have the privileges required to run docker containers (i.e. unshare(3)) in your environment. (And `docker load` and `docker push` work fine without a working Docker execution backend, IIRC.)
I wrote a few blog posts awhile ago on this where I reimplemented docker pull and push in bash:
It even got up to basic image modification support.
Disclosure: I work on kaniko and lots of other things that construct docker images without docker at Google.
Not to be confused with the similarly named “Image Manifest V2, Schema 2”: https://docs.docker.com/registry/spec/manifest-v2-2/
Oh and there’s the OCI spec too: https://github.com/opencontainers/image-spec/blob/master/spe... (and its repo)
Really enjoying all these Docker-demystifying posts, I wish I’d found these months ago while puzzling over how Dockerless worked from confusing bzl files. In fact, I really wanted Kaniko just yesterday as I poured over DinD and wondered why it was so complicated to build Docker images within Kubernetes. Now all we need is a lightweight CI wrapper for K8S jobs with GitHub webhook and kubectl apply support! :)
I've sometimes described docker images as a collection of tarballs stickytaped together with a JSON manifest.
I understand that the format is simple, but I don't want to write such a tool. I want such a tool to exist and be in wide usage, so I have assurance that it will keep up with changes, receive security scrutiny and receive improvements in features and performance.
This is a question of economics. I can write software to do anything software can do. Whether that makes sense is different. Until it has not made sense for me or many others. Given Google's clear interest in prying apart the tangle and willingness to assign fulltime engineering to it, there is a chance that we can all get out of the quagmire.
This is missing the point.
The point of the tool is to do docker builds + pushes on Kubernetes (or inside other containerized environments) securely.
If you can `docker load/push`, that means you have access to a docker daemon. If that daemon is not docker-in-docker, you have root on the machine since access to the docker.sock is trivially the same as root.
As such, to do `docker load` + `docker push` in a containerized environment reasonably securely, you do need either docker-in-docker (which is probably insecure anyways if you need the container to be privileged still).
In addition, sure you can piece together a tarball, but the point of this tool is backwards compatibility with Dockerfiles, not to be able to manually piece things together.
> If you can `docker load/push`, that means you have access to a docker daemon.
Yes†, but by manually creating a container image, you've decoupled CI from CD: you no longer need to actually have a trustworthy execution sandbox on the machine that does the `docker push`-ing, because that machine never does any `docker run`-ing. It doesn't need, itself, to be docker-in-docker. It can just be a raw VM that has the docker daemon installed (sitting beside your K8s cluster), that receives webhook requests to download these tarballs, and then `docker load`s them and `docker push`es them.
† Though, consider:
• You can talk to a Docker registry without a Docker daemon. The Docker daemon<->Docker registry protocol is just a protocol. You can write another client for it. (Or, you can just carve the registry-client library out of Docker and re-use it as a Go library in your own Go code.)
• You can parse and execute every line of a Dockerfile just as `docker build` does, without a running Docker daemon, as long as none of those lines is a RUN command. Many application container-images (as opposed to platform container-images) indeed do no RUNing. You've already got a compiled static binary from earlier in your CI pipeline; you just want it "in Docker" now. Or you don't have a build step at all; you're just "composing" a container by e.g. burning some config files and a static website into an Nginx instance. In either of these cases, you might have a Dockerfile with no RUN at all.
Combine the two considerations, and you could design and implement a `docker`-compatible executable that supports `docker build` and `docker push`, without doing anything related to containers!
(The simplest way to do this, of course, would be to just take the docker client binary—which is, handily, already the same binary as the docker daemon binary—and make it so the Docker client spawns its own Docker daemon as a thread on each invocation. Add some logic for filesystem-exclusive locking of the Docker state dir; and remove all the logic for the execution driver. Remove the libcontainer dependency altogether. And remove `RUN` as a valid `docker build` command. There: you've got a "standalone Docker client" you can run unprivileged.)
- Our team has built something exactly like you're describing https://github.com/GoogleCloudPlatform/distroless
Dockerfiles without RUN commands are technically more correct: reproducible, much easier to inspect. However, its quite limiting for the existing corpus of Dockerfiles.
I like to think of kaniko as the (pull) + build + push decoupling of the docker monolith. Other tools, like cri-o, have implemented the complement (pull + run).
Disclaimer: I work on kaniko and some of these other tools at Google
`RUN apt-get update` and then an
`RUN apt-get install -y pkgX pkgY..pkgN`
I could download each package beforehand, tar em and use docker save, but I'd want the recursive dependency tree of packages too....
The concept behind ansible-container - having the ability to create Docker, LXC, LXD or any future type or flavor of container...from Ansible playbooks...that you're already able to use to configure entire VMs or bare metal machines just feels like a much more efficient use of ops resources.
Ansible becomes portable across everything.
Also for development machines, how do you sync things between developers. I can commit a docker file change, but unless I explicitly tell docker compose to rebuild my images and containers, it will happily stick to the old version. I have to keep nagging our (3) developers to do this from time to time... what am I doing wrong?? Sorry if these are dumb questions but we’re still stuck with the basics it seems.
It's not rocket science, of course. You build an image somewhere (your local machine, a CI server, anywhere), push to a registry, and when you want run the image, you pull from the registry and run it. ("docker run" will, by default, automatically pull when you ask it to run something.)
I don't quite understand what your Compose problem is. Is the Compose file referencing images published to, say, Docker Hub? If so, the image obviously has to be built and published beforehand. However, it's also possible to run Compose against local checkouts, then run "docker-compose up --build", e.g.:
There's a whole ecosystem of tools built around Docker for building, testing, deploying and orchestrating Docker applications. Kubernetes is one. If you're having issues with the Docker basics, however, I wouldn't consider any of these systems quite yet, although you should consider automating your building and testing with a CI (continuous integration) system, rather than making your devs build and test on their local machines.
As with anything, to actually use Docker in production you'll need an ops person/team that knows how to run it. That could be something as simple as a manual "docker run" or a manual "docker-compose", to something much more complex such as Kubernetes. This is the complicated part.
let's say I update my Dockerfile and change from `FROM ruby:2.3.4` to `FROM ruby:2.5.1` and commit the Dockerfile change, merge it to master, etc.
Our developers have to remember to manually run docker-compose --build, or to remove their old containers and create new ones, which would get them rebuilt... I couldn't find something that would warn them if they're running off of stale images, or better, simply build them automatically when the Dockerfile changes.
Part of the benefits of docker is creating a repeatable environment with all sub-components on all dev machines. Isn't it?
Maybe our devs should only pull remote images and never build them, but then wouldn't I have the same problem that docker-compose won't force or remind the developers to pull unless they explicitly tell it to? And also, isn't this detaching the development process around the Dockerfiles/builds themselves from the rest of the dev process??
Edit the code, then restart Compose, and repeat. It will build each time. If you want to save time and you have some containers that don't change, you can "pin" those containers to published images — e.g., the main app is in "./myapp", but it depends on two apps "foo:08adcef" and "bar:eed2a94", which don't get built every time. This speeds up development.
Building on every change sounds like a nightmare, though. It's more convenient to use a file-watching system such as nodemon and map the whole app to a volume. Here's a blog article about it that also shows how you'd use Compose with multiple containers that use a local Dockerfile instead of a published one: https://medium.com/lucjuggery/docker-in-development-with-nod....
In any case, thanks for your suggestions. I think it's some misconception on my part about how docker-compose should behave.
The solution I've used in the multiple companies I've started is to maintain a developer-oriented toolchain that encodes best practices. You tell the devs to clone the toolchain locally and you build in a simple self-update system so it always pulls the latest version. Then you provide a single tool (e.g. "devtool"), with subcommands, for what you want to script.
For example, "devtool run" could run the app, calling "docker-compose --build" behind the scenes. This ensures that they'll always build every time, and never forget the flag.
If you have other common patterns that have multiple complicated steps or require "standardized" behaviour, bake them into the tool: "devtool deploy", "devtool create-site", "devtool lint", etc.
We've got tons of subcommands like this. One of the subcommands is "preflight", which performs a bunch of checks to make sure that the local development environment fulfills a bunch of checks (Docker version, Kubectl version, whether Docker Registry auth works, SSH config, etc.), and fixes issues (e.g. if the Google Cloud SDK isn't installed, it can install it). It's a good pattern that also simplifies onboarding of new developers.
- When we make a PR, we mark it as #PATCH#, #MINOR#, or #MAJOR#.
- Once all tests pass and a PR is merged, CI uses that tag to auto-bump our app version (e.g. `ui:2.39.4`, or `backend:2.104.9`) and update the Changelog. 
- CI then updates the Dockerfile, builds a new image, and pushes that new image to our private repo (as well as to our private ECR in AWS).
- CI then updates the repo that represents our cloud solution to use the newest version of the app.
- CI then deploys that solution to our testing site, so that we can run E2E testing on APIs or the UI, and verify that bugs have been fixed.
- We can then manually release the last-known-good deployment to production.
The two main keys to all of this is that our apps all have extensive tests, so we can trust that our PR is not going to break things, and our CI handles all the inconvenient version-bumping and generation + publication of build artifacts. The best part is, we no longer have to have 5 people getting merge conflicts when we go to update versions of the app, as CI does it for us _after_ things are merged.
0: We use pr-bumper (https://github.com/ciena-blueplanet/pr-bumper), a tool written by my coworkers, for our JS apps and libraries, and a similar Python tool for our non-JS apps.
For production you want the Docker image to be built when PRs are merged to master (or whatever your flow is). Google Container Builder makes that very easy, you can set up a trigger to build an image and push it to the registry when there are changes to git (code merged to a branch, tag pushed, etc.). Then you need to automate getting that deployed, hopefully to Kubernetes, but that is a different issue.
This feels odd to me. Isn't one of the major selling points of docker development-production parity?
I’ve been using https://github.com/dminkovsky/kube-cloud-build to build images on Google Cloud Container Builder. It handles generating Cloud Container Builder build requests based on the images specified in my Kubernetes manifests, which was a big deal for me since writing build requests by hand was a total pain.
Disclaimer: I worked on this kaniko at Google
Discloser - I work on kaniko and other container things at Google.
Why is it okay now for kaniko to run as root user?
Evidence that Docker is doing this:
- They only advertise three things with the name Docker: "Docker for Mac" (a free product that is not open-source), "Docker EE" (an enterprise product), and "Docker Hub" (a cloud service). Those are all downstream products, like RHEL or Openshift.
- The whole "Moby" thing is basically their upstream brand, aka "the things not called Docker".
- They spun out tons of smaller projects like buildkit, linuxkit, containerd, runc, and seem eager to get others to use them and contribute, even competitors.
- They embraced Kubernetes as part of their downstream product, even though they famously did not invent it, and they certainly don't control it.
So I think people saying "these free open-source tools are killing Docker" are missing the point. The real competition for Docker is Openshift vs Docker EE, everything else is implementation details.
If you listen to the sales pitch of these two companies right now, it's an absolute tug of war. Docker focuses on independence and innovation ("we know where containers are going, and we don't force RHEL down your throat"). Red Hat focuses on maturity and upstream control ("We've been by your side for 20 years, are you going to trust us or some Silicon Valley hipster? Also we employ more Kubernetes contributors than anyone else").
That's the real battle, in my experience on the open-source side you'll find mostly engineers from all side collaborating peacefully and building whatever they need to get their job done.
That said, Furan isn't suitable for untrusted Dockerfiles (or multi-tenant environments) exactly due to the security implications of access to the Docker engine socket.
The issue I see with Kaniko is drift from upstream Moby/Docker syntax. One of the strengths with Furan is that you have the guarantee that the Docker build you perform locally is exactly what happens by the service. When you can't make this guarantee you get into weird situations where "the build works for me locally" but there's some issue when doing a remote build. That's also why we've resisted putting special build magic into Furan (like injecting metadata into the build context, for example).
If Kaniko authors are reading this: have you considered buildkit and, if not, would you be open to contributions based on it?
My understanding is that the official 'docker build' itself is based on Buildkit.
We are looking at interoperability with buildkit (and the large set of of other tooling like this) through the CBI: https://github.com/containerbuilding/cbi which aims to be a neutral interface on top of things like buildkit, buildah, docker and kaniko that build images.
Discloser: I work on kaniko and other container things at Google.
Thank you for the pointer.
FYI: Kaniko plugin for CBI is now available.