The Dockerfile format imposes a hierarchical relationship between layers. This quickly becomes very annoying, since dependencies usually form dependency graphs, not dependency trees.
Alternative tools, like nix (probably bazel too), are not bound in the same way. They can achieve fine grained caching by mapping their dependency graph to docker layers, which is something that can not be expressed with a Dockerfile.
> The Dockerfile format imposes a hierarchical relationship between layers. This quickly becomes very annoying, since dependencies usually form dependency graphs, not dependency trees.
Isn't a Dockerfile just a sequence of dependencies, rather than a tree?
You're right, though it becomes hierarchical once you have multiple Dockerfiles inheriting from some base image (which I did not articulate in my original comment).
The final result need not be.
You can build a bunch of things then merge the results in a final stage without any hierarchy (this is "COPY --link" in a Dockerfile).
With nix, re-usability is very high. It's a function that is very baked in at very low levels of it's design. This comes with up front complexity but getting to these reusable layers is basically forced.
Docker is very simple and often touts reusable layers, but in practice is not. Unless you tackle that complexity.
Making reproducible and reusable content takes effort. Other tools are not designed for that. As a result the getting to the same state requires a similar amount of complexity. Worse, with docker you can never be sure that you actually succeeded in your goal of reproducibility.
An analogy could be rust. Rust has up front complexity, but tackling that complexity gives confidence that memory safety and concurrency primitives are done correctly. It's not that C _can't_ achieve the same runtime safety, it's just requires a lot more skill to do correctly; and even then memory exploits are reported on a near daily basses for very popular and widely used libraries.
Complex problems are complex. And sooner or later you'll need to face that complexity.
This is not how docker works.
Docker, exactly like nix, is based on a graph of content addressable dependencies.
What you are describing is chaining a bunch of commands together.
Yes, this forms a dependency chain stored in separate layers and is part of the cache chain.
Nix suffers the exact same problems with reproducibility.
The thing it provides is the toolchain of dependencies that are reproducible.
Docker does not provide your dependencies.
If the inputs change then so does the output.
If the output itself is not reproducible (like, say an artifact with a build-time embedded in it) then you have something that is inherently not reproducible and two people trying to build the same exact nix package will have different results.
EDIT: Fixed a sentence I apparently got distracted while writing and didn't complete (about layer caching).
Nix is not content addressable though, the hashes is based off on the derivation files which are equal to the lock files you would find in other package managers.
> The thing it provides is the toolchain of dependencies that are reproducible. [...] If the inputs change then so does the output. If the output itself is not reproducible (like, say an artifact with a build-time embedded in it) then you have something that is inherently not reproducible and two people trying to build the same exact nix package will have different results.
There are no guarantees they are reproducible. The only guarantees Nix gives you is that the build environment is the same which allows you to make some claims about the system behaving the same way. But they are certainly no guarantees about artifacts being bit-by-bit identical.
But doing this is going to give you a slight headace as most of the package repository in Nix is not checked for reproducible builds and there are no way to guarantee the hashes are actually static.
Right, all builds are dependent on their inputs.
Your inputs determine your outputs.
If your input(s) change, then so does your output.
We are saying the same thing here, I'm just trying to point out this is exactly how docker build works, but rather it is more about what you are willing to put into your docker build.
I think we are talking past each other. I'm just trying to clear up a misconception on how nix works, not anything about the docker portion of what you have written.
It would seem you don't understand how either work. They are basically opposites in how they actually work.
Docker layers are completely independent from each other. A docker layer is a sha256 sum of that layer. Separately there is an image manifest, which is also fetched with a sha256 of that manifest, states the order of the layers and at runtime those layers are stacked up on each other.
With docker, there is no explicit dependency chain. A layer is just a tarball or some JSON. Some tooling can take advantage of this fact. How nix builds docker images takes advantage of this.
Nix on the other hand, the output hash is not tied to the output hash, because the output hash is irrelevant to how it was produced. You also cannot know in advance the output hash of something. IE. If I do say "echo foo > bar.txt", I cannot know the sha256 sum of bar.txt until the code runs. But before the code runs I can know the hash all the inputs that will create bar.txt.
This fundamental difference means two builds, executing the same code can share the outputs. Provided that the build environment is trust worthy.
You are describing the makeup of an OCI image, which is the _output_ of a typical docker build (and also the output of the nix image builder).
While docker build can/does output OCI images, that is only an output.
How that output comes to be is not the output itself, same as the nix side the article is talking about.
> How that output comes to be is not the output itself, same as the nix side the article is talking about.
I see the confusion now. The nix image builders OCI layer's contents are _only_ nix store paths which _do_ include the input.
Nix store paths are are guaranteed to never overlap each other, and such the order of layering them in the docker manifest does not matter. But the docker layers are just tar balls of nix store paths. Each layer in the image has no dependence on previous or future layers at all. It is just one or more nix store paths.
I'm not talking about OCI images (again, that's the _output_).
I'm talking about how they are built.
OCI images are OCI images, they get extracted the same way no matter if there's conflicting paths or not.
What I'm saying here through multiple different threads is, buildkit and nix build things the same way.
`docker build` is not just a Dockerfile builder, its actually a grpc service (with services running on both the docker CLI and in the daemon).
This service is actually very generic.
It includes builtin support for Dockerfiles, which just converts the Dockerfile format into what buildkit calls "LLB", which is analogous to LLVM IR.
What I'm also saying is, people are comparing "docker build" with a Dockerfile that's using a package manager that's not even provided by docker to nix.
This is not an apples to apples comparison, and in fact you can implement nix packaging using buildkit (https://github.com/reproducible-containers/buildkit-nix).
I'm also saying that `Dockerfile` does actually support merging dependencies without being order dependent (this is `COPY --link`).
But also, you can drive buildkit operations without going through Dockerfile. You can also plug in your own format with the `syntax=<some/image>` at the top of your file.
This isn't "convert to dockerfile", its "convert to LLB", which is all the Dockerfile frontend does.
Finally, I'm saying nix isn't in and of itself some magic tool to have a reproducible build.
You still have to account for all the same things.
What it does do, at a package management level, is make it easier to not have dependencies that change automatically over time (which has its own plusses and minuses).
The Dockerfile format imposes a hierarchical relationship between layers. This quickly becomes very annoying, since dependencies usually form dependency graphs, not dependency trees.
Alternative tools, like nix (probably bazel too), are not bound in the same way. They can achieve fine grained caching by mapping their dependency graph to docker layers, which is something that can not be expressed with a Dockerfile.