Docker builds are not deterministic, I don't get where you get that idea. I can't count the hours lost because the last guy who left one year ago built the image using duck tape and sed commands everywhere. The image is set in stone, but so is a zip file, there's nothing special here.
Building an image using nix solves many problems regarding not only reproducible environments that can be tested outside a container but also fully horizontal dependency management where each dependency gets a layer that's not stacked on one another like a typical apt/npm/cargo/pip command. And I don't have to reverse engineer the world just to see what files changed in the filesystem since everything has its place and has a systematic BOM.
So is it right, to make docker reproducible it needs to either build dependencies from source from say a git hash or use other package managers that are reproducible or rely on base images that are reproducible.
And that all relies on discipline.
Just like using a dynamically typed programming language can in theory have no type errors at run time, if you are careful enough.
Right; you could write a Dockerfile that went something like
FROM base-image@e70197813aa3b7c86586e6ecbbf0e18d2643dfc8a788aac79e8c906b9e2b0785
RUN pkg install foo=1.2.3 bar=2.3.4
RUN git clone https://some/source.git && cd source && git checkout f8b02f5809843d97553a1df02997a5896ba3c1c6
RUN gcc --reproducible-flags source/foo.c -o foo
but that's (IME) really rare; you're more likely to find `FROM debian:10` (which isn't too likely to change but is not pinned) and `RUN git clone -b v1.2.3 repo.git` (which is probably fixed but could change)...
And then there's the Dockerfiles that just `RUN git clone repo.git` and run with whatever happened to be in the latest commit at the moment...
Possible; I don't have a feel for the relative likelihoods. I think the thing nix has going for it is that you can write a nix package definition without having to actually hardcode anything in and nix itself will give you the defaults to make ex. compilers be deterministic/reproducible, and automate handling flake.lock so you don't have to actually pay attention to the pins yourself. Or put differently; you can make either one reproducible, but nix is designed to help you do that while docker really doesn't care.
It's actually how nix works by default. When you pull in a dependency, you are actually pulling in a full description of how to build it. And it pulls in full descriptions of how to build its dependencies and so on.
The only reason nix isn't dog slow is that it has really strong caching so it doesn't have to build everything from source.
Docker can resolve dependencies in a very similar manner to nix, via multi-stage builds. Each FROM makes one dependency available. However, you can only have direct access to the content from one of the dependencies resolved this way. The other ones, you have to COPY over the relevant content --from at build time.
You're totally right about the underlying container image format being much more powerful than what you can leverage from a Dockerfile. That's exactly the thing that makes nix a better Docker image builder than Docker! It leverages that power to create images that properly use layers to pull in many dependencies at the same time, and in a way that they can be freely shared in a composable way across multiple different images!
A Docker FROM is essentially the equivalent of a dependency in nix... but each RUN only has access to the stuff that comes from the FROM directly above it plus content that has been COPY-ed across (and COPY-ing destroys the ability to share data with the source of the COPY). For Docker to have a similar power to nix at building Docker images, you would need to be be able to union together an arbitrary number of FROM sources to create a composed filesystem.
Even with the Dockerfile format you can union those filesystems (COPY --link).
People use the Dockerfile format because it is accessible.
You can still use "docker build" with whatever format you want, or drive it completely via API where you have the full power of the system.
I actually hadn't heard of COPY --link, but it's interesting because it seems to finally create a way of establishing a graph of dependencies from a Dockerfile! It doesn't sound like it's quite good enough to let you build a nix-like system, though, because it can only copy to empty directories (at least based on what the docs say). You really need the ability to e.g. union together a bunch of libraries to from a composite /lib.
I'm not sure what you mean by 'You can still use "docker build" with whatever format you want'. As far as I'm aware, "docker build" can only build Dockerfiles.
I'm also not sure what you mean when you mention gaining extra abilities to make layered images via the API. As far as I can tell, the only way to make images from the API is to either run Dockerfiles or to freeze a running container's filesystem into an image.
docker build is backed by buildkit, which is available as a grpc service ("docker build" is a grpc client/server).
Buildkit operates on "LLB", which would be equivalent to llvm IR.
Dockerfile is a frontend.
Buildkit has the Dockerfile frontend built in, but you can use your own frontend as well.
If you ever see "syntax=docker/dockerfile:1.6", as an example, this triggers buildkit to fire up a container with that image and uses that as the front end instead of the builtin Dockerfile frontend.
Docker doesn't actually care what the format is.
Alternatively, you can access the same frontend api's from a client (which, technically, a frontend is just a client).
Frontends generate LLB which gets sent to the solver to execute.
OK, wow, this is interesting indeed. I didn't realize just how much of a re-do of the build engine Buildkit was, I had just thought of it as a next-gen internal build engine, running off of Dockerfiles.
Applying this information to the topic at hand:
Given what Buildkit actually does, I bet someone could create a compiler that does a decent job transforming nix "derivations", the underlying declarative format that the nix daemon uses to run builds, into these declarative Buildkit protobuf objects and run nix builds on Buildkit instead of the nix daemon. To make this concrete, we would be converting from something that looked like this: https://gist.github.com/clhodapp/5d378e452d1c4993a5e35cd043d.... So basically, run "bash" with those args and environment variables, with those derivations show below already built and their outputs made visible.
Once that exists, it should also be possible to create a frontend that consumes a list of nix "installables" (how you refer to specific concrete packages) and produces an oci image out of the nix package repository, without relying on the nix builder to actually run any of it.
If you're using Nix, that is what you are ultimately producing, it's buried under significant amounts of boilerplate and sensible defaults. Ultimately the output of Nix (called a derivation) reads a lot like a pile of references, build instructions, and checksums.
You can also use a hammer to put a screw in the wall.
Dockerfiles being at their core a set of instructions for producing a container image could of course be used to make a reproducible image. Although you'd have to be painfully verbose to ensure that you got the exact same output. You would actually likely need 2 files, the first being the build environment that the second actually get built in.
Or you could use Nix that is actually intended to do this and provides the necessary framework for reproducibility.
fun fact: there actually is a class of impact driver[1] that couples longitudinal force to rotation to prevent cam-out on screws like the Phillips head when high torque is required
Building an image using nix solves many problems regarding not only reproducible environments that can be tested outside a container but also fully horizontal dependency management where each dependency gets a layer that's not stacked on one another like a typical apt/npm/cargo/pip command. And I don't have to reverse engineer the world just to see what files changed in the filesystem since everything has its place and has a systematic BOM.