I keep reading about Nix and I still don't understand what it does better than D...

TeeMassive · 2024-03-16T03:34:02 1710560042

Docker builds are not deterministic, I don't get where you get that idea. I can't count the hours lost because the last guy who left one year ago built the image using duck tape and sed commands everywhere. The image is set in stone, but so is a zip file, there's nothing special here.

Building an image using nix solves many problems regarding not only reproducible environments that can be tested outside a container but also fully horizontal dependency management where each dependency gets a layer that's not stacked on one another like a typical apt/npm/cargo/pip command. And I don't have to reverse engineer the world just to see what files changed in the filesystem since everything has its place and has a systematic BOM.

jhanoncomm · 2024-03-16T04:12:30 1710562350

So is it right, to make docker reproducible it needs to either build dependencies from source from say a git hash or use other package managers that are reproducible or rely on base images that are reproducible.

And that all relies on discipline. Just like using a dynamically typed programming language can in theory have no type errors at run time, if you are careful enough.

yjftsjthsd-h · 2024-03-16T05:14:14 1710566054

Right; you could write a Dockerfile that went something like

    FROM base-image@e70197813aa3b7c86586e6ecbbf0e18d2643dfc8a788aac79e8c906b9e2b0785
    RUN pkg install foo=1.2.3 bar=2.3.4
    RUN git clone https://some/source.git && cd source && git checkout f8b02f5809843d97553a1df02997a5896ba3c1c6
    RUN gcc --reproducible-flags source/foo.c -o foo

but that's (IME) really rare; you're more likely to find `FROM debian:10` (which isn't too likely to change but is not pinned) and `RUN git clone -b v1.2.3 repo.git` (which is probably fixed but could change)...

And then there's the Dockerfiles that just `RUN git clone repo.git` and run with whatever happened to be in the latest commit at the moment...

janjongboom · 2024-03-16T07:07:39 1710572859

And that assumes that `foo` and `bar` are not overwritten or deleted in your package repository, and that the git repository remains available.

nullify88 · 2024-03-16T07:24:16 1710573856

Maintaining something like that is a pain unless you have tooling like Renovate to inform and update the digests and versions.

cpuguy83 · 2024-03-16T05:18:33 1710566313

It is likely just as rare for someone to use nix for this, though.

yjftsjthsd-h · 2024-03-16T05:22:11 1710566531

Possible; I don't have a feel for the relative likelihoods. I think the thing nix has going for it is that you can write a nix package definition without having to actually hardcode anything in and nix itself will give you the defaults to make ex. compilers be deterministic/reproducible, and automate handling flake.lock so you don't have to actually pay attention to the pins yourself. Or put differently; you can make either one reproducible, but nix is designed to help you do that while docker really doesn't care.

clhodapp · 2024-03-16T05:30:29 1710567029

It's actually how nix works by default. When you pull in a dependency, you are actually pulling in a full description of how to build it. And it pulls in full descriptions of how to build its dependencies and so on.

The only reason nix isn't dog slow is that it has really strong caching so it doesn't have to build everything from source.

cpuguy83 · 2024-03-16T16:06:19 1710605179

This is literally how docker works as well. The difference is docker doesn't bring a toolchain for those artifacts.

clhodapp · 2024-03-16T18:05:53 1710612353

Docker can resolve dependencies in a very similar manner to nix, via multi-stage builds. Each FROM makes one dependency available. However, you can only have direct access to the content from one of the dependencies resolved this way. The other ones, you have to COPY over the relevant content --from at build time.

cpuguy83 · 2024-03-16T23:06:08 1710630368

I'm not exactly sure what you are referring to here.

You can have as many "FROM"'s as you want. "FROM scratch" along with "ADD" is also valid (for non-image dependencies).

From there you do not need to copy things, you can mount the reference into another stage directly.

Also, this is the Dockerfile format, the underlying build API's are _far_ more powerful than what Dockerfile exposes.

clhodapp · 2024-03-17T03:40:15 1710646815

You're totally right about the underlying container image format being much more powerful than what you can leverage from a Dockerfile. That's exactly the thing that makes nix a better Docker image builder than Docker! It leverages that power to create images that properly use layers to pull in many dependencies at the same time, and in a way that they can be freely shared in a composable way across multiple different images!

A Docker FROM is essentially the equivalent of a dependency in nix... but each RUN only has access to the stuff that comes from the FROM directly above it plus content that has been COPY-ed across (and COPY-ing destroys the ability to share data with the source of the COPY). For Docker to have a similar power to nix at building Docker images, you would need to be be able to union together an arbitrary number of FROM sources to create a composed filesystem.

cpuguy83 · 2024-03-17T04:31:17 1710649877

Even with the Dockerfile format you can union those filesystems (COPY --link).

People use the Dockerfile format because it is accessible. You can still use "docker build" with whatever format you want, or drive it completely via API where you have the full power of the system.

clhodapp · 2024-03-17T04:56:36 1710651396

I actually hadn't heard of COPY --link, but it's interesting because it seems to finally create a way of establishing a graph of dependencies from a Dockerfile! It doesn't sound like it's quite good enough to let you build a nix-like system, though, because it can only copy to empty directories (at least based on what the docs say). You really need the ability to e.g. union together a bunch of libraries to from a composite /lib.

I'm not sure what you mean by 'You can still use "docker build" with whatever format you want'. As far as I'm aware, "docker build" can only build Dockerfiles.

I'm also not sure what you mean when you mention gaining extra abilities to make layered images via the API. As far as I can tell, the only way to make images from the API is to either run Dockerfiles or to freeze a running container's filesystem into an image.

cpuguy83 · 2024-03-17T05:41:12 1710654072

docker build is backed by buildkit, which is available as a grpc service ("docker build" is a grpc client/server).

Buildkit operates on "LLB", which would be equivalent to llvm IR. Dockerfile is a frontend. Buildkit has the Dockerfile frontend built in, but you can use your own frontend as well.

If you ever see "syntax=docker/dockerfile:1.6", as an example, this triggers buildkit to fire up a container with that image and uses that as the front end instead of the builtin Dockerfile frontend. Docker doesn't actually care what the format is.

Alternatively, you can access the same frontend api's from a client (which, technically, a frontend is just a client).

Frontends generate LLB which gets sent to the solver to execute.

clhodapp · 2024-03-17T06:03:42 1710655422

OK, wow, this is interesting indeed. I didn't realize just how much of a re-do of the build engine Buildkit was, I had just thought of it as a next-gen internal build engine, running off of Dockerfiles.

Applying this information to the topic at hand:

Given what Buildkit actually does, I bet someone could create a compiler that does a decent job transforming nix "derivations", the underlying declarative format that the nix daemon uses to run builds, into these declarative Buildkit protobuf objects and run nix builds on Buildkit instead of the nix daemon. To make this concrete, we would be converting from something that looked like this: https://gist.github.com/clhodapp/5d378e452d1c4993a5e35cd043d.... So basically, run "bash" with those args and environment variables, with those derivations show below already built and their outputs made visible.

Once that exists, it should also be possible to create a frontend that consumes a list of nix "installables" (how you refer to specific concrete packages) and produces an oci image out of the nix package repository, without relying on the nix builder to actually run any of it.

This would subsume the purpose of e.g. https://nixery.dev/

cpuguy83 · 2024-03-17T13:54:54 1710683694

https://github.com/reproducible-containers/buildkit-nix

clhodapp · 2024-03-18T01:18:53 1710724733

That is really cool!

MadnessASAP · 2024-03-16T05:38:48 1710567528

If you're using Nix, that is what you are ultimately producing, it's buried under significant amounts of boilerplate and sensible defaults. Ultimately the output of Nix (called a derivation) reads a lot like a pile of references, build instructions, and checksums.

isbvhodnvemrwvn · 2024-03-16T08:38:33 1710578313

I think their point was that number of people who use nix is a rounding error, perhaps due to poor user experience.

MadnessASAP · 2024-03-16T05:34:07 1710567247

You can also use a hammer to put a screw in the wall.

Dockerfiles being at their core a set of instructions for producing a container image could of course be used to make a reproducible image. Although you'd have to be painfully verbose to ensure that you got the exact same output. You would actually likely need 2 files, the first being the build environment that the second actually get built in.

Or you could use Nix that is actually intended to do this and provides the necessary framework for reproducibility.

inopinatus · 2024-03-16T08:54:46 1710579286

fun fact: there actually is a class of impact driver[1] that couples longitudinal force to rotation to prevent cam-out on screws like the Phillips head when high torque is required

[1] https://en.wikipedia.org/wiki/Impact_driver#Manual_impact_dr...

jhanoncomm · 2024-03-16T22:51:41 1710629501

My favourite “well actually” on HN. As a reluctant DIYer, thanks!

clhodapp · 2024-03-16T03:36:29 1710560189

Most Docker builds are not remotely deterministic or reproducible, as most of them pull in floating versions of their dependencies. This means that the same Dockerfile is likely to produce different results today than it did yesterday.

ok_dad · 2024-03-16T03:36:08 1710560168

Aren’t Nix builds actually deterministic in that they’ll build the same each time? Docker doesn’t have that, you’re just using prebuilt images everywhere. Determinism has a computer science definition, it’s not “build once run anywhere,” it’s more like “builds the exact same binary each time.”

cpuguy83 · 2024-03-16T05:16:02 1710566162

Don't conflate using "apt-get" in a Dockerfile with what "docker build" does.

takeda · 2024-03-16T05:51:39 1710568299

You can absolutely build a reproducible image in Dockerfile if you have discipline and follow specific patterns of doing.

But you can achieve the same result if you use similar techniques with a bash script.

georgyo · 2024-03-16T09:01:28 1710579688

You _can_ if you have _disipline_. That sounds like a foot gun the longer a project goes on and as more people touch the code.

Just create a snapshot of the OS repo, so apt/dnf/opkg/ etc will all reproduce the same results.

Make sure _any_ scripts you call don't make web requests. If they do you have the validate the checksums of everything downloaded.

Have no way to be sure that npm/pip/cargo's package build scripts are not actually pulling down arbitrary content at build time.

cpuguy83 · 2024-03-16T17:20:01 1710609601

So, outside of the fact that a nix build disables networking (which you can actually do in a docker build, btw) how would you check all those build scripts in nix?

You seem to be comparing 2 different things.

takeda · 2024-03-16T18:40:03 1710614403

You don't. Those scripts will just fail forcing you to rewrite them. This is why some people trying to create new packages often complain, because they need to patch up original build for given application to not do those things.

There are still ways that package will not be fully reproducible, for example if it uses rand() during build, Nix doesn't patch that, but stuff like that is fortunately not common.

clhodapp · 2024-03-16T05:27:39 1710566859

Docker doesn't give you the proper tooling to not have to use e.g. apt-get in your Dockerfiles. For that reason, one might as well conflate them.

vergessenmir · 2024-03-16T08:30:41 1710577841

I'm not sure that this is a Docker problem but you do have a point. I've used docker from the very beginning and it always surprised me that users opted to use package managers over downloading the dependencies and then using ADD in the docker file.

Using this approach you get something reproducible. Using apt-get in a docker file is an antipattern

zmgsabst · 2024-03-16T09:21:20 1710580880

Why? — I agree that it’s not reproducible, but so what?

We have 2-3 service updates a day from a dozen engineers working asynchronously — and we allow non-critical packages to float their versions. I’d say that successfully applies a fix/security patch/etc far, far more often than it breaks things.

Presumably we’re trying to minimize developer effort or maximize system reliability — and in my experience, having fresh packages does both.

So what’s the harm, precisely?

clhodapp · 2024-03-16T20:29:25 1710620965

This feels like moving the goalposts. This is on a thread which began with the statement that Docker is reproducible. Will we next be saying that, OK, it's an issue that Docker isn't reproducible, but it's doing it for a noble reason?

Regardless.... I can give a few reasons that it matters, off the top of my head:

1) Debugging: It can make debugging more difficult because you can't trace your dependencies back to the source files they came from. To make it concrete, imagine debugging a stack trace but once you trace it into the code for your dependency, the line numbers don't seem to make any sense.

2) Compliance: It's extremely difficult to audit what version of what dependency was running in what environment at what time

3) Update Reliability: If you are depending on mutable Docker tags or floating dependency installation within your Dockerfile, you may be surprised to discover that it's extremely inconsistent when dependency updates actually get picked up, as it is on the whim of Docker caching. Using a system that always does proper pinning makes it more deterministic as to when updates will roll out.

4) Large Version Drift: If you only work on a given project infrequently, you may be surprised to find that the difference between the cached versions of your mutably-referenced dependencies and the actual latest has gotten MUCH bigger than you expected. And there may be no way to make any fixes (even critical bugfixes) while staying on known-working dependencies.

cpuguy83 · 2024-03-16T06:23:09 1710570189

Docker doesn't give you the tooling to build a package and opts for you to bring the toolchain of your choice. Docker executes your toolchain, and does not prescribe one to you except for how it is executed.

Nix is the toolchain, which of course has its advantages.

clhodapp · 2024-03-16T20:33:30 1710621210

In terms of builds and dependency management, Docker and nix actually work pretty similarly under the covers:

Both are mostly running well-controlled shell commands and hashing their outputs, while tightly controlling what's visible to what processes in terms of the filesystem. The difference is that nix is just enough better at it that it's practical to rebase the whole ecosystem on top of it (what you refer to as a "toolchain") whereas Docker is slightly too limited to do this.

ok_dad · 2024-03-16T05:36:31 1710567391

Uh, I never even mentioned apt. Docker and nix are, likewise, very different. In not super familiar with either but I do know docker isn’t reproducible by design whereas nix is. I’m not sure nix is always deterministic, though i know docker (and apt) certainly aren’t, nor are they reproducible by design.

cpuguy83 · 2024-03-16T17:10:52 1710609052

So the thing here is docker provides the tooling to produce reproducible artifacts with graphs of content addressable inputs and outputs.

nix provides the toolchain of reproducible artifacts... and then uses that toolchain to build a graph of content addressable inputs in order to produce a content addressable output.

So yes they are very different, but not in the way you are describing. Using nix, just like using docker, cannot guarantee a reproducible output. Reproducible outputs are dependent on inputs. If your inputs change (and inputs can even be a build timestamp you inject into a binary) then so does your output.

ok_dad · 2024-03-18T04:14:51 1710735291

With nix, you just have to be careful not to do anything non deterministic to get a deterministic build. With docker build, you have to specifically design a deterministic build yourself. It’s easier to just not use inputs that change than to design a new build that’s perfectly deterministic.

jhanoncomm · 2024-03-16T04:08:29 1710562109

Is this about timestamps or is there more to it?

MadnessASAP · 2024-03-16T04:17:57 1710562677

The timestamps thing is part of ensuring that archives will have the correct hash. Nix ensures that the inputs to a build, that being the compiler, environment, dependencies, file system, are exactly the same. The idea being then that the compiler will produce an identical output. Hash's are used throughout the process to ensure this is actually the case, they are also used to identify specific outputs.

takeda · 2024-03-16T05:56:40 1710568600

The Nix idea is to start building with a known state of the system and list every dependency explicitly (nothing is implicit, or downloaded over net during build).

This is achieved by building inside of a chroot, with blocked network access etc. Only the dependencies that are explicitly listed in the derivation are available.

otabdeveloper4 · 2024-03-16T08:52:23 1710579143

> I still don't understand what it does better than Docker

It doesn't break as you scale. If you don't need that, then keep using Docker. (Personally, for me "scale" starts at "3 PC's in the home", so I eventually switched all of them to NixOS. I don't have time to babysit these computers.)

> Docker build are deterministic and easily reproductible

No, they definitely aren't. You don't really want to go down this rabbit hole, because at the end you realize Nix is still the simplest and most mature solution.

janjongboom · 2024-03-16T07:05:51 1710572751

That's the interesting bit about Dockerfiles. They look _looks_ deterministic, and they even are for a while while you're looking at it as a developer. I've done a detailed writeup of how it's not deterministic in https://docs.stablebuild.com/why-stablebuild