Shouldn't this example of randomness still be pushed to the installation stage, instead of in the distribution. If Debian's binary package contains a "random" key, then we have a pretty large herd already using it.
Indeed. During working on reproducibility in Debian, I found exactly this: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=833885
Such randomization can mitigate against certain kinds of attacks, especially making particular instances of an attack work differently between each kernel build, and thus increasing the difficulty of writing exploits that work across a broad range of kernel builds. There are about 30 different patch releases of the kernel for Debian Jessie alone; so such randomness, if applied across the different builds of a kernel within a release, across various different Debian releases, across different architectures, and also applied by other distros, can substantially increase the difficulty of writing an exploit that can affect a substantial fraction of Linux systems on the net.
That helps provide a form of "herd immunity", where the herd is all deployed Linux systems on the net. It doesn't provide any real protection for a particular build; so it's no harder to write an exploit that targets all system running the exact same kernel. But it does dramatically increase the difficulty of writing a widespread worm which relies on the given exploit, and which can easily spread between a variety of different systems.
Anyhow, that footnote was on a policy for explicit exceptions from the general policy. The general policy is that yes, builds should be reproducible, and any randomness should be generated locally. The footnote was giving a few example of potential exceptions which may be required, as a way of demonstrating that it would be fine to have the general requirement be reproducible builds with case-by-case, narrow exceptions for cases in which non-reproducible builds provide significant value.
And note that such exceptions may be based on how an upstream project operates. If the upstream kernel has some modules that build non-reproducibly in such a way, it may be more viable to encode an explicit exception to the policy for that case, than to remove that non-reproducibility.
It's important to keep in mind that someone applying a cleanup once accidentally removed almost all entropy from OpenSSL's random number generator on Debian (they even asked for review from upstream, and failed to get it, because of the confusing naming of upstream's mailing lists). Having an excessively rigid policy could mean that Debian maintainers would be forced to remove features that do provide some kind of benefit.
Rebuild? No. Just relink (and link in a just-generated binary containing the randomness seed). At boot time. Just what OpenBSD is experimenting with.
Some kind of blockchain-like trust verification system isn't the craziest idea I've ever pitched.
Like if 500 people can independently confirm a build produces a binary with result X then they could all share in the reward, whatever that is. Nanokarmas?
Other systems are designed to be more asymmetric in order to facilitate scale. Crypto coins would never work at all if to verify a possible hash you had to spend days mining to reproduce the work. Spending five minutes compiling a program to achieve consensus isn't a problem.
The problem is in verifying that someone actually did the work and didn't steal someone else's solution. Maybe encrypting the result you get and sending it off in escrow to a centralized verification location would work, and once a sufficient number of solutions are collected the solutions are unsealed and the results shared so everyone can see what happened and raise any objections.
>Like if 500 people can independently confirm a build produces a binary with result X then they could all share in the reward, whatever that is.
In a cryptocurrency blockchain, every single node verifies every block received from another node in order to check whether it should be included in the node's local copy of the blockchain. If computing and verifying are the same operation, it's not "500 people independently verify a source produces a given binary and the rest of the network rewards them", but "the entire network verifies that a source produces a given binary and they all equally pat each other on the back". Unless you only reward the first to verify a build, but then no node would bother spending time verifying blocks made by other people and building off that chain rather than mining its own blocks if they both take as long as each other.
>The problem is in verifying that someone actually did the work and didn't steal someone else's solution.
Bitcoin uses the hash of the rest of the block (which includes the address for the reward of the miner who is doing the proof-of-work) as an input into the proof-of-work such that the result of the proof-of-work is only valid for that input. It's not apparent to me whether where there could be room to add an input like that into checking whether a source compiles to a binary. (I thought through whether you could make a Lamport-signature-like scheme involving picking specific intermediate values generated during the compilation that correspond to parts from a pre-committed series of hash pairs, but then I realized it wouldn't work because anyone who does the build once would get all of the intermediate values and be able to create as many of these signatures as they wanted for little effort.)
>Maybe encrypting the result you get and sending it off in escrow to a centralized verification location would work, and once a sufficient number of solutions are collected the solutions are unsealed and the results shared so everyone can see what happened and raise any objections.
Sounds like what you're looking for is some kind of web-of-trust reputation system with a trusted authority rather than a cryptocurrency blockchain. (If you have a trusted authority, then nearly all of the design of a bitcoin-like cryptocurrency is ridiculous dead weight. You can shed nearly everything, you don't need a broadcast-everything blockchain, and you could choose to have really cool things like blind signatures for anonymous transactions.) (Though if you have a trusted authority who can afford to be running build processes, it'd be a lot simpler to just have them do all the build-verifying for you, and you could do away with anything discussed in this post and just have them publish a PGP/HTTPS-encrypted webpage with their results.)
You don't have to do that on the server right away, though. Just give the same task to multiple random users and only verify the result when all of the users return the same hash.
Honestly, I think it's a migration really worth it. Nix (and Guix) are quite mature now. The advantages they bring into the table are massive.
The whole Debian ecosystem would become a lot more integrated and robust. It would be possible to develop packages at their own pace, without having to keep all dependencies in sync with the whole package tree. Besides, no more dist-upgrade breaking your whole system. It would look a lot like a rolling release, but with none of its disadvantages.
It would be also possible to turn all Debian flavours into little declarative Nix blurbs. There are countless advantages.
But I also really appreciate the massive effort that Debian maintainers make, and the sheer number of those maintainers.
Combining whatever human processes Debian have in place to keep that going with Nix would be fantastic. Right now, to use Nix regularly, you really have to be willing to read a lot of Nixpkgs source code.
Edit: I should also note that I do actually currently use Nix on top of Debian for my work machine. Servers are all NixOS machines deployed with NixOps though.
1. Too few marketing. I only heard of Nix for the first time last year.
2. When you boot a virtual server, the UI usually offers you Debian, Ubuntu and Centos/RHEL images, sometimes Suse/SLES or Arch. So again, less visibility, and if the hoster does not allow you to bring your own image, it's a pain in the ass to install anything else.
3. (Going off topic for a second: I probably would have switched to Nix(OS) already if it wouldn't entail effectively abandoning the configuration management tool that I maintain.)
The reproducibility that this post is about is for auditability ("verify that the binary originated from the claimed source"), whereas the reproducibility in Nix is for reliability ("ensure that the same package either fails or succeeds on every system in the exact same way, regardless of what the environment looks like").
Both are nice to have, but they are only tangentially related.
When you download a package from the binary cache of Nix, you use the input hash. The binary cache contents may differ depending on the system of the build server.
Nix does try to eliminate nondeterminism in their builds though.