
Bootstrapping Rust - darnir
https://gnu.org/software/guix/blog/2018/bootstrapping-rust/
======
rusbus
The "bootstrapable.org" project that OP refers to is an interesting practical
result from the infamous "Reflections on Trusting Trust."[1] If you have the
source compiler, which is itself bootstrapped from source, then you
effectively sidestep the problem brought up in the paper -- someone sneaks in
a boobytrapped compiler somewhere in the process resulting in a chain of
tainted compilers.

This is the kind of work that seems pretty thankless, but I'm glad someone is
doing it.

[1]
[https://www.archive.ece.cmu.edu/~ganger/712.fall02/papers/p7...](https://www.archive.ece.cmu.edu/~ganger/712.fall02/papers/p761-thompson.pdf)

~~~
colejohnson66
“someone sneaks in a boobytrapped compiler somewhere...”

Especially one that spits out a white supremacy message every once in a while:
[https://www.quora.com/What-is-a-coders-worst-
nightmare/answe...](https://www.quora.com/What-is-a-coders-worst-
nightmare/answer/Mick-Stute)

~~~
unhammer
Wow. But it sounds almost too good to be true; anyone know if it's for real?

~~~
Drdrdrq
I have no idea if it's true, but it is really not that difficult to achieve in
a setting like this, where binaries are compiled from the sources that are
available on the system itself. Even creating a compiler that poisons itself
is not that difficult. It is the idea itself which is genius, and
(unfortunately) completely doable.

------
yoklov
> There are plans to extend mrustc to support newer Rust, but it turned out to
> be difficult.

Was some feature added to rust 1.20.0 that was particularly difficult to
implement? Or is this just a 'have to stop somewhere' situation.

~~~
steveklabnik
My understanding is that the goal was to break the bootstrap chain, and so
once that was done, there wasn’t a ton of reasons to keep working on it.

mrustc is effectively written entirely by one person:
[https://github.com/thepowersgang/mrustc/graphs/contributors](https://github.com/thepowersgang/mrustc/graphs/contributors)

It's extremely impressive, but I can also understand why trying to keep up
doesn't make a ton of sense.

~~~
Twirrim
Presumably the drawback here is that rust is going to take an increasing
amount of time to bootstrap on these platforms.

You're releasing about 8 a year (excluding bug fix releases), each of which is
necessary to compile the next one, and so on down the line? That sounds like
it's going to get extremely nasty, really quickly.

~~~
steveklabnik
That’s assuming that someone wants to fully bootstrap. Only a very, very small
number of people actually do this, and since this is all about trust anyway,
it all depends on your level of paranoia.

mrustc has already bootstrapped a byte-identical rustc from the mainline, that
in and of itself is good enough for many. Even Debian didn’t do a full
bootstrap from the OCaml days.

It’s all about trust. If you want to be mega paranoid, then yeah it’s gonna be
a lot of work. But that always is. This whole thing is only an issue for a
very small number of people. Those people are important! But it’s a tradeoff,
like everything.

------
unhammer
[https://dwheeler.com/trusting-trust/](https://dwheeler.com/trusting-trust/)
is the page on Diverse Double-Compiling as a counter to the trusting trust
attack

------
ximeng
So bootstrap chain here is g++->mrustc->several iterations of rust. (Rather
than original bootstrap chain via ocaml.)

Bootstrap for g++ is presumably something like machine code->asm->c->g++.

And the overall goal is something like shortening or simplifying the chain
from machine code to rust compiler.

Ideally I suppose this would be something like machine code->proto rust->rust
compiler.

Haskell seems to have a relatively good pipeline, with clean division between
core and non-core.

[https://ghc.haskell.org/trac/ghc/wiki/Commentary](https://ghc.haskell.org/trac/ghc/wiki/Commentary)

~~~
clort
The nice(r) thing about bootstrapping GCC is that there are many alternative
implementations of C and C++ already existing.

~~~
ploxiln
And you can build GCC-8 with GCC-4.8, whereas rust seems to require at least
the previous version of rust. It would be reasonable for rust 1.19 to be able
to build the latest release for the next couple of years ... rust 1.19 is less
than 18 months old! Surely rust was an OK language 18 months ago?

~~~
hsivonen
Bootstrapping is done rarely but ongoing compiler development is done all the
time. It seems like a bad idea to optimize for bootstrapping at the cost of
not being able to use the best Rust has to offer in compiler development
today.

~~~
ploxiln
The benefit is to stable software distributions, and also to from-scratch
software stack builds in diverse environments. As a software engineer who has
been interested in reliable and understandable software systems for a couple
decades, I find the Rust trend of following the web-ecosystem and completely
abandoning software support after a couple of months to be _nuts_.

Look, I'm a reasonable guy. I don't insist on being compatible with Linux 2.4,
I don't insist on working with a compiler and system libraries from 2006, I
don't reject all bundling completely. But working with stuff from a few years
ago would be very helpful to lots of projects and efforts, and is what I've
come to expect from high quality software distributions and libraries, like
Debian stable, the linux kernel, GCC, libpng, sqlite3, etc etc.

(The linux kernel is a huge project and maybe the fastest-moving in existence,
and you can build the very latest release with GCC-4.7!)

~~~
zozbot123
> I find the Rust trend of following the web-ecosystem and completely
> abandoning software support after a couple of months to be _nuts_.

If Rust was a "traditional" project and not following web-ecosystem practices,
it would still be issuing 0.x releases though - the language itself is quite
far from true maturity (the "2018" version has _only just_ gained NLL, and
there are plenty of deeply-impacting features in the pipeline, at various
stages of development). So it's six of one, half a dozen of the other...

~~~
acqq
To save others searching; The mentioned NLL is: [https://github.com/rust-
lang/rfcs/blob/master/text/2094-nll....](https://github.com/rust-
lang/rfcs/blob/master/text/2094-nll.md)

“non-lexical lifetimes”

~~~
steveklabnik
For a simpler explanation, see [https://doc.rust-lang.org/edition-
guide/rust-2018/ownership-...](https://doc.rust-lang.org/edition-
guide/rust-2018/ownership-and-lifetimes/non-lexical-lifetimes.html) (the guide
says beta but it’s in stable now, that will be fixed soon.)

------
andrewchambers
writing mrustc in C++ seems like such a mistake.

It compiles rust to C, so why not write it in rust? then compile itself to C.

Another idea, just compile rustc to web assembly then use
[https://github.com/WebAssembly/wabt/tree/master/wasm2c](https://github.com/WebAssembly/wabt/tree/master/wasm2c)
to convert it to your bootstrap source.

~~~
nine_k
> _why not. write it in rust?_

Exactly to avoid having any rust compilers in the chain.

~~~
mrob
There are many Rust compilers in the bootstrap chain. The point is to make
every step in the chain human-readable. Auto-generated C is not "source code"
in the sense of "the preferred form of the work for making modifications to
it" (the GPL's definition of source code). A malicious code generator could
hide a trusting trust attack in the generated code in such a way that it would
be difficult to find. True source code is easier to audit.

