
“This change deletes the C implementations of the Go compiler and assembler” - osw
https://github.com/golang/go/commit/b986f3e3b54499e63903405c90aa6a0abe93ad7a
======
justincormack
So what is the bootstrap process going to be? Other than already have a Go
compiler I mean. Or is it have a Go cross compiler?

Maybe it matters less, you used to always assume bootstrap from C but that
more or less died with C++ based compilers, although you can do a multistage
bootstrap from the last gcc before C++ still.

~~~
yiyus
It is explained in the design document:
[https://docs.google.com/document/d/1P3BLR31VA8cvLJLfMibSuTdw...](https://docs.google.com/document/d/1P3BLR31VA8cvLJLfMibSuTdwTuF7WWLux71CYD0eeD8/edit)

Basically, you start from the last C version, and every version is supposed to
be able to compile the next one.

~~~
cperciva
So if you want to avoid trusting trust, you need to audit not only a C
compiler and the source code for the Go compiler you plan on using, but also
every past Go compiler as well?

~~~
f2f
You can't avoid trusting trust. "Ken was here" :)

~~~
cperciva
You can. You just have to bootstrap all the way up.

~~~
fugyk
What if any intermediate version is found to contain the violation of trusting
trust? Every go maintainer has to build every version sequentially from that
version to current version.

------
arcticbull
I still just don't understand why they insist on building their own toolchain.
It just doesn't make sense to me.

When you set out to build a programming language, what is your objective? To
create a sweet new optimizer? To create a sweet new assembler? A sweet new
intermediate representation? AST? Of course not. You set out to change the way
programmers tell computers what to do.

So why do this insist on duplicating: (1) An intermediate representation. (2)
An optimizer. (3) An assembler. (4) A linker.

And they didn't innovate in any of those areas. All those problems were solved
with LLVM (and to some more difficult to interact with extent GCC). So why
solve them again?

It's like saying you want to build a new car to get from SF to LA and starting
by building your own roads. Why would you not focus on what you bring to the
table: A cool new [compiler] front-end language. Leave turning that into bits
to someone who brings innovation to that space.

This is more of a genuine question.

~~~
dsymonds
What's your counter-proposal?

If you're building a new language, you need a new AST. You can't represent Go
source code in a C++ AST.

There _are_ alternate compilers for Go, in the form of gccgo and llgo. But
those are both very slow to build (compared to the Go tree that takes ~30s to
build the compiler, linker, assembler and standard library). And the "gc" Go
compiler runs a lot faster than gccgo (though it doesn't produce code that's
as good), and compilation speed is a big part of Go's value proposition.

~~~
arcticbull
I would never set out to build a language I wanted people to use and not build
it as a front-end for LLVM. I don't want to write an optimizer or assembler.

I don't doubt for one second that llgo takes a longer time to compile. And in
exchange for slower compile times you benefit from many PHDs worth of
optimizations in LLVM. And every single target architecture they support.

It's easy to build something faster when it does less. I'll admit there's no
blanket right answer to that tradeoff.

~~~
dsymonds
Yes, that's why there's both gc and gccgo (llgo came later). Apart from the
rigour of having two independent compilers, they are seeking different
tradeoffs. gc is very interested in running fast, and gccgo benefits from
decades of work that have been put into gcc's various optimisations.

Does that answer your original statement that you didn't understand why we
build our own toolchain?

~~~
bsdetector
Well I still don't understand. Russ says it was for segmented stacks, but
doesn't explain why those were necessary. You say it was for compile speed,
yet gcc and llvm can crank out millions of lines a code a second at similar
optimization levels as the Go compiler. Neither of these are convincing
explanations.

------
ngoldbaum
Wow, github doesn't handle big diffs well. Some sort of automatic pagination
would really help.

~~~
cratermoon
I would make the case the "big" diffs are a problem. Unless there's a _really_
good reason (and bad dependency management is not a good reason) then commits
should be smaller and more logically related.

~~~
rcthompson
This is a merge commit, which means the diff is going to include all the
changes on the branch being merged. Even if all the individual commits are
small, a merge diff can still be very large.

------
brandonwamboldt
Congrats to the Go team, but that link kills the browser....

~~~
davecheney
You can read the original commit on Gerrit, it's less explodey.

[https://go-review.googlesource.com/#/c/5652/](https://go-
review.googlesource.com/#/c/5652/)

------
Animats
Nice. That's a step forward. Another bit of legacy code bites the dust.
Another step forward to the post-C world we need.

(If you want to compile with a different compiler as a check, there's an LLVM-
based compiler for Go.)

~~~
gillianseed
Go is also supported in GCC, as GccGo.

------
bketelsen
RSC is awesome.

------
smegel
And the boy pulled up his bootstraps and became a man.

------
Vecrios
So, if I'm understanding this correctly, they are to re-write the Go compiler
in Go, and compile it using the currently published compiler (i.e. 1.4)?

Could someone, kindly, explain how future versions would be built? Thanks!

~~~
humbledrone
My understanding is that they wrote code that translated the C code for the
original Go compiler into Go code. This translation wasn't fully general -- it
made assumptions about how the C code was written -- but it allowed the port
from C to Go to be very precise (i.e. bug for bug). So now that the Go
compiler written in Go can compile Go, that's what they'll use going forward,
and they will slowly work to make it into more idiomatic Go instead of
machine-generated Go.

So to answer your question, this new Go-written-in-Go compiler will initially
be compiled by the Go-written-in-C compiler. The output from that will be an
executable Go-written-in-Go compiler, and _that_ will be used to compile
itself in the future. I.e. Go compiler version 1.4 will be used to compile Go
version 1.5 will be used to compile Go version 1.6...

Keep in mind that this is not at all unusual. The C compiler GCC has been
compiled using older versions of GCC for a long time. Having a compiler
compile itself is a sort of milestone that many languages aspire to as a way
of showing that the language is "ready."

~~~
uxp
Its generally called "self-hosting" when a compiler can compile itself[1]. It
was a pretty big deal when Clang became self-hosting[2] in 2010.

[1] [https://en.wikipedia.org/wiki/Self-
hosting](https://en.wikipedia.org/wiki/Self-hosting)

[2] [http://blog.llvm.org/2010/02/clang-successfully-self-
hosts.h...](http://blog.llvm.org/2010/02/clang-successfully-self-hosts.html)

~~~
Vecrios
Thank you both for your inputs.

------
tbolt
So this means the go compiler is completely written in go?

~~~
dsymonds
In source control, yes. There's not yet a stable release where that's the case
though; Go 1.5 (due later this year) will be that release.

------
joeld42
congrats gophers! That's a big step for the language.

------
davidrusu
Anyone else seeing this post as the 1st and 2nd link on the front page of HN?

~~~
WestCoastJustin
In case is gets fixed, here's what I see, to help diagnose the bug [1]. Both
posts point to
[https://github.com/golang/go/commit/b986f3e3b54499e63903405c...](https://github.com/golang/go/commit/b986f3e3b54499e63903405c90aa6a0abe93ad7a),
have the same HN item id=9097404, but different comments counts.

[1] [http://i.imgur.com/xATOXPb.png](http://i.imgur.com/xATOXPb.png)

~~~
jdoliner
This must be the new and improved "eventually consistent HN" that dang has
been talking about.

------
pjmlp
Great news!

------
gresrun
Once you go Go, you never Go back!

~~~
bsummer4
Then, clearly, the right path is to never go Go.

------
bcantrill
One does wonder if the register re-naming from their abstract (but misleading)
names to their proper machine names (e.g., from "SP" to "R13") wasn't at all a
reaction to the (in)famous polemic on the golang build chain.[1]

[1] [http://dtrace.org/blogs/wesolows/2014/12/29/golang-is-
trash/](http://dtrace.org/blogs/wesolows/2014/12/29/golang-is-trash/)

~~~
rsc
SP, FP, and PC are all still there. What we did was make the conventions more
uniform across all architectures. The rules for certain corner cases for when
SP and PC were references to the virtual register and when they were
references to the real register were inconsistent. As part of having a single
assembly parser, we made the rules consistent, which meant eliminating some
forms that were accepted on only a subset of systems, or that had different
meanings on different systems.

I'm a little surprised you brought that post up to begin with. It completely
misses the point, as I explained in my comment here at the time
([https://news.ycombinator.com/item?id=8817990](https://news.ycombinator.com/item?id=8817990)).
When I wrote that response I also submitted a comment on the blog itself with
a link to the HN comment. That blog comment has not yet been published. If
you're going to keep sending around links to such an inflammatory blog post,
could you also try to get my comment there approved?

Thanks.

------
davexunit
Here we go again. _Another_ compiler that can't be bootstrapped from source
code. It's a packaging nightmare. Another magic binary to trust not to have a
Thompson virus.

~~~
chimeracoder
> Another compiler that can't be bootstrapped from source code.

It _can_ be bootstrapped from source - it just needs to be bootstrapped either
using gccgo[0], or using the 1.4 compiler (which is guaranteed to work for all
1.x compilers, not just 1.5)

> Another magic binary to trust not to have a Thompson virus.

"Reflections on Trusting Trust" gets posted on HN regularly, and it's an
interesting exercise, but you are _far_ more likely to have an exploit hiding
in plain sight in a compiler compiled from source once than you are to have
one that only appears after multiple iterated compilations.

It's a good concept for security experts and compiler developers to be aware
of, but the likelihood is incredibly small.

Also, for what it's worth, "Trusting Trust" is over three decades old, and
there have been numerous response to it in the interim, with lots of study.
It's like saying "Your problem reduces to 3-SAT, and satisfiability is NP-
hard, so you can't solve it', throwing your hands up, and leaving it at that.
In reality, solving 3-SAT in the general case is NP-hard, but it is well-
studied enough that, in practice, solving SAT/3-SAT is actually pretty easy
most of the time. Some of these responses have even been posted elsewhere in
this thread, though they're also pretty easy to find online as well.

[0] which is written in C++ - frankly, I'd be much more concerned about a
single-compilation bug in _any_ C++ code than I'd be about a multiple-
compilation bug in Go.

~~~
nullc
[http://www.dwheeler.com/trusting-trust/](http://www.dwheeler.com/trusting-
trust/) < David A. Wheeler’s Page on Fully Countering Trusting Trust through
Diverse Double-Compiling, for an example.

Though the diversity available for a go compiler written in go isn't very
tremendous.

