Hacker News new | past | comments | ask | show | jobs | submit login

What I would give for the developers of the Go toolchain to have spent the last decade improving GCC or LLVM instead of their own bespoke toolchain.

In many ways Go seems like an excuse for Google to fund the continued development of Plan 9. Three of the five most influential people on the Go team (Ken Thompson, Rob Pike, and Russ Cox) were heavily involved in Plan 9. And it shows. Go's toolchain is a direct descendant of the Plan 9 toolchain; in fact, the Go language is really just an evolution of the special C dialect that Plan 9 used [1]. Indeed, for a while, the Go compiler was written in this special dialect of C, and so building Go required building a C compiler (!) that could compile this custom dialect of C, and using that to compile the Go compiler [2].

By all rights, Plan 9 was an interesting research project, and seems well loved by those familiar with it. (I'm not personally familiar; it was well before my time.) But it never took off. What we ended up with is Linux, macOS, and Windows.

Go very much wants to be Plan 9. Sure, it's not a full-fledged operating system. But it's a linker, assembler, compiler, binutils, and scheduler. All it asks of the host system is memory management, networking, and filesystem support, and it will happily replace your system's DNS resolution with a pure Go version if you ask it to [3]. I wouldn't be surprised if Go ships its own TCP/IP stack someday [4].

This is, in my opinion, craziness. What other language ships its own assembler?! [5] To make matters worse, the assembly syntax is largely undocumented, and what is documented are the strange, unnecessary quirks, like

> Instructions, registers, and assembler directives are always in UPPER CASE to remind you that assembly programming is a fraught endeavor. (Exception: the g register renaming on ARM.)

> In the general case, the frame size is followed by an argument size, separated by a minus sign. (It's not a subtraction, just idiosyncratic syntax.)

> In Go object files and binaries, the full name of a symbol is the package path followed by a period and the symbol name: fmt.Printf or math/rand.Int. Because the assembler's parser treats period and slash as punctuation, those strings cannot be used directly as identifier names. Instead, the assembler allows the middle dot character U+00B7 and the division slash U+2215 in identifiers and rewrites them to plain period and slash.

The excuse for the custom toolchain has always been twofold, that a) LLVM is too slow, and fast compiles are one of Go's main features, and b) that the core team was too unfamiliar with GCC/LLVM, at least in the early days, and attempting to build Go on top of LLVM would have slowed the speed of innovation to a degree that Go might not exist [6].

I've always been skeptical of argument (b). After all, one of Go's creators literally won a Turing award, as this document not-so-subtly mentions. I'm quite sure they could have figured out how to build an LLVM frontend, given the desire. Rust, for example, is quite a bit more complicated than Go, and Mozilla's developers have had no trouble integrating with LLVM. I suspect the real reason was that hacking on the Plan 9 toolchain was more fun and more familiar—which is a very valid personal reason to work on something! But it doesn't mean it was the right strategic decision.

I will say that (a) is valid. I recently switched from writing Go to writing Rust, and I miss the compile times of Go desperately.

That said—and this is what I can't get past—the state of compilers would be much better off if the folks on the Go team had invested more in improving the compile and link times of LLVM or GCC. Every improvement to lld wouldn't just speed up compiles for Go; it would speed up compiles for C, C++, Swift, Rust, Fortran, Kotlin, and anything else with an LLVM frontend.

In the last year or so, the gollvm project [7] (which is exactly what you'd expect–a Go frontend for LLVM) has seen some very active development, and I'm following along excitedly. Unfortunately I still can't quite tell whether it's Than McIntosh's 20% time project or an actual staffed project of Google's, albeit a small time one. (There are really only two committers, Than and Cherry Zhang.) There are so many optimizations that will likely never be added to gc, like a register-based calling convention [8] and autovectorization, that you essentially get for "free" (i.e., with a bit of plumbing from the frontend) with a mature toolchain like LLVM.

There are not many folks who have the knowledge and expertise to work on compilers and linkers these days, and those that do can command high salaries. Google is in the luxurious position of being able to afford many dozens of these people. I just wish that someone with the power to effect change at Google would realize that the priorities are backwards. gccgo/gollvm are where the primary investment should be occurring, and the gc toolchain should be a side project that makes debug builds fast... not the production compiler, where the quality of the object code is the primary objective.

[0]: https://dave.cheney.net/2013/10/15/how-does-the-go-build-com...

[1]: http://doc.cat-v.org/plan_9/programming/c_programming_in_pla...

[2]: https://docs.google.com/document/d/1P3BLR31VA8cvLJLfMibSuTdw...

[3]: https://golang.org/pkg/net/

[4]: https://github.com/google/netstack

[5]: https://golang.org/doc/asm

[6]: https://golang.org/doc/faq#What_compiler_technology_is_used_...

[7]: https://go.googlesource.com/gollvm/

[8]: https://github.com/golang/go/issues/18597




> the state of compilers would be much better off if the folks on the Go team had invested more in improving the compile and link times of LLVM or GCC

It never would have happened. How do you motivate people whose principal frustration is the state of C++ to work on a large C++ codebase?

Heterogeneity is a huge benefit to any ecosystem. Improving existing things is great, but building new things is also very important. Go would simply not exist today if it were built on LLVM or GCC.


Agreed. This argument comes up a lot, especially in open source, that resources on one project would have better spent on some other project. Developer focus is not a fungible commodity. If the Golang devs hadn't been developing the Go compiler, they wouldn't necessarily have spent their efforts on LLVM, they'd just as likely be working on something other way to make Plan9 come about.


Let me refine my point a bit. Apologies; my original post had gotten a bit long.

I agree that enthusiasm is important! And indeed, for the Go creators, their particular leanings might have been such that they couldn't get excited about building an LLVM/GCC frontend, and adapting the Plan 9 toolchain is literally the only way those three could have Go gotten off the ground. As a member of the Go team, you'd certainly know better than I.

But Go is long past a personal passion project. Go is over ten years old. Go likely has over a million developers [0]. Go 1.0 has been stable for about seven years, and the first meaningful changes to the language are just now being talked about. In my opinion, it is several years past due for Google to start investing seriously in a Go toolchain based on a mature compiler stack.

I realize the audacity of this claim and I don't make it lightly. But if I had the money to spend on a team of developers, I would spend it making llvm-as and lld fast enough and stable enough to be Go's assembler and linker, and abandon the custom Plan 9 ones.

> It never would have happened. How do you motivate people whose principal frustration is the state of C++ to work on a large C++ codebase?

Well, for one, once the language gets off the ground, you can write the frontend in the new language. Rustc manages to be almost entirely Rust, for example.

> Heterogeneity is a huge benefit to any ecosystem. Improving existing things is great, but building new things is also very important.

I agree, and I think Go is an interesting contribution to the P/L landscape—essentially it proved that stripping away a good deal of complexity (generics, inheritance, etc.) results in a very useful, highly productive language. But I don't think Go's custom assembler and linker are meaningfully contributing to the ecosystem. They're useful presently in that they improve Go developers' productivity with ultra-fast builds, but they're not suitable for use by anything but Go. Improvements to Go's linker and assembler benefit only Go. Improvements to lld or gold can benefit practically everyone using a compiled language.

[0]: https://research.swtch.com/gophercount


> it is several years past due for Google to start investing seriously in a Go toolchain based on a mature compiler stack.

That's a bit presumptuous, prescribing specific implementation details based on the fact that it's "several years past due" that they replace their tech stack with one you would like to see improved.

Remember there already is a mature C compiler alternative for Go: gccgo. There's also already a first-party LLVM based Go toolchain created by Google (gollvm). Whatever hypothetical benefits you might presume would emerge from this kind of synergy already exist. But the community mostly isn't interested.

Also, Google also already invests a massive amount of resource into LLVM. In fact, the principal author of LLVM and Clang works at Google. But even when he was at Apple they were already shoveling resource into the project.


> Remember there already is a mature C compiler alternative for Go: gccgo. There's also already a first-party LLVM based Go toolchain created by Google (gollvm). Whatever hypothetical benefits you might presume would emerge from this kind of synergy already exist. But the community mostly isn't interested.

I suspect the community is uninterested because it’s hard to be interested in compilers that a) compile slower than gc, and b) produce slower code than gc. That's a worse compiler on all fronts!

For gccgo or gollvm to be useful, they need to provide some benefit. I suspect we'll see (b) fixed within a year. GCC/LLVM have far more optimizations than gc, and so it's mostly a matter of plumbing enough information from the Go frontend into the LLVM optimizer to unleash its full power.

I don't expect we'll see (a) fixed, unless something changes at Google.

> Also, Google also already invests a massive amount of resource into LLVM. In fact, the principal author of LLVM and Clang works at Google. But even when he was at Apple they were already shoveling resource into the project.

Yes, but Chris Lattner is not actively working on speeding up LLVM. His big project lately has been supporting TensorFlow via MLIR.


You agree and then again go completely sideways by unreasonably expecting that Go team to contribute to stack they neither claim to be expert on, nor particularly like it.

> Improvements to lld or gold can benefit practically everyone using a compiled language.

I do not see how a generic framework could be made as fast as custom build stack.


> You agree and then again go completely sideways by unreasonably expecting that Go team to contribute to stack they neither claim to be expert on, nor particularly like it.

Sorry, but I think you’re still missing my point. It’s well documented that the original creators of Go did not want to build their language on top of LLVM, and did not like C++. That’s totally fine! Of course you have to work on things you like.

But Go is so stable, and has been for more than half a decade. That means Google could spin up a brand new team, entirely separate if they must, of folks who like both LLVM and Go, to build out an LLVM frontend for Go. It’s not like Go is a fast-moving target. There is a very stable spec.

Gccgo is proof that it is possible to have a GCC frontend for Go without too much work. Ian Taylor has been maintaining Gccgo for about ten years, working what seems to be half time. Imagine what could be done if there were a few more folks actively working on improving gccgo, rather than just keeping parity with gc.

> I do not see how a generic framework could be made as fast as custom build stack.

I don’t see why not. It’s mostly a matter of introducing ways of disabling the expensive bits when compiling/linking a simpler language like Go. Yes, it’s probably somewhat slower to develop, but everyone benefits from the work, and you also offload much of the burden of maintaining a compiler/linker for a half dozen different platforms to the LLVM team collectively.


100% agree -- enthusiasm is not fungible! Sometimes starting over is faster than incremental improvement.


> It never would have happened. How do you motivate people whose principal frustration is the state of C++ to work on a large C++ codebase?

Yeah the creators of go didn't come to praise C++ but to bury it. To kill C++ you need first to knock out gcc and LLVM.


Given that Apple are the major sponsor of LLVM and they're not exactly cash poor, I think it's reasonable to conclude that being able to throw money and people at LLVM development doesn't explain why it's still much slower than the Go compiler toolchain. In hindsight the Go team made the right call to use their own toolchain.

As enneff alludes, Ken Thompson's antipathy towards C++ is well documented: https://bryanpendleton.blogspot.com/2009/12/coders-at-work-k...


> being able to throw money and people at LLVM development doesn't explain why it's still much slower than the Go compiler toolchain.

Right, you need to specifically throw money and people at the problem of making LLVM faster, not just at LLVM in general. Neither Swift nor Objective-C have "fast compiles" as part of their pitch. Much of the work on LLVM goes into producing the highest quality object code possible, which is a goal often at odds with compiling quickly, and part of the reason the choose-your-optimization-level flag (-O) exists, though -O0 compiles are still not fast enough.

> In hindsight the Go team made the right call to use their own toolchain.

No, we don't have the benefit of hindsight yet. We don't know what could have been if the resources that had been spent on the Go toolchain had been spent on LLVM instead.

If five highly-qualified engineers spent five years trying to speed up gollvm compiles without success, we'd have strong evidence that something about LLVM prohibits the fast compiles that are possible with the gc toolchain. But that's not the situation.


Maybe you should trust the judgement of the highly-qualified engineers then when they decided to go with their own toolchain, and not just assume that they’re doing it because plan 9 or some other vacuous reason.


The same people who created a language with poor design decisions where superior solutions existed in other languages they could have simply used?


What you vaguely label as poor design decisions are actually trade-offs and they make sense to a plethora of highly educated engineers.


The people who made golang are not language designers, and/or did not research established options in other language as per their admittance.


Ken Thompson wrote B which is the direct predecessor of C. He was and is certainly a language designer and implementer.

https://en.wikipedia.org/wiki/B_(programming_language)


He falls under the "and/or did not research established options in other language as per their admittance" part of my previous statement. Not to mention that B nor C encompass lessons learned since their design.


>The people who made golang are not language designers,

I bet they've designed and executed on more programming languages than you have :)


That's not an argument.


Just like when C came around and everyone else was doing safe system programming in Algol dialects, PL/I, Fortran extensions, BLISS, Mesa.

I see a trend here.


Imagine what Ken Thompson could have accomplished if he hadn't made all those poor design decisions!

And yet hundreds of thousands of working programmers around the world are productively using Go while still continuing to ignore the supposedly superior solutions.


Mass adoption doesn't necessarily mean something is superior. For all the software that's written in Go, there's plenty more written in C++


Imagine how famous he would be if Bell Labs had been allowed to sell UNIX instead of giving its source code away with a symbolic price.

Hundreds of developers used Basic, Pascal, C, Modula-2, Assembly, Forth productively.

Maybe we should have kept using them, instead of coming up with programming languages that require a PhD. /s


Appeal to authority fallacy. People still use C, what's your point? There are superior options, but people are either (1) forced to use something inferior, or (2) don't know any better (especially if they drank the kool aid).


> people are either (1) forced to use something inferior, or (2) don't know any better

Here's another fallacy for you: false dilemma.


I agree with you and am nitpicking only for my own knowledge: isn’t it “False “Dichotomy”?


If memory serves, AMD switched to a different compiler to reduce compilation time of the shaders, so reducing LLVM compilation time doesn't seem easy..


> If five highly-qualified engineers spent five years trying to speed up gollvm compiles without success, we'd have strong evidence that something about LLVM prohibits the fast compiles that are possible with the gc toolchain. But that's not the situation.

At some point people are going to need to put the money where their mouth is. Or they can live with the software which open source developers developed on their own or employers' dime.


I spent six weeks, unpaid, at the Recurse Center this year hacking on gccgo to see if I could improve the situation. I didn't get very far—it's hard! And I don't have much experience with compilers. So, truly, I did as much as my finances could reasonably support. (I did manage to write up a blog post about where specifically gccgo needs the most work [0].)

I did learn enough in those six weeks to feel comfortable asserting that there is incredible potential in gccgo/gollvm, and that I think Google is making a mistake by continuing to invest so heavily in the gc toolchain.

[0]: http://meltware.com/2019/01/16/gccgo-benchmarks-2019.html


If you're going to rely on personal experience in an argument, "in six weeks I learned enough to criticize many engineers with decades of experience" is not a strong argument. That's not even claiming to be a 10x engineer, that's claiming to be a 1000x engineer.

I don't intend this to sound harsh, apologies if it does. I strongly recommend you introspect on your own confidence and experience levels. I've mentored a lot of junior engineers with similar viewpoints and often there's a lot of hidden complexity they simply lack the experience to understand.

This isn't to say you should take everything anyone with more experience says at blind faith, simply that dismissing without understanding is counter to a growth mentality.


I invoked the personal experience only as evidence that I did what (little) I could to push along the project that I think has the most long term potential. I'm not criticizing without at least trying to do what I can.

In short, the reasons gccgo produces worse code than gc are the result of fairly basic optimizations that are/were missing. For example, open-coding string slicing [0] and string equality [1] is enough to close the gap in the TimeParse benchmark in the blog post above. In fact, with those changes, gccgo actually produces better code than gc, because the full power of GCC can be unleashed on the IR, and GCC has more aggressive optimizations than gc.

> This isn't to say you should take everything anyone with more experience says at blind faith, simply that dismissing without understanding is counter to a growth mentality.

Quite honestly, I think you're more guilty of dismissing without understanding than I am. It's true that I haven't been programming for decades, but I've spent two years dealing with the Go toolchain's shortcomings on a large Go project, and six weeks specifically on a passion project to improve gccgo. Of course it's possible that I'm wrong! But I've spent a while thinking about it, and often just as relevant as experience is a fresh, outside perspective. It's easy for a team to get stuck in tunnel vision or groupthink.

[0]: https://github.com/golang/gofrontend/commit/62e3a8cc0a862b0a... [1]: https://github.com/golang/gofrontend/commit/89b442a0100286ee...


> Right, you need to specifically throw money and people at the problem of making LLVM faster

I repeat, Apple is the major sponsor of LLVM and is sitting on ~$250 billion in cash. If LLVM could be made significantly faster by simply throwing money at it, presumably this has also occurred to Apple management. I can't believe they're not doing it because it doesn't align with their marketing pitch.


Apple embraced LLVM because Apple's lawyers are scared of version 3 of the GPL and consequently of GCC, and they need something to compile Objective C.

That is different from a belief that making LLVM better is worth management's attention or a slice of Apple's budget once it proved a satisfactory replacement for GCC.


> it's still much slower than the Go compiler toolchain

The kinds of optimizations LLVM does is way beyond anything golang does. Golang doesn't even optimize passing function parameters in registers, let alone the advanced optimization techniques LLVM and GCC do.


I think the parent is talking about compilation speed not runtime speed.


Optimizations affect compilation speed. golang barely does any optimizations, so it compiles quicker.


While Go has a lot of clear influence from Plan 9, I don't see the "conspiracy theory" here at all.

As far as I'm aware, one major reason why Go reinvented so much was to save time and effort. They did whatever they could in order to bootstrap fast and efficiently, and they did so by cannibalizing Plan 9's toolchain, including the compiler and assembly language. The original "gc" toolchain (with inscrutable binary names like "6g" and "6a"), was written in C and came directly from Inferno. You can browse the original commit here [2]. That stuff has all been rewritten in Go.

A key attribute I and others have noticed about highly productive developers is that they tend to build an effective toolchain around themselves and bring it with them for new projects. Sometimes that stuff can become legacy baggage, but there's no denying that it's a good strategy.

Perhaps LLVM or GCC would also have been a good strategy. There are some arguments to the contrary. 11 years ago, when Go was started, LLVM wasn't nearly as mature as today. But look at the hurdles other projects like Rust have had to get over with LLVM. And a large part of Rust's compilation speed is apparently due to LLVM. So LLVM is not a magic bullet. Migrating Go today to LLVM would of course be a big, time-consuming zero-velocity project; you'd want to be really certain that the payoff would be worth the effort.

GCC is not an easy project to deal with, either. For decades, its internal intermediate representation was undocumented and intentionally obfuscated [2] to ensure FSF/GNU control over backends.

I do agree that Go has a certain bias towards a particular, idiosynchratic way of doing things, which is not always a positive.

[1] https://github.com/golang/go/commit/0cafb9ea3d

[2] http://lambda-the-ultimate.org/node/715


Thanks for the thoughtful response.

I don't mean to imply that there's a conspiracy here. What I mean is that I think the Plan 9 heritage is clouding strategic decisions around the Go toolchain. What may have been the right decision to get a new language off the ground is not necessarily the right decision once the language is widely popular and stable.

> Migrating Go today to LLVM would of course be a big, time-consuming zero-velocity project; you'd want to be really certain that the payoff would be worth the effort.

A big project, yes, but it's happening! [3] If a few engineers working on gollvm for a year or two could improve gollvm to the point that it made the average Go program run 20% faster, I'd think that would absolutely be worth it to Google.

> GCC is not an easy project to deal with, either. For decades, its internal intermediate representation was undocumented and intentionally obfuscated to ensure FSF/GNU control over backends.

Very true in general, but the Go team has been lucky in that GCC core maintainer Ian Taylor has been a member since the early days. Gccgo has been a spec-conformant Go compiler since 2012 [2], and so nearly all of the GCC-integration bits have been in place for seven years. As a result, it's far more a matter of staffing the project so that the Go frontend's inliner, garbage collector, and escape analysis can reach parity with gc, rather than dealing with GCC/FSF politics.

> You can browse the original commit here. That stuff has all been rewritten in Go.

Yeah, I'm familiar with the lineage. I don't know that I'd say that it's been rewritten in Go, though, as the toolchain and runtime were converted fairly automatically with a "c2go" tool that Russ Cox wrote [0] around the Go 1.5 release. Have you looked at the resulting code much? Some of it has been rewritten to be idiomatic Go, but a lot of it is still very clearly C code that has been automatically translated [1]. See also the fact that go/src/runtime/runtime.go, go/src/runtime/runtime1.go, and go/src/runtime/runtime2.go all exist—a consequence of the fact that runtime.go, runtime.c, and runtime.h all existed in Go 1.4 [2].

[0]: https://github.com/rsc/c2go

[1]: https://github.com/golang/go/blob/7b294cdd8df0a9523010f6ffc8...

[2]: https://blog.golang.org/gccgo-in-gcc-471

[3]: https://go.googlesource.com/gollvm/+log


I see lots of false assertions here. Here's a significant one:

> There are so many optimizations that will likely never be added to gc, like a register-based calling convention.

The calling conventions was marked as undefined as a first step to changing it [1]. The change was introduced by the very author of this post.

Also, a more agressive inlining mitigates the slow calling conventions.

More generally, I think it is a huge achievement for a PL to be self hosting. It makes its development easier as there is only one language to deeply know (plus assembly of course).

[1] https://github.com/golang/go/issues/27539


> Indeed, for a while, the Go compiler was written in this special dialect of C, and so building Go required building a C compiler (!) that could compile this custom dialect of C, and using that to compile the Go compiler [2].

I don't think that's true. The build process used GCC to compile the Go and C compilers. The Plan-9-derived C compiler was used for compiling those parts of the runtime that were written in C back then and were supposed to follow the conventions of Go program code.

As you can tell from Inferno (or even Plan 9 from userspace), GCC can compile the Plan 9 C dialect, given the right options.


This is a minor point, but:

> What other language ships its own assembler?!

The majority of native code compilers I have seen include their own assembler. Many of them don't have a textual input format, but they are assemblers nonetheless.


Go is supposed to be a concurrent, type-safe, memory-safe, systems programming language.

This goal is entirely at odds with an unsafe, legacy-ridden C/C++ toolchain.




Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: