Hacker News new | comments | show | ask | jobs | submit login
Gollvm from Google (googlesource.com)
142 points by aleksi on May 27, 2017 | hide | past | web | favorite | 48 comments

It looks like Google employee was working on this in his own time for several months (https://github.com/thanm/dragongo) and now the project has been "adopted" by Google/Go team.

By "adopted" I mean that:

* the code was moved to the same git hosting infrastructure that also hosts official Go compiler and libraries owned by Go team

* the license was changed from Apache to the same BSD license as Go compiler

* another Google employee is contributing to the code

* initial checkin was made by Russ Cox, who is pretty much the lead for Go project

All that implies that this has a blessing of the Go team

I am somewhat confused by this. The LLVM project already has an official Go frontend that lowers to LLVM IR:

* https://llvm.org/svn/llvm-project/llgo/trunk/README.TXT


> You attempted to reach llvm.org, but the server presented a certificate signed using a weak signature algorithm (such as SHA-1). This means that the security credentials the server presented could have been forged, and the server may not be the server you expected (you may be communicating with an attacker).

Does Google think they can do a better job?

I don't really know, but I would be suprised if that is the intent of the project because:

  1. The maintainer of llgo is a Google employee (and a very talented LLVM engineer).

  2. I don't imagine there would be tremendous differences in the strategies used to generate LLVM IR between the tools.  ISTM, if there are deficiencies in llgo, then it would be better to fix them rather than creating a whole new tool.

If you take advantage of the two-hour edit window and remove the spaces in front of 1 and 2, it will be easier to read, especially on mobile:

1. The maintainer of llgo is a Google employee (and a very talented LLVM engineer).

2. I don't imagine there would be tremendous differences in the strategies used to generate LLVM IR between the tools. ISTM, if there are deficiencies in llgo, then it would be better to fix them rather than creating a whole new tool.

So question for llvm devs here: what benefit does this provide? How does it enhance go? Does it make go programs compile into more efficient binaries targeted to specific cpu architectures?

A big problem intermediate code solves is that, you don't need a big monolithic compiler for both front-end language parsing and back-end architecture instructions. They call it MxN problem, so instead of MxN combinations of architectures and languages in a monolithic compiler - you get M+N components where M handle the language parsing etc.. while N handle the conversion from the single intermediate language/instructions to the target architecture's instructions.

Go already has its own intermediate representation to solve this problem, so this project must solve some other problem.

Better optimizations? AFAIR, the go compiler does some optimizations, but it does not bend over backwards, exactly. Maybe hooking up to llvm can help with that, if it is a goal. (Mmmh, does LLVM optmize at all, or does it just provide a framework for people trying to build optimizing compilers? I don't really know.)

Also, llvm by now reaches far more platforms than the current go compiler. I think that this is the most likely explanation.

But gccgo already exists, and provides an optimizing compiler backend and extended platform support. And the tools in LLVM's ecosystem seem largely useless to Go (no need for asan et al when you have a GC, and no need for tsan when Go already has an optional race detector). Other than the compiler itself having a permissive license, I see no urgent impetus for an LLVM backend for Go, which likely explains why it's taking so long coming.

> But gccgo already exists, and provides an optimizing compiler backend and extended platform support.

We (SUSE / openSUSE) had an incredible amount of issues with gccgo. The main problem was that the runtime wasn't updated often enough, they had some odd patches that broke the runtime, and you generally had to update the compiler to update the Go version (quite difficult in enterprise distributions).

Wouldn't you have those same problems with an LLVM-based backend? LLVM moves very quickly, and AFAICT upgrades are non-trivial.

The problem wasn't the code generation, it's that it was maintained in a way that made it very difficult to update in distributions. They also broke the stdlib in a few versions. But even if you ignore all of that, if it was supported by upstream then we would have more fate in it than using a franken-compiler. ;)

LLVM has better support for moving GC, due to Azul's efforts. That's a big difference.

Doesn't gccgo reuse the standard Go runtime, and wouldn't integrating LLVM'S GC support require pretty much writing a brand-new runtime?

LLVM doesn't come with a garbage collector; it comes with hooks you can plug your GC into. So it'd require a good bit of work to integrate with LLVM's GC support, but nowhere near rewriting the whole runtime.

Go having its own intermediate representation "solves" the problem but requires go-specific backends/lowering mechanisms. LLVM is arguably a more generic place for backends to live. (though in practice it does serve clang's needs best).

If you design a new processor, you generally take it upon yourself to do the work necessary for folks to use C compilers that target your processors. I'd argue that they're more likely to contribute a backend implementation to LLVM than golang.

I don't disagree, but that's a different problem than the one discussed above. :)

But go ir != llvm ir. The idea is you can leave behind optimizations passes and backend, simply maintaining a frontend, when you adopt the llvm ir and toolchain.

I don't disagree, but this is a different problem than the one cited above.

well, it is - because the existing architectural backends for llvm ir won't work with go's ir.

The cited problem was abstracting over different architectures. Go's IR solves this problem already. Solving this problem has nothing to do with making LLVM IR and Go's IR interoperable. A related problem might be that Go's IR doesn't target as many platforms as LLVM's, but that's still a different problem.

Hopefully following the system ABI on AMD64 instead of passing arguments and return values on the stack, though I guess it's to be seen what happens with multiple return values.

"At the moment llvm-goparse is not capable of building the Go libraries + runtime (libgo), which makes it difficult/unwieldy to use for running actual Go programs. As an interim workaround, I've written a shim/wrapper script that allows you to use llvm-goparse in combination with an existing GCCGO installation, using gccgo for the runtime/libraries and the linking step, but llvm-goparse for any compilation."

Not sure I get it: so Gollvm (=llvm-goparse?) can be used as compiler, but not as a linker (yet?), so gccgo's linker can be used? Also, Go runtime and standard libraries can't be compiled with Gollvm? If standard libs can't be compiled, then how can I know if my app can be compiled?

I know a project named "llvm-go" was started quite long ago, and had somewhat slow (compared to gccgo) progress because of few, non-Google contributors (probably hobbyists); is this the same work? is it just still in progress, but somewhat more (how much?) advanced now?

Some additional googling shows a similarly named project (https://github.com/go-llvm/llvm), which from its readme seems absorbed by LLVM proper, and is subtitled "LLVM bindings for [Go]" (http://llvm.org/svn/llvm-project/llvm/trunk/bindings/go/READ...). Is this the same project, or something more, or something else? [EDIT:] Ok, based on the CONTRIBUTORS file, it's a totally different project, at least one question cleared. (https://go.googlesource.com/gollvm/+/master/CONTRIBUTORS)

go-llvm/llvm appears to be a set of Go bindings to LLVM, i.e., the ability to access libllvm (embed LLVM and use it programatically) from a normal Go program. That is, you're using Go to drive LLVM, not LLVM to compile Go.

See the factorial example, where they build up some LLVM IR (in memory) for computing factorials: https://github.com/go-llvm/llvm/blob/master/examples/factori...

I'd guess the advantage of this is using LLVM's JIT at runtime.

> Is this the same project, or something more, or something else?

Compare the source code:



I think you're thinking of llgo, not go-llvm.

> define hidden i64 @foo.bar() {

> entry:

> %"$ret0" = alloca i64

> store i64 0, i64* %"$ret0"

> store i64 1, i64* %"$ret0"

> %"$ret0.ld.0" = load i64, i64* %"$ret0"

> ret i64 %"$ret0.ld.0"

> }

Can someone knowledgeable with Go, explain what's happening here? Why does it store a 0 and then a 1 in "$ret0"? Why does it allocate a single integer on the stack? (is it because this is just intermediate code for a virtual machine?) and all of that just to return a 1.

Why zero first, that's already been explained. Why alloca from the stack: this is because there is a pass in LLVM which converts alloca locations into SSA form[1], and SSA is what many of the other optimizations in LLVM are built around. The rules around SSA mean that variables are renamed rather than modified (every assignment is final, except for phi nodes where control flow merges), so producing SSA output from an imperative language is a bit more difficult. Since LLVM comes with something that converts mutable stack allocations into SSA, it's easier to just leverage it.

The pass is called mem2reg:


[1] https://en.wikipedia.org/wiki/Static_single_assignment_form

> Why does it store a 0

Go language semantics define that variables receive a zero initialization.[0]

> and then a 1 in "$ret0"?

The 1 is because the code explicitly stores a 1 :-).

Clearly the output code has not been through an optimization pass.

[0]: https://gobyexample.com/variables

but if you look at the code, there isn't a "variable" to be initialized - the function is supposed to return a constant/literal one.

This means we could write numerical code in Go and get auotvectorized assembly and use GPUs via the nvptx backend. Neat!

Go is a GC language. Only a strict subset would be amenable to running on a GPU.

There's no reason GC couldn't work on GPUs. No one just has bothered implementing it yet.


Yeah, and a GPU could also just happen to give you the right answer in an O(1) hashtable lookup

If you noticed, I did say numerical algorithms, for most of which it's usually easy to allocate memory in advance.

I didn't mean anyone would want to do this but just that it's an interesting parallelism for a language.

I thought this project was very old and perhaps even abandoned? I wonder if I'm confusing it with a similar project, or perhaps this project was revived? Maybe I'm just mistaken...

> I wonder if I'm confusing it with a similar project

Yes, you're confusing it with this project: https://github.com/go-llvm/llvm

Aww.. they lost a chance to call it Gollum!

They did, it's just the latin spelling: GOLLVM.

Or Precious: the go llvm compiler.

Super great name!

Makes me think they came up with the name first and then decided to build it

I did wonder what virtual machine this GollVM would be...

Well, the Go low-level VM, of course!

I wonder if they will implement the scheduler/goroutines "properly" unlike gccgo with threads? That would be amazing.

That's an especially interesting possibility since LLVM now has IR support for C++ coroutines.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact