Hacker News new | past | comments | ask | show | jobs | submit login
TinyGo: New Go Compiler Based on LLVM (tinygo.org)
293 points by dcu on July 18, 2019 | hide | past | favorite | 51 comments

> Garbage collection is currently only supported on ARM microcontrollers (Cortex-M). For this platform, a simple conservative mark-sweep collector has been implemented. Other platforms will just allocate memory without ever freeing it.

Conservative garbage collection has serious problems on 32-bit. Stuck pointers happen quite often. The usual culprit is floats, which thankfully don't usually appear in embedded code, but by no means is the problem limited to floats.

Oops, those docs need updating. The GC now works on all platforms (including RISC-V and WebAssembly!) except for AVR because the AVR backend is just too unreliable at the moment and you wouldn't want a GC in 2kB of memory anyway.

That's wonderful to hear, I just wish I had learned about it sooner. I kept looking back at TinyGo for WASM tasks, because it looked like the right combo of size footprint and runtime capabilities, then ultimately thinking "hmm but the GC issue."

I'll probably take a look later today!

GC works even in WASM? That's great! What size wasm targets does TinyGo generate compared to the regular Go compiler?

It produces much smaller binaries My blog post on it -> https://dev.to/sendilkumarn/tiny-go-to-webassembly-5168

In the (reasonably simple) stuff we've been doing so far, TinyGo produced wasm files are 2-3% the size of mainline Go produced wasm files. eg:

https://justinclift.github.io/tinygo_canvas2/ (~19.5kB)

https://justinclift.github.io/tinygo-wasm-rotating-cube/ (WebGL, ~12kB)

Can you elaborate? How are pointers to floats problematic? Or is it something else?

Conservative GC is pretty much: look at all reachable memory, whatever looks like a pointer to a range which is allocated keeps that range allocated.

Pros: you don't have to care about type information and precise stack/structure walking.

Cons: If you have range 0x8000-0x8010 allocated, and have a variable with integer value 0x8001 somewhere in the memory, it will keep that range allocated. It doesn't matter that it's an int, not a pointer. Floats and 32-bit pointers have quite a lot of accidental collisions that way.

Conservative GCs don't know which bytes in memory are actually pointers, so they treat every word in memory as being a pointer if it looks like one. This means if you have some other value that happens to look like a pointer — in this case a float — the GC will think it's pointing to some other memory and keep that memory around even though it isn't used.

Because 1.0 is 0x3f800000, which is often right in the middle of the heap.

Also, even on 64 bit machines, conservative GCs often just do a naive scan of the stack. And of uninitialized values on the heap. This means that garbage may be treated as live. Garbage often includes pointers that should no longer be live.

I don't believe go can have uninitialized values. every variable has a default "zero value"

Not on the heap, but it can certainly have uninitialized variables on the stack. These cannot be created in normal Go code but the optimizer may decide to leave some values uninitialized when it determines that this is safe to do.

Also they're targeting a BBC:Microbit, which has an nRF51822 - the smaller version with only 16 kB of RAM. 16 kB! How is garbage collection going to work here for anything but the most trivial examples?

You can run MicroPython on the nRF51822. That's a GC language. I've ran some fairly complex programs on it.

Lack of careful programming and you run out of memory, but that's more a function of assuming you have a large memory space, not GC in general.

I think the person you’re replying to means a GC that needs headroom, not a GC like Python’s which generally doesn’t.

FWIW micropython replace's cpython's refcount-with-cycle-breaker by a mark-and-sweep GC: https://github.com/micropython/micropython/wiki/Memory-Manag...

I have some trouble seeing the point of a Go compiler/implementation without any working GC, or with the only GC being a problematic mark+sweep implementation. Is this just a proof-of-concept effort, or do they expect to add workable GC down the line?

Personally, I welcome all comers, regardless of intent or completeness. That they offer a minimal refcounted implementation for the platform more likely to need it sooner seems to signal good intent.

I say this watching 30+ years of alternate implementations of different languages and runtimes. Even those that failed (or maybe moreso) indirectly inspired positive changes in the survivors.

While I have no insight to their plans, perhaps a non-GC toolchain could still be used for short-lived programs (such as command line tools)? Don't even need to free anything if it's short enough, the OS can deal with it when the process exits

>a non-GC toolchain could still be used for short-lived programs (such as command line tools)

Or missile firmware: https://groups.google.com/forum/message/raw?msg=comp.lang.ad...

You can already do this: GOGC=off [1]

[1] https://blog.cloudflare.com/go-dont-collect-my-garbage/

https://tinygo.org/usage/important-options/ Take a look at the `-gc` flag. The values have changed in the latest version (need to update the docs!) but it does provide an option to disable the GC entirely.

The conservative mark-sweep implementation was the easiest to write: I don't think there is any real GC that is simpler. In the long term, the plan is to add other GCs that are precise and can be used in a real time context. However, note that such a conservative mark-sweep GC is good enough for many use cases already (look at MicroPython!) and that other GCs will likely cause additional RAM/flash bloat.

I don’t know go, but it almost certainly lets you write non-idiomatic go code that doesn’t use GC allocated memory? Usually languages not designed for systems programming have these features for interop purposes (ffi, p/invoke, etc).

Although if that’s the case, I personally don’t get the draw. Typically these managed/higher-level/interpreted languages are harder to use than the embedded-friendly alternatives if you’re not using their automatic memory management. But a lot of times people just want to stick to (really weird and distant) versions of languages they already know/use (D is probably the only exception here for obvious reasons).

Depends on your definition of embedded.

How is this different than gollvm? And how does it combat the problems from the Go FAQ:

> At the beginning of the project we considered using LLVM for gc but decided it was too large and slow to meet our performance goals. More important in retrospect, starting with LLVM would have made it harder to introduce some of the ABI and related changes, such as stack management, that Go requires but not are not part of the standard C setup. A new LLVM implementation is starting to come together now, however.

There are 3 LLVM-based Go compilers that I'm aware of: gollvm, llgo, and tinygo. Of those, only TinyGo reimplements the runtime that causes lots of (code size) overhead in the other compilers.

There is more to a toolchain than translating source code to machine code. I'm sure the others do that job just as well, but only TinyGo combines that with a reimplemented runtime that optimizes size over speed and allows it to be used directly on bare metal hardware: binaries of just a few kB are not uncommon.

Congratulations on this new project. It’s compelling, your website is informative and easy to read, and the mission statement is very clear. Plus, it just looks like a lot of fun. Well done!

In my opinion, this feels very much in the hallowed traditions of Tiny Basic and Tiny C, but with modern tool chains.

I disagree that this is in the tradition of projects like TinyC; I wouldn't call this project "tiny" in the same way I would call TinyC "tiny" because this project uses LLVM, which is quite large and complex.

Hehe, LLVM takes about an hour to compile on my laptop. The compiler itself is not tiny in any way. It really refers to the size of the binaries produced which is usually very small. For example, for WebAssembly, reductions to 3% of the original size are not uncommon.

3% seems common (in my testing), with 2% occasionally happening as well. ;)

I would be interested to see some more complex examples for this than the ones provided[1], specifically around interrupts and DMA, both are somewhat "concurrent" resources that in C need to be managed manually, with careful use of the volatile keyword and similar.

For instance, UART transmit looks to be implemented using blocking IO[2]. I don't know Go very well, but it would be interesting if this could be implemented as a buffered channel [3], which would provide a nice abstraction for the hardware FIFO buffer used by the UART, and allow the CPU to be doing other things. The same could also be used for the I2S support I think, which will often be used to send much more data (streaming audio) than the UART.

On closer inspection, looking at the issue tracker, it appears this is already in planning, which is great [4].

[1] https://github.com/tinygo-org/tinygo/tree/master/src/example... [2] https://github.com/tinygo-org/tinygo/blob/515daa7d3c9af18c78... [3] https://gobyexample.com/channel-buffering [4] https://github.com/tinygo-org/tinygo/issues/9#issuecomment-5...

LLVM has a lot of optimizations. Does anyone have information about performance of applications built by each complier? I’d especially be curious about apps well tuned to reduce usage on the naive GC that TinyGo uses.

I'd love to see this targeted towards PICs. Really I'd like to just see any modern language targeting low power microcontrollers allowing for optimization of code size or performance. 8-bit processors using a banked memory model are hard for languages with modern (flat) memory models to adapt to.

Zig lets you bring your own allocator. https://ziglang.org/

I don’t think LLVM (which Zig uses) can currently target PIC.

Basic and Pascal might not be modern, but they surely are alternatives to C for targeting PIC.

That should be possible: there is also experimental AVR support. Experimental because the LLVM backend is unfortunately too buggy to be used in real programs (but is improving!).

The first step to make this happen, would be to add a PIC backend to LLVM.

golang compiler on a raspberry pi 3 compiles applications really slowly, think minutes, but there's a workaround which is to cross compile on x86, think seconds plus sftp transfer. i assume the primary use case is to speed up compilation and create smaller binary sizes? this could be helpful as binary sizes for golang compiled binaries are pretty large, my apps typically are around 2-20MB, but i understand they include all dependencies.

> i assume the primary use case is to speed up compilation and create smaller binary sizes?

LLVM is going to be much slower than the Go toolchain. It may produce smaller binaries, but mainly through dropping the bulk of the runtime on the floor. The main thing that LLVM does is (apparently) reduce the stack usage.

But it does support the CPUs that the author wants to target out of the box.

I believe LLVM toolchains also (by default) exclude debug information, which gc (go compiler) would need to be told to explicitly exclude.

You generally need to run the code through some rewriters that, as far as I recall, drop those symbols anyways -- these CPUs can't execute an elf binary directly from flash.

As a data point, TinyGo includes the debug info by default. There is a "-no-debug" option to turn off the inclusion though.

I compiled go as an x86-64 to ARM cross compiler (ubuntu to raspi). Worked great. I even confused myself a bit by running the ARM binary on the x86-64 machine (turns out binfmt will detect the binary and run it under QEMU).

I wonder if this could be adapted to bootstrap the standard Go compiler on new architectures. Big if true.

Standard Go compiler can be bootstrapped on new architectures with cross compilation. It does not need any new implementation.

Doesn't the GCC, and thus the GCC Go implementation, already cover more architectures than LLVM?

Is that a problem the Go team has with their current compiler?

Maybe? They wouldn’t be able to do the current bootstrap of each Go version being built with the previous but they could go back to the first release. There’s a canonical version where Go bootstrapping starts, so it might be possible to avoid fixing unportable C in the early days.

Applications are open for YC Summer 2023

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact