I'm not sure I agree with the point regarding concurrent data structures and gar...

pron · on June 3, 2013

Go's designers seemed to take a somewhat defeatist approach in this regard. Instead of writing a really good GC, they opted to design the runtime in such a way that more memory sharing is possible, and the pressure on the GC is hopefully reduced. But they've done this at a cost: they've deliberately introduced aliasing, which all but completely precludes the possibility of ever implementing a really good GC for Go. That's why I think Go might be good enough for current multi-core hardware, but will not scale in a many-core world (there are other aspects of Go that are major road blocks for future scaling as well).

I remember seeing a talk by Cliff Click where he explains how a small change to the CPU can greatly assist with developing pauseless GCs (though his former employer, Azul Systems, has moved on to implement the whole thing in software, probably at some cost to performance).

Regardless of the undertaking, the question remains -- as we move into a many-core era -- whether we'll find a programming paradigm good enough to take advantage of many cores. I think that whatever the winning paradigm may be, it will require a "really good" GC.

rayiner · on June 3, 2013

Concurrent, scalable GC is hard, but it's made harder by the complete lack of cooperation from the hardware and OS. For example, you can dramatically simplify the design of a concurrent GC if you can scan the thread stacks and registers in a snapshot before doing the concurrent tracing. This should not be a high-latency operation on existing hardware, but on say OS X it can take 10's of milliseconds for just a couple of hundred threads. On Azul's machines, cooperation from its hypervisor allows thousands of threads to be paused in under a millisecond. Similarly, a read barrier like the kind implemented in Azul's hardware is not very complicated. The CPU already does, conceptually, a very elaborate virtual -> physical address translation + checking of access protections on every memory access, which is cached in the TLB. Adding a bit to the TLB to indicate that a page should be protected by a read barrier would cost very little.

Azul's GC is basically a "really good" GC. It can handle hundreds of CPUs with hundreds of GBs of heap with tens of GBs per second of allocation rate with pause times of 10-20 milliseconds. It's basically a solved problem with the proper overall system design.

hga · on June 3, 2013

Going by what I know of Azul's systems, their pauseless GC avoids a 1 second delay per GB of heap by focusing on solving the hardest case vs. deferring it. When running on their custom hardware (generic 64 bit RISC + their special sauce) it uses an instruction to cheaply implement a read barrier. According to a paper describing this system, which they've since improved, doing that on stock (x86_64) hardware would incur a ~20% performance penalty. If your application uses a lot of memory and can't afford pauses that's probably OK.

nickik · on June 3, 2013

There Collector C4 is a soft-realtime they can not provide realtime but if they can be quite sure that they have very samll pauses. I truly belive in that vision. The problem is that the all the parts of they system have to be updated.

At the moment they can run on x86_64 but its not as fast as it could be. The have a special build for every linux distribution.

I am not sure how good this all works if you have very few cores and not a lot of threads. All in all however the C4 defently shows the way of the future. Hardware developer should learn from Azul, the OS guys should work more on supporting managed runtimes and the programming language community should do so to. Managed Memory is the future exept in very, very special cases.

m0th87 · on June 3, 2013

I don't see how Go's pointers preclude a good GC. You can't have dangling pointers, so it's not as if they're forced to use a conservative GC. Care to elaborate?

pron · on June 3, 2013

The best-known issue is interior pointers[1]:

To give the programmer this flexibility, Go must support what we call interior pointers to objects allocated in the heap. The X.buf field in the example above lives within the struct but it is legal to capture the address of this inner field, for instance to pass it to an I/O routine. In Java, as in many garbage-collected languages, it is not possible to construct an interior pointer like this, but in Go it is idiomatic. This design point affects which collection algorithms can be used, and may make them more difficult, but after careful thought we decided that it was necessary to allow interior pointers because of the benefits to the programmer and the ability to reduce pressure on the (perhaps harder to implement) collector. So far, our experience comparing similar Go and Java programs shows that use of interior pointers can have a significant effect on total arena size, latency, and collection times.

Interior pointers may pose a problem for sophisticated GCs. Interestingly, this may not be a deal-breaker (see here: https://groups.google.com/forum/#!topic/golang-dev/GvA0DaCI2...), but because Go relies on interior pointers so much, and because the language does not distinguish between interior pointers and regular object pointers, this may prove to be a huge problem in practice.

[1]: http://talks.golang.org/2012/splash.article#TOC_14.

m0th87 · on June 3, 2013

Nice, thanks for the reference. It's also worth pointing out that Go's GC can't do soft real-time because of GC on the global heap. IIRC, Rust has a global heap as well, although it's less frequently used thanks to richer ownership semantics.

pcwalton · on June 3, 2013

> As an unrelated aside, I recall Andrei Alexandrescu making the argument that GC is important for type safety, unless the type system can prove that dangling pointers are impossible.

Rust's type system can.

cwzwarich · on June 3, 2013

But how do you go beyond the paradigm of linear / affine types and nested regions that Rust gives you when you remove GC? Lots of programs are difficult to write with only these features, e.g. a web browser, and when you do you just end up writing a garbage collector yourself anyways. Either you add optional unrestricted allocations or increase the complexity of the type system to allow for the expression of more safe deallocation using dependent types or another similarly strong logic.

I really like Rust, but I wonder how it is going to get over this hurdle that previous languages like Cyclone never fully solved.

kibwen · on June 3, 2013

The title of this post is somewhat misleading. Just because pcwalton wants to remove GC from the language doesn't mean that GC won't be available, just that it would be transformed from a compiler mechanism into a module in the standard library. The semantics would be exactly the same, though the syntax to use it would change a bit. The goal here isn't to hate on GC, it's to ensure that normal users are capable of implementing pointer types that are exactly as "blessed" as the built-in types--and the best way to do that is by eating your own dogfood.

cwzwarich · on June 3, 2013

pcwalton both wrote the article and the comment I was replying to. If GC isn't necessary because Rust's type system can prove that the deallocations are safe then the answer to the question of how to write programs with nontrivial memory allocation patterns shouldn't be "use an external unsafe GC".

kibwen · on June 3, 2013

The existence of Rust's current GC doesn't have anything to do with preventing dangling pointers. Off the top of my head, cyclical references are the only thing that you need GC for that you can't handle with unique pointers or borrowed pointers, but there may be others.

I would need to have read Andrei's original argument to be sure, but I'm guessing it was along the lines of "without a strong type system, garbage collection is necessary to prevent dangling pointers". And Rust's type system is strong enough to prevent dangling pointers; they cannot happen unless you are deliberately dropping into `unsafe` blocks. This is true regardless of if you have garbage collection built in. But that doesn't necessarily mean that the type system is powerful enough to encode all the patterns that GC allows.

jeltz · on June 3, 2013

Strongly agreed about the GC. One thing which got me interested in both Rust and Erlang is the fact that they do not have GC of shared memory and instead opt for a simple per process GC. Avoiding complicated problems where they are not necessary is part of good design.