
Go 1.5 concurrent garbage collector pacing - crawshaw
https://docs.google.com/a/golang.org/document/d/1wmjrocXIWTr1JxU-3EQBI6BK6KgtiFArkG47XK73xIQ/edit#heading=h.q556xotjblu6
======
stcredzero
I'm very excited about this. If they can meet their goal, this will make Go
much more suitable for online "real-time" (in the colloquial technical sense)
applications like game servers. I'm currently starting on my 2nd "game server"
in golang. (To be fair, it's much more server than game as of yet.)

~~~
BinaryIdiot
More suitable sure but I'm not convinced you'd want to use anything but C or
C++ for a server that needs to be as "real-time" as possible. I wish GO could
use RAII or another memory scheme so they could drop GC altogether.

~~~
PopsiclePete
No thanks.

Watching you "manual memory all the way" dudes struggle with simple
concurrency as much as you do, I'd hate to throw away a GC just for that extra
teeny bit of performance.

~~~
comex
Well, the 10ms cited as desired GC latency is hardly an extra teeny bit of
performance for a game server which would like to tick at at least 60fps
(~16ms frame time), which is common for FPS; other types of games are
different of course. I suppose it could be tuned to further decrease latency
at the cost of throughput...

~~~
crawshaw
The 10ms number is an upper bound. You will only see pauses that long if you
generating large amounts of garbage, which is relatively easy to avoid.

~~~
oofabz
The duration of a pause has more to do with the size of your heap than with
how much garbage you generate. If you generate a lot of garbage, the pauses
don't get longer, they get more frequent.

~~~
slimsag
crawshaw isn't wrong -- and he's also the guy working on Android support. Go's
GC scans pointers in the heap.

\- A 10GB heap with one single heap-allocated []byte will have short <1ms
pauses.

\- A 512MB heap with 500k heap-allocated []byte will have long pauses.

------
jalopy
Does anyone else feel like this is reinventing the JVM Garbage Collection
technology - which has at least 20 years of calendar experience and countless
man hours of engineering research in the real world?

Whatever your thoughts are on Java the language - would it not make sense to
focus efforts on the OpenJVM project and all the languages it can support?

Genuinely curious if this seemingly "reinvent the wheel" approach is going to
provide benefits.

~~~
vessenes
Speaking as someone who writes go code every day right now, I like the
language for a number of reasons, including the fact that it can be compiled
and deployed without anything like the sort of build/deploy insanity that a
modern java installation comes with.

So, I'd rather hit my head against the wall on GC problems while smart people
improve it than hit my head against the wall with a MASSIVE deploy/build
headache that definitely won't go away for a bunch of technical, political and
historical reasons.

~~~
mike_hearn
What kind of insanity? Are you talking about desktop or server apps?

Whichever you mean, you can create apps that have self contained JREs with the
javapackager tool. But most server setups don't seem to need it. If you have a
broken IT environment where your admins refuse to upgrade the JVM, that can be
an issue, but then you might find yourself being told to stick with an
obsolete version of Go in future too in such an environment.

~~~
vessenes
Hi Mike! Fancy seeing you here.

I'm not trying to start a religious conversation; so with that caveat, I'll
respond -- By insanity, I'm referring to the need to grok stuff like this
page:

[http://maven.apache.org/guides/introduction/introduction-
to-...](http://maven.apache.org/guides/introduction/introduction-to-
dependency-mechanism.html)

Building a large java application is not something I'm an expert at, but I
have spent many hours of my life fighting with maven and ant and dependencies,
happily some time ago.

By comparison, there's "go build". Or if you need auto-cross-compiling,
there's goxc. While an IT manager might limit the version of go on a server,
it would have no impact on most go developers; I can build a self-contained
binary for linux right from my Macbook Air in under 2 seconds with a single
command. Or using goxc, I can build for all binary targets in very short
order.

To me reducing this sort of cognitive load on build/deployment is a HUGE
benefit go provides, especially to junior developers or teams with different
backgrounds.

~~~
zak_mc_kracken
You are confusing building and deploying.

Building in Java is also a one line command. Deploying is also often a one
line command (or a one click affair).

The fact that Go is natively compiled makes this kind of deployment a bit more
problematic, and it's bound to get more complicated once Go supports dynamic
loading.

I'm not trying to make a judgment call between Java and Go here, just pointing
out that saying that one of these language is insanely more complicated to
deploy than the other doesn't make much sense.

------
Osiris
For someone that's not very familiar with garbage collectors, does this mean
that the GC will reduce or eliminate execution pauses that are common in GC'd
languages?

~~~
tomp
I only skimmed the document, but couldn't find any details about the
implementation; it deals mostly with how to pace the GC so that it doesn't run
out of memory.

I can give you a few hints, though. The title says _concurrent_ GC, which is,
unfortunately, usually a misnomer in GC literature - it usually means that
parts of the GC are concurrent with the mutator (user program), but that there
are still synchronization points where pausing the mutator is required. If GC
that is _actually_ concurrent 100% of the time, _fully_ -concurrent or _on-
the-fly_ label (edit: and _pauseless_ as well, as _bradleyjg_ mentioned) is
used.

Second, while it's not that hard to implement a mostly-concurrent GC, it's
practically impossible to implement an efficient compacting, on-the-fly GC, as
it would require read barriers during the compaction phase, which would
significantly slow down the program. Azul makes such a GC, but they use
privileged (kernel) code to implement read barrier (mutator threads still need
to be paused/sychronized with, but not all at the same time) (edit2: Azul's C4
collector doesn't use a kernel read barrier, and apparently their read barrier
is much more efficient than most). I don't know if Go features a compacting GC
(although many languages do, as that's the only thing that can prevent
unbounded heap growth over time), but if it does, it's almost certainly not
on-the-fly GC.

To summarize, I assume this will only reduce the pauses, not eliminate them.

~~~
rdtsc
> The title says concurrent GC, which is, unfortunately, usually a misnomer in
> GC literature

It is confusing. I think they started using pausless as the term now instead
to distinguish between the two.

> Azul makes such a GC

Anyone who is interested in this, should read how that is implement. Really
fascinating and cool.

However they are not the only ones. Erlang's GC is also pause-less and
concurrent between actors. So GC in one actor won't pause other actors on a
multi-core system.

~~~
easytiger
> Anyone who is interested in this, should read how that is implement.

Are is it implemented? my understanding was that it retains 4 copies of the
entire heap in memory and continuously switches between them, or something
like that.

~~~
nickik
That is not at all how it works. Essentially it splits the hole heap up into
pages, pages have a state empty, allocating, evacuting. There are GC threads
who constantly reallocate objects from pages that have high amount of
unreachable objects to new pages.

In order not to have mutator threads waiting for GC threads, they use
worksharing. When the mutator wants to read from a page that is evacuting
then, it will move that object itself (this is where the need the read
barrior).

This all requires very small amount of syncronisation, there is one small
checkin point that every thread has to visit, then its all concurrent again.
The pauses are only relate to how long it takes everybody to checkin, not the
size of the heap, or the size of the pages.

This allows multible 100GB with very small pause times.

I just wrote this from memory and its of course more complicated, but this
might help you to think about it.

~~~
easytiger
Thanks

------
teabee89
Could someone provide a simple comparison with how other language runtimes
(e.g. Java) implement GC?

~~~
NullXorVoid
As far as I can tell, the JVM GC is (not surprisingly) significantly more
mature. The JVM has a concurrent mark-sweep algorithm that works similarly to
what's described in this document, but more importantly IMO the JVM's GC is
generational, and it appears Go's GC is not.

In the JVM, the CMS only occurs on the oldest generation which contains
objects that were in use for a while, whereas objects that are created and
discarded quickly are cleaned up in younger generations with a copying
collector, which avoids memory compaction by switching back and forth between
2 equally sized memory blocks. This means much of the time the JVM can avoid
doing a full CMS and keep the time spent in GC only a few ms per second.

I guess it depends on what your use case is. If you're writing low-latency
server applications that don't have loads of long-lived internal state and you
have memory to spare, the JVM's CMS GC is going to be way more efficient than
Go's. But if you're constrained for memory then avoiding copying collectors
and generational heaps is your best bet, but you'll pay for it in GC time. Of
course the JVM has other GC algorithms like G1, but I don't have much
experience with those.

~~~
schmichael
> the JVM's GC is generational, and it appears Go's GC is not.

Correct, that's slated for the 1.6 timeframe:

> While other goals may intervene, 1.6 will most likely be used to improve
> throughput by adding bump pointer allocation as well as a generational copy
> collector for nursery (new) spaces. The mature (old) space will be managed
> using our concurrent GC.

Source:
[https://docs.google.com/document/d/16Y4IsnNRCN43Mx0NZc5YXZLo...](https://docs.google.com/document/d/16Y4IsnNRCN43Mx0NZc5YXZLovrHvvLhK_h0KN8woTO4/preview?sle=true)

------
atdt
(Off-topic: Figures 1 & 2 are graphically simple and tidy. Does anyone know
what software they used to generate them? Or failing that, what software is
good at generating such graphics?)

~~~
mikecb
Looks like the drawings feature in Google Docs.

------
wiineeth
so what does it mean to performance?

