

The Managed Runtime Initiative - helwr
http://lwn.net/Articles/392307/

======
hga
Be sure to skip down to the comments here: <http://lwn.net/Articles/392365/>
(or find the first one from bgallmeister).

Where a reply to this posting is made. Further comments following are
generally quite interesting, e.g.:

" _In general, current virtual memory implementations in almost all OSs assume
that virtual memory manipulation is a relatively rare event, and we have put
forward an algorithm and an application for rapidly-changing mappings that
makes a real difference to a vast array of applications, but it can't do so
within the limitations of current virtual memory APIs and manipulation
speeds._ "

" _[...] Our GC code is all in user space (part of the OpenJDK based code we
put up along with the kernel mods) - it just needs some very scalable and
somewhat different virtual and physical memory manipulation semantics form the
kernel._ " (<http://lwn.net/Articles/392745/>)

And:

" _The duration of the stop-the-world pause in all current JVM GC's is
generally linear to the amount of live data the heap contains (you have to
scan all that stuff and fix all the pointers to the relocated objects). This
means that the larger the heap - the larger the pause. Sun's CMS (the Mostly
Concurrent Mark Sweep -XX:+UseConcMarkSweepGC mentioned above) will delay the
compaction as long as it can and track empty spaces in free lists, but it will
eventually fall back on it's compaction code and pause for about 2-4 seconds
per live gigabyte on a modern x86-64 machine. This is why JVMs are generally
not used with more than a few GB of data, except for batch apps (ones that can
accept a 10s of seconds of complete pause). Since a 256GB server now costs
less than $18K, there is a ~100x and growing gap between commodity server
capacity and the ability for individual runtime to scale with acceptable
response times._ " (<http://lwn.net/Articles/392797/>)

Azul has claimed they've fixed this on their own custom hardware (a generic 64
bit RISC with some extra instructions including at least one to implement a
fine grained memory barrier) and this is a version of their software based on
that.

It will be very interesting to see if it's truly practical to put all your GC
code in userspace; they made a decision to put almost all of their secret
sauces in userspace and say that's saved them many times.

------
jwr
I find the aggressive tone of Jon's post surprising and I'm troubled by it.

I write code in Clojure, deploy on the JVM using Amazon's Linux servers and GC
pauses are a very real issue for us. Azul has been working on this problem for
years and I'm quite happy that they decided to open-source a large part of
their work.

Now, before you jump on me with the obvious — I do not advocate pushing crappy
or unclear code into the kernel. But I'd much rather see a healthy discussion
than this childish "take your toys and go away, we don't like you and you're
not welcome here" attitude.

~~~
tomjen3
The Java guys have spend a fair amount of time tuning and improving garbage
collectors.

Given that you run clojure, have you considered trying to use a concurrent
garbage collector?

~~~
jwr
We are using the Concurrent Mark-Sweep collector now, but this doesn't solve
the problem completely. You still get pauses when Full GC occurs.

------
Oxryly
Seems like they have developed a good approach to the problem. Hopefully
people can smooth their ruffled feathers enough to discuss the meat of the
problem.

