
Why You Should Be Excited About Garbage Collection in Ruby 2.0 - DavidChouinard
http://patshaughnessy.net/2012/3/23/why-you-should-be-excited-about-garbage-collection-in-ruby-2-0#
======
cpr
Gosh, I hate to be that "old guy," but using bitmaps for mark & sweep GC dates
from the early 70's.

In particular, in GC implementations for Lisp and Lisp-like languages (such as
the ECL implementation we used at Harvard in the mid-70's) where CONS cells
were two eighteen-bit halfwords (CAR and CDR) fitting in a 36-bit word, there
was no place for mark bits, so you'd use a bitmap instead (as rediscovered
here).

And we used the same technique for finding the page headers (e.g., containing
the mark bitmaps) for each part of the heap, aligning to a larger power of two
so you could chop any pointer down by and'ing off the lower bits to get to the
page header.

There really ain't much new under the sun. Too bad every generation has to
rediscover all this stuff.

~~~
joblessjunkie
I don't think this is rediscovery. These things just take time to implement.
Ruby hasn't been around all that long.

~~~
true_religion
Ruby has been around since 1995. That's over 15 years.

Implementing bitmaps for mark-and-sweep a project well within the skills of a
final year undergrad, or first year graduate student of CS.

You can say many things, but I doubt they didn't implement this because they
didn't have time, or didn't have the skills to do so. They merely prioritized
against it from day-one.

------
subwindow
Not really excited. Sorry. To me this is an incremental improvement to a
method of GC that is just not going to cut it.

Rubinius uses Generational GC. There are definitely some parts of Rubinius
that are (currently) slower than MRI, but memory management stands to be one
of places where Rubinius will demolish MRI in the long-run.

JRuby/JVM uses (from what I understand) a Generational-first hybrid GC.
There's been so much work and analysis done on the JVM that there's no way in
hell any mark and sweep GC is going to compete.

Honestly if MRI is going to remain competitive it needs to completely rethink
the way it does GC.

~~~
bluemoon
Jruby seems like it has the most potential in the GC department. It seems like
most of the interesting GC papers I've seen are work that's been done on the
JVM.

~~~
jacaetevha
And (IIRC) much of the JVM's GC and VM work came directly from the Smalltalk
world and the Strongtalk guys.

~~~
Uchikoma
No.JVM GCs (JRockit, Hotspot, IBM) have developed much further than what
Smalltalk did. The G1 GC is especially nice.

I found

[http://blog.dynatrace.com/2011/05/11/how-garbage-
collection-...](http://blog.dynatrace.com/2011/05/11/how-garbage-collection-
differs-in-the-three-big-jvms/)

a very interesting - and understandable - article.

------
jbellis
Wake me up when they add compaction.

[http://en.wikipedia.org/wiki/Fragmentation_(computing)#Memor...](http://en.wikipedia.org/wiki/Fragmentation_\(computing\)#Memory_fragmentation)

(CPython has this weakness too.
[http://www.slideshare.net/jbellis/pycon-2012-what-java-
can-l...](http://www.slideshare.net/jbellis/pycon-2012-what-java-can-learn-
from-python/8))

------
pkulak
I was expecting a lot more. It's still just mark and sweep, which is old news.

------
ajasmin
How do other garbage collectors handle this?

Can I fork a Java or V8 process and take advantage of the copy-on-write
optimization?

~~~
oconnore
Most [1] modern GC implementations use generational copying collectors. A
copying GC does not need to maintain a bit for each allocated object, and
therefore will not break copy-on-write.

They are also much faster for many common loads.

[1] I suppose this is debatable. Oddly, many modern scripting languages use
__ancient __compiler technology for no discernible reason. See also: the GIL.

~~~
masklinn
> Oddly, many modern scripting languages use ancient compiler technology

Historical: most of them are 10~20 years old, "new compiler technologies" had
not trickled down much when they were created and they had no real need for
the things those provided (not to mention they have drawback, a GIL has better
single-threaded performances than fine-locking which is important when single-
treaded is your primary workload).

Also manpower, probably, it's harder to implement a generational concurrent GC
than a refcounting or a basic M&S, and it's even harder to _retrofit_ that
into an existing codebase (hence pypy having pluggable GC backends and
defaulting to a hybrid GC)

~~~
bickfordb
Java and Haskell (GHC) have pretty good single thread performance and do not
have a GIL.

~~~
ootachi
And they both had to spend a lot of effort to get concurrent GC to work
efficiently. The biggest downside of concurrent GC is that you need read
barriers, not just write barriers as in single-threaded incremental garbage
collection.

------
riffraff
Narihiro Nakamura did a lot of hacking on the ruby GC ("longlife" used in
twitter's Kiji, parallel marking, lazy sweeping, bitmap marking).

It's nice to see more of it ending up in the ruby mainline.

------
txttran
it took 3+ years to develop this technique? it seemed like the most
straightforward approach to the problem.

~~~
silentbicycle
Second most straightforward, perhaps. They previously had the bit for marking
inside objects, rather than on a separate memory page of just mark bits, which
interacts poorly with copy-on-write.

------
xenophanes
So what's REE's solution and how does it differ from this approach?

~~~
FooBarWidget
REE developer here. REE's solution is almost the same. Except we don't align
heaps; we identify the containing heap using binary search instead.

------
verelo
"Excited about Ruby" is something that I cant say I've ever been. Still...

