Hacker News new | comments | ask | show | jobs | submit login
Garbage Collection in Ruby 2.1 (tmm1.net)
163 points by tmm1 on Dec 30, 2013 | hide | past | web | favorite | 30 comments

It's interesting to compare this to how python does it (reference counting with generations): http://patshaughnessy.net/2013/10/30/generational-gc-in-pyth...

"At first glance, Ruby and Python seem to implement garbage collection very differently. Ruby uses John McCarthy’s original mark and sweep algorithm, while Python uses reference counting. But when we look more closely, we see that Python uses bits of the mark and sweep idea to handle cyclic references, and that both Ruby and Python use generational garbage collection in similar ways. Python uses three separate generations, while Ruby 2.1 uses two.

This similarity should not be a surprise. Both languages are using computer science research that was done decades ago – before either Ruby or Python were even invented. I find it fascinating that when you look “under the hood” at different programming languages, you often find the same fundamental ideas and algorithms are used by all of them. Modern programming languages owe a great deal to the ground breaking computer science research that John McCarthy and his contemporaries did back in the 1960s and 1970s."

I think Henry Baker had a paper on why generational and refcounting GCs are ultimately equivalent. Couldn't find it quickly, though, to give you a link.

I think you might be thinking of this paper: A unified theory of garbage collection [1], which shows how most modern collectors are hybrids of GC and ref counting.

[1] http://atlas.cs.virginia.edu/~weimer/2008-415/reading/bacon-...

For the record: This is a very good paper. It's extremely easy to read, even for somebody who knows nothing about the background, and the scientific result is absolutely beautiful!

I just wanted to say thank you for this comment. I'm reading the paper because of it. Can you please point me to your other favorites that are good + accessible?

I wouldn't even know where to start :-)

Haha you aren't getting off that easy. Could you please tell me, say, 3 that come to your mind?

Oh, it was David Bacon. Thanks for finding this link.

Isn't it actually kind of sad, that such important researches don't get considered until 40 or 50 years later?

I don't think such research wasn't considered, it's been used constantly in many projects since.

The reason why some platforms have "silly" choices like stop the world M&S GC, or interpreters with AST walking is, likely, that optimization wasn't a goal in the original implementation and it's hard to retrofit it while keeping compatibility.

Also interesting to compare this with how the G1 collector for the JVM handles it: http://www.infoq.com/articles/G1-One-Garbage-Collector-To-Ru...

EDIT: Better link: http://citeseerx.ist.psu.edu/viewdoc/download?doi=

So Ruby objects occupy 40 bytes each, inside pages on the eden heap.

That's pretty huge. That's several times larger than some other dynamic languages -- languages that have essentially the same features as Ruby.

Not really. Python primitives are smaller (24 bytes). Python strings are 37 + 1 byte per character. Python objects are 72 bytes. Python classes are 104 bytes. (Got these numbers playing around with sys.getsizeof in the python console.)

Primitives are a bit bigger in ruby, but they aren't really primitives anyway (everything is an object).

Python is another offender!

What languages are you referring to than? Perl and PHP are really the only others that are sort of in the same class as Ruby and Python.

Smalltalk, for one.

Perl objects are even bigger and need one indirection per access. A perl head is 32 bytes plus 48 byte body, ie. 80byte on 64bit systems.

For my perl rewrite p2 I need one single word per object on the stack. Common Lisps usually needs two words, the class pointer and the tagged value.

I really liked the visualization of the various GC's btw. Excellent work!

I have no idea of the size in other language implementations, but notice 40 bytes is on x86-64, on 32 bit it's either 20 or 24.

this is why running large ruby apps on 64 bit OSs can take up a lot more memory than 32 bit.

isn't that true of most platforms? I know the JVM has compressed OOPS to limit this issue, but I am not aware of similar things elsewhere.

To me, it's interesting that after all this time and with the prevalence of languages that rely heavily on garbage collection, garbage collectors still don't seem to scale to large heaps.

Large heaps mean long pauses when doing the "mark" phase of the generation holding most of the objects. That seems like a big problem that will increase as memory sizes increase.

Maybe the way I'm looking at it is too simplistic and there are better methods now. But even in Java, which has had plenty of time to work these issues out, I have seen issues with long GC pauses.

My impression right now is that GCs just don't scale to large heaps. To use more memory, you need to either manage memory yourself (which not a lot of modern languages allow), or increase the number of independent heaps (by using more tasks/processes).

Please enlighten me if I'm wrong here.

That seems to be the case in Oracle Java at least. With 80+ GB heaps, you can experience multi-minute stop-the-world collections when a full GC strikes. Luckily, that can be tuned to be very rare (for the usual load).

For better latency without such hiccups, the only solution I know of is Azul's Zing (http://www.azulsystems.com/zing/pgc) which really did away with larger pauses at least with our software.

What is "the usual load"?

Interesting; I'll check out the Azul one. I'm a little skeptical, but it might be an improvement. Did you see better throughput or just reduced latency?

By usual I mean what our application was tuned for.

Which, sadly, may or may not be what it encounters. There are still some usage patterns that can cause large amount of time being spent doing gc (without Zing).

Latency was greatly reduced, I cannot really say about throughput (except that it was not clearly worse at least).

The important number is the avg pause time. Here with ruby 2.1 7ms for a minor and 58ms for a major sweep for a typical ~500K heap app.

In my potion-based GC (a stack-scanning, compacting, cheney two-finger GC) the avg GC needs 3ms, but it's not concurrent (multi-threaded) yet.

All of these can be considered real-time, i.e. < 15ms. For bigger heaps the scans need to be incremental (saving and restoring GC state, which is easy with libgc or cheney).

The best java GC I found needed 150ms pause time. Good lisp's have real-time GCs.

What's this RailsApp.preload_all method mentioned in this post? Is that a github thing?

Great post btw. I am looking forward to getting some production apps onto 2.1 in the coming weeks and seeing how performance characteristics have changed in the new release.

We implement preload_all, which loops over app/{models,controllers}/ and requires everything. This method is generally called from config.ru, and happens in the unicorn master before it forks off workers.

I recently upstreamed a warmup method for Rack::Builder you can also use for this purpose: https://github.com/rack/rack/pull/617

Thanks for that link (and for submitting it!). I've done something similar in deploy scripts, to warm up a fresh instance before bringing it into rotation - even better having it able to be baked into rack like that.

That's interesting. Thanks for sharing. I'll have to check this out.

And all these are all very similar to Google's Chrome V8 and Mozilla's SpiderMonkey. Although both had incremental done already, for Ruby that is scheduled for v2.2.

Now just when will Ruby get a Method or Tracing JIT?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact