
Ruby Garbage Collection: Still Not Ready for Production - timr
http://www.omniref.com/blog/blog/2014/03/27/ruby-garbage-collection-still-not-ready-for-production/
======
FooBarWidget
Phusion Passenger author here.

The article mentions that Unicorn's out-of-band garbage collection is
problematic because the way it works - running the GC after every request, and
requiring turning off the normal GC - is overkill. But there the community is
working on a better solution.

In particular, Aman Gupta described this very same problem and created a gem
which improves out-of-band garbage collector, by only running it when it is
actually necessary, and by not requiring one to turn off the normal GC.
Phusion Passenger (even the open source version) already integrates with this
improved out-of-band garbage collector through a configuration option. This
all is described here: [http://blog.phusion.nl/2014/01/31/phusion-passenger-
now-supp...](http://blog.phusion.nl/2014/01/31/phusion-passenger-now-supports-
the-new-ruby-2-1-out-of-band-gc/)

Just one caveat: it's still a work in progress. There's currently 1 known bug
open that needs reviewing.

Besides the GC stuff, Phusion Passenger also has a very nice feature called
passenger_max_requests. It allows you to automatically restart a process after
it has processed the given number of requests, thereby lowering its peak
memory usage. As far as I know, Unicorn and Puma don't support this (at least
not out of the box; whether there are third party tools for this, I don't
know). And yes, this feature is in open source.

~~~
astrodust
Based on what I've seen in NodeJS, isn't it about time that Ruby has some kind
of temporary "Buffer" class that represents data in a way intrinsically
different from String?

This would allow explicit clearing of the data in a way that would break how
String works, but in the context of Buffer it would be allowed.

If the Ruby GC isn't cutting it for you, maybe old-school memory management is
the way to go, right?

~~~
phillmv
From the article, the issue is that the Ruby GC is triggered on total number
of objects, and not total amount of used memory.

The article claims this got worse with the new generational GC algorithm,
which sought to minimize the amount of work the GC has to do during the
collection pause. By marking long lived objects as probably OK, you end up
with fewer objects to free per GC cycle.

The problem then, is if you allocate some massive strings, and they get marked
as "old" then they might never get collected by the GC. It's not clear to me
if the article author sees it this way, but apparently there is some kind of
bugs where some class of objects get marked "old" by accident, and it sounds
like it's exacerbated by existing interpreter architecture.

Work on the Ruby interpreter is weirdly silo'ed off and mostly done by
Japanese developers, so there's a significant barrier to entry for any
enterprising C developer to roll her sleeves up and get hacking.

Anyhow, so, my understanding of your question is no, it would not help outside
of really specific optimizations? It depends on what this Buffer class would
do. Would they let you mark them as being young? Or avoid extra malloc calls
because you have special information on the size of your strings? Kinda
depends on how the generational algorithm was implemented? At this stage my
knowledge grows thin.

~~~
gary4gar

        Work on the Ruby interpreter is weirdly silo'ed off and mostly done by Japanese developers, so there's a significant barrier to entry for any enterprising C developer to roll her sleeves up and get hacking.
    

This is wrong. Ruby Developers welcome contribution in any form. Also, they
have various resource to get started:

    
    
        Official Contributing Guide: http://ruby-doc.org/core-2.1.1/doc/contributing_rdoc.html
        Ruby Hacking Guide: http://ruby-hacking-guide.github.io/
        Book on ruby internals: http://www.amazon.com/Ruby-Under-Microscope-Illustrated-Internals/dp/1593275277
        RubySpecs: http://rubyspec.org/
    

Further, incase you are stuck. you can post on the mailing-lists. someone will
surely help you get started.

~~~
stormbrew
It's entirely possible that it's changed in the last few years, but at the
very least what he said was once very true. There has traditionally been a
very real and very painful language barrier to the ruby core team.

But, to be fair, it's a bit of a goose and gander kind of situation. People
everywhere else in the world have to deal with that kind of situation _all the
time_.

~~~
steveklabnik
It has, very significantly. There is still a 日本語-only mailing list (ruby-dev),
but the English one has significantly more traffic (ruby-core). No decisions
are made in ruby-dev that are not also discussed in ruby-core.

In addition, it's only that mailing list that's split; the bugtracker is in
English, the help is all in English (with one or two 日本語 translations).

It's actually never been easier to contribute to Ruby.

~~~
stormbrew
That's good to hear.

------
csfrancis
Ruby 2.1.1 introduces a new environment variable,
RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR, that helps to mitigate the heap growth
caused by the generational garbage collector added in 2.1. By setting this
variable to a value lower than the default of 2 (we are using the suggested
value of 1.3) you can indirectly force the garbage collector to perform more
major GCs, which reduces heap growth.

See [https://bugs.ruby-lang.org/issues/9607](https://bugs.ruby-
lang.org/issues/9607) for more information.

~~~
jamesharker
We're using this & found it works well at slowing down heap growth however we
still end up with the same maximum memory usage as before, it just takes
longer to get there.

~~~
timr
Yes, exactly. You're changing the slope of the growth, but the growth still
happens.

------
rubiquity
> _(you are using Unicorn, right? You should be!)_

No, you should be using Puma.[0] Unicorn is a work of art and it is incredibly
simple, but with that comes a lot of waste. Puma won't fix this problem as it
lies in MRI, but you will be able to run way less processes so total memory
consumption won't be such an issue.

Ruby 2.1 comes with an asterisk. It's a lot faster but you should take some
time to tune the GC to your application's needs. Aman Gupta[1] has some
excellent posts on his blog about how to do this. On Rails apps that I have
upgraded from 2.0 to 2.1 we have seen around 25% (and up to as high as 50% in
some places) decrease in response times. The GC oddities will all certainly
get better in Ruby 2.2 (and maybe even in minor releases of 2.1 but I doubt
it).

0 - [http://puma.io](http://puma.io)

1 - [http://tmm1.net](http://tmm1.net)

~~~
otterley
"Unicorn is a work of art and it is incredibly simple, but with that comes a
lot of waste."

What "waste" are you talking about? It's a multiprocess Rack-compliant
webserver, and not much else.

~~~
FooBarWidget
Being multi-process _is_ the "waste". Each process has its own heap, its own
copy of the AST and bytecode, etc. Even with preforking and copy-on-write, the
overhead introduced by processes is still significantly bigger than running
many threads within 1 process.

Of course, this doesn't have to be a problem for everyone. Whether threads are
advantageous depends a lot on the workload.

~~~
otterley
I think a little extra memory in exchange for a whole lot of safety and code
simplicity is a fair tradeoff. If you don't have enough memory to spawn a few
hundred subprocesses I'd suggest you tackle that first.

~~~
ikawe
The _first_ step of scaling an app is _not_ to make sure you have a server
with at least 43GB of RAM.

If we say a typical Rails process is 150MB, then consider several hundred of
them...

150MB * 300 = 43GB

So, while your point is valid, in that there is value in safety and code
simplicity, we're not talking just "a little" memory. Consider your use case,
that's all.

~~~
otterley
Your point is about economics, not scaling per se. A program that requires
lots of memory may be wasteful, but it could still very well scale (in terms
of its ability to service requests under increasing load).

Nevertheless, the memory requirements of a multiprocess server are not a
linear function of the footprint of the master process. fork(2) on modern
Unices (including Linux, since, like forever) has copy-on-write semantics for
a child's memory usage. So when you fork a new process, the memory overhead is
relatively small (only the page tables are copied, not the heap, and never the
program text or shared libraries).

Only when a child process modifies or allocates a new heap page or exec()s
another program will it incur additional memory overhead. So the additional
memory required by a child Unicorn process (assuming the parent preloads the
application server code) won't usually be anywhere near 150MB (and if there's
a memory leak it's trivial to free it up by killing the child after it
services a request).

------
sams99
This article enrages me.

I totally agree with codinghorror that a blog without comments in not a blog,
this is a prime example. No way to respond to the author without jumping
through crazy hoops.

As to the issue.

1\. NEVER use unicorn oobgc that ships with unicorn or the old one that ships
with passenger. Use gctools on 2.1.1
[https://github.com/tmm1/gctools](https://github.com/tmm1/gctools) or this on
2.0 [http://samsaffron.com/archive/2013/11/22/demystifying-the-
ru...](http://samsaffron.com/archive/2013/11/22/demystifying-the-ruby-gc). If
you are disabling GC you are doing it wrong and creating rogue processes.

2\. Expect memory doubling with Ruby 2.1.1. Not happy with that? You have 2
options. Tune it down by reducing RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR. At 1
your memory consumption will be on par with 2.0. It ships with the 2 default.
Option 2, wait for a future release of Ruby, this will be fixed in 2.2 maybe
even patched a bit more in 2.1.2 . See: [https://bugs.ruby-
lang.org/issues/9607](https://bugs.ruby-lang.org/issues/9607) and
[http://vimeo.com/89491942](http://vimeo.com/89491942) and
[https://speakerdeck.com/samsaffron/why-
ruby-2-dot-1-excites-...](https://speakerdeck.com/samsaffron/why-
ruby-2-dot-1-excites-me)

As for the, full-of-bait, title. Ruby's GC is ready for production and being
used in production in plenty of places. Just don't go expecting rainbows if
you disable it. And expect memory doubling with 2.1.1 if left untuned. You can
choose how much you want the memory to increase from 2.0

~~~
timr
OK, I'll bite.

First, sorry we don't have comments yet. I'm the author, so you no longer have
to be enraged. Perhaps just annoyed.

Second, with Ruby 2.1, out of the box, this code:

    
    
       while true do 
        "a" * (1024 ** 2)
       end
    

leads to infinite process growth. There should be no need for "memory
doubling" to run this code -- you're generating throwaway strings of identical
size. Similar code, in other languages, does not lead to out of control memory
consumption.

Also, the problem isn't caused by disabling the GC. This happens in stock Ruby
2.1. Disabling the GC (and running it once per request) is the _fix_ for the
problem.

~~~
vidarh
> There should be no need for "memory doubling" to run this code -- you're
> generating throwaway strings of identical size.

How do you know they are throwaway?

Your code involves two method calls. While it would be unlikely that someone
has overridden them given that they are core String and Fixnum methods, it is
perfectly possible. E.g:

    
    
      class String;
    
        alias :old :*
    
        def * right
          $store ||= []
          $store << self
          old(right)
        end
      end
    
    
      "foo" * 5
      "bar" * 3
    
      p $store
    

In other words, even with seemingly innocent calls like that, it takes extra
work to be able to reuse that memory without a full GC pass. It's certainly
not _impossible_ , but it's not there yet.

So I agree with you in principle, but this is one of those areas where the
malleability of Ruby objects makes it tricky to optimize.

(one possible example approach is to set aside a bit to indicate "has at least
once been stored somewhere where it can escape" as "poor mans escape analysis"
\- if your object is only ever stored in local variables that have no been
captured by a lambda, or passed as arguments, then it can't escape higher than
where it was created, and so you can take shortcuts, otherwise you'd still
need a full gc pass)

------
Legion
> you are using Unicorn, right? You should be!

No. [http://puma.io/](http://puma.io/)

~~~
thinkbohemian
Heroku Ruby guy here. I recommend Puma. Especially if your app is threadsafe.
Even if it isn't you can still run multiple workers and only one thread and
you get protection against slow clients (that Unicorn doesn't have).
[https://devcenter.heroku.com/articles/deploying-rails-
applic...](https://devcenter.heroku.com/articles/deploying-rails-applications-
with-the-puma-web-server)

I also wrote this gem that tunes the number of Puma workers and resets any
workers where a memory leak is detected:
[http://github.com/schneems/puma_auto_tune](http://github.com/schneems/puma_auto_tune)

~~~
rubiquity
I admire your recent work with threads in Ruby with the Puma libraries and the
threaded in memory queue (I was working on something similar myself, I keep
forgetting to get back to it though). Most companies I see using Heroku are
always wondering if their bill could be lower and most of the time they are
using preforking servers and background job processors.

I look forward to your talk at Ancient City Ruby next week!

------
noelwelsh
Why not use the JVM (e.g. JRuby)? These problems have long been solved (if
they ever occurred to start with.)

~~~
gdeglin
[http://www.isrubyfastyet.com/](http://www.isrubyfastyet.com/) seems to show
that Rails+JRuby performance remains disappointingly bad. The memory usage is
kind of nuts too, not sure why it's so high.

Edit:

This page explains the high memory usage and it seems more reasonable now:
[https://github.com/jruby/jruby/wiki/Troubleshooting-
Memory-U...](https://github.com/jruby/jruby/wiki/Troubleshooting-Memory-Use)

~~~
teacup50
These benchmarks are uselessly bad:

\- The client connections are run _without concurrency_ , _serially_ , _on the
same machine_ :
[https://github.com/brianhempel/isrubyfastyet/blob/master/run...](https://github.com/brianhempel/isrubyfastyet/blob/master/runner/benchmarks/rails/requests_per_second_benchmark.rb#L16)

\- The JIT is only given _3 seconds_ to warm up:
[https://github.com/brianhempel/isrubyfastyet/blob/master/run...](https://github.com/brianhempel/isrubyfastyet/blob/master/runner/benchmarks/rails/requests_per_second_benchmark.rb#L12)
\-- HotSpot kicks ass, but not after only 3 seconds of serially issued
requests.

\- By running the tests single-threaded, they're throwing away one of the
major wins of the JVM -- low overhead concurrency.

------
dclowd9901
Using ARC with Objective C has been an amazing experience, and I come from the
brainless world of Javascript GC.

Any reason why more environments don't adopt this approach? It seems entirely
efficient, reasonable and well-designed...

~~~
dunham
Python does reference counting.

The downside is that you can leak memory if you have circular references. The
programmer needs to take care to not let this happen. (This is essentially the
cause of the Javascript memory leaks in IE, although the reference counting
only occurred at the boundary between Javascript objects and browser objects.)

I believe python handles this by occasionally running a garbage collector that
looks for leaked objects.

Objective C offers weak pointers (pointers that "don't count" and zero
themselves out when all the other pointers to the object disappear), which can
aid a programmer in avoiding circular references.

Finally, I believe it has been shown that reference counting can be slower
that a well-implemented garbage collector, but I suspect this depends on your
specific workload.

~~~
nostrademons
Reference counting is also the reason why Python requires a GIL. Incrementing
a reference count is an opportunity for a race condition (another thread could
read the refcount in between your read and write), which means that every
reference count needs to be protected by a lock. Either you do fine-grained
locking for each object, or you add a GIL for the whole interpreter. The
former will absolutely kill performance (not only do you need to increment a
refcount with every assignment or function call, you need to take a lock). The
latter makes the whole interpreter thread-hostile.

~~~
teacup50
You're off in the weeds. A refcount can be incremented/decremented with a
simple atomic compare-and-swap, and that's exactly what most refcounting
systems do.

It's not free, but it's damn close to it.

~~~
herokusaki
If that is the case then why doesn't Python do it? I mean this as a real
question; I genuinely don't know.

~~~
teacup50
Because it was designed and written poorly, and its thread-safety issues
extend far beyond refcounting.

------
purephase
Another vote for puma. We're using Unicorn in our production environment right
now with the unicorn-worker-killer gem, but our initial tests with ruby 2.1
and puma in dev/QA are going well so we're looking to move to that setup in
the near future.

~~~
dcu
a nice thing about unicorn is zero-downtime deployments, can you do that with
puma?

~~~
projct
Yep. [https://www.digitalocean.com/community/articles/how-to-
set-u...](https://www.digitalocean.com/community/articles/how-to-set-up-zero-
downtime-rails-deploys-using-puma-and-foreman)

------
rubyn00bie
Oh Unicorn... a little GC tuning goes a long way, and isn't that hard... For
most apps the time spent in GC, while annoying, is far and away not the
bottleneck.

... especially since Ruby 2.0 as the GC is much, much faster.

I'll be honest, I don't think the memory bloat is that problematic if you
design the app well (business logic definitely prevents this some time)... but
in general you can pass most of that off to a background worker, or pre-cache
responses so you aren't bloating your app server instances/threads.

------
ptx
Couldn't this problem be largely solved the same way C# does it, with its
GC.AddMemoryPressure function? If you're using lots of unmanaged memory, you
simply inform the GC when you're allocating and freeing it, so that in can be
taken into account.

[http://msdn.microsoft.com/en-
us/library/system.gc.addmemoryp...](http://msdn.microsoft.com/en-
us/library/system.gc.addmemorypressure\(v=vs.110\).aspx)

------
sayrer
Doesn't Ruby need to break compatibility with its C API to get a good result
here?

To get a copying, generational GC as in Java, it would need to stop handing
out raw pointers.

------
x0x0
stuff like this is why people use the jvm; at this point, it's pretty
bulletproof

~~~
Fasebook
Gotta love those reflection api attacks and other remote procedure injections.
Also, don't forget that everything is an object.

~~~
mavroprovato
Could you provide an example of those attacks/injections?

~~~
Fasebook
Read this backwards:
[http://en.wikipedia.org/wiki/Java_EE_version_history](http://en.wikipedia.org/wiki/Java_EE_version_history)

of course, Jvm is insecure by design, that's the whole point of a VM, really,
to run anything no matter the context, so they're always be the same ways to
escape the sandbox:
[http://seclists.org/fulldisclosure/2013/Jul/172](http://seclists.org/fulldisclosure/2013/Jul/172)
The only thing keeping it secure is legions of devs and an "application
server" propagating trust.

~~~
mavroprovato
I really do not understand what the Jave EE version history page has to do
with the security of the Java runtime.

As for the other link, I think you have mixed things up a little. Those
vulnerabilities are ways to bypass the Java sandbox, when Java is running in
the browser as an applet. This is really not comparable to Ruby, as it cannot
run in the browser and does not even have a sandbox functionality as far as I
know.

Show me a way to remotely execute code on a machine just because it is running
Java, and you will have my fullest attention :-)

------
jrochkind1
Seems like a reason to stick to ruby 2.0 and not upgrade to 2.1, no?

------
iancarroll
Ha, so Unicorn did have a bug in it.

I thought it was me.

------
jafaku
Seems like Rubysts spent too much time bashing PHP and very little time
improving their language, and now they have been left behind.

------
znowi
Whenever I read that "you should be using X", credibility of the author
rapidly diminishes for me.

