
Ruby 2.1: Out-of-Band GC - gerjomarty
http://tmm1.net/ruby21-oobgc/
======
gary4gar
It would be nice to get these patches in core as part of ruby 2.1.1 so others
don't need to patch & recompile ruby from source.

Other that, anything that improves performance is a welcome change.

~~~
gnufied
I think Aman already proposed those changes to ruby-core -
[http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-
core/...](http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-core/59728)

He is a member of ruby-core too, so I guess if enough people in core like
these patches they should go right in. However I don't see OOBGC patch there
in that list.

~~~
gnufied
I have not been reading carefully, just noticed that OOBGC stuff is part of -
[http://rubygems.org/gems/gctools](http://rubygems.org/gems/gctools)

------
exDM69
Very nice to see improvement in Ruby's GC. Back in Ruby 1.8 days, I read the
source of the GC implementation and I was less than impressed. I should take
the time and revisit that code to see what kind of improvements have been
done.

~~~
steveklabnik
In that time frame, the interpreter has been entirely re-written, and we've
gone through three or four different garbage collectors.

So yeah, it should be quite different.

------
rurounijones
MRI Ruby seems to be making a lot of progress in the implementation details
recently. Good to see.

~~~
munificent
Anecdote time!

I've been hacking on a little scripting language[1] lately. To see how its
performance compares, I run a few benchmarks[2] against other similar
(dynamically typed, bytecode compiled) languages: Lua, LuaJIT (interpreted),
Python, and Ruby.

Like others, I had internalized "Ruby is slow" through osmosis. But the
version of Ruby I happen to have on my machine is 2.0.0.

In my little benchmarks, it turns out Ruby is one of the fastest. I haven't
compared to 1.8.7, but I'm guessing that was much slower.

[1] [https://github.com/munificent/wren](https://github.com/munificent/wren)
[2]
[https://github.com/munificent/wren/tree/master/benchmark](https://github.com/munificent/wren/tree/master/benchmark)

If you want the gory details, here's the results of running them against Lua
5.3.2, LuaJIT 2.0.2, Python 2.7.5, and Ruby. (Yes, I should try against Python
3. I will.):

    
    
                              score time   wren score relative
        binary_trees - wren   3193  0.31s
        binary_trees - lua    1366  0.73s  233.72%
        binary_trees - luajit 6256  0.16s   51.04%
        binary_trees - python 1376  0.73s  231.96%
        binary_trees - ruby   3003  0.33s  106.34%
    
        fib - wren            3078  0.32s
        fib - lua             2785  0.36s  110.52%
        fib - luajit          6900  0.14s   44.62%
        fib - python          1331  0.75s  231.37%
        fib - ruby            3548  0.28s   86.77%
    
        for - wren            6080  0.16s
        for - lua             1990  0.08s   50.71%
        for - luajit          5914  0.02s   13.24%
        for - python          2825  0.35s  215.25%
        for - ruby            6595  0.15s   92.19%
    
        method_call - wren    4707  0.21s
        method_call - lua     1674  0.60s  281.17%
        method_call - luajit  4221  0.24s  111.53%
        method_call - python   767  1.30s  613.69%
        method_call - ruby    3061  0.33s  153.77%
    

As you can see, Ruby fares quite well.

~~~
mcguire
" _dynamically typed, bytecode compiled_ "

I haven't followed Ruby development, but the last time I heard, Ruby 1.8 and
prior was a tree-walking interpreter---simple to implement but not fast at
all.

~~~
srd
This changed in 1.9. Ruby is now compiled into a stack machine based bytecode.
Ruby 1.8 used the AST internally, you're right there. This made e.g. method
calls very slow compared to the new byte code representation.

------
FooBarWidget
We've now added support for the Ruby 2.1 Out-of-Band GC in Phusion Passenger:
[http://blog.phusion.nl/2014/01/31/phusion-passenger-now-
supp...](http://blog.phusion.nl/2014/01/31/phusion-passenger-now-supports-the-
new-ruby-2-1-out-of-band-gc/)

------
kcorbitt
As someone woefully uninformed about these things, why can't GC be implemented
as a separate thread, maybe with a lower priority than the primary
interpreter? Would a separate thread not be able to count references to
objects or something?

~~~
judofyr
Quoting ko1 ([https://bugs.ruby-
lang.org/issues/8339#note-11](https://bugs.ruby-
lang.org/issues/8339#note-11)):

    
    
        > Parallel tracing needs an assumption that "do not move (free) memory
        > area except sweeping timing". Current CRuby does.
        > For example: "ary << obj". Yes, the CRuby's memory management strategy
        > (assumption) is different from normal interpreters.

~~~
m0th87
Isn't memory only shuffled around during compaction? Why not mark/sweep in
parallel, then stop-the-world at compaction time?

~~~
Skinney
In a nutshell, generational GCs works by scanning live data. Thus, anything
not scanned is garbage. While the application is running, it also allocates
new data.

If you allocate an object somewhere --after-- that area has been scanned, the
GC will treat that area as garbage, and then you will have memory corruption,
leading to segfaults and other nice things.

While parallel mark/sweep is certainly doable (Java does this) it's not easy
to get right.

