

Ruby 2.1 Garbage Collection to Address Criticism on Large Scale Deployments - DanielRibeiro
http://www.infoq.com/news/2013/09/ruby-2-1-gc-revamp

======
exDM69
You should go and read the Ruby Garbage Collector implementation. It's very
straightforward to read, the code is simple but reveals why it's so slow. I
won't say anything bad about it but it's not a stellar piece of software
engineering.

The good thing is that there's plenty of room for improvement.

~~~
ioquatix
What surprises me is that there are a lot of companies using Ruby. Why don't
they put their money where their mouth is? Hire a team of sufficiently skilled
developers and pay them to improve the GC implementation. It isn't rocket
science :)

~~~
mikekersehy
There's only so many GC engineers in the world, and the majority work in
academia, industrial research labs and places like the Oracle VM team. So some
web-app company isn't likely to hire one. Also they aren't interested in
making some toy GC for a language they don't use.

~~~
justincormack
And this is why Silicon Valley has a hiring problem.

No, you hire someone who can code C and read the GC literature. The problem
space is well understood. You are not breaking new ground, just applying known
algorithms.

~~~
dasil003
This is an absolutely ridiculous comment. Writing a GC for a dynamic language
like Ruby is not just a question of knowing C and reading some literature.
It's a massive problem to solve. This is not something that you double your
burn rate in the hopes that 2-3 years down the line you can double performance
of your application cluster. It's simply not worth it. Startups have immediate
problems to solve or they die.

And BTW, the reason we have a hiring problem is not because we are too picky
about specific experience, it's because when it comes to software engineering
some people have it and some people don't, and the ones who don't are the ones
whose resumes are consistently out there filling up inboxes, and the ones who
do are rarely looking for a job, and when they are the Facebooks, Twitters and
Googles of the world are throwing around $200k + benefits like pocket change.

~~~
coldtea
> _This is an absolutely ridiculous comment. Writing a GC for a dynamic
> language like Ruby is not just a question of knowing C and reading some
> literature. It 's a massive problem to solve._

No it's not. There are tons of good GC's around. You can even improve Ruby's
GC with 1990 technology.

(E.g did the LuaJit guy solved a "massive problem"? Mostly by himself?)

~~~
jamesaguilar
Some people routinely underestimate what a team of two or three smart guys and
gals can do if they get in a room together and really work on a problem.

------
BruceM
It is too bad they aren't looking at the very nice MPS:
[http://www.ravenbrook.com/project/mps](http://www.ravenbrook.com/project/mps)

------
gary4gar
Sad fact is MRI even with no matter how much effort we put in would never
would be able to match GCs of battle tested VM like JVM.

so I am wondering, it makes sense to stop designing a VM & just focus on
language. Perhaps, something like Truffle can be used to do the optimizations
for you. Ruby on truffle is already 4-5x faster than MRI. And the ruby
implementation was done by an intern in just 6months. All because power of
JVM.

    
    
            https://twitter.com/headius/status/362616159897534465
            http://www.oracle.com/technetwork/java/jvmls2013wimmer-2014084.pdf
    

Something to keep in mind.

~~~
vidarh
The numbers for Ruby on Truffle are meaningless until people have had a chance
to hit it hard with all the particular oddities of Ruby. Consider their 6
months was to hit 45% of RubySpec. You can reach 45% of RubySpec fairly easily
if you go for the softest targets (I'm not saying that's what they've done - I
haven't checked).

[EDIT: I see they're doing some interesting things that certainly ought to
beat MRI. If I understand it correctly it seems like they are somehow
collapsing type checks for multiple operations. Of course the devil is in the
details - if they are trying to defer type or method checks, and throw away
results if the checks fails (which should be rare) that will only be safe if
modifications that does happen does not introduce or remove side effects that
can't be "rolled back", but I might be misunderstanding their presentation]

The problem is the multitude of bizarre things that are legal Ruby. Like
people doing eval("class Fixnum; def + other; 42; end; end;"). Yes, that's
legal, and yes that means any integer arithmetic in your application is
suddenly broken. More importantly it means any optimisations based on your
beliefs about what any piece of code is meant to do, while they are most
likely right, _can_ turn out to be horribly wrong and so are problematic for a
VM or compiler, without substantial amount of logic to be able to detect or
bail out from optimised code to safe fallbacks. Doing so without slowing down
the code when your guesses are right is hard because of how many ways there
are of changing the behaviour of code in Ruby.

Unless your compiler understands eval() and it is _possible_ for it to reason
about the contents of the eval string, it can make pretty much _zero_
guarantees about the state of the world after an eval() call, and so it can
make pretty much zero guarantees about the state of the world after _any_
method call that could reach such an eval() call.

Admittedly, that's a stupid thing to do, but it's legal in Ruby, and while the
above example is extreme, you do find a lot of use that is roughly equivalent.
E.g. autoload creates as much lack of predictability as eval. So does a
'require' or 'load' that might get triggered later in execution, for example.

The reason those are important is that it makes a massive amount of
optimisations far harder: You can't blindly cache method pointers, for
example, because any method call potentially invalidates them. You can't even
cache class pointers, because _they_ can change: You can return from a method
call and suddenly an object has an eigenclass. You can't inline functions
without guarding them somehow to fall back to the full method call when it
turns out some idiot _did_ redefine Fixnum#+. You can't assume seemingly
"safe" stuff like Fixnum#+(some other Fixnum) will even return an object of
the type you assume, for the same reason - someone might decide to implement a
DSL that redefines it.

Frankly, it'd be fantastic to start deprecating some of the more obnoxious
things like these, and weeding out the few uses of them, but as it stands
today, a fast Ruby subset is "easy". A fast complete Ruby implementation is an
entirely different beast. A fast incomplete Ruby implementation that refuses
to support some of the most noxious corner cases would still be extremely
useful for a lot of people, though.

(in the interest of disclosure since I'm talking about another Ruby
implementation: I'm writing a series on my own slow process of writing a Ruby
compiler, though my goals are very different - mostly focused on writing about
the process)

~~~
chrisseaton
I'm the author of Ruby on Truffle.

I'll talk you through exactly how we solve the problem of redefining Fixnum,
as one example of how we've tackled these problems.

Whenever you use Fixnum#+ in one of your methods, we lookup what that method
is and cache the method so we can call it quickly next time. We actually never
again check that this cache is still valid. The trick is that we sort of do
the opposite - any time you do something that could invalidate that cache, we
find the installed machine code that uses it, and delete it. If the machine
code is still running somewhere on some stack for some thread or fibre, we
jump from the machine code into an interpreted version which looks up the
method again and carries on.

So Kernel#eval makes no difference - if something that you eval ruins your
later cached method calls in the same method, that's not a problem because if
you're still running the same machine code, then you can't have redefined
Fixnum#+. If you had redefined it, you'd be back in the interpreter getting
ready to compile again with new caches.

I'll also just point out that running RubySpec means we are successfully
running something like 5000 lines of off-the-shelf unmodified systems code,
just for the harness before we even get to the tests.

Our theory is that we can make Ruby very fast, without having to forgo any of
your favourite random dynamic monkey-patching features.

Watch the video:
[http://medianetwork.oracle.com/video/player/2623645003001](http://medianetwork.oracle.com/video/player/2623645003001)

Join us on the mailing list:
[http://mail.openjdk.java.net/mailman/listinfo/graal-
dev](http://mail.openjdk.java.net/mailman/listinfo/graal-dev)

~~~
VeejayRampay
Would you mind providing some "grand order of things" hand-wavey estimate as
to when exactly the public can expect to have Fast Ruby ™ © ®?

Also, will it the very same Ruby we all know, compatible with everything? i.e.
will it be the Christmas I envision?

~~~
chrisseaton
I'm afraid I can't - sorry. Keep an eye on the mailing list or follow me on
twitter (@ChrisGSeaton) though.

~~~
VeejayRampay
Done, thanks. And good luck to you guys.

------
mje__
Is Github still running Ruby 1.8? vmg mentioned at last year's Rubyfuza they
had "one of the largest 1.8 deployments in the world".

~~~
imbriaco
Nope, we're on 1.9. We were briefly on 2.0 (last week) but had some minor
performance regressions that we need to understand before we go back.

------
sluukkonen
For anyone interested in how it will be implemented, see ko1's presentation
slides from Euruko 2013.

[http://euruko2013.org/speakers/presentations/toward_more_eff...](http://euruko2013.org/speakers/presentations/toward_more_efficient_ruby_2_1-koichi.pdf)

------
rustc
Are the videos of the talks available? or more details about the
implementation of the new architecture?

~~~
lucaspiller
They will be eventually on [http://www.baruco.org/](http://www.baruco.org/).
The conference has only just finished.

