
Why blocks make Ruby methods slower - chippy
https://www.omniref.com/ruby/2.2.0/symbols/Proc/yield?#annotation=4087638&line=711&hn=1
======
pjungwir
Wow, that is super interesting!

I think they buried the lede though. (Edit: not complaining! I like tech
taught via a discovery narrative like they use here.) Given this:

    
    
        def a(&block)
          yield
        end
    
        def b
          yield
        end
    

Running `a {1+1}` is 4x slower than running `b {1+1}`. I didn't even know you
_could_ yield in a method without an explicit &block parameter. I'll guess
I'll change my ways then, although I like having the block declared in the
method signature so I know at a glance I can pass one. Too bad!

So I understand from the article that the slowness is because of extra memory
allocation etc., but I still don't understand _why_. I mean, if both methods
behave the same, what is the difference? Is it because declaring &block means
my code in the method body can now "do stuff" with `block`, whereas in the
implicit case I don't have a reference so I can't do any funny business but
just yield? That would make sense.

~~~
SeoxyS
Even though it's slower, I avoid the latter form as a matter of style. I think
it's super important that if a method yields, it's shown in the method
signature. Otherwise, your code won't be self-documenting, and it's going to
cause weird and hard-to-debug bugs later on.

~~~
dragonwriter
> Even though it's slower, I avoid the latter form as a matter of style. I
> think it's super important that if a method yields, it's shown in the method
> signature.

Using &block in the signature does not indicate that the method yields, it
indicates that the method reifies a Proc. To me, its a sign that the method
does something which makes it likely that the passed block will be called
outside of the usual lifetime of the called method.

Now, it may be an issue that Ruby has no self-documenting signature element
which indicates that a method expects a block for the purpose of yielding
without reification, but overloading the signature element which is for
reification doesn't really address that (and, destroys one element of self-
documentation in order to create another one.)

~~~
stormbrew
So, that the function reifies the block is not part of its _interface_ ,
though, it's part of its implementation. If anything it's a problem that this
implementation detail leaks out through the interface.

Furthermore, if you pass it a block with & on the call-side, it's not really
reifying a block anyways (even if there is still a performance penalty, which
I've never checked), so it's not even true that's what it means in general.

The truth is all functions take a block in ruby. Only some of them do anything
with it. Given that, indicating that you take a block with the &block argument
seems entirely reasonable to me, if you don't mind the performance hit.

------
mdavidn
This isn't just a syntactic difference. A Proc can outlive the method that
defined it, so the interpreter must lift local variables from the stack to the
heap. A block, however, can never outlive the method that defined it, so local
variables can remain on the stack. In theory, the interpreter could perform an
escape analysis and implicitly convert `block.call` to `yield` if no `block`
reference survives local scope. Presumably MRI 2.3 does this analysis.

Python has a similar performance impact for closures. Defining a lambda forces
local variables onto the heap, even if the lambda is never instantiated, e.g.
when defined inside an infrequently true condition.

------
jrochkind1
It is highly unlikely that this will make a non-trivial performance difference
in any real code. Please don't go micro-optimizing your code for this, it's
unlikely to be worth any loss in readability or maintainability. And it's
platform dependent, may behave entirely differently in JRuby, or in future
versions of MRI.

------
fataliss
I love those kind of clear explanations! It's probably not life changing, but
at least you have the thorough explanation just not a statement telling you
that x is faster than y. Props for Omniref and the reddit guys!

------
_pius
This title really should read "493% slower" rather than just "slower."

~~~
chippy
It was submitted as such, but must have been altered by a moderator.

------
ffn
Thank you, this was great! Both my work and side projects all run on Ruby
(except the ones that run on js), and this has been extremely helpful. I
actually wrote def(whatever, &block) ... all the time because I was told that
it would be clearer that a block was needed, but if just dropping the &block
means I can get a 4x speed upgrade... well, time for a bit of refactoring in
various critical loop sections.

~~~
knodi123
> if just dropping the &block means I can get a 4x speed upgrade

Well, __up to __4x. If you 're doing anything interesting or complex, or, god
forbid, making a database call, then this might turn into an instance of that
famous anecdote where Bill Gates wouldn't even bend over to pick up a thousand
dollars laying on the ground.

~~~
stormbrew
This can't be emphasized enough. Do almost anything worth doing in that
function or in the block itself and the 400% improvement on the 1% of that
functions execution time becomes almost certainly meaningless. These kinds of
performance improvements have very steep diminishing returns.

------
filereaper
I like posts like this from OmniRef exploring the RubyVM internals.

Is there anyway to search through all of these types of source code
annotations across the site?

I wanted to learn about method dispatch in Ruby, there was a post earlier on
HN about this using OmniRef, I don't know how to retrieve it from the site
directly.

~~~
timr
We've made a page with all of the tutorials we've done so far:
[https://www.omniref.com/tutorials](https://www.omniref.com/tutorials)

I think this is the method dispatch one you're wanting:

[https://www.omniref.com/ruby/2.2.0/files/method.h?#annotatio...](https://www.omniref.com/ruby/2.2.0/files/method.h?#annotation=4081781&line=47)

------
voronoff
Is this 400-500% true for more expensive operations inside of the block? It
seems like he's just comparing the cost of procs to the cost of 1+1. I don't
think the generalization has been established here.

~~~
vidarh
You're right, he's comparing the cost for that very simple case. Do anything
remotely complex and it quickly becomes just noise.

~~~
timr
As with all things, it depends on context. If you're trying to write a
graphics or audio system in Ruby, this kind of thing can really matter. If
you're writing a Rails app, it's rounding error on the time waiting for I/O.

Profiling is essential when figuring out how to improve your code. What we're
doing here is _explaining_ one weird benchmark result.

~~~
voronoff
Except that's not what's being said:

"Why blocks make Ruby methods 439% slower" implies something that isn't
happening here, at least to me.

------
orf
Despite me having very little interest in Ruby this was pretty interesting to
read, I would love to see some similar walkthroughs (?) with the Python
interpreter source code.

------
bilalq
There was a post on a Reddit discussion of this that was quite interesting:
[http://www.reddit.com/r/ruby/comments/2x9axs/why_blocks_make...](http://www.reddit.com/r/ruby/comments/2x9axs/why_blocks_make_ruby_methods_439_slower/coy5gcb)

Apparently, this is due to overhead from creating a Proc object. The
performance is better with blocks in JRuby, and the same will apply to MRI 2.3
when that gets released.

------
EpicEng
So, I know very little about the practical implementation of a non-trivial
interpreter (read: next to nothing), but... it's difficult for me to
understand why this would be the case. What is the reasoning behind the
interpreters lack of semantic awareness? Can't it, in the most simple case,
see that the block is not being modified or accessed in any way and optimize
the reification out?

~~~
chrisseaton
It seems like it's obvious that it should be able to remove the allocation of
the block, however as with everything in Ruby there are edge cases.

What happens if you're halfway through executing this method, you've not
allocated the block, and somebody wants to start a debugger? Then you have you
go back and somehow recreate the Proc as if it was always there. Same thing if
someone uses ObjectSpace to look for all live Procs.

~~~
EpicEng
Huh. Is it standard in interpreted language land to avoid optimizations in
case for the debugger use case? Seems like that would rule out a ton of
potential optimizations...

~~~
vertex-four
> Is it standard in interpreted language land to avoid optimizations in case
> for the debugger use case?

Yes. Otherwise, you'd have to have separate "being debugged" and "not being
debugged" runtimes that keep track of different information and do different
things, and it's not unlikely that you'd end up with programs doing different
things as a result. You can actually access debugger information from inside
the program, and it's not that uncommon to do that.

------
dragonwriter
Interestingly, there's a comment there that indicates that the cost is _much_
lower on JRuby. Be interesting to see if this is true, and what the cost is
like on Rubinius? Possible low-hanging fruit for MRI to improve?

~~~
nahiluhmot
I wonder if it costs less in JRuby or if JRuby's version of yield is less
optimized.

~~~
stormbrew
It's probably that jruby is just much more likely to inline it either way, at
which point whether it was reified or not doesn't matter, unless you call
something other than .call on the Proc object.

------
VanillaCafe
If someone cares about speed, why would that someone be using Ruby? It feels
like optimizing a Ruby program is making "really fast slow code".

~~~
thomasahle
Perhaps because it's a nice language?

Languages are not the same as implementations, and languages are not fast or
slow.

~~~
theseoafs
Languages are not fast or slow, but languages do have traits that make them
more or less well suited to fast or slow implementations. Ruby's semantics do
not make it well suited for a fast implementation.

That said, the question just becomes "why would you use MRI if you care about
performance?"

