

The overhead of abstraction in C/C++ vs. Python/Ruby - frostmatthew
http://blog.reverberate.org/2014/10/the-overhead-of-abstraction-in-cc-vs.html

======
astrobe_
I think that this observation is not as relevant to the task at hand that it
seems. The author is in the same position as the implementor of a VM, who has
to decide which operation will be a primitive and which will not be. What is
actually more important than anything else if you care about performance is
which operations are likely to be called from hot loops. It's classic
optimization wisdom except that you can only work on expected scenarios rather
than real-world uses (until you release your stuff and other people use it).
So a 100x slower abstraction is ok(-ish) as long as it's not somthing to be
used in performance critical sections.

~~~
haberman
Your comment amounts to: "this performance-related observation doesn't apply
in situations where performance is not an issue."

Yes, but I am working in a situation where performance _is_ an issue. So yes,
the observation is relevant to the task at hand.

------
chrisseaton
A while ago I wrote a blog post about how compilers for Ruby can remove
abstraction. Features that you might think of as being expensive abstractions
such as using #send to call methods instead of calling them directly, need not
have any overhead at all with a good enough compiler
[http://www.chrisseaton.com/rubytruffle/pushing-
pixels/](http://www.chrisseaton.com/rubytruffle/pushing-pixels/)

------
jaimebuelta
The problem with this article is that is doing something extremely simple, and
calling that an abstraction.

The power of Python or Ruby is not demonstrated doing a for loop, is with
something like this:

print sum(int(n) for n in open('/a/file/with/numbers'))

We don't need to care about a lot of stuff that we would in C/C++ (memory
management, iterate through a file, the format of stuff, terminating the loop
properly)

Are C/C++ interesting languages? Of course, sometimes you need that extra
level of control about what's going on.

But I don't agree that the cost of abstraction is higher in Python/Ruby. You
can create code that does A LOT in a few lines, that most of the times will
have a good enough speed.

In C/C++ you need a lot of work to create a good abstraction, that's not
guaranteed to perform well.

~~~
ajtulloch
FWIW, with folly::gen [1], this is just:

    
    
        using namespace folly::gen;
        byLine('/a/file/with/numbers') | eachTo<int> | sum;
    

[https://github.com/facebook/folly/tree/master/folly/gen](https://github.com/facebook/folly/tree/master/folly/gen)

~~~
nly
It's only 2 lines even restricting yourself to the standard library

    
    
        #include <fstream>
        #include <iterator>
        #include <numeric>
        using namespace std;
    
        int main() {
            ifstream nf ("numbers.txt");
            return accumulate (istream_iterator<int>(nf), istream_iterator<int>(), 0);
        }

------
TheLoneWolfling
Read: "The overhead of abstraction in languages that do optimizing compilation
versus languages that don't".

~~~
haberman
Also: CPython and MRI don't do optimizing compilation, which is not entirely
obvious considering they both _do_ compile to bytecode and optimize slightly.

~~~
TheLoneWolfling
Don't know about MRI. With Python (as it's the reference implementation, I
don't see the need to call it by another name) all, IIRC, compilation does
with optimization enabled is remove assert and if __debug__: statements. Oh,
and docstrings. So yeah... slightly. To put it mildly.

(Note that with this "benchmark", PyPy has (almost) no difference in runtime
between the abstracted and non-abstracted versions.)

------
TillE
The "C with classes" code looks so weird. I'm a little surprised you got away
with int main() not explicitly returning 0, didn't realize that was still
valid C++. Anyway, C doesn't really do abstractions to any great extent, so it
would make more sense to focus only on C++ and delve a little deeper.

Observe what happens when you make a method virtual, for example.

~~~
jeorgun
V-tables aren't a very expensive abstraction. They prevent inlining, which
isn't exactly ideal (hell, that's the main point of the post), but nowadays
their cost over a normal function call is basically negligible.

~~~
coherentpony
> V-tables aren't a very expensive abstraction. They prevent inlining, which
> isn't exactly ideal (hell, that's the main point of the post), but nowadays
> their cost over a normal function call is basically negligible.

This is not true. Sweeping general statements like this are almost never true.
It is really problem dependent whether a particular type of abstraction is
'too expensive' or not.

I have to actively avoid vtbl lookups in my domain, and this is because if I
don't I lose about 15% in execute time. You might think 15% "isn't very long",
but it is when your application uses several thousand MPI processes.

~~~
jasode
> I lose about 15% in execute time.

Can you paste a representative snippet of your code that demonstrates this 15%
penalty?

In my experience with tight loops executing a million iterations, the extra
indirection of a vtable lookup is never more than 1% slower. Others also
observe similar minor differences.[1]

[1][http://stackoverflow.com/a/667680](http://stackoverflow.com/a/667680)

~~~
haberman
I have definitely observed penalties much greater than 1% (closer to 15%).

This paper is old, but measures the _direct_ cost of virtual table lookups
(not taking into account indirect costs arising from the inability to inline)
as 5% in real C++ programs. When they converted the C++ programs to use all
virtual functions, the overhead rose to 13.7%:
[http://www.cs.ucsb.edu/~urs/oocsb/papers/oopsla96.pdf](http://www.cs.ucsb.edu/~urs/oocsb/papers/oopsla96.pdf)

~~~
jasode
I didn't make it clear but I wasn't comparing inline to virtual.[1] That type
of drastic difference can be greater than 10% which is understandable. I was
comparing _non-inline_ function to virtual function and I haven't been able to
recreate a 15% slowdown even with zero compiler optimization switches.

The stackoverflow observation and mine were on more modern cpus. Maybe that
has something to do with the conflicting anecdotes.

If someone has a small snippet of code that shows _non-inlined_ function calls
running 15% faster than vtable calls, I'd like to study it.

[1]the context in grandparent post was already constrained to (non-inlined)
normal function calls: " _, but nowadays their cost over a normal function
call is basically negligible._ "
[https://news.ycombinator.com/item?id=8476208](https://news.ycombinator.com/item?id=8476208)

~~~
srean
> I was comparing non-inline function to virtual function

Then you are fighting a strawman. The decision to use virtuals or not has to
be taken in the context of all the benfits, and inlining is a prominent one
among them. Another is vectorization. Inlining a single function sometimes
trigger an avalanche of other optimizations, so the cost of using virtuals can
be quite high in such scenarios. It is a n unrepresentative to rule out some
of the main motivators for choosing non virtual over virtual.

In my experience people who harp on the line that virtual functions are free,
are those who do not write number crunching code, where the benefits are most
apparent.

That said, vtable is really a neat performance optimization that targets
runtime polymorphism. The problem lies in the fact that many people believe
that runtime polymorphism is the only path to polymorphism. Java programmers
certainly believe so, for a reason of course. Many uses of runtime
polymorphism can be replaced by compile time polymorphism without any loss in
flexibility, and frequent gains in performance. In many parts of code I know
for certain that the types wont change, and in such cases runtime polymorphism
is un-necessary. Many a game engine, array processing code has been written
with zero or very sparse use of that feature.

There is this raging debate about whether C++ / D functions should default to
virtuals, just like in Java. I certainly am in the camp that believes that
they should not because runtime polymorphism is not as uniform a necessity as
it is made out to be, as long as the language offers compile time
polymorphism. Java is out of luck here, its designers did not include compile
time polymorphism features or syntactic sugars (if I am not mistaken) but for
C++ and D their defaults make sense.

@Jasode Indeed and in fact I had upvoted your comment even before writing my
comment above.

~~~
jasode
>you are fighting a strawman. The decision to use virtuals or not has to be
taken in the context of all the benfits, and inlining is a prominent one among
them.

Agree that the _decision_ to use vtable must consider _all_ the disadvantages
including loss of inlining. The previous poster also already mentioned that as
well.

My question about comparison was _not about the decision_ of yes-or-no to
virtual calls. It was about understanding the 15% penalty of vtable calls
compared to normal non-inlined calls. If someone had a snippet of code showing
that large of a penalty on a modern cpu, I thought it would be interesting to
disassemble the compiler's output and study it.

When the previous poster (coherentpony) was complaining about 15%, I thought
he was specifically talking about normal function calls because the poster he
was responding to was restricting the word "negligible" to _normal_ function
calls. My questions were a continuation of that _narrow and focused_
benchmark.

Yes, I think most people understand vtables are not "free". They have a cost.
When Alexander Stepanov introduced the STL in the 1990s, one of the factors
leading to fast adoption was that it used templates with extensive inlining
and the performance blew away the previous attempts of algorithms+containers
designed with inheritance & vtables. Heck, a C++ std::sort() could be even
faster than C qsort().

Hope that clears up what my curiosity was about.

------
tokenrove
It would be interesting to see the same comparison with a whole-program
optimizing compiler like MLton, especially with the abstraction split across
source file boundaries (an area where C compilers have gotten much stronger
lately).

------
sbov
The overhead of abstraction is slightly more than 1.5 addition operations?
Sign me up.

------
zem
i was expecting just the opposite - the overhead in terms of programmer time
and effort involved in abstracting something out versus just copy/pasting the
code.

------
theflubba
This article conflates abstraction with OO and language features, IMO.

Putting some functionality into an object is not as interesting as running
that functionality as a monadic command.

Would be more interesting to run some benchmarks in Haskell using functional
abstractions rather than OO machinery.

------
jerven
Maybe, this person should understand what they are timing. The time command is
a wonderful tool, but comparing the execution time of a 4 line C program with
the start up cost of a VM and interpreter is inane.

One needs to measure the program doing the real work. And in this case for the
C program you might find that bash forking is the actual cost not the C code
running through the loop.

JRuby and PyPy are starting to do things with their specialisation tricks that
mean that the overhead of abstraction is 1000 loops before it goes away.

A performance benchmark running for less than a few minutes is a joke. Startup
variance alone will dominate the results.

~~~
haberman
> comparing the execution time of a 4 line C program with the start up cost of
> a VM and interpreter is inane.

That's not what I compared. Maybe you should read the article again?

> And in this case for the C program you might find that bash forking is the
> actual cost not the C code running through the loop.

I didn't publish any benchmark numbers of any C programs, because I didn't
need to: the two C++ programs I was comparing compiled to exactly the same
machine code, making empirical observations of their execution time
immaterial.

------
mackwic
Even though the question is good, the article answer is disappointing.

First, I really don't like the angle taken, then the question of abstraction
(why we do it, how) and choices made by languages designers (and variations in
their idioms) are so vast that you really _can 't_ treat the question through
a <1000 signs blog post.

Some quick points:

\- You don't code for the machine, you code to be read by another human
(possibly you) in the future. I insist, you will be read regularly and
frequently. Thus, your code needs to be clear, precise, and concise. This must
be the first thing in mind when coding: program what need to be done in a way
that a fellow stranger could understand.

\- Abstraction is a way to keep a structure of code clear when the
interactions become complex and/or abundant. If you can avoid them when still
being crystal clear in your code, do it. Direct speaking is always better than
convolution.

\- The main requirements when you code are often one or two of: quick to
develop, easy to maintain, extensible, efficient (you control your big-O and
_WORST_ exec times), correct (no bug. at. all.), real time (when X happens, Y
is done between n µs and m µs during p ns)

\- So, know when performance is a goal, and know when it's not. Choose your
language, your techs, your team considering these goals.

\- And yeah, C89, C99 and the C++es have a very high overhead of abstraction:
clarity, concision, and sometimes performance (all abstractions can't be
inlined). Think of it.

~~~
haberman
You are speaking in vague platitudes, most of which don't contradict what I
wrote in the article at all.

I was speaking of a specific context in which I was working (writing C or C++
extensions for Python and Ruby), and addressing a specific design question
(how much of the library should be in C or C++ vs. Python or Ruby). I guess I
didn't specifically say this, but I thought I made it pretty obvious that I'm
working in a situation where performance is a factor.

My observations were also specific to the question of the relative cost of
creating layering abstractions in Python/Ruby vs. C/C++.

> And yeah, C89, C99 and the C++es have a very high overhead of abstraction:
> clarity, concision

Now you're just being silly.

Abstractions are sometimes good, making things clearer and more concise. The
wrong abstractions can make code less clear and concise.

Here is my solution to that problem: if an abstraction is not making your code
clearer and more concise, don't use it. Saying that abstractions in C and C++
are inherently less clear and concise just shows anti-C/C++ bias.

~~~
theflubba
I don't know if your solution is correct. Many functional abstractions in
Haskell, such as those that involve high level typeclass abstractions, make
code more esoteric, and not more clear. However this is a tradeoff to make
code more based in algebra and logical structure. This is more important than
readability and conciseness.

~~~
nostrademons
There is a very strong argument that you shouldn't be using the high-level
esoteric typeclass abstraction in Haskell, or even that you shouldn't be using
Haskell for production programming. I say this as someone who likes Haskell a
lot and wrote one of the top tutorials on the web for it. But if you run a
business based on software, and your core value proposition is not the 100%
provability of your code (eg. Galois), then readability and conciseness are
absolutely more important than algebra and logical structure.

~~~
theflubba
Very fair point. Thanks for replying.

