
The hidden cost of C++ (2009) - fogus
http://www.rachelslabnotes.com/2009/10/the-hidden-cost-of-c/
======
nostrademons
With modern optimizing compilers and CPU architectures, you can't really
predict the cost of code merely by looking at it anyways. Not in any language.

Will that function call be inlined? Does it have side-effects? Are the
arguments constant, so that it might even be evaluated at compile-time? Can it
be hoisted out of a loop? Do its pointer args alias, preventing many of these
operations? Is it big enough to blow the cache, or will it all fit in L1 and
benefit from really fast memory accesses? Can branch-prediction operate
effectively on its loops, or does it have an unpredictable pattern that will
flush the pipeline many times?

I've heard that the best way to get your program to run fast is to get it into
a compiler benchmark suite, so that compiler-writers account for it when
writing optimizations. Failing that, it's all about profiling so you can
measure the _actual_ cost rather than relying on what you think you know about
the programming language.

~~~
fauigerzigerk
These are good points, but I don't think it relieves you of having to know
whether your making a virtual function call or not, or wether you're passing
an argument by value or by reference. You do have to know that, and in C++ you
cannot know that by looking at the call site. In C and in many other languages
you can.

~~~
eridius
In C there's nothing stopping the function you're calling from turning around
and invoking dynamic dispatch.

As a trivial example, objc_msgSend(). This is a C function, but it's also the
entry point for objective-C's message sending (e.g. dynamic dispatch).

Basically, what I'm trying to say is, focusing on the cost of the function
call itself is useless if you don't know the cost of the function
implementation.

~~~
fauigerzigerk
Absolutely, but to learn more about the implementation you have to look at the
function's code. Every line of that code poses the same problem, so it's
recursive down to every last expression.

~~~
nostrademons
...which may involve loops or conditionals that then involve tracing more
function calls or performing other dynamic dispatch, and so on farther down
the tree until you have to know the whole codebase. Eventually you run into
the halting problem: you can never know whether a given line in an arbitrary
program will be executed, because if you could, just replace that line with
"HALT" and you've solved the halting problem.

Can't we just agree that you can't predict the performance of an arbitrary
program without measuring it? (You can predict the performance of _some_
programs quite accurately, but that has nothing to do with language choice and
everything to do with coding standards that bound execution time.)

~~~
fauigerzigerk
If you're saying that it doesn't matter if you know whether you're calling a
virtual function or whether a parameter is passed by value or by reference
then we cannot agree on that. If you have reason to use C++ at all, these are
not things you can simply ignore because they can affect performance rather
dramatically.

What we can agree on is that knowing these things is just a small part of what
affects performance and that performance can be rather unintuitive.

------
16s
I use C++ daily for systems level programming and I love it. If I could only
have one programming language it would be C++. I can write anything in it and
I actually enjoy doing so. Maybe I'm weird or something, but I know a lot of
other guys who feel this way too.

And, I know some other guys who do hard core embedded C programming and they
are migrating to C++. They love it compared to C and can't wait until every
line of their C code has been reimplemented in C++.

C++ will be around for decades to come and many folks actually enjoy writing
it, they just don't advertise that fact.

~~~
dkarl
Decades might be right. I can't think of a compelling reason for anyone to
learn C++ today, but having put in the time to learn it reasonably well, I
won't be surprised if I find myself still using it occasionally twenty years
from now.

~~~
mike-cardwell
I started learning C and C++ about three months ago. My reasoning is, they're
used all over the place. My operating system is written using C and nearly all
the applications I use are too. I want to be able to modify the programs I
use, and I want to be able to write native applications that run quickly.
Those are my motivations. And I haven't been this happy learning a language
for years.

~~~
lelele
"I started learning C and C++ about three months ago. My reasoning is, they're
used all over the place. My operating system is written using C..."

Your operating system is written in C. Why learn C++? The return of investment
of learning just C is much much higher than learning C++. Been there, done
that.

~~~
mike-cardwell
I feel I will be a more rounded programmer if I know both.

I've been concentrating too much on C++ I think. This thread has prompted me
to go back and spend some time writing some C.

------
eridius
I'm disappointed. I expected this to be a post along the lines of Ridiculous
Fish's two posts on C++ features that you pay for even if you don't use:

1\. [http://ridiculousfish.com/blog/posts/i-didnt-order-that-
so-w...](http://ridiculousfish.com/blog/posts/i-didnt-order-that-so-why-is-it-
on-my-bill-episode-1.html)

2\. [http://ridiculousfish.com/blog/posts/i-didnt-order-that-
so-w...](http://ridiculousfish.com/blog/posts/i-didnt-order-that-so-why-is-it-
on-my-bill-episode-2.html)

Thankfully one of C++11's requirements was to not introduce any new
functionality that had penalties even if you aren't using it. Too bad that
wasn't a requirement for the original design of the language.

~~~
sltkr
I've read the first article and I'm not convinced the problem this guy
describes incurs real costs in reality. After all, there is no reason to
include inline function definition in dynamic objects unless they are called
without being inlined or their address is taken; both of which are extremely
unlikely for class-inline methods. The author fails to convince me that if C++
didn't require taking the address of an inline functions to yield a consistent
value, real-world C++ code could/would run significantly faster.

The second article is pretty much obsolete with C++11, which makes copy-on-
write string implementations impractical.

~~~
eridius
How does C++11 make copy-on-write string implementations impractical? I know
the short string optimization is used now, but I don't see how that's
incompatible with COW (for strings that are too large to use the short string
optimization). And I don't see how move constructor/assignment is incompatible
either. Is there some other C++11 change that I'm overlooking?

~~~
sltkr
I can't give you a definite answer, unfortunately, but I believe the more
detailed specification of the semantics of string references/iterators in the
new standard makes it very difficult to implement std::string with copy-on-
write semantics without losing the advantages of such an implementation.

For example, consider the following function:

    
    
      char f(string &s)
      {
        const string &t = s;
        const char &ch = t[1];
        s[2] = 'x';
        return ch;
      }
    

This should be valid code, but with a copy-on-write implementation, writing to
the string on line 3 could invalidate the reference created on line 2.

As far as I know there isn't a proper solution to this problem without
changing the return type of `std::string::operator[](size_type) const' in some
non-standard way.

~~~
eridius
Did you read the 2nd link? This is a problem even in C++03. The solution is to
invoke the copy on the access of `t[1]`, which is what libstdc++ does. That
was the whole point of the second link; that using any operator/method that
returns a char reference (operator[], at(), etc) ruins COW.

~~~
sltkr
> The solution is to invoke the copy on the access of `t[1]`, which is what
> libstdc++ does.

No, libstdc++ specifically DOES NOT unshare the string buffer when calling the
CONST version of operator[] (really, go look it up). That's the whole point of
CoW: to only copy when the string is written (or at least, when you have to
pessimistically assume it will be, because somebody has gotten a non-const
reference to its contents).

You could "fix" the issue by unsharing the buffer whenever a const
iterator/reference to the string is returned, but then you lose most of the
performance benefit of sharing the buffer in the first place, because most
strings that are constructed are read at some point.

Because of these difficulties we'll see standard library implementations move
away from copy-on-write implementations and then the whole issues doesn't
exist. That is to say: the performance of the test case the blog author
describes will be consistently “bad” because std::string will behave pretty
much like a std::deque/std::vector.

------
BudVVeezer
Awesome! Another " _I_ don't understand writing good code in language X, thus
_that language_ has 'hidden' deficiencies." post.

You can write non-performant code in any language you'd like, and obscure it
in non-trivial ways. That's not the fault of the language.

~~~
hythloday
No, sorry. There are a bunch of factors that make the point a little more
complicated.

If you're writing, let's say, scenegraph traversal, it's really tempting to
get it working on the PPU and then port it to run on an SPU. A vtable
dereference to main memory will crash your SPU job and it is not always
obvious why. Obviously, you can't use static analysis tools to make sure the
right instructions are transferred, so you have to do it by hand. A modern PS3
game has on the order of low hundreds f different SPU jobs...not much fun.
Even discounting that, SPUs will work best when fed branchless parallelized
jobs, I wouldn't be surprised to see an order of magnitude difference between
a vtable call on a pointer array and a static call on an object array.

Keeping pointers to objects in a polymorphic array is a games performance
anti-pattern because of the dereferencing cost, but it's necessary to call
virtual functions...there's a hidden cost right there.

Which loop is faster, by eyeballing, and by how much?

    
    
      void load(Assets* a) {
        for (int j=0; j<m_numAssets; j++) {
          loadAsset(a[j]);
          m_numLoadedAssets++;
        }
      }
    
      void load(Assets* a) {
        int numLoadedAssets=0;
        for (int j=0; j<m_numAssets; j++) {
          loadAsset(a[j]);
          numLoadedAssets++;
        }
        m_numLoadedAssets = numLoadedAssets;
      }
    

I've seen the former style run literally 1000 times slower than the
latter...that's obvious? I submit in a world of out-of-order processors it is
not at all.

I don't think that all the author's points are spot on...Koenig lookup is
Byzantine but IDEs do a good job of it, ditto source-level reasoning about
dispatch. The underlying theme, that C++ is not a good fit for modern game
development, shouldn't be so trivially dismissed.

~~~
rubashov
How is your example a language thing? Would a JITed language improve the
locality of the m_numLoadedAssets variable? I'm not quite sure I get it.

~~~
loup-vaillant
m_numLoadedAssets is a member of some unnamed class the snipet of code is
extracted from. When you call the method, it is likely deeper in the stack
than any local variable, or even in the heap.

It depends where *this is allocated. In the worst case, it is allocated in
main memory while you wanted to stay in the graphic memory, or something.

A naive compiler would then access memory (or the cache) instead of using
registers. A Sufficiently Advanced Compiler would guess that calling ++ many
times is the same as incrementing in one go, and hoist that out of the loop,
but apparently this one is a bit cruder.

Now the same could be said about m_numAssets, but this one isn't written to,
so the compiler only have to put a copy in a register, which I guess is a
simpler optimization to do.

~~~
rubashov
You explain the memory locality issue well but the question is how is the
situation better in any other language.

~~~
loup-vaillant
Oops.

To answer your question, that particular situation would be better in any
language that forces you to explicit the reference to "this" (or "self").
Imagine how we could modify C++:

    
    
      void member_function() {
        int local_variable++;
        local_variable++;        // This is okay
        member_variable++;       // That should not be allowed
        this->member_variable++; // This should be written instead
      }
    

Applied to the example in the GGP above:

    
    
      void load(Assets* a) {
        for (int j=0; j<this->m_numAssets; j++) {
          loadAsset(a[j]);
          this->m_numLoadedAssets++;
        }
      }
    

We see that every non-local access is prefixed by something ("this->" and "a["
here). The heavier syntax suggests a heavier cost, so the programmer will more
easily think of hoisting those out of the loop, if possible (either manually
or through compiler optimizations).

------
dude_abides
The same underlying point can be made in such contrasting styles: Linus (
<http://harmful.cat-v.org/software/c++/linus> ) vs the OP.

~~~
tsahyt
Still they're both effectively right about it. "Object oriented niceness"
comes at a cost. With C++ it might be smaller than in dynamic languages like
Python, where things like function call overhead can be ridiculously big, but
it's still there.

~~~
zwieback
I think they are both wrong about it. What the argument boils down to is that
the limitations of C prevent you from using things that might lead you astray.
That's not an argument for or against C++ it's just an argument against large
languages in general and an argument for programmer discipline in specific. It
might work for Linus if he reimplements C++ features in a better way but it
does not help all users and doesn't help all the time. Do we honestly believe
a beginning programmer will be better off reimplementing std::vector or just
using a fixed array instead?

If I remember correctly somewhere further down in that particular Linus rant
he was saying that you can look at a call site and tell what's going on, the
same argument the poster here is making, but do we honestly believe Linus
never uses function pointers? And the argument about the name of the function
giving us a rough estimate of the overhead does not sound valid to me.

As a C programmer and former C++ programmer I can see a valid argument hidden
in these types of posts but I think we'd be better off with a little less
attitude and a little more useful guidelines.

~~~
gte910h
I think this IS C++ specific.

ObjectiveC has a much smaller call site complexity penalty.

~~~
JoeAltmaier
Well, not specific. Surely it is in C++; surely it is in other languages too.

------
jon6
I don't necessarily disagree with this assessment but you can easily create
functionality in C that requires looking at multiple source files to
understand whats going on.

    
    
        x = foo(bar(), baz());
    

Now you have to look at the definitions for bar and baz to understand the cost
of this line.

I guess this is the trade-off with using a high level language. Expressivity
vs transparency.

~~~
mpyne
This page gives another "nice" demonstration of C (in actual usage, not an
obfuscated code contest): <http://oldhome.schmorp.de/marc/bournegol.html>

What's especially nice is that TRUE is defined as -1 in that source, not 1,
which could easily confuse someone who isn't intimately familiar with C's
boolean operators and conditional expressions (since something like if((a < b)
== TRUE) { ... } would actually be incorrect).

Even with that in mind, it would still be nice to have a "safer to use"
systems language than C++ generally available. Perhaps Go or D can fill that
role (though good luck making that happen for game development on Windows...)

~~~
pjmlp
For many years that language for me was Turbo Pascal, but then it faded away.
:(

------
kqr2
Note: post is from 2009.

She also has a followup where she explores the "pressure points" of writing
games in C++:

[http://www.rachelslabnotes.com/2009/11/there-are-never-
enoug...](http://www.rachelslabnotes.com/2009/11/there-are-never-enough-
render-engines/)

------
VikingCoder
The days when you can reason about performance by looking at function
signatures are, in my opinion, long gone.

I can count on one hand the number of times that I thought that changing the
function signature of a few methods was going to impact whole application
performance. Every time, it was in a low-level function that was called all
over the place. Or, more to the point, I dug into function signatures when
evidence pointed me there.

I can't even begin to list all of the other problems I worry about that
actually do impact whole application performance on a regular basis.

------
cjensen
Slapping an "explicit" modifier on all your constructors eliminates what the
author is complaining about.

~~~
cobrausn
I really wish 'explicit' was default and the keyword was 'implicit'.

------
halayli
Abstraction in general comes at a cost.

~~~
tsahyt
Obviously, yes. One can't expect jumps across multiple levels of libraries,
wrappers and APIs to be free of cost. That's what Carmack referred to as "20
layers of crap". He's effectively right about it. On the other hand we can't
reinvent the wheel everything we write a line of code.

One of the most important aspects of balancing productivity with performance
is knowing which parts of code are crucial to performance and should be
written with that in mind. That's one of the toughest parts as well. Usually
one is better off profiling than guessing but I believe an educated guess can
go a long way nevertheless.

------
malkia
in 1999 I was a total C++ dude. I had to work that year on a port of metal
gear solid for PC. I was surprised to see such well written C code by the
konami developers, with nice oo system that had certain tcl element to it.
There were only comments in japanese but that did not stopped us from
understanding how it worked.

~~~
gte910h
Object oriented C is a great paradigm when done right.

~~~
pjmlp
Problem is that 99% of the developers that work in the industry usually do it
wrong.

~~~
gte910h
I'd say the number of people who do object oriented C right is as high or
higher than the people who do C++ right, for a given set of people. It's just
simpler, with fewer leaky abstractions.

------
lelele
For me, the hidden cost of C++ was the mental overhead of programming in it. I
had to be constantly alert about the many many ramifications of almost every
line I wrote.

------
LiveTheDream
Can we get a [2009] in the headline?

