"I would, for example, suspect that a "correct" optimization strategy for 99% of all real-world cases (not benchmarks) is: if it doesn't have floating point, optimize for the smallest size possible. Never do loop unrolling or anything fancy like that. "
and a similar thread http://lkml.indiana.edu/hypermail/linux/kernel/0302.0/1068.h...
And the 2.6.15 (2005) changelog which exposes a configuration option to compile the kernel optimized for size http://lkml.org/lkml/2005/12/18/139
I don't think that's unusual at all. I've found it to often be the case. Cache misses will kill performance.
Activation record cleanup, on the other hand, will be inside your routines, and will be executed both in cases of normal and abnormal exit. But this isn't the responsibility of exceptions; you have to do this cleanup even on normal exit.
Code that runs only in exceptional cases is comparatively rare. But with exceptions, you can move that code somewhere else entirely; and with PC-based exception handling, the space cost is only borne when an exception is thrown, when the PC lookup tables need to be paged in. In this scenario, the tradeoff between exceptions and error codes becomes relevant: exceptions let your code get smaller at the cost of a hit when an exception is actually thrown, while error codes bloat your code.
Of course, all of the above is not specific to C++. C++ has other deficiencies which can lead to suboptimal pathologies in practice.
However, none of these speedups is as large as upgrading to Ruby 1.9