
Interprocedural optimization in GCC - matt_d
https://kristerw.blogspot.com/2017/05/interprocedural-optimization-in-gcc.html
======
ignoramous
Tangential Q: someone mentioned that GCC is right up there in terms of most
complex piece of software ever written. People with enough know-how would
vouch for this? Are compilers the most complex piece of the puzzle? Or, are
there other type of wares that dwarf GCC or compilers, in general, in terms of
complexity (like, say, JIT interpreters, or DB Query Execution Engines, or
HDLs).

~~~
swift
Web browsers are astoundingly complicated. They generally contain _multiple_
JIT compilers, a database engine, sophisticated rendering code for 2d and 3d
graphics and fonts, video and audio codecs, support for a variety of network
protocols, graphical debuggers and development tools... the list goes on. And
they expose all this to arbitrary applications downloaded from the internet
while attempting (and mostly succeeding) to maintain your security and privacy
and remain backwards compatible with 20+ years of legacy content. At a target
frame rate of 60 fps.

Browsers are, when you think about it, pretty damn amazing.

~~~
Arizhel
Taken as a whole, sure, but all those things you refer to in a browser are
components, probably developed by different teams or even reused by entirely
separate projects. By that measure, a modern Linux distribution is far more
complicated than a web browser, since it includes a kernel, compilers,
multiple web browsers, codecs, etc. The Firefox browser, for instance, I
believe uses the SQLite database, so you can't claim that the Mozilla devs
created that complexity, they just included it.

I think a compiler is different because you can't break it apart into separate
projects like that. All the parts that go into a compiler are really only
useful for that compiler, not as common pieces of "infrastructure" that can be
used by many disparate larger projects.

------
big_spammer
Identical code folding is slick. It's the opposite of using macros, and is
harder to fold code than to generate it.

------
pawadu
devirtualization sounds like something that could significantly improve
performance in large C++ projects.

Hope to see some real-world benchmarks soon!

~~~
melkiaur
Hi, I just created an account to give you a bit of feedback.

When g++4.9 went out, our old mathematical/financial library was finally able
to be built with LTO.

See, to be effective, devirtualisation requires LTO, it also requires profile-
guided analysis to detect the right use-cases to devirtualise, and it requires
that your functions can in effect be "hidden" from the outside, so that the
compiler can tune the inlined functions.

As our codebase was also portable to Windows (and so, had all the right
declspec (dllexport) lying around), we were able to use the
-fvisibility=hidden flag. We just had to make sure that everything was within
one big fat shared-lib.

On old CPUs (e.g: Nehalem class), our code ran 2 times faster than the same
code compiled with -O3. On Haswell-E, it ran 50% faster.

There's also a very cheap way to force devirtualisation: use the C++11 "final"
keyword on your methods. G++ can optimise and inline where applicable.

~~~
pawadu
Thank you!

I was hoping this could automatically improve performance for existing
projects but it seems it is a bit more complicated than that. Either way,
50-100% improvements sounds very nice.

