
Optimizing Software in C++ [pdf] - signa11
http://www.agner.org/optimize/optimizing_cpp.pdf
======
blt
Anyone who is interested in maximizing the output of computer hardware should
read this. 100% of times I've been able to significantly speed up a piece of
C++ code, I did it by using a technique from this document. Cannot recommend
highly enough.

------
pcwalton
This is great overall. One nitpick:

> How compilers optimize

This didn't mention scalar replacement of aggregates [1]! This is easily one
of the most important optimizations for programmers in C++ and similar
languages to know about.

[1]:
[https://books.google.com/books?id=Pq7pHwG1_OkC&pg=PA331&lpg=...](https://books.google.com/books?id=Pq7pHwG1_OkC&pg=PA331&lpg=PA331&dq=scalar+replacement+of+aggregates&source=bl&ots=4Y8WHoc7lU&sig=iqAwWCX6_gyYGfHoF_x392b_zIg&hl=en&sa=X&ved=0ahUKEwiHoraj1OHJAhUOxGMKHTuJBZcQ6AEIIzAB#v=onepage&q=scalar%20replacement%20of%20aggregates&f=false)
is a reasonable explanation.

------
vvanders
Lots of good stuff there. Caches(including i-cache and d-cache), volatile(and
how it gets mis-construed), pointer-aliasing, float to int overhead, alloca. I
don't think I've seen so much good performance stuff in a single place.

------
0xFFC
Does this have second part ? (I saw "1" at first page in headline). I am
asking because I am in love with this . Thank you so much

~~~
0xFFC
[http://www.agner.org/optimize/](http://www.agner.org/optimize/)

for anyone who wants complete series.

------
fitzwatermellow
Chapter 14 is twenty-four carat gold. Would love to see optimize.com as a
dedicated site with Rosetta Code-style ports of optimization recipes across
languages, platforms ;)

Quick link to grab all five manuals:

[http://www.agner.org/optimize/optimization_manuals.zip](http://www.agner.org/optimize/optimization_manuals.zip)

~~~
nly
> Chapter 14 is twenty-four carat gold

Maybe, but the bounds-checking optimisation [14.2] just looks dangerous. If
you end up with an erroneous 'int' index of -1 then the original code will
error out, whereas the new code will interpret it as 2^32 - 1. This is a
perfectly valid index in an array of 2^32 values. This optimisation just isn't
equivalent if INT_MAX < size <= UINT_MAX.

Am I missing something?

~~~
VodkaInferno
size is a const int that holds a positive value so you won't ever have INT_MAX
< size.

~~~
nly
Yes, and the compiler knows that as well. GCC even merges the two versions
because they're equivalent[0]

My point is, if size is dynamic, this is a dangerous optimisation.

[0] [http://goo.gl/4Ev8as](http://goo.gl/4Ev8as)

------
MichaelMoser123
This optimization guide is a very interesting book.

there is another article "What Every Programmer Should Know About Memory" by
Ulrich Drepper
[http://www.akkadia.org/drepper/cpumemory.pdf](http://www.akkadia.org/drepper/cpumemory.pdf)
; it has a section on how to optimize memory access / tools that help in the
process.

I put up my notes/summary on this article (more like a book with a 100+ pages)
[http://mosermichael.github.io/cstuff/all/blog/2015/12/11/wep...](http://mosermichael.github.io/cstuff/all/blog/2015/12/11/wepskn.html)

------
alvern
Thank you for sharing this.

