
Forced Inlining Might Be Slow - ingve
http://aras-p.info/blog/2017/10/09/Forced-Inlining-Might-Be-Slow/
======
userbinator
Note that this is about slow _compilation_ , not execution. IMHO if the output
also becomes much faster, the tradeoff is reasonable. It would be interesting
to know how much runtime performance changed, and in what direction, by not
inlining.

On the other hand, the huge difference in compile times suggests there may be
a hidden quadratic or higher complexity algorithm somewhere in the code path
of the compiler when inlining is performed.

~~~
Animats
Right. Of course turning off forced inlining will speed up compilation. The
article is completely silent about what it does to run time.

This is an unusual case, in that it involves code with SIMD instructions. Out
of line calls usually mean the arguments have to be put on the stack, then
loaded into the SIMD registers. Inlining opens up the optimization possibility
of not doing that.

It's not clear what he's writing, but from the math it sounds like a physics
engine for animation or games.

~~~
amagumori
he's a senior developer at unity.

------
halayli
aside from compilation time, inlining doesn’t mean your program will run
faster, because now you’re filling your icache with the inlined instructions
rather than a call instruction, that might be flushed away on a missed branch
prediction.

~~~
smitherfield
True, but less so now than a decade ago; if it's a hot path (which anything
`__forceinline` presumably is, assuming the author has a reasonable
understanding of optimization) inlining is usually going to be a performance
win, and often a cache/code size win as well (optimizing the inlined code,
avoiding register spills).

~~~
rurban
No, still. If it's an unlikely branch and pretty big, better put it into an
extra function call if possible. icache pollution is still an issue.

