This feels a little unfair; the function is invoked the same number of times, but C++ has a mechanism for removing the overhead of calling a function (inlining).

I looked at disassembly of generated binary, sure, function calls inside quad sort were also inlined.

