For more clarification. The number 0, 1 in these benchmarks mean the level of indirection. IndirectionSort<0, xxx> is sorting pointers on their address. While IndirectionSort<1, xxx> is sorting pointers on the value it's pointing to. Notice how MergeSort regresses much more then the QuickSort. Without an indirection MergeSort is competitive with an indirection QuickSort becomes relatively much better. As you point out sorting pointers to structs is much more important use case. This property becomes even more important as L1 cache is typically kind of small O(32kb), while L2/L3 cache is large O(1mb). So when sorting pointers it's not unreasonable to hit L2/L3 cache but miss L1. With L2/L3 typically being 10-20 cycles latency (vs 5 cycle for L1) QuickSort is largely insensitive to the L1 cache misses.