

Memory cache optimizations - dbaupp
http://blog.libtorrent.org/2013/12/memory-cache-optimizations/

======
powertower
A dead comment (due to a hellbanned account) of an infamous person here - whos
comments don't usually make sense (because of schizophrenia) - is actually
quite good this time.

You can turn-on dead comments in your HN account.

...Has anyone actually tested the performance improvements? Are there any?

~~~
stusmall
I wish he wasn't hellbanned. I enjoy the presence of his comments. They
usually aren't very insightful, sometimes not PC, and usually a little
frightening but he is without a doubt part of the character of this site. I
always keep dead comments on.

------
daviesliu
Cache miss metric in OProfile may be the better solution than this.

~~~
tenfingers
Do you know if the new "perf" tool can also be used for the same purpose?

~~~
minimax
Yes you can use perf to profile on cache misses (perf record -e cache-misses).

~~~
mtanski
Not only that but you can annotate the source and see which function and which
instruction (C or asm) caused it. This way you know exactly which field is the
one causing cache misses.

Dito for branch miss-prediction.

------
FooBarWidget
Finally, a useful article about cache optimizations. All the other articles
I've been able to find give vague hints, but no actual, practical,
_measurable_ advise. With these tools I can finally see what's going on in my
code rather than making educated guesses.

~~~
xyzzy123
What I would have been interested to see is the author's analysis of whether
this was actually worth doing or not.

------
jheriko
this is interesting - i would leave a comment there but its not obvious...

looking at the implementation I am curious how good the coverage is: what
about the array new/delete operators? what about placement new? what about
stack allocations? what about the static data area?

------
fulafel
Sad that this optimization has to be done manually in 2014…

~~~
nostrademons
It's because of backwards compatibility. The order of struct fields in memory
is defined to be their declaration order by the C standard, and a lot of
network protocol code will stop working if that assumption fails. So compilers
are not free to reorder fields in memory.

The packing algorithm for Cap'n Proto [1] is cache-aware, within the bounds of
also accommodating network optimizations, backwards-compatibility, etc. So
yes, newer systems do perform this optimization.

[1]
[http://kentonv.github.io/capnproto/encoding.html](http://kentonv.github.io/capnproto/encoding.html)

