

Memory Access Patterns Are Important - jcdavis
http://mechanical-sympathy.blogspot.co.uk/2012/08/memory-access-patterns-are-important.html

======
jandrewrogers
I think programmers frequently underestimate the kinds of performance gains
that can be achieved via memory layout and access pattern optimizations. I do
a lot of high-performance algorithm design and have frequently seen the case
that what looks like a trivial and logically neutral algorithm implementation
tweak doubles performance by virtue of having better memory access behaviors.
For heavy duty memory access optimization, 10x is not unachievable in some
cases.

In my experience designing high-performance algorithms on modern processors,
I'd estimate 80% of all performance optimizations are actually memory behavior
hacks. Designing better algorithms is another 10% and is harder to do (most
software engineers have never designed a genuinely original algorithm). The
last 10% is everything else, such as redesigning the code to get closer to
saturating all 3 ALUs available on modern cores. (Code hacks to force ALU
saturation sounds like it shouldn't do much but I've seen 2x performance gains
for some algorithms where the "obvious" implementation tended to serialize
over one ALU if you analyze them.)

For high-performance code junkies, memory access optimization is the bread and
butter for integer factor performance gains.

~~~
sliverstorm
I'm not a programmer by trade, but I regularly process chunks of data 10GB or
larger. In this context fast vs. slow often means nigh-instant vs. twenty
minutes of runtime.

In summary, from my experience- I agree. Memory access (also, storage access)
is huge. Much bigger than tight loops or logical reduction/optimization.

------
malkia
This is one of the reasons why textures in the GPU's are swizzled -
<http://en.wikipedia.org/wiki/Swizzling_(computer_graphics)>

From the old demo days, there was the famous rotozoom effect, and it showed
how linearly arranged array when rotated 90 degrees was becoming way slower,
and if some swizzling was done it was better.

Another example is texture mipmaps. If you have one large image, say 4096x4096
without mipmaps, and you have to scale it down and display at say 64x64 - then
there would be a lot of cache misses (and aliasing problems - shimmering,
etc.)

And those are just the most obvious problems, there are lot of more tricky
ones.

Judy Arrays were written with that in mind, but maybe they are overly comlex
for the things achieved - <http://judy.sourceforge.net/>

~~~
spitfire
Was that demo lasse reining or toasted by cubic team and $een perhaps? iirc,
Pascal wrote an article about optimizing that code for the 486's cache.

------
zippie
The need to access memory efficiently is also applicable beyond the JVM:

<http://news.ycombinator.org/item?id=4339089>

~~~
jackowayed
It basically has nothing to do with the JVM. These speedups are entirely a
function of how modern processors work. To some extent the platform/language
matters because its guarantees determine how easily you can achieve good
memory locality.

(For example, one could implement a java compiler that correctly made all Java
code work, but gave absolutely no memory locality--different instance
variables of an object could be arbitrarily far apart, and you could even
implement arrays as linked lists (or at least chunk lists) so that arrays
don't guarantee locality either. One couldn't make the optimizations he made
under those conditions. But any language where you know what data will be
contiguous allows you to make these optimizations.)

I'd guess that you knew this, but your comment suggests that there's something
special about the JVM that makes it especially well-suited for these
optimizations, which might confuse readers.

~~~
zippie
I was trying to convey the opposite, as you stated, that inefficient memory
access is not limited to the JVM.

------
vampirechicken
Everything hard, or not well understood by the programmer is now branded a
pre-optimization and hand-waved into the "we'll fix it when it becomes a
problem" category.

