
Graph algorithms and software prefetching - based2
https://lemire.me/blog/2018/05/24/graph-algorithms-and-software-prefetching/
======
rurban
I knew that he was wrong with his older don't __builtin_prefetch article[1],
because my own benchmarks proved otherwise. But of course it depends on the
structures (compile-time) and the data (run-time). With lots of randomly
spread out data, prefetch might be too expensive, but when the e.g. a
compacting GC aligned my data properly near each other it wins.

And of course for graphs: avoid the costly pointers, think of relative short
links instead. And best: avoid graphs. Use trees or arrays (tries) instead.

BTW GraphQL is such a nonsense approach. A graph can contain pointers not only
to parents and siblings, but even to any other random node, and by searching
in a graph you can always run into endless cycles. That's why you avoid graphs
in a database. You avoid the additional serialization or the hash table to fix
the recursive cycles. That's why you use proper relations, trees, but for sure
not "graphs" or "objects". Whenever a PM approaches you to use a this fancy
new graph or object db, tell him that this is nonsense. You rather structure
your data properly so that you can avoid graphs, the mostly costly of all data
structures.

[1] [https://lemire.me/blog/2018/04/30/is-software-prefetching-
__...](https://lemire.me/blog/2018/04/30/is-software-prefetching-
__builtin_prefetch-useful-for-performance/)

------
dekhn
I worked on a large distributed machine learning pipeline. We used software
prefetch via Intel instructions- the training algorithm knew that it would be
operation on a bunch of examples in the memoryspace surrounding a fetched
item, so it explicitly specified prefetech.

It was really hard to measure speedups; they would work fine if the code ran
on an otherwise idle machine and had ample spare resources. But if another
code ran on the machine and was hosing L2, the prefetch quickly caused
performance to drop.

In retrospect, explicitly bundled data and explicit (software) prefetch worked
more reliably under most scenarios.

