The links are not only about a loop. The general conclusion is that making better informed decisions about whether to prefecth or not is very hard and that the CPU will have most of the time way more information to make a good informed decision. It also says that unless you proove by benchmarks that it makes sense, it is probably wrong.

Now. This is of course a general case. If you control the whole algo and data structures during the execution, a well crafted prefectch /can/ be beneficial. Again, the general idea of the links I posted is that /generally/ the CPU has more info of the /overall/ system state to make a correct prefetching choice. I think that info/links are usefull/interesting even if they do not apply to the specific case in TFA.

Clarifying a bit more: I didn't post that to contradict the article but just to provide a bit of related info.

