

Replay: Unknown Features of the NetBurst Core (2005) - nkurz
http://www.xbitlabs.com/articles/cpu/print/replay.html?

======
nkurz
While it might sound from the title like this article is hopelessly out of
date, I think it's still highly relevant to the current generation of Intel
processors. I came across it in a footnote to Agner Fog's excellent
[http://www.agner.org/optimize/microarchitecture.pdf[1]](http://www.agner.org/optimize/microarchitecture.pdf\[1\]).

The article details (what I think is) an otherwise undocumented 'replay'
feature that describes how the processor deals with data dependencies that
aren't resolved on the expected schedule: among others, L1 cache misses, TLB
misses, and failed Store-Load forwards.

[1] Footnote to self: Read the rest of Agner's footnotes!

~~~
raverbashing
AFAIK the replay feature is present only on Netburst processors, basically,
the P4 line (Northwood onwards, not sure it was on Willamette ones)

Modern Intel processors do not use the Netburst architecture.

~~~
nkurz
Do you have specific knowledge about how more modern processors handle these
cases? I was particularly excited to find this article because it was the only
source that explained the hardware counter activity I found here:
[http://fastcompression.blogspot.com/2014/09/counting-
bytes-f...](http://fastcompression.blogspot.com/2014/09/counting-bytes-fast-
little-trick-from.html)

Yann was trying to write a fast histogram to record the number of occurrences
of each character. But the simple version of the program was much slower than
expected. After a fair amount of digging, it was determined that "impossible"
store forwarding was a factor. Checking the performance counters seemed to
confirm that many of the loads were being replayed (executed many times before
being retired). Is there another explanation for this?

~~~
raverbashing
From this it looks like it might be something similar really
[https://docs.google.com/document/d/18gs0bkEwQ5cO8pMXT_MsOa8X...](https://docs.google.com/document/d/18gs0bkEwQ5cO8pMXT_MsOa8Xey4NEavXq-
OvtdUXKck/pub)

I'm not very familiar with the modern things in the Core architecture

~~~
nkurz
Yes, I think so too! But I'm biased almost surely. You probably didn't notice,
but the "Nathan Kurz" who wrote that email and the 'nkurz' that posted this
are both me. That's why I was excited to finally find some sort of external
confirmation. :)

~~~
raverbashing
Aaah sorry I didn't notice :)

Sorry, my knowledge of the microprocessor internals have been going down these
past years (and they are becoming more complex)

