
Agner Fog - ibobev
https://www.agner.org/
======
gpderetta
The link is presented without commentary, but for those who do not know, Agner
Fog manuals are pretty much the bible on x86 microarchitectural details and
optimization.

~~~
madspindel
Why does the page look like a joke?

~~~
compiler-guy
To filter out people who would judge a book by its cover.

Also, this is an ancient web site, built back before everyone had to have
glitzy junk. So it has a little whimsy. Did you notice how fast it loaded?

~~~
JdeBP
I noticed that it used <a> for fragment anchors instead of the id attribute. I
changed over most of those on my WWW site some years ago.

~~~
exikyut
Well, I just learned that the id attribute is what one should use for fragment
anchors. All the HTML source code I've read (probably within the last year or
so) used <a>. :/

Perhaps the moral here is that "newer" code was so obfuscated (minified;
framework-infested) that I wasn't able to read it, so I only ever ended up
looking at very dated HTML to fish examples out of. (Figuring out how to
phrase something to Google can sometimes be slower and/or harder than simply
looking in existing source code where you know you'll find something...)

FWIW, I started tinkering with HTML circa 2003 and have slowly (not especially
concertedly) studied it over the years, which is my excuse^Hdefense.
HTML5/CSS3 aren't especially intimidating to me (thanks to browser
compatibility :D) and I could probably photocopy 60-80% of the designs I see
here given significant time (this likely describes a lot of people here), but
I don't really actively develop for the Web, so little oversights like this
happen. _Goes and reads how to include a CSS file again_

~~~
mbreese
I'm a bit ashamed to admit that you just taught me that anchor links work with
id attributes. I just always used <a name> tags... which I rarely used, but
still. I guess my excuse is that I started with HTML around 1996 :)

I guess it's a good thing that I don't do much frontend coding anymore!

~~~
exikyut
Important clarification: it was JdeBP's comment that taught _ _me_ _ that one
uses id attrs for anchors! That was the point I was making in my previous
comment.

I think I followed a similar path to what you did, albeit 7 years later. IMO
frontend dev is fine, given the time that seems to be required in order to
stumble on these "things we didn't know we needed to know", whether by
concerted tuition/learning or simple wandering.

------
glangdale
Agner is a legend. I had the mildly hilarious experience of, during our
acquisition by Intel, seeing that one of Intel's most talented network
engineers (the sadly departed Venky Venkatesan) was using Agner's material as
his reference rather than some super-secret awesome Intel docs.

His work is an even-handed and well-presented coverage of modern architecture.
I'm perennially amazed by the way that people will regularly spout off about
what might or might not be expensive or cheap on modern IA when you could
_just go look it up in Agner 's work_.

~~~
ebikelaw
Intel's documentation is unbelievably bad. A damning statement on their
documentation is that Google staffs a project to parse their horrible PDF and
compare the supposed instruction latency and throughput figures to real
measurements.
[https://github.com/google/EXEgesis](https://github.com/google/EXEgesis)

~~~
glangdale
[ ex-Intel, although not connected to docs, and I did wind up submitting a
number of corrections which got taken up ]

I think "unbelievably bad" is unfair; specifically, compared to what? If you
think Intel docs are bad you must not have read many equivalent docs for other
processors. We worked on a range of processors for the Hyperscan project
before version 4.x, and while Intel docs leave a lot to be desired, you should
see the other guys.

That said, there are a lot of things that need to be fixed.

L/T numbers are always idiosyncratic as there are a bunch of ways of getting
them and none perfectly predict behavior for all instructions and all cases.

L/T numbers in the ORM frequently 'go missing' which isn't great.

The ORM is a giant 'additive song' that gets a new chapter with each new
extension, and tries to provide optimization direction for over 10 years worth
of processors. There's still Pentium 4 material in there!

There's somewhat of a culture of secrecy about the architecture that Intel
would actually benefit from relaxing. It's clear that other vendors know how
to get close to IA in performance or even pass it on some metrics. So I think
it would benefit Intel a great deal to help people build better mental models
of how their processors work.

~~~
ebikelaw
The reason it is "unbelievably bad" is because it's not machine-readable. It's
a PDF that comes out of FrameMaker for crying out loud. Some parts of it are
not even human-readable, such as this garbled chart in ORM 3.7.6.1, the
otherwise rather confusing description of the possible benefits of REP MOVSB
on Ivy Bridge or later, under certain circumstances, maybe.

I mean, the image is just totally unreadable.
[https://imgur.com/a/jqxpqAf](https://imgur.com/a/jqxpqAf)

~~~
glangdale
I agree that it could be improved, but I think "unbelievably bad" is
hyperbole.

This is especially true in light of comparisons to other vendors; I'm not
aware of any up-to-date machine-readable information that's the equivalent for
Appendix C for any other vendor, and historically (pre Intel acquisition) we
had trouble finding this information out _at all_ for some of our platforms to
which we had Hyperscan ports. It was either out of date, not available, or
behind a paywall ("show you've bought our SDK to get basic processor docs").

------
DyslexicAtheist
didn't know where to start so I stumbled over a wealth of information here
[https://www.agner.org/cultsel/](https://www.agner.org/cultsel/) and looks my
weekend will be filled with study of this fascinating pdf linked:

[https://www.agner.org/cultsel/warlike_peaceful.pdf](https://www.agner.org/cultsel/warlike_peaceful.pdf)

> _Regality theory is a theory saying that people show a preference for strong
> leadership in times of war or collective danger, but a preference for an
> egalitarian political system in times of peace and safety. These
> psychological preferences in individuals are reflected in the political
> structure and culture of the society. A society in danger will develop in
> the direction called regal, which includes strong nationalism, discipline,
> strict religiosity, patriarchy, strict sexual morals, and perfectionist art.
> A society in peace will develop in the opposite direction called kungic,
> which includes egalitarianism and tolerance. This book is both theoretical
> and experimental. The theoretical explanation of the regal-versus-kungic
> dimension is based on evolutionary psychology and human ecology.
> Contributions from the social sciences and the humanities are added to
> further analyze historical examples of regal and kungic developments. The
> theory is tested on data from both contemporary and ancient societies. These
> tests confirm the predictions of the theory._

thanks OP

~~~
fsloth
This is actually why Agner is famous:
[https://www.agner.org/optimize/](https://www.agner.org/optimize/)

------
CalChris
Hennessy and Patterson don't cite Fog and that's just crazy. H+P then get
basic facts about x86 architecture and microarchitecture wrong "The length of
80x86 instructions varies between 1 and 17 bytes." CA 5th, p A-23. No, it's 15
bytes as per the Intel Software Developer Manual.

Seriously, any practitioner should be reading Fog. Of course, they should read
the Intel + AMD optimization manuals, Dr. Bandwidth, Developer Zone, the
patents, but they definitely should be reading Fog.

Thanks Agner.

~~~
rayiner
The microarchitecture reference also has some educational illustrations of
microprocessor design plays out in the real world. _E.g._ section 9.9:

> Register read stalls has been a serious, and often neglected, bottleneck in
> previous processors since the Pentium Pro. All Intel processors based on the
> P6 microarchitecture and its successors, the Pentium M, Core and Nehalem
> microarchitectures have a limitation of two or three reads from the
> permanent register file per clock cycle.

> This bottleneck has now finally been removed in the Sandy Bridge and Ivy
> Bridge. In my experiments, I have found no practical limit to the number of
> register reads.

This (and the preceding related discussion) gives you a ton of insight into
the limitations Intel engineers were facing and the trade-offs they made.
Here, ports on a register file have a cost--they require transistors, may
limit cycle time, etc. Intel realized that in many cases, the operands to an
instruction would be available on the bypass network, and you could get away
with having fewer ports on the register file than you'd theoretically need
given the number of functional units.

------
lossolo
Just be aware that some of the manuals there are outdated on some of their
content. For example _Optimizing software in C++: An optimization guide for
Windows, Linux and Mac platforms_. A lot of optimizations from this manual are
performed by compilers automatically now, in smart pointers section you got
information about auto_ptr as most used smart pointer, while it was deprecated
years ago, he writes that composite types will almost always be copied out of
the function if returned by value, while now we have move semantics and
guaranteed copy elision etc.

------
goodusername
I had Agner as a teacher when studying IT-engineering. At the time I didn't
know anything about him really, but always found him pleasant and
knowledgable. Looking back now, I wish I had spent some more time picking his
brain about his expertise.

The webpage is awesome in my opinion. It's functional, simple and with a good
amount of humor :) It a good example of a content-first approach.

------
fernly
I see he has a design for "a new open instruction set architecture and the
corresponding hardware and software standards" [1]. It occurred to me to
wonder if it had any relation to RISC-V, another open-source ISA [2], but
apparently none at all.

[1] [https://forwardcom.info/](https://forwardcom.info/)

[2]
[https://en.wikipedia.org/wiki/RISC-V](https://en.wikipedia.org/wiki/RISC-V)

~~~
fernly
I am incorrect, in a subordinate document [3] he compares his ISA to RISC-V,
Open RISC, and others.

[3]
[https://www.forwardcom.info/comparison.html](https://www.forwardcom.info/comparison.html)

