I learned about MIPS CPUs in college, 20 years ago, with Patterson and Hennessy, and built one from logic gates. It was basically a high-powered 6502 with 30 extra accumulators, but split across the now-classic pipeline stages.
I understand that modern CPUs are much different than this. I understand that there are deeper pipelines and branch predictors and superscalar execution and register renaming and speculative execution (well, maybe a bit less than last year) and microcode. Also, I imagine they're not built by people laying out individual gates by hand. But since I can only interact with any of these things indirectly, I have no basis for really understanding them.
How does anyone outside Intel learn about microcode?
- Agner Fog's instruction tables for x86, which has the latency, pipeline, and ALU concurrency information for a wide range of instructions on various microarchitectures.
- Brief microarchitecture overviews (such as this one for Skylake), that have block diagrams of how all the functional units, memory, and I/O for a CPU are connected. These only change every few years and the changes are marginal so it is easy to keep up.
Knowing the bandwidth (and number) of the connections between functional units and the latency/concurrency of various operations allows you to develop a pretty clear mental model of throughput limitations for a given bit of code. People that have been doing this for a long time can look at a chunk of C code and accurately estimate its real-world throughput without running it.
Thanks for sharing the above very useful resources, blocked my Saturday for first scan!