This was very interesting, but is there some kind of formatting failure that prevents the actual instruction definitions (the syntax) from being shown alongside the explanations? Seemed ... oddly confusing. Even if these are undocumented "Black market" instructions, someone must have picked mnemonics?
Some of the headings in the document are white and blend into the background. Click the moon icon in the upper left corner to enable dark mode and make them visible.
As I've been studying the history of CPUs I'm not so sure that RISC really makes sense to talk about outside of the original period it was created. It was a belief that at the point it time that repurposing the budget allocated to decoding complex instructions and the was better spent on more register & cache, and allowing operations to clock faster due to their simplicity.
Now that we have speculative execution, cracking into micro-ops and multi-issue units, it's really not RISCy at all. POWER and x86 really have more in common than they don't.
Right, fundamentally RISC was about aligning the instruction set with the hardware. At the hardware level, you can’t do arithmetic on a value in memory, you have to load it into a register first. CISC might still give you an add instruction that operates on memory, and behind the scenes it does load, add. RISC says, let’s just do those more basic operations, and the program can combine them. The hardware can be much simpler as a result.
Now, the hardware is stupidly complex anyway and the instructions generally don’t line up too well with the underlying hardware regardless. Consider branch delay slots. Initially, this lined up well with the hardware. The branch takes extra time, so let’s expose that to the program instead of hiding it, then the hardware gets simpler and programs can take advantage of it. Trouble is, once your pipelining gets more complex than “branches take an extra cycle and we can perform another instruction in that time” then it doesn’t align with the hardware anymore. It makes things worse, since now your fancier hardware has to support this weird thing that doesn’t match how they actually operate.
> fundamentally RISC was about aligning the instruction set with [1970s] hardware
because once you start getting into superscalar designs in the 1990's you are already starting to have slippage between the ISA and the implementation.
It's also worth remembering that CISC was a product of making assembly higher level and exposing 'instructions' that are basically function calls to make assembling programming more productive and reusable in a time where compilers were still an undertaking and not a weekend project.
In the future with the proliferation of VM based languages I could see CPUs exposing their microcode publicly so that the JITs can target the actual hardware more effectively with less concern about compatibility because only the VMs need to get ported and really there are only 4 major ones (JVM, CLR, V8, and Python).
> In the future with the proliferation of VM based languages I could see CPUs exposing their microcode publicly so that the JITs can target the actual hardware
IBM AS/400 did this, except that (as a closed system) the microcode layer is not public (and these days is just some POWER variant anyway).
It's also sort of what Itanium tried, except that they made the same mistake of entrenching implementation in the ISA... and that most real-world code does not have the same usage patterns as processor-implementation microcode.
> In the future with the proliferation of VM based languages I could see CPUs exposing their microcode publicly
Like what Lilith and Modula-2 did back in the early eighties [0]? My original snarky comment was primarily aimed at all those bit-fiddling instructions: on one hand, while you can emulate them with shifts and bitwise logical instructions in the loop, having direct hardware support is not only much nicer, it's also quite cheap and simple, it's literally just several muxers connected together. But on the other hand, how often does one really need those instructions? They take precious encoding space.
Nice article! Do you have a board to experiment on or is this all on QEMU? I've been wanting to try one of the Banana Pi boards, but I think it's going to be _quite_ painful
reply