My favorite feature was this: stack frames were cache aligned and the line under the stack pointer would be marked valid upon entry of functions (so the storing the callee saves wouldn't bring in a line that would be overwritten anyway) and invalidated upon exit (so the cached wouldn't have to write back stale data). As he said, this effectively enabled them to treat the cache as an extended register file. Strangely nobody has copied this idea.
There are great stories about the development, but very little other detail is available and AFAICT, very few were actually built. 801 was not a single chip design and RISC didn't become an overnight success until it was built in a single chip (which enabled much higher frequency and was the true key to success).
ADDED: Also, their simulator had a neat trick to make it fast: all I$ lines had an 8X (IIRC) shadow where for each fetched instruction there would be a short sequence of host instructions emulating it. Thus to simulate an 801 instruction they would just ensure the line was fetched and branch directly to the corresponding address in the simulation cache which could fall through to the next instruction until either a branch or the end of the line was reached.
I found a reference to the simulator trick in CHM's "IBM 801 Microprocessor Oral History", but I would love to see more detailed design documentation on the CPU and simulation. The documents available on bitsavers unfortunately are not that detailed.
Neither trend were very intentional by their designers (e.g. why would anyone intentionally make their processor “complex”?) but instead were consequences of the technology available at the time and were simply the most reasonable ways to go about things.
So called CISC designs emerge from a time when assembly programming was widespread, RISC came about once compilers were in widespread use. Reasonable engineers with constraints from both eras were focused on producing highly efficient designs.
Which is why the term “CISC” rubs me the wrong way. I almost have to wonder to what extent this naming was subversive product marketing leaking into academia.
Of course today we have the hindsight to see that the constraints of contemporary processor design requires a significant amount of design elements from both RISC and CISC methodologies.
Also, even though PowerPC is gone, IBM is still very much relevant in high-end RISC development -- they still make POWER computers.