> IMHO these days ISA implications on performance and efficiency are being overstated.
Noooo, besides simply copying instructions 1-to-1, the process is way to involved, and imposes 40 years old assumptions on memory model, and many other things, which greatly limits the amount of way you can interact with the CPU, adds to transistor count, and makes making efficient compilers really hard.
Interesting point. So on the one hand we have all these layers in the CPU to abstract away things in the ISA that are not ideal for block level implementation... but on the other hand compilers are still targeting that high level ISA... and ironically they also have their own more general abstraction, the intermediate representation.
I'm probably not the first or last to suggest this but... it seems awfully tempting to say: why can't we throw away the concept of maintaining binary comparability yet and target some level of "internal" ISA directly (if intel/AMD could provide such an interface in parallel to the high level ISA)... with the accepted cost of knowing that ISA will change in not necessarily forward compatible ways between CPU revisions.
From the user's perspective we'd either end up with more complex binary distribution, or needing to compile for your own CPU FOSS style when you want to escape the performance limitations of x86.
Noooo, besides simply copying instructions 1-to-1, the process is way to involved, and imposes 40 years old assumptions on memory model, and many other things, which greatly limits the amount of way you can interact with the CPU, adds to transistor count, and makes making efficient compilers really hard.