
This architecture tastes like microarchitecture [pdf] - fanf2
http://wp3workshop.website/pdfs/WP3_dunham.pdf
======
jcranmer
If you look at the context of the paper, it makes a little more sense. This is
a one-day workshop that's effectively a retrospective look at computer
architecture over the past 50 years and asking "what's next?" So the real
point of the paper is the somewhat meandering discussion of "why exposing
microarchitecture is bad" that's the first part of the paper, with the latter
part being a suggestion of how to take that advice to the future. That said,
it's (IMHO) the worst of the bunch.

The point the paper is trying to make is that exposing microarchitecture to
the ISA is a decision that ended with problems. It then claims that some of
the bulwarks of modern ISAs still act too much like exposing-
microarchitectural details, namely the presentation of a finite register set.
It sort of doesn't bode well when your references to "this is totally a
fruitful idea to go down" are all limited to papers in the 1990s.

------
deepnotderp
Memory is the bottleneck... so let's use memory ever more intensely.

-_-

Seriously, hardware accelerated context switching is much more of a viable
option rather than saying "screw registers, we'll make memory do _absolutely
everything_. To make matters worse, in most languages you can't tell where the
pointers are going before you AG them, which results in pointer chasing pain.

Stateless ISAs are interesting, but I seriously doubt the way forward is to
make memory do everything for us.

Also, the biggest cost of context switching is not in the register file
migration, but in the flushing of the TLBs, look to SASOS+VIVT to address
that, not whatever this is proposing.

~~~
redshirt
One point the authors make is that registers are memory. I’m not sure they’re
saying screw registers, just that they’re counter to portability and in
general only a yet faster cache for programs to use. The other point they seem
to make is that a secondary reason we have architected registers is that
renaming over an arbitrarily large space was (and in some ways is still)
difficult. By using graph coloring and reg allow in compiler, we narrow the
encoding space before execution and give the hardware a much simpler problem.
Recent research results on register less arch show that there are quite a few
benefits. There’s a lie in calling it registerless though, it’s still there,
just transparent and hence portable.

------
catern
Raising the abstraction level of the ISA and making more logic part of
proprietary CPU microcode is exactly the wrong direction to go in for
security. I'm not sure how the authors can possibly claim this will be
beneficial for security. Thankfully, recent events (Meltdown and Spectre) have
made the flaws in their philosophy abundantly clear.

>When considering the period of rapid evolution that microarchitecture is
about to face with the end of lithography scaling, abstractions that are free
from underlying microarchitectural influence are critical to minimizing future
disruption.

Ironically completely accurate! If chipmakers can successfully sell us on
super-high-level ISAs (moving the scheduler and hypervisor inside their
proprietary chips, as this article seems to suggest), they will be able to
lock us in and easily prevent any "disruption" of their business model.

The greatest advantage for innovation in ISAs in the past 20 years has been
the development of Linux and GCC, which allow any new chip to get a huge
amount of working software with relatively little porting effort. Moving more
logic out of this open source software portability layer and into the
proprietary chip will just make it harder to build new chips from scratch.

~~~
skybrian
I don't think that's what they're suggesting. This is more about replacing
most register references within opcodes with stack pointer offsets, for faster
context switching. It only replaces the register-spilling part of a context
switch, along with the register-allocation code in a compiler.

It sounds a bit like SPARC, but more flexible? And not particularly difficult
to port to.

~~~
jcranmer
> It only replaces the register-spilling part of a context switch, along with
> the register-allocation code in a compiler.

The compiler would still need to stack slot coloring. And, as I noted
elsewhere, they also seem to suggest dropping memory coherency, which makes
the stack here just a register file with single instructions that load and
store the registers to memory.

------
Nokinside
Welcome to the club.

* The Myth of Sufficiently Smart Compiler (SSC)

* The Myth of Sufficiently Smart Virtual Machine (SSVM)

* The Myth of Sufficiently Smart Instruction Set Architecture (SSISA)

All these ideas have the same coal. How to preserve and use high-level
information when transforming code to the lower level in a way that gives
maximum performance.

In a hypothetical dreamland where where all these exist, compilers,
hypervisors, virtual machine monitors, microkernels,operating systems, ISA's
and microcode would generate "sufficiently smart" stack that provides
performance increase.

[http://wiki.c2.com/?SufficientlySmartCompiler](http://wiki.c2.com/?SufficientlySmartCompiler)

[http://wiki.c2.com/?SufficientlySmartVirtualMachine](http://wiki.c2.com/?SufficientlySmartVirtualMachine)

------
phkahler
The author basically says "we've been doing it wrong" and doesn't provide an
alternative.

~~~
redshirt
Meh, for a retrospective paper, I think the authors probably provided more
hints as to the future than they really needed to. I feel like they captured
the exact reasons why ISA is so difficult to do right. As for “doing it
wrong”, I didn’t take it that way. I think the authors lament the influence of
what they seem to feel is a mantra that dictates the programmer must deal
directly with the hardware versus providing a sufficiently abstract ISA.
Building circuits for ML accelerators, etc. is actually damned easy, exposing
those to the programmer in a portable way that does not require rearchitectng
the program every time you change the accelerator is tough. I literally loathe
porting for Intel b/c the AVX insn behave differently and are essentially
architecting in the microarch, passing the complexity directly to the
programmer. I’d much rather the Risc-V solution or the ARM solution.

------
FeepingCreature
Tl;dr, as far as I can tell: Context switch performance can be improved by
abolishing registers. Also, here's some indications that this won't
_necessarily_ completely annihilate processor performance.

~~~
woadwarrior01
Which sounds a bit like implementing the LLVM IR as an ISA. You don't have
registers, but instead you get an infinite number of temporary values.

