You are quite right that the document that 'specifies' RISC-V remains a key weak...

gchadwick · 2024-04-28T05:35:31

> No document that I could find has ever tried to specify an instruction set independent of an actual implementation.

What do you mean by this? I'd say most ISA specifications do this (e.g. the Arm spec doesn't refer to Arm's CPU implementations and has well defined ways to discuss things that can be implementation dependent).

acuster · 2024-04-28T06:52:38

Sorry, it's three in the (sunday) morning, and I've been hitting the whysky trying to handle the estabilshment journalists having fun, while other journalists are talking about humans struggling to get water while themselves being asked if they will survive the night. ---I'm not at my best.

You're right to call me on my statement; I should have all my notes on hand to make that claim and I don't. Paah, no, I do: ARMv7-M Architeture Reference Manual ... Part A Application Level Architecture ... ...processor in Thread mode (vs. in Handler mode).

So ARM already has a lot of detail whereas the RISC-V architecture is trying to (has to?) start even more abstract, where code doesn't even have modes (no interrupts).

This all started a pandemic saturday morning, cup of coffee in hand, enthusiasm to read the "RISC-V Spec" and see what I could learn. Download. Confusion: it says "manual," did I get the right thing? ... Ok, yeah, that's what's on offer. Half an hour later, I'm actually pissed off, like actively angry. I'm reading this from the point of view of "what's the execution environment that I'll be working against?" and I'm getting hit with "unprivileged" which is just wrong. It turns out they are mixing up "the environment of general purpose programmers" with "the minimal that needs to be implemented"---it's a royal mess, they kindda give up on it in the middle. I'm angry about being asked to read this as "the product"; it's not even properly proof-edited. So I took my frustration and tried to figure out 'what would you do to make this better?'

The 'RISC-V' spec is trying to specify: [instructions], and what they do to the [architecture]. I don't know much about the details, but I have a notion that there was push back on writing this up as a 'state machine' and how each instruction might change that state. I assume Prof. Asanović had his own good reason to avoid framing things that way but he's yet to give us a good explantion of why. So probably he's right, I just don't know why.

So how could this be done?

I went to look at the history. The original x86 spec was tied to the chip they were trying to sell. PowerPC, MIPS, if I remember right, were not 'specified' in a clean way--none of them had the same challenge as RISC-V does, starting in pure execution environment mode. I went to read the infamous von Newmann writeup and got side-tracked by his virtural neurons but didn't find the right level of abstraction there either.

So, I'm sorry I can't really justify myself here, but this is all subtle and hard. From what I have found, I don't think anyone has faced the challenge that RISC-V faces, so I don't think we have a roadmap for the spec that RISC-V ought to have.

cheers

thechao · 2024-04-28T13:50:27

I really think something like Ghidra/SLEIGH and a formal specification of the p-code could help; but, only if the following things happened:

1. A p-code parser front end in C existed;

2. An alternate XML/JSON version of SLEIGH; and,

3. A way to integrate the above to document (book) generation.

For the latter I'd prefer HTML. I've found the SLEIGH spec, itself, heavy enough going that I can't tell if it supports full constraint specifications, or not.

IAmLiterallyAB · 2024-04-28T18:56:54

> full constraint specifications

Is that a technical term? (if so, can you explain further)

I've made SLEIGH specs for two architectures. In my experience, it can describe 95% of the semantics well enough for decompilation (it gets weird when your ISA has quirks). Not as comprehensive as SAIL appears to be

Also, SLEIGH compiles to an XML format which is what Ghidra actually uses

thechao · 2024-04-28T23:55:28

CPUs are fairly orthogonal in terms of capabilities; if the instruction can encode it, the CPU can interpret it. Coprocessors (GPUs, NPUs, etc.) have ISA where the legal encoding space is much larger than a the legal instruction space: the set of valid instructions is not dense in its own encoding space. This smaller legal space is defined by a set of constraints on the set of legal encodings.

acuster · 2024-04-28T15:12:57

(A better, more considered, response than I gave yesterday.)

Alastair Reid's article "How to improve the RISC-V specification" makes some great points: namely that the RISC-V specification needs improvement, (implicitly) that this work is worth doing, and that testing is integral to specification. These are all great points.

However, a new specification effort for RISC-V requires a much greater effort.

The core document certainly needs to be rewritten with better structure, a more formal presentation of each 'instruction,' better clarity on what it is saying, and fixes to numerous errata. The harder 'fix' requires resolving the tension between its two readerships --- the implementors who must process instructions according to the requirements of the spec versus the coders who need to know what they can expect from the environment in which their instuctions are processed. (The 'unpriviledged' element in the subtitle of the spec is the code being executed.) Problematically, this tension might not be resolvable: since the core instuction set has no side effects and no requirements can be placed on how implementations are made, the spec inherently has no way to express knowledge of whether the code has been processed correctly! Fun.

A new specification effort has a bigger task than fixing, as best as possible, the current document. Alastair Reid's article mentions needing to integrate testing more centrally into the specification. There is also the need to think of this core spec as the foundational document on which to build the whole suite of RISC-V specification documents. A new spec also needs to cater to the even wider readership, beyond 'implementors' and 'coders,' that accompanies projects with wide-spread success. For example, the spec ought to serve companies or governments in their procurement contracts, so they can express what is meant by deliverables which are conformant with the specification. Ideally, this wider readership under consideration would also include 'students,' that is smart people who have less a priori knowledge of the domain than that which the original 'manual' was able to assume.

This is all a lot of work, which would be of great benefit to the community but which no one in the community has any reason to, or really could, take on. Ideally, the RISC-V Foundation would scope out the work, take a position on what parts of the effort were worth pushing for, and then make that happen.

adreid · 2024-04-30T18:20:22

Fixing the whole problem is large but there is a small sequence of steps that would improve things a lot that can be taken.

Specify the key formats in a json/xml file then go one by one through tools, docs, etc changing them to use the machine readable file.

gumby · 2024-04-28T18:49:56

> No document that I could find has ever tried to specify an instruction set independent of an actual implementation.

The most famous example is MIX.

weebull · 2024-04-28T22:04:16

...because those are normally proprietary documents.

ARM, for example, put a tonne of effort into decoupling their ISA specification from their implementations, and guard this document.

You're seeing how the sausage is made.

pstoll · 2024-04-28T15:10:21

> The work would be a multi-person-year effort, requiring concomitant funding.

> It is not clear to me how this work might begin.

To me that screams for - develop interest in a US public sector (aka government) funding agency and start getting grants to do this type of work.

I assume RISC-V can reasonably be pitched to government tech & policy folks as having many directly applicable benefits to their set of problems in a way they may want to invest in. To procure RISC- systems, they’d probably want a bunch of verification done. If the community isn’t investing, maybe they would.

Caveat I’m not current on the state of RISC-V adoption in public sector systems. But I’ve done funded systems development work at start ups with DARPA/ARPA, InQTel, etc. A motivated party should be able to make this happen.

Anyone closer to RISC-V know what the state of adoption in public sector?

gumby · 2024-04-28T19:06:49

> To me that screams for - develop interest in a US public sector

I disagree 100%.

(note I'm a big fan of the RISC V effort)

First, the US government can/should seed pieces of fundamental technology that the private sector may later use, but in general you really don't want the government funding direct competitors to competitive technology already in the private sector (boosting RISC V when ARM, x86 etc are widely available). On a practical level, even if you do think the USG should do such a thing it won't happen.

Look at the members of RISC V International -- they are mostly huge multinationals who could both afford and benefit from such work. In fact the author of the post works for Intel. They are they people who should be paying for it (the smaller members, like Sifive, and mostly bleeding money).

There is one hack that could work, and might even be the most feasible: get DARPA to fund the pedagogical document acuster described. That to me is similar to previous successful research-supporting projects DARPA has funded in the past, such as SPICE. But that's a special case.

And anyway, why USG? Where are China and Europe in this? The EU funding this kind of pedagogical document would be a good step towards get its chip mojo back.

Pet_Ant · 2024-04-28T15:26:05

This sounds like the perfect case for crowd founding. The community would likely support it. I bet some industry partners would support it. If you had a reasonable team capable throw it up there. $1M means 3 people at $330k, that seems feasible.