> They are going to split the RISC V ecosystem. Part of the DNA of RISC-V is to ...

Tuna-Fish · 2024-04-30T14:34:20

The code size reduction instructions are an extension that will go through and will eventually be supported by everyone, and is not the bone of contention here. They are designed to be "brown-field" instructions, that is, fit into the unimplemented holes in the current spec.

The reason the spec is going to split is not them, but the fact that Qualcomm also wants to remove the C extension, and put something else in the encoding space it frees.

Pet_Ant · 2024-04-30T15:12:50

Hmm, that seems like a mistake because C allows for instruction compression with low cost to decode that is perfect for embedded use which is a big part of the RISC-V usage now.

That said, if they implemented C, and then had their replacement toggleable with a CSR that would still be backwards (albeit not forwards) compatible so that'd only be an issue if Qualcomm RISC-V binaries become dominant, but I don't think binaries are gonna be that dominant outside of firmware going forward, and any that are will be from vendors that will multi-target.

phkahler · 2024-04-30T17:26:58

>> Hmm, that seems like a mistake because C allows for instruction compression with low cost to decode that is perfect for embedded use which is a big part of the RISC-V usage now.

It may be low cost to decode a compressed instruction, but having them means regular 32-bit instructions can cross cache lines and page boundaries.

My own thought is that there should be a "next" version or RISC-VI that is mostly assembler-level compatible but changes all the instruction encodings to be more sane. What that means exactly is still a bit fuzzy, but I am a fan of immediate data being stored after the opcode.

Pet_Ant · 2024-04-30T18:36:09

> My own thought is that there should be a "next" version or RISC-VI that is mostly assembler-level compatible but changes all the instruction encodings to be more sane.

I feel like that is really a case of Chesterton's fence. It was done by people who litterally wrote the book on processor design (David Patterson, author of "Computer Architecture: A Quantitative Approach", "The Case for RISC", "A Case for RAID", ). I have heard a talk with the rationale behind where bits are placed to simplify low-end implementations.

> What that means exactly is still a bit fuzzy, but I am a fan of immediate data being stored after the opcode.

As a hobbyist, I get it... but except for when you are reading binary dumps directly, which happens so rarely these days, when is that ever relevant? That is just OCD. I think of this video when I get the same itch and temptation. https://www.youtube.com/watch?v=GPcIrNnsruc

Also, let's not forget that RISC-V is already a thing with millions of embedded units already shipped.

phkahler · 2024-05-02T13:33:36

>> I feel like that is really a case of Chesterton's fence. It was done by people who litterally wrote the book on processor design

It was originally intended for use in education where students could design their own processors and fixed instruction sizes made that easier. I'm not saying "therefore it's suboptimal", just that there were objectives that might conflict with an optimal design.

>> > What that means exactly is still a bit fuzzy, but I am a fan of immediate data being stored after the opcode.

>> As a hobbyist, I get it... but except for when you are reading binary dumps directly, which happens so rarely these days, when is that ever relevant?

How about in a linker, where addresses have to be filled in by automated tools? Sure, once the weirdness is dealt with in code its "done" but it's still an unnecessarily complex operation. Also IIRC there is no way to encode a 64bit constant, it has to be read from memory.

Maybe I'm wrong, maybe it's a near optimal instruction encoding. I'd like to see some people try. Oh, and Qualcomm seems to disagree with it but for reasons that may not be as important as they think (I'm not qualified to say).

Pet_Ant · 2024-05-02T13:57:19

> Also IIRC there is no way to encode a 64bit constant, it has to be read from memory.

There never is, you can never set a constant as wide as the word length. Instead you must "build" it. You can either load the high bits as low, shift the value, and then add the low bits, or sometimes as Sparc has it ('sethi'), there is an instruction that combines the two for you.

https://en.wikibooks.org/wiki/SPARC_Assembly/Control_Flow#Ju...

https://en.wikipedia.org/wiki/SPARC#Large_constants

brucehoult · 2024-05-01T10:05:23

> millions of embedded units already shipped

10+ billion. With billions added every year.

THead says they've shipped several billion C906 and C910 cores, and those are 64 bit Linux applications cores, almost all of them with draft 0.7.1 of the Vector extension. The number of 32 bit microcontroller cores will be far higher (as it is with Arm).

__s · 2024-04-30T17:38:19

yes, was curious why compression format didn't require

1. non compressed instructions are always 4 byte aligned (pad a 2 byte NOP if necessary, or use uncompressed 4 byte instruction to fix sizing)

2. jump targets are always 4 byte aligned (which exists without C, but C relaxes)

This avoids cache line issues & avoids jumps landing inside an instruction. Can consider each 2 compressed instructions as a single 4 byte instruction

Bit redundant to encode C prefix twice, so there's room to make use of that (take up less encoding space at least by having prefix be 2x as long), but not important

fwsgonzo · 2024-04-30T18:18:55

I completely agree. Not that everything has to be relaxed, but at least the things that made it impossible to decode RISC-V when C is enabled. The amount of code needed to detect when and how instructions are laid out is much larger than it should be.

brucehoult · 2024-05-01T10:01:44

"impossible"?

It's a little easier than ARMv7, and orders of magnitude easier than x86, which doesn't seem to be preventing high performance x86 CPUs (at an energy use and surely silicon size penalty admittedly).

Everyone else in the RISC-V community except Qualcomm thinks the "C" extension is a good trade-off, and the reason Qualcomm don't is very likely because they're adapting Nuvia's Arm core to run RISC-V code instead, and of course that was designed for 4-byte instructions only.

Pet_Ant · 2024-04-30T18:47:26

That is a trade-off towards density that seems not worth it where all it would take is a 16 bit NOP to pad and a few more bytes of memory to save on transistors of implementation.

Maybe they did the actual math and figured it's still cheaper? Might be worth it.

__s · 2024-04-30T19:24:06

SiFive slides: https://s3-us-west-1.amazonaws.com/groupsioattachments/10389...

Their argument is that since eventually there'll be 8 byte instructions, those will have the same cache line issues (tho that could be addressed by requiring 8 byte instructions be 8 byte aligned)

Pet_Ant · 2024-04-30T19:47:18

Check your link? It isn't working for me.

__s · 2024-05-01T02:12:59

https://lists.riscv.org/g/tech-profiles/topic/slides_on_reta...

panick21_ · 2024-04-30T17:58:04

C is good for high performance instruction sets to. Funny how every company that starts with green field RISC-V doesn't ever mention it as a problem. And yet the one company who wants to leverage their ARM investment thinks its a huge problem that will literally break the currently established standard.