Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah, RISC-V has the `Zicond` extension, but it's not a "proper" conditional move in the traditional sense. For the usual situations where the compiler would use a single conditional move on any other ISA it now needs multiple instructions to get around the fact that `Zicond` will set the destination register to zero if the conditional move isn't made. This totally sucks for performance if you don't have a "sufficiently advanced magic core" which can macro-fuse `Zicond` instruction sequences.

That said, RISC-V does have a proper conditional move instruction. And the funny part: it has multiple! `xtheadcondmov` and `xmipscmove` both implement "real" conditional moves. The catch is that those are vendor-specific extensions; compared to the official narrative that "it doesn't fit the design of RISC-V" apparently the actual hardware vendors see the value of adding real cmovs to their hardware. I wonder how many more vendor-specific extensions will it take before a common cross-vendor extension is standardized, if ever?

(And yes, I'm perfectly aware of why `Zicond` was designed the way it was. I don't really want to get into a discussion whether that's the right design or not long-term.)



> For the usual situations where the compiler would use a single conditional move on any other ISA it now needs multiple instructions

Only if you need the full properties of cmove. In many cases it just generates a single Zicond.

While some companies implement a 3R1W integer pipeline and use fusion, others keep the integer side 2R1W. If you use 2R1W you can get wider issue for the same area, if you have a four issue integer pipeline you may be able to add a fifth integer execition unit for cheaper than moving it to 3R1W, which may give you a higher performance gain.


"3R1W integer pipeline" is kinda ambiguous; I think it'd be extremely-stupid for any core to have all their ALUs be 3R. Much more sane is having ~half be such (if even that), and the rest at 2R.

Or, better yet, have the 3R extra port come from some of the 2R being split up; e.g. for a block of 3×2R1W ALUs, be able to split one up for its read ports, reusing it as 2×3R1W when needed, thereby being able to do 3R1W at 66% the throughput of 2R1W without any extra register ports (i.e. 1.3x throughput benefit of 3R1W over two 2R1W instrs). Probably has some extra costs from scheduling & co needing to handle 3R though.


"sufficiently advanced magic core" is a fairly funny term when a lot of other RISCV behavior basically assumes any real processor will have a macro op fusing frontend for specific series of instructions, and even provides recommendations for what fusions cores should implement.


When you say "xmips" was there a difference of technique between MIPS and ARM?


The `xmipscmove` is just the name of the extension. The 'mips' here means MIPS-the-company, and not MIPS-the-ISA. It's supported by the MIPS P8700 CPU which, counterintuitively, is a RISC-V CPU and not MIPS (it's named "MIPS" because the company which designed it is called "MIPS", not because it uses the MIPS architecture).




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: