LoongArch64 Subjective Higlights

unwind · 2025-01-24T07:42:45 1737704565

This was very interesting, but is there some kind of formatting failure that prevents the actual instruction definitions (the syntax) from being shown alongside the explanations? Seemed ... oddly confusing. Even if these are undocumented "Black market" instructions, someone must have picked mnemonics?

duskwuff · 2025-01-24T08:01:14 1737705674

Some of the headings in the document are white and blend into the background. Click the moon icon in the upper left corner to enable dark mode and make them visible.

Joker_vD · 2025-01-24T13:04:50 1737723890

Woah, what an array of varied and sometimes indispensable instructions. Does it all still count as RISC?

Pet_Ant · 2025-01-24T15:43:39 1737733419

As I've been studying the history of CPUs I'm not so sure that RISC really makes sense to talk about outside of the original period it was created. It was a belief that at the point it time that repurposing the budget allocated to decoding complex instructions and the was better spent on more register & cache, and allowing operations to clock faster due to their simplicity.

Now that we have speculative execution, cracking into micro-ops and multi-issue units, it's really not RISCy at all. POWER and x86 really have more in common than they don't.

wat10000 · 2025-01-24T16:27:01 1737736021

Right, fundamentally RISC was about aligning the instruction set with the hardware. At the hardware level, you can’t do arithmetic on a value in memory, you have to load it into a register first. CISC might still give you an add instruction that operates on memory, and behind the scenes it does load, add. RISC says, let’s just do those more basic operations, and the program can combine them. The hardware can be much simpler as a result.

Now, the hardware is stupidly complex anyway and the instructions generally don’t line up too well with the underlying hardware regardless. Consider branch delay slots. Initially, this lined up well with the hardware. The branch takes extra time, so let’s expose that to the program instead of hiding it, then the hardware gets simpler and programs can take advantage of it. Trouble is, once your pipelining gets more complex than “branches take an extra cycle and we can perform another instruction in that time” then it doesn’t align with the hardware anymore. It makes things worse, since now your fancier hardware has to support this weird thing that doesn’t match how they actually operate.

Pet_Ant · 2025-01-24T17:15:25 1737738925

I think it's worth highlighting this

> fundamentally RISC was about aligning the instruction set with [1970s] hardware

because once you start getting into superscalar designs in the 1990's you are already starting to have slippage between the ISA and the implementation.

It's also worth remembering that CISC was a product of making assembly higher level and exposing 'instructions' that are basically function calls to make assembling programming more productive and reusable in a time where compilers were still an undertaking and not a weekend project.

In the future with the proliferation of VM based languages I could see CPUs exposing their microcode publicly so that the JITs can target the actual hardware more effectively with less concern about compatibility because only the VMs need to get ported and really there are only 4 major ones (JVM, CLR, V8, and Python).

kps · 2025-01-24T18:15:42 1737742542

> In the future with the proliferation of VM based languages I could see CPUs exposing their microcode publicly so that the JITs can target the actual hardware

IBM AS/400 did this, except that (as a closed system) the microcode layer is not public (and these days is just some POWER variant anyway).

It's also sort of what Itanium tried, except that they made the same mistake of entrenching implementation in the ISA... and that most real-world code does not have the same usage patterns as processor-implementation microcode.

Joker_vD · 2025-01-24T17:34:09 1737740049

> In the future with the proliferation of VM based languages I could see CPUs exposing their microcode publicly

Like what Lilith and Modula-2 did back in the early eighties [0]? My original snarky comment was primarily aimed at all those bit-fiddling instructions: on one hand, while you can emulate them with shifts and bitwise logical instructions in the loop, having direct hardware support is not only much nicer, it's also quite cheap and simple, it's literally just several muxers connected together. But on the other hand, how often does one really need those instructions? They take precious encoding space.

[0] https://archive.org/details/byte-magazine-1984-08/page/n186

deivid · 2025-01-24T07:43:13 1737704593

Nice article! Do you have a board to experiment on or is this all on QEMU? I've been wanting to try one of the Banana Pi boards, but I think it's going to be _quite_ painful

pantalaimon · 2025-01-24T10:48:32 1737715712

You can get the Asus XC-LS3A6M from e.g. AliExpress. It comes with a 3A6000 CPU which is plenty fast (compared to ARM SBCs), similar to Zen 1.

Alpine is one of the few western distributions that has full support for the architecture.

sidewndr46 · 2025-01-24T15:46:17 1737733577

I wasn't aware of this. It looks pretty neat. But the price point I'm seeing is more than $400 US. So I'm not sure it is price competitive yet

snvzz · 2025-01-24T08:10:09 1737706209

The RISC-V Vector article[0] on the same site is quite interesting as well.

0. http://0x80.pl/notesen/2024-11-09-riscv-vector-extension.htm...