128-bit RISC-V assembler

nynx · on Sept 19, 2021

When you need a pointer that can address any byte that has or will ever exist.

b0tzzzzzzman · on Sept 19, 2021

Looking for insight from people who work at low level optimising of code, instruction sets, and hardware.

1. Do you see 128bit processors/systems becoming mainstream anytime in the near future?

2. What type of applications would benefit consumers and general computing at scale?

3. Do we know of any of the main players working on these?

mamcx · on Sept 20, 2021

> 2. What type of applications would benefit consumers and general computing at scale?

I'm not a expert, but this will make "SIMD" nearly everything.

In special, will allow data to be encoded as vectors and processed in batches ie: make it more than is now, and as emergent realization of the data, not as an "extra" to be added, sometimes, like actual SIMD instructions today.

This already is a thing in some rdbms/pipeline, where combine per-"row" and columnar processing is optimal.

b0tzzzzzzman · on Sept 20, 2021

So more in line with GPUs purpose for parallel processing? But extending the applications available for processing or utilization, currently reserved for 'ML' and 'AI' typical means as we know them today?

Sounds like a merging os CPU and GPU .. I understand the simplistic nature of that conclusion. But, only can guess at realword results.

zozbot234 · on Sept 20, 2021

I don't think this is correct. 128-bit RISC-V has nothing to do with SIMD, at all. It was created to future-proof RISC-V wrt. exascale distributed computing, where even a full 64-bit address space might be insufficient.

mamcx · on Sept 20, 2021

> 128-bit RISC-V has nothing to do with SIMD,

Maybe. The point is that you can encode more per unit, so you can batch more things in one go. Is not exactly SIMD as is today, but is also the same idea.

b0tzzzzzzman · on Sept 20, 2021

Sorry, I was generally asking about how it related to 128bit broad general computing instructs. But also interested in RISC related context.

Thanks for clarifying!

ruslan · on Sept 19, 2021

SCALL instruction, used in his example, has been renamed to ECALL. Not a big deal though.

fwsgonzo · on Sept 19, 2021

Good catch. I will rename it to ecall and provide an alias for syscall instead.

phkahler · on Sept 19, 2021

And here I was thinking 48 bits would be optimal for a lot of things. From address space to posits and instruction lengths (actually 24 is probably better there).

ruslan · on Sept 19, 2021

Although I completely agree with you, I cannot help but remember "640K ought to be enough for anyone" by W.H. Gates.

klyrs · on Sept 19, 2021

It's a fun quote, but it's apocryphal.

mrlonglong · on Sept 19, 2021

Are there any 128-bit core done yet in FPGA?

brucehoult · on Sept 19, 2021

If there's not, it would be less than a day's work to change a few constants in a 64 bit core and add the corresponding equivalents of the same 10 instructions that were added to go from RV32I to RV64I.

Fabrice Bellard has had rv128 support in TinyEmu for some years already.

If it hadn't been done then it's not because it's in any way difficult (in emulator, FPGA, or SoC) but only that no one has needed it yet.

ruslan · on Sept 20, 2021

Well, I have a doubt that simply increasing bus size to 64 bits you can get working bitstream for FPGAs, even for top notch ones. Something tells me that synthesising/placing/routing such soft-core will face timing and resource limitations.

Studying recent ARM based SoCs I usually find that they often use reduced AXI4 buses to 48 or even less bits, making all internal I/O transactions block-wise.

brucehoult · on Sept 21, 2021

Of course you can do it and it will work.

A "standard" (e.g. 5 stage pipe in-order) 64 bit RISC-V core takes a very small proportion of even a fairly small FPGA such as an Arty 100T. You can even do multiple cores. 128 bit would fit no problem at all.

You'd probably have slightly lower Fmax but there's no question about working and if you actually need 128 bit arithmetic then it will be faster than using multiple instructions on a 64 bit CPU (e.g. 4 instructions for a 128 bit add).

baybal2 · on Sept 19, 2021

Who needs 2^128 integers?

hdjjhhvvhga · on Sept 19, 2021

It opens up some very interesting possibilities and optimization options that you wouldn't normally ever consider. In any case, I think programmers of most languages wouldn't even need to care as all optimizations would be done by the teams working on the compilers/interpreters/VMs.

Note that it's not just about integers, floating point would also benefit from this.

mrlonglong · on Sept 19, 2021

You can use tagged pointers for memory protection.

sedatk · on Sept 19, 2021

Evergreen comment, "2^128" being an arbitrary upper limit.

rowanG077 · on Sept 20, 2021

It could speed up some crypto algorithms that use very large words.