Hacker News new | past | comments | ask | show | jobs | submit login
128-bit RISC-V assembler (github.com/fwsgonzo)
30 points by ingve on Sept 19, 2021 | hide | past | favorite | 21 comments



When you need a pointer that can address any byte that has or will ever exist.


Looking for insight from people who work at low level optimising of code, instruction sets, and hardware.

1. Do you see 128bit processors/systems becoming mainstream anytime in the near future?

2. What type of applications would benefit consumers and general computing at scale?

3. Do we know of any of the main players working on these?


> 2. What type of applications would benefit consumers and general computing at scale?

I'm not a expert, but this will make "SIMD" nearly everything.

In special, will allow data to be encoded as vectors and processed in batches ie: make it more than is now, and as emergent realization of the data, not as an "extra" to be added, sometimes, like actual SIMD instructions today.

This already is a thing in some rdbms/pipeline, where combine per-"row" and columnar processing is optimal.


So more in line with GPUs purpose for parallel processing? But extending the applications available for processing or utilization, currently reserved for 'ML' and 'AI' typical means as we know them today?

Sounds like a merging os CPU and GPU .. I understand the simplistic nature of that conclusion. But, only can guess at realword results.


I don't think this is correct. 128-bit RISC-V has nothing to do with SIMD, at all. It was created to future-proof RISC-V wrt. exascale distributed computing, where even a full 64-bit address space might be insufficient.


> 128-bit RISC-V has nothing to do with SIMD,

Maybe. The point is that you can encode more per unit, so you can batch more things in one go. Is not exactly SIMD as is today, but is also the same idea.


Sorry, I was generally asking about how it related to 128bit broad general computing instructs. But also interested in RISC related context.

Thanks for clarifying!


SCALL instruction, used in his example, has been renamed to ECALL. Not a big deal though.


Good catch. I will rename it to ecall and provide an alias for syscall instead.


And here I was thinking 48 bits would be optimal for a lot of things. From address space to posits and instruction lengths (actually 24 is probably better there).


Although I completely agree with you, I cannot help but remember "640K ought to be enough for anyone" by W.H. Gates.


It's a fun quote, but it's apocryphal.


Are there any 128-bit core done yet in FPGA?


If there's not, it would be less than a day's work to change a few constants in a 64 bit core and add the corresponding equivalents of the same 10 instructions that were added to go from RV32I to RV64I.

Fabrice Bellard has had rv128 support in TinyEmu for some years already.

If it hadn't been done then it's not because it's in any way difficult (in emulator, FPGA, or SoC) but only that no one has needed it yet.


Well, I have a doubt that simply increasing bus size to 64 bits you can get working bitstream for FPGAs, even for top notch ones. Something tells me that synthesising/placing/routing such soft-core will face timing and resource limitations.

Studying recent ARM based SoCs I usually find that they often use reduced AXI4 buses to 48 or even less bits, making all internal I/O transactions block-wise.


Of course you can do it and it will work.

A "standard" (e.g. 5 stage pipe in-order) 64 bit RISC-V core takes a very small proportion of even a fairly small FPGA such as an Arty 100T. You can even do multiple cores. 128 bit would fit no problem at all.

You'd probably have slightly lower Fmax but there's no question about working and if you actually need 128 bit arithmetic then it will be faster than using multiple instructions on a 64 bit CPU (e.g. 4 instructions for a 128 bit add).


Who needs 2^128 integers?


It opens up some very interesting possibilities and optimization options that you wouldn't normally ever consider. In any case, I think programmers of most languages wouldn't even need to care as all optimizations would be done by the teams working on the compilers/interpreters/VMs.

Note that it's not just about integers, floating point would also benefit from this.


You can use tagged pointers for memory protection.


Evergreen comment, "2^128" being an arbitrary upper limit.


It could speed up some crypto algorithms that use very large words.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: