
Porting SBCL to the RISC-V - pome
http://christophe.rhodes.io/notes/blog/posts/2018/beginning_an_sbcl_port/
======
brucehoult
If there is any assembly-language programming needed, or a code generator,
then I suggest you start with the code for MIPSle. Many of the instructions
and mnemonics are the same.

The biggest differences are immediate arithmetic and load/store offsets are 12
bit on RISC-V vs 16-bit on MIPS. To compensate, LUI loads 20 bits on RISC-V vs
16 bits on MIPS. So it's only immediates or offsets between +/-2K and +/-32K
that are different.

Also RISC-V does compare two registers for ordering and branch in one
instruction, which older MIPS can't do.

~~~
microcolonel
From my (slow) hobby work on a V8 port, I can say that there are differences
in loading large (48-64 bit) arbitrary constants without a pool (it takes
_many_ instructions) or a scratch register, the FPU is also used differently
(it's more janky and bolted-on with MIPSel). It's nice not having exposed
delay slots, and the additional pc-relative addressing range is very
convenient (since, for example, in V8 there is a maximum code heap size known
at compile time, and the addresses are contiguous, so you can use pc-relative
immediate addressing for anything in that heap [as long as you keep track of
sources for relocation] at a penalty of one word [which is not a big deal in
the grand scheme of things]).

~~~
brandmeyer
> I can say that there are differences in loading large (48-64 bit) arbitrary
> constants without a pool.

How arbitrary is arbitrary? ARMv7's Thumb2 format immediates are composed of a
8-bit field shifted by up to 5 bits. So you can form any 32-bit variable, but
with limited precision.

ARMv8 modified immediates can describe a contiguous run of ones followed by a
contiguous run of zeros, and SWAR variations of the same. So you can describe
things like a repeating 0x3f... for example.

Do either of those formats encompass the kinds of literals that you need in
the V8 JIT?

> so you can use pc-relative addressing ... at a penalty of one word

Since the RISC-V PC-relative addressing capabilities are similar to ARMv8
(adrp) and x86-64 (rip-relative addressing), I would have though that this is
basically a non-problem. You pay one more live register to hold the page
address, but you also get more registers, so I would think it mostly washes
out. Where do you pay a penalty?

~~~
microcolonel
> _How arbitrary is arbitrary? ARMv7 's Thumb2 format immediates are composed
> of a 8-bit field shifted by up to 5 bits. So you can form any 32-bit
> variable, but with limited precision._

When it comes to encoding the address of an entry point, every bit of
precision you lose (above the first two or three) in the address loses you
memory compactness (and adds a certain amount of complexity to compilation and
relocation).

On the ARM and AArch64 V8 ports they use a constant pool for target addresses,
on RISC-V you can probably just use AUIPC to compute the target address in
place with no pool address register. You can, of course, do the exact same
thing on RISC-V that they do in the ARM ports, but RISC-V has the considerable
advantage of four extra bits (totalling 20) in U immediates vs. MIPS (16-bit U
immediates), and eight extra bits vs. ARM in some cases (though ARM's
immediate encodings are various and sundry, and produce a huge variety of
corner cases and microoptimizations which are mostly useless to JITs [in my
mostly amateur opinion]; to a lesser extent MIPS also has some interesting
features for loading immediates which make up for the shortfalls in AOT code,
but are harder [it seems to me, an amateur] to use effectively in a JIT).

~~~
brandmeyer
AArch64 uses adrp in almost exactly the same way that RISC-V uses auipc to
access literal pools. It isn't a strongly distinguishing feature between those
architectures.

The difference between them is that ADRP computes a 4 kB page-aligned pc-
relative address, which complements the 12-bit unsigned address offsets in its
base+disp addressing mode to get a uniform +(2 GB -1) to -2GB reach. RISC-V
doesn't compute a page-aligned address, in order to partially compensate for
the use of signed offsets in its base+disp12 addressing mode. I say partially
because RISC-V's PC-relative reach remains asymmetric +(2 GB - 2k - 1) to -(2
GB + 2k), but that probably doesn't matter much as long as you establish an
appropriate red zone.

ARM distinguishes immediate operands used for data processing and immediate
operands used for address generation. The alternative formats I was referring
to are mostly just available for the logical operations (although Thumb2
sometimes also uses them for arithmetic). I was thinking that they might make
pointer tagging a smidge easier to deal with.

*edit: Whoops, 12-bit signed, not 10-bit signed on the asymmetry of the RISC-V reach.

~~~
brucehoult
Your figures seem a little off there. Yes, RISC-V is a little asymmetric, with
the AUIPC being able to subtract exactly 2 GB from the PC or add 2GB-4KB to
it, and then a jalr/lb/sb can subtract an additional 2 KB or add 2KB-1.

But the AArch64 adrp is also asymmetric because the relative reach depends on
where in the 4 KB page the original PC is. It's only symmetric if the PC is 4
KB aligned. If you're part way through the page then there is more -ve reach
and less +ve reach.

A couple of KB fuzziness in what is basically a 32 bit reach in a 64 bit
address space is pretty much completely irrelevant in both cases.

~~~
brandmeyer
ADRP operates on the 4kB page of the pc (by truncation), not the entire pc.
RISC-V could have implicitly added 2k in auipc and balanced out the bias. But
they didn't.

~~~
brucehoult
I know how ADRP operates. It's symmetric about the truncated PC. It's not
symmetric about the _actual_ PC.

As I said in the last post.

~~~
brandmeyer
I think you are misunderstanding the benefit of having symmetric reach by the
page.

On AArch64, you can define a 2 GB contiguous slice of address space, built up
out of whatever page size you find convenient for your system and plant a
relocatable binary into it, up to 2 GB of size. Any instruction anywhere in
the last page can reach any address in the first page, and vice versa.

In RISC-V, if you try to do the same thing, you'd find that while any
instructions in the last page can reach any address in the first page with
room to spare. But some instructions in the first page cannot reach portions
of the last page.

Sure, it doesn't matter most of the time. It isn't ever really an obstacle in
practice for the feller writing application code for the platform. But the
linker has to be aware of it in the 'medium' code model as a special case for
just this particular platform. Somebody had to write that special case to work
around the hardware.

~~~
brucehoult
The linker code to calculate the necessary auipc and remaining offset for a
relocation and do something else when out of bounds, was written years ago, is
two lines of code, and no doubt took less time than this conversation.

I don't even know of any application that has 1 GB of code, let alone 2 GB
minus 2 KB (2,147,481,600 bytes).

------
jepler
Looks like the disparaging ARM-fronted website about RISC-V might have been
[https://riscv-basics.com/](https://riscv-basics.com/) which has disappeared
down the memory hole. One mention at
[http://www.osnews.com/comments/30562](http://www.osnews.com/comments/30562)
sheds a bit of light.

edited to add: here's HN's discussion at the time:
[https://news.ycombinator.com/item?id=17489504](https://news.ycombinator.com/item?id=17489504)

~~~
kbob
ARM got sort of a Streisand Effect: this project probably wouldn't have
happened if ARM hadn't drawn attention to its new competitor.

------
dang
We changed the url changed from
[http://christophe.rhodes.io/notes/tag/riscv/](http://christophe.rhodes.io/notes/tag/riscv/)
to the introductory article in the series. The other article listed there is
[http://christophe.rhodes.io/notes/blog/posts/2018/first_risc...](http://christophe.rhodes.io/notes/blog/posts/2018/first_riscy_steps/).

------
Annatar
This has got to be the first processor for which the software is available
before a complete computer exists; RISC-V 19” rack mountable servers remain
distant science fiction.

~~~
floatboth
Nah, certainly not the first. Lots of software was ported to AArch64 before
_any_ chips existed, only ARM's "Foundation Model" (a rather slow emulator).

~~~
microcolonel
Not to mention, there is actual RISC-V hardware capable of running this
software; and you can buy it right now for a known public price (which is more
than could be said for AArch64 for a long time, and almost to this day) and
integrate it with standard peripherals (PCIe, SATA, USB, etc.).

Granted, the hardware is somewhat limited for now, since it's only in-order.

~~~
Annatar
Servers, I explicitly wrote “19” rack mountable servers”!

Where can they be bought? Link please!!!

~~~
floatboth
You could mount a HiFive Unleashed into a rack case I guess. It's not a
standard form factor though, so you'd need some custom mounting hardware (or,
well, hot glue :D)

~~~
Annatar
That would be hacking. I couldn’t build datacenters with that. I’m a
professional engineer, not a hacker.

