Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

First of all, kudos to WD. This makes me feel good about spending $1500 on their spinning drives just last week.

But on a more practical note, what kind of board and toolchain does one need to get this going on an FPGA? Is there a readme somewhere that would walk one through the process?



Depends on what you want to do I guess.

You'd need a SoC variety of FPGA with a memory controller as I didn't see one in this code base. Putting this in an FPGA seems feasible.

But I see some challenges. It looks like it's a Harvard-architecture core. That means separate buses for data and instructions, which is not common outside of embedded or specialized systems. I'm sure you can setup GCC to work with this, but it would be a project.

You could build (or find) a memory controller that can multiplex separate instruction and data buses to a single memory space... decide data will be in memory range A and instructions in memory range B, and inform the linker where to put code and data.

I'd probably start by downloading whatever free versions of the fpga tools the vender's offer and see if I can synthesize the code with any of the targets and how well it fit. (assuming someone else hasn't posted that info already). If it isn't going to fit in anything supported by the free version of tools, I probably wouldn't go any further with it myself.

Assuming that it did fit, I would switch gears and would build a simulation testbench, and start tinkering to see how it worked as compared to the docs. If it really is strictly harvard, I'd build a bridge to the FPGA's memory controller that could map two buses to a single memory space. If I got that far I'd start working setting up a compiler and linker to map out code and data partitions to that memory space.

At this point you might be ready to build all this and load the FPGA, but you have no peripherals (like ethernet or a vga). I'd consider slaving it to a raspberry pi or something like that. I saw a debug module in the github repo, so that might be a good thing to expose to the raspberry pi. Or pick a simple bus like i2c and use that to get some visibility from the r-pi into the risc-v state and bridge over to the ram.

---

Another direction you could take would be to get something like a snickerdoodle. I believe it can boot linux with the arm core in the FPGA, and it has the peripherals you need like ethernet(wifi) and access to an SD card. So the direction I would take there is trying to supplant the ARM core with the RISC-V. So the effort there would be to disable the ARM core, which ought to be straight forward, and build a wrapper around the RISC-V core to be able to talk to the peripherals in place of the ARM.

Given that it's an ARM core, I'm sure all the internal busing is AMBA (AHB/APB/AXI), so it's probably pretty reasonable to try this.


I wouldn't call this Harvard. It has ICCM/DCCM (closely-coupled memories) that appear to be strict (i.e. core cannot ld/st to ICCM), but everything is in the same (physical) address space, and the CCMs are just mapped into that space. In that sense, the ICCM is not much different from having a ROM in the memory map, which isn't usually seen as making something a harvard architecture.

In my view, RISC-V pretty much precludes a hard Harvard architecture. The fact that RISCV has a FENCE.I instruction that "synchronizes the instruction and data streams" means that a RISC-V implementation can't really be strictly Harvard, since if it was it wouldn't make much sense to invalidate the icache.

This core has an icache, but that also doesn't make it a Harvard architecture. As long as the backing store of both the icache and the dcache is the same, it won't be any more difficult to work with than any other modern modified Harvard architecture (read: pretty much every computer system in every desktop, laptop, phone, etc in the last decades).


> I'm sure you can setup GCC to work with this, but it would be a project.

FWIW AVR is not only Harvard, but code memory addresses point to 16-bit words, and data pointers to bytes. Yet, gcc works great (mostly). It is mainly a matter of whipping up a good linker script and directing code and data to the appropriate sections. Not particularly hard, although Harvardness does leak into your C code, mostly when taking pointers to functions or literal sections stored in flash.

*mostly — gcc long ago stopped taking optimization for code size seriously. Unfortunate for uCtlr users, as for small processors like the AVR optimization for size is pretty much also optimization for speed. Gcc has had some pretty serious code-size regressions in the past — but mostly not noticed by people not trying to shoehorn code into a tiny flash space.


For ARM, I have seen IAR emit unoptimized code half the size of GCCs output with speed optimizations enabled (which lead to slightly smaller code than size optimization!). When optimizing for size, IAR shrank the code size by another 1/3 to 1/2. When you are really strugging to fit functionality into a tight contoller because your hw engineers won't put in a bigger contoller, this is a dealbreaker.


Agree completely that it is a deal breaker. But don’t be too hard on the hardware engineers, sometimes the BOM can not afford the extra 17 cents.


I guess the tone came out wrong. I understand why we got the HW we got. In this case it wasn't even about cost. The power available to the device as a whole was so low that we were counting microamps. It was all incredibly tight:

Just switching on the wrong SoC feature would bring the entire thing outside the envelope. Even the contents of the passive LCD affected power consumption in adverse ways. Showing a checkerboard pattern could make the device fail.


It's surprising to me that the RISC-V ISA specification is loose enough that a core could be considered RISC-V-compliant and yet also need a linker script to accommodate its peculiarities.


?? linker scripts are about the layout of the executable. The OS (if there is one) is the driver of that. ISA spec is an orthogonal concept.


On x86-64, the page tables control memory access permissions (rwx). The layout and semantics of these page tables are defined by the ISA and work the same on any processor. You wouldn't need to lay out your binary differently on Intel vs. AMD, for example.

It sounds like AVR, by contrast, has some parts of the address space that are (x) only and others that are (rw) only, even though other RISC-V processors don't have this restriction. That seems odd to me.

Maybe it's not as odd if you think of memory as a device that is mapped into your address space. ROM could be mapped into an x86-64 address space and be read-only -- even if the page table said it was writable it would probably throw some kind of hardware exception if you tried to actually write it.


(AVR isn't RISC-V)

Having page tables at all is completely optional on RISC-V. This is a chip with no MMU.


> It sounds like AVR, by contrast, has some parts of the address space that are (x) only and others that are (rw) only, even though other RISC-V processors don't have this restriction. That seems odd to me.

AVRs have three separate address spaces (program, RAM/data, EEPROM), i.e. the address "0" exists three times. Additionally their registers live in RAM.


Every architecture has a gcc linker script, it's just that most of them have a default one hiding in an "arch" directory somewhere that works for normal use.


It occurs to me that the more likely explanation is that GCC devs aren’t up to date on the specs and that perhaps a unified situation is possible, just not implemented.


> It looks like it's a Harvard-architecture core. That means separate buses for data and instructions, which is not common outside of embedded or specialized systems.

I'd point out that WASM is also Harvard architecture so that's not so exotic anymore


That sounds like considerably more work than I was hoping it would be. So the follow-up question, then. Who is this release for, in your opinion?


I don't know why WD released this, but it would be useful for people building SoC ASICs that don't want to license an ARM core. Depending on the licensing.


Maybe this will be picked as teaching material in universities by students or teachers, which means potential future candidates for WD.

Maybe it's a way to help existing experts to federate around real use cases to discuss further field improvements, which means outsourced R&D for WD.

Eventually, it will be less work in the future once more people get interested in the subject.


Memory controllers can be implemented in the soft logic, so should not require a hard memory controller or SoC FPGA.


The core speaks AXI and AHB-Lite, so for an experienced FPGA/core guy, it probably wouldn't be too much work to integrate into their own FPGA flow. And since it doesn't have any floating-point, it will probably be able to fit on modestly-sized FPGAs.


I'm curious if the FPGA tools support the system verilog syntax that this is using. I'm an FPGA designer, but have not switched to system verilog.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: