Hacker News new | past | comments | ask | show | jobs | submit login
Designing a RISC-V CPU, Part 1: Learning hardware design as a software engineer (mcla.ug)
150 points by lochsh 17 days ago | hide | past | favorite | 31 comments

Robert Baruch recently re-booted his "Build a RISC-V Processor without using an FPGA" series[1]. His approach is heavily dependent on nmigen and formal verification. Check it out.

[1] https://www.youtube.com/watch?v=YgXJf8c5PLo&list=PLEeZWGE3Pw...

I know it isn't direct HLS, but I have to say that Python in EDA seems like a weak combination - even if it's effectively just glue code when I write python I'm just constantly hanging myself on nooses that wouldn't exist in a statically typed language (not good when you have to pay millions for a respin)

What does the formal verification flow/s look like for it?

This! I'd rather write VHDL then migen. I really don't understand the obsession that the open source hardware community has with it. There are way better alternatives.

For software engineers wanting to get into hardware design, the most important thing to remember is that it isn't software, despite feeling a bit like writing software. Writing instructions and describing hardware are very different, regardless of how much a HDL may try to hide that difference with similar looking constructs.

As someone who had done both hardware and software for decades, I'd forgotten how much I had internalized until I saw a programmer friend try to optimize the parts count on a circuit with an arduino, 8 LEDs and 8 resistors.

He knew that the resistor was to limit current, and he knew about parallel and series circuits, so he just used one resistor on the other side of the LEDs instead of 8 of them... and then wondered why the LED brightness changed with the number of lit LEDs.

Hardware isn't the same as software... it's easy to forget that sometimes. 8)

A big difference is also that things are sequential at the gate level. Everything happens at the same time. It’s almost like a massive multithreaded program. Also, propagation delays can make you pull your hair out if you aren’t aware of them when debugging. There’s also a possibility of unclean signals (not Vcc or GND, transition bouncing, etc.).

One level of abstraction further down, Gates are analog, not digital. If you don't clock things right, you could end up with huge currents going through two transistors trying to drive something opposite ways, and then lose the chip, or battery life, or an intermittent glitch.

Indeed. The best description that I’ve heard of the process is “work out the hardware you want to exist to solve the problem, and then write the HDL code which will synthesize into that hardware”. In verilog/vhdl there are a lot of ways you can write the HDL which will give correct outputs but synthesize into hardware which is garbage (too big or won’t meet timing). You have to learn the specific patterns to write which hint to the synthesis tool what you would like it to synthesize.

I'm hoping to talk about this in the next installment where I actually talk about the design. It's been really interesting getting a feel for how the processes are different. It's something I've been aware of before from talking to hardware engineers, but I think I'm only truly understanding it now that I'm doing it myself :)

Is anyone around who has implemented a substantial project in nMigen? Looking at the syntax it looks very unintuitive and awkward, mainly due to being shoehorned into Python syntax. Do the advantages of being able to metaprogram in Python outweigh the disadvantages of the syntax? (I’m comparing to a hypothetical HDL which is at a similar abstraction level but has a dedicated syntax and some other way of embedding metaprogramming)

I'm about 20kloc in to a (not public) nMigen project, including a UDP/IP stack, various peripheral drivers, a DSP pipeline, and some custom networking code. So far I think "do the advantages of being able to metaprogram outweigh..."is a resounding "yes", but I came to it with extensive Python experience so it feels like a very natural fit to me, and given how other attempts at embedded metaprogramming go (Tcl...) I'm not sure Python is easily beat. Being able to integrate with numpy for DSP work, pytest for simulation and testing, Python packaging (such as it is) for distributing modules and managing dependencies and versioning, and simulating inside Python too are all additional compelling advantages.

nMigen is a pretty fledgling project so there aren't a ton of big projects in it yet, but for example Luna[0] (from the makers of the HackRF) implements a full USB stack including USB3 support, with enough abstraction that you can create a USB-serial converter inside your FPGA using two I/Os for full-speed USB and wire it into the rest of your project in about ten lines of Python[1].

[0]: https://github.com/greatscottgadgets/luna [1]: https://github.com/greatscottgadgets/luna/blob/master/exampl...

Migen, the Python-based project nMigen is based off, has been around for longer and has some large projects, such as LiteX[2] which uses Migen to glue together entire SoCs, including peripheral cores such as GigE, DDR3/4, SATA, PCIe, etc, all written in Migen, and is pretty widely used. It also pulls in Verilog/VHDL designs (such as many of its CPU core choices) since it's easy to pull in those from the Python side.

[2]: https://github.com/enjoy-digital/litex/

I'd be interested in this too, I'm all for improving on the standard HDLs, but from this example I see more downsides than upsides, mostly due to the fact the synatx looks very verbose. Part of the is because its layered on top of python, so even a simple switch must be written in a more elaborate way. The code shown in the example would be very readable in SystemVerilog and quite a bit shorter. Also how does something like this handle carry for integer addition, sign extension, etc.

Carries are documented. Two n-bit numbers result in an (n+1)-bit number. The “n+1th” bit is obviously the carry of the result. Sign extension is handled by declaring a `Signal` as signed or unsigned during creation.

For example, both:

    x = Signal(4, true)
    y = signed(4)
will create a 4 bit signed number. Use `false` or `unsigned` for a signed. Setting either x or y to a Python integer will handle the sign extension behind the scenes.

I will agree with you on the syntax, however. It’s caused by the fact that you’re not synthesizing your program, but writing code that generates code.

nmigen is a library, not a language. So it has to use what’s available to it. The upside is you don’t have to write tokenizers, parsers, etc, but the downside is it looks “hackish”.

For example:

    m.d.comb += x.eq(y + 1)
...means: in the combinatorial domain (clockless) of the m `Module`, set x equal to y+1. If you come from a VHDL/Verilog background, nmigen’s “syntax” is pretty off putting, but if you come from a programming background (like me), the Python syntax is easier to grok IMO.

Robert Baruch has a nice tutorial on nmigen: https://github.com/RobertBaruch/nmigen-tutorial

Thanks that helps clarify things, also nice to see Robert Baruch's tutorial.

Shameless plug: I had a similar impression when looking at nMigen, but wasn't super happy with Verilog either, so I wrote a new HDL called Wyre [0]. No metaprogramming, but a Verilog-like language with a focus on ergonomics instead. I'm currently making a basic Minecraft clone for the Lattice iCE40 with it.

[0] https://github.com/nickmqb/wyre

Does part 2 exist yet? This is basically an explanation of what FPGAs are and a "hello world" nMigen example.

That said, high-level HDLs like nMigen are very promising for education, and I hope more people take the same first steps as the article's author did. It's a fun field.

And along those lines, here's hoping that MIT updates their free 6.004 edX course to reflect the in-person class' new RISC-V focus soon.

> And along those lines, here's hoping that MIT updates their free 6.004 edX course to reflect the in-person class' new RISC-V focus soon.

While it hasn't been uploaded into Open Courseware yet you can simply get it directly from the web site. Though you "have to be logged in", materials can be downloaded without logging in if you use a search engine to find what you're looking for, e.g. https://6004.mit.edu/web/_static/spring21/resources/fa20/L02... which talks about RISC-V

I'm hoping to write part 2 this weekend or next week -- I realise anyone knowledgeable of logic design won't be super excited by this first installment. It's intended mostly for people who are new like me! I look forward to writing in more detail about RISC-V and my cpu design :)

Looking forward to reading part 2. Can you or someone else say recommend some resources/tutorials for learning nmigen? Or are the official docs still the best source?

I've had this recommended and it looks v promising! https://vivonomicon.com/2020/04/14/learning-fpga-design-with...

Someone above has mentioned Robert Baruch too: https://github.com/RobertBaruch/nmigen-tutorial

I also found this helpful: http://blog.lambdaconcept.com/doku.php?id=nmigen:tutorial

And there is of course the IRC channel if you want to ask people questions, #nmigen on irc.freenode.net

The author states the following:

>"Despite being faster than schematics entry, hardware design with Verilog and VHDL remains tedious and inefficient for several reasons. The event-driven model introduces issues and manual coding that are unnecessary for synchronous circuits, which represent the lion's share of today's logic designs."

Can someone say is "event driven" in the context of an HDL different than say what even't driven is in Javascript web app, GUI etc? I'm having trouble wrapping my head around it in a digital circuit context. Also in second part of that passage wny would manual coding be unnecessary for synchronous circuits?

> Can someone say is "event driven" in the context of an HDL different than say what even't driven is in Javascript web app, GUI etc? I'm having trouble wrapping my head around it in a digital circuit context.

I think what they mean is that when you're defining synchronous logic in (System)Verilog or VHDL, the behaviour of that synchronous logic is defined as being in response to an event. For example, look through a Verilog codebase and you'll probably see lots of blocks that look like the following:

  always @(posedge clock) begin
      ... // Some logic
What that block says is that the logic defined inside it - most often writing to some sort of storage element, or sampling a signal - will trigger at every positive edge of the "clock" signal, which is the "event". Usually, people will work in more control signals like a clock enable, reset, etc. to make it do more interesting things.

Conceptually HDL is actually very similar to those cases, but with an important difference: simulated time. In an HDL simulator, the simulator starts executing by running code designated to run at time 0 (in Verilog, this is specified using an "initial" block).

Looking first at combinational logic: As the simulator goes through the "initial" code, it will set variables to new values. These value changes will activate event listeners throughout the code ("always @ *" or "assign" in Verilog), which represent combinational logic. So if variable "myvar" is updated, and it is an input to an adder in some other module, the always statement which updates the adder output will be triggered. Whenever a combinational event is triggered here at time 0, it is "scheduled" to be resolved at time 0 + delta, where delta just represents a time after time 0, but before time 0.000...01.

Alternatively, you can schedule events with a specific delay, such as setting up a clock signal to wait 0.5ns and then toggle. You can then setup event listeners to react to the rising edge of this clock signal ("always @ posedge" in Verilog), giving you synchronous logic.

Typically, a simulation will involve a bunch of setup a time 0, combinationally getting every variable to its initial condition. Then there will be no more events scheduled at the current simulator time, so the simulator advances until the next time it has an event scheduled (such as the clock edge at 0.5ns). That value changes from that event will likely trigger many more combinational events that will be resolved before moving onto the next clock edge.

So, all of the scheduling basically works the same as event driven javascript, the big difference from what I can tell is that events are scheduled relative to simulator time, rather than real world time, and the time doesn't advance until everything scheduled for the current time has resolved. This lets us simulate the massively concurrent nature of hardware even using a single simulator thread.

When considering how this looks in actual hardware, you can still consider clocked elements as being event driven, but it's not obvious that it makes sense to think of combinational gates that way. Still, the tools are designed to construct a circuit that gives you the same result as the event driven semantics, as long as you meet timing constraints.

I'm not a simulator expert, so I may be slightly off in my explanation, but hopefully that gives you the general idea!

Migen looks good but avoiding HDLs while trying to learn hardware design is not the best learning approach. They might be ugly languages but the do force your to think in hardware and gates.

That’s interesting, my first impression of nMigen is that it looks like it does a much better job of forcing you to think in terms of the hardware you’re describing, compared to verilog which can lull you into just writing a sequential program.

If you want to learn hardware design in chisel (RTL level HDL implemented in scala), I would give this project here a huge recommendation (I'm the author): https://github.com/PeterAaser/RISCV-FiveStage.

It's coursework that takes you from knowing nothing about hardware design to designing your own RISC-V In-Order Five stage architecture. As far as I know a few students have actually done the work to run this on an FPGA, but for the most part you will have the luxury of an emulator, giving you things like stack traces compared to the model execution for all the test programs etc. Execution trace example shown here: https://github.com/PeterAaser/RISCV-FiveStage/blob/master/Im...

Has anyone worked with BlueSpec https://github.com/B-Lang-org/bsc How does it compare to nMigen. Recently heard oxide computer folks talking about it and they were all gungon about it.

Is RISC-V actually going anywhere?

Yes. But it's just an ISA, someone has to design good silicon

Last I remember, our main bottleneck is getting data to the cpu fast enough that it's not idling all day, so icreasing the number of instructions by a large margin sounds counterproductive. It'll be interesting to see where it goes.

Except that by having fewer instructions in the ISA you may have to increase the number of instruction executed by the CPU which is what matters for performance/efficiency.

Has RISC-V made the good tradeoffs? We'll have to wait a long time (until someone tries to make a high performance RISC-V instead of glorified microcontrollers) to know..

Note though that the RISC-V is a cleaned up MIPS ISA beside being open it doesn't bring much novelty (nothing to improve security or GCs) with the (significant) exception of the (unstable) vector ISA.

Applications are open for YC Summer 2021

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact