
CλaSH – From Haskell to Hardware - jdmoreira
http://www.clash-lang.org/
======
adwn
Disclaimer: I haven't read through the details of the language yet.

> _The merits of using a functional language to describe hardware comes from
> the fact that combinational circuits can be directly modeled as mathematical
> functions and that functional languages lend themselves very well at
> describing and (de-)composing mathematical functions._

Whenever I read something like this, I cannot really take the language or the
language designer seriously. The complexity and difficulty of hardware design
is not in the _combinational_ part, it's in the _sequential_ (i.e., state-
carrying) part of the circuit [1]. One of the major drawbacks of Verilog,
SystemVerilog, and VHDL, is that successive sequential statements have to be
translated manually to state machines (at least for synthesizable code –
simulation code does not suffer from this restriction) [2].

[1] Source: I'm an FPGA design engineer with a computer science background.

[2] There are of course languages with an improved design, but nearly all of
them are research prototypes and unsuited for non-toy/example designs. The
more innovative commercial products have hardly any marketshare, because
electrical engineers seem to be extremely conservative [1].

~~~
gluggymug
I used to work in ASIC design and verification. I have to agree with you. I
can't take any of this stuff seriously.

They are all toys aimed at newbies. They address problems that are no problem
at all to a qualified engineer in the profession.

Combinatorial logic is not hard. I would argue sequential stuff isn't that
hard either. It's all undergrad engineering stuff studied in the first year or
two.

The design size of a modern ASIC is at a massive scale that makes the problems
of whichever description language trivial. Even the cheapest stuff has quite a
few modules: memory controllers, CPU, caches, power stuff, bus controllers,
various io modules, debugging modules. Nowadays it's multiple cores. Even the
sub modules are just wrappers around other cores sometimes.

The teams are large, usually in the dozens because the problems are non-
trivial.

This clash stuff is like building a mailbox. Engineers in the profession are
designing and verifying the equivalent of an apartment complex.

~~~
adwn
Thank you for this perspective from the top. Sure, when verifying the final
ASIC design, the problems of the implementation language don't matter much
anymore. _However_ , those IP cores have to be written by _someone_ , and they
actually get to suffer the deficiencies of the HDL.

> _I would argue sequential stuff isn 't that hard either. It's all undergrad
> engineering stuff studied in the first year or two._

So? That's like saying that _" writing software isn't that hard either. It's
all undergrad comp science studied in the first year or two"_, which is
obviously false. If your tools and languages suck, they'll slow you down and
you'll produce more bugs – in software as in hardware.

I often get the impression, that many hardware engineers don't even realize
how much their tools suck, because the lack the perspective on the outside
world, specifically the software world, to see better ways of doing things.

~~~
sigterm
I am a pretty junior ASIC engineer who has relatively more exposure to
software world compared to many of my peers.

In my last project I did some IP integration work and the lack of decent
development tools led to significant chores. I did make use of Emacs verilog-
mode like GP mentioned and even wrote some small macros in elisp but overall
experience is far from satisfactory. Not to even mention the usability of
propriety EDA tools and flows. The software world just looks like heaven with
so many awesome development tools available (VS, PyCharm, intelliJ, to name a
few I've used). And I can tell the difference because I once managed to
convert a C# GUI software into a console application mainly with the help of
an IDE (Visual Studio), without first learning C#.

Sadly, I think ASIC (or maybe broader, hardware) industry has a pretty poor
ecosystem in general. I'm not aware of good hardware focused community
comparable to HN, no high quality active Q&A on sites like stackoverflow; even
the HDL languages look inelegant and not well thought out. And you have a
point, I'm not even sure how many of my colleagues realize that. I want to
make some difference. But I'm not sure where to start...

~~~
adwn
> _I want to make some difference. But I 'm not sure where to start..._

1\. Take a promising language, improve it where necessary.

2\. Add excellent support for translation to VHDL and Verilog. The generated
HDL code has to be readable, editable, and it has to reflect the structure of
the original code more or less 1:1. You also need to support "inline
VHDL/Verilog" (like inline assembly in software). Otherwise, your language
doesn't integrate into the ecosystem of synthesis software, simulators,
vendor-dependent Map/P&R, and existing IP cores, which makes it useless in the
real world. This is the main reason why all innovative VHDL/Verilog
replacements have failed so far. Without this feature, there's just no way
your language is going to gain any significant market share.

~~~
MootWoop
2\. I am curious to hear what you think about our solution of wrapping
existing VHDL/Verilog in an external task in Cx? [http://cx-
lang.org/documentation/tasks#external](http://cx-
lang.org/documentation/tasks#external)

~~~
adwn
Very nicely done!

Is there a way to get in contact with you? I suppose you're either Nicolas or
Matthieu?

~~~
MootWoop
Yep, I'm Matthieu! You can send me an email at matthieu.wipliez at synflow.com

------
microarchitect
There's been a few different proposals to use functional languages to write
hardware: Chisel
([https://chisel.eecs.berkeley.edu/](https://chisel.eecs.berkeley.edu/)) and
Bluespec ([http://www.bluespec.com/](http://www.bluespec.com/)) are two
others. They haven't really taken off because the productivity bottleneck in
hardware is not in design but verification and specifically sequential
verification. Combinational verification is quote easy, because modern SAT
solvers can prove almost all combinational properties you can throw at them.

The trouble comes in when you start dealing with complex state machines with
non-obvious invariants. I don't think these functional languages can really
help too much here because unlike C or C++ in the software world, there isn't
unnecessary complexity introduced by a language, e.g. verification becoming
harder due to aliasing. It's the properties themselves that are complex. Lots
of progress is happening though: Aaron Bradley's PDR
([http://theory.stanford.edu/~arbrad/](http://theory.stanford.edu/~arbrad/))
and Arie Gurfinkel and Yakir Vizel's AVY
([http://arieg.bitbucket.org/avy/](http://arieg.bitbucket.org/avy/)) have made
pretty major breakthroughs in proving a lot of complex sequential properties
and these algorithms have made their way into industrial formal tools as well.

------
deadgrey19
Interesting idea. Very similar to Bluespec Verilog
([http://www.bluespec.com/high-level-synthesis-
tools.html](http://www.bluespec.com/high-level-synthesis-tools.html)) which
also builds on a foundation of Haskell to Verilog translation
([http://en.wikipedia.org/wiki/Bluespec,_Inc.](http://en.wikipedia.org/wiki/Bluespec,_Inc.))
Unlike CλaSH, BSV is non-free (CλaSH is using a BSD License) which is a major
(and cool) difference.

Having said all that, I'm currently writing a lot of Verilog for a system
design that I'm working on. I also learned to program Haskell at university,
(although it's been a few years...), so this language would seem PERFECT for
me. But it isn't...

Reading through the CλaSH documentation/tutorial I've found the examples are
baffling. I have no idea what is being written in either Haskell or Verilog
space. The examples seem to be focusing on the language aspects of the tool
rather than how to express actual hardware in it.

It would help me greatly if the authors would go through something like this:
[http://asic-world.com/examples/verilog/index.html](http://asic-
world.com/examples/verilog/index.html) or this: [http://asic-
world.com/examples/vhdl/](http://asic-world.com/examples/vhdl/) and write a
side by side comparison of how I express these fundamental hardware concepts
in this new language.

~~~
MootWoop
The way I see it, it's Register Transfer Level, just written differently.
Instead of writing: @always(clock) counter <= counter + 1;

you write: counter = s where s = register 0 (s + 1)

per [http://hackage.haskell.org/package/clash-
prelude-0.7.5/docs/...](http://hackage.haskell.org/package/clash-
prelude-0.7.5/docs/CLaSH-Tutorial.html)

I agree, it is an interesting _idea_. I suspect that just having "Haskell" got
the link a lot of upvotes :-) As stated on the "Why CλaSH", the advantage is
obvious for combinational circuits. But it doesn't seem to help much for
synchronous logic (which arguably represents the majority of hardware
designs). You'll still be writing everything as "how to update register X in
state S".

~~~
deadgrey19
Helpful explanation. Thanks! Looking at this I have a very specific question,
from a practical getting work done point of view, is this _better_ than what
we have, or just _different_ to what we have. If the answer is "different" I
don't mind, but it will hinder my adoption ;-) I guess this is why I want to
see a side-by-side comparison of real hardware constructs. Things I use
regularly. It would aid greatly in understanding the (potential) benefit of
expressing things these ways.

~~~
MootWoop
Hard to say, it depends on what you consider better. It certainly is more
concise than the equivalent Verilog (just like Haskell is more concise than
pretty much any language I know). This seems especially true when you want to
describe repetitive structures (such as their FIR filter).

CλaSH also has a much better type system than Verilog (again, thanks to
Haskell), but if you wanted a good type system when describing hardware, you
might as well just switch to VHDL ^^

My concern is with the description of state machines. You need to specify if
you want a Mealy or a Moore machine, something that is usually implicit. And
you're still describing the transfer function between states; CλaSH does not
seem to allow you to describe your program in a structured way (such as loop
until x becomes true, wait for 3 cycles, read z, while z > 0 decrement z,
etc.)

------
catchmrbharath
This is very interesting. I have always felt that having a language that has
immutable constructs translates very well into a hardware description
language. This is because most of the hardware modules are immutable, except
for state machines, which can be modelled using state monads(or something
similar).

Considering the hardware tooling right now, I would love to see a language
with more abstractions than verilog or vhdl. I would love to see programming
languages, that just don't compile to system verilog / VHDL, but can directly
move to synthesizing step. I have worked with Bluespec and I don't see that as
the successor.

I would say, a language like haskell(or a derivative) can map to a hardware
really well, and this is a very good attempt. Please pursue it / turn it into
a product, so that hacking on hardware(a FPGA) is much more easier and
beautiful.

------
platz
the CλaSH vs Lava page was interesting:
[http://hackage.haskell.org/package/clash-
prelude-0.7.5/docs/...](http://hackage.haskell.org/package/clash-
prelude-0.7.5/docs/CLaSH-Tutorial.html)

------
Tehnix
I tried this out briefly while working on a VHDL project some time ago. I
absolutely love how quick it is to run and test your code, compared to the
toolings for VHDL. Also, Haskell's terse syntax is very much a plus! :)

------
d-equivalence
When I was writing the chips from nand2tetris in HDL I noticed how cleanly
they could be expressed with function composition in SML. I dismissed the
ideas as a bit weird, if cool, but after a few weeks I came across HardCaml
[http://www.ujamjar.com/hardcaml/](http://www.ujamjar.com/hardcaml/) and some
references to HML (Hardware ML). Unfortunately only the thesis survives and
not any actual code, but still.

The moral of the story is to take your weird ideas more seriously I guess ;-)

------
krick
I don't know a first thing about hardware design, so it's a good opportunity
to ask a question.

So say it would be something actually useful and after some tinkering I'm
actually generating nice Verilog specifications. Can I apply it somehow if I'm
not working in Intel or something? What can I do with it?

~~~
raverbashing
> I'm actually generating nice Verilog specifications. Can I apply it somehow
> if I'm not working in Intel or something? What can I do with it?

Throw it into an FPGA

~~~
krick
And it would be significantly faster than if I would concentrate on optimizing
the code for, say, executing it on GPU?

~~~
CHY872
Perhaps. Perhaps not. If you have an algorithm that is better implemented in
hardware, you might well see better performance executing it on a GPU (for
example, if you want to mine bitcoins).

Really, FPGAs are _meant_ for prototyping hardware; you synthesize your ASIC
onto the FPGA and check it for functionality.

Of course, there's a reason you're making that ASIC.

~~~
raverbashing
Or you don't do an ASIC and ship the FPGA. That's common as well (for the big
manufacturers it's cheaper to do an ASIC, for the not so big, it's more common
to sell the FPGA)

------
marvel_boy
Newbie here. From the page seems that CλaSH can be used to other tasks apart
from a hardware modeller, it's that true?

~~~
cantankerous
What kinds of other tasks do you mean? Clash's surface language is Haskell,
which is a general purpose programming language.

------
Ericson2314
I've used CLaSH over most of the pass year to do all my Verilog assignments
(CS major doing EE electives), and also contributed a bit to the project, so
figure I'd offer some insight.

For all you saying "but sequential is the hard part". Yes functional
programming is most clearly a smash-hit with combinatorial circuits, but CLaSH
shines with sequential circuits too. Basically, time-varying nets are
represented as infinite streams where the nth element is the value at the nth
cycle. Registers are basically `cons`, they just "delay" the stream by one
cycle, tacking on the initial value in front. To make complex sequential code
you just "tie the knot" around the streams with `letrec`s -- which actually
corresponds to exactly what the feedback loop looks like on the schematic.
[Anyone that's done FRP should recognize this ideom.] In this way CLaSH is
both more high-level and more low level than Verilog/VHDL: clocks nets are
derived automatically, but feed back loops are explicit.

Now if you are an electrical engineer, subsisting one tedious task (routing
clocks) for another (programming without "blocking assignment") might seem
like no net gain. But us functional programmers are fluent at working with
such fix-points, and abstracting both what we tie together an the knot-tying
itself. The Moore and Mealy combinatorial are the tip of the iceberg --
examples that we hope will be more accessible to electrical engineers
unfamiliar with functional programming.

For all you saying that "the hard part isn't working with the HDL at all, but
lower level concerns like timing, layout, etc", I have two things.

First you are acting like HDL writing is not on the "critical path" of your
development process, and thus of no concern. Well that's not just true--you
can't have one engineer do HDL writing, one do layout, and one engineer do
testing completely independently because there are some basic data
dependencies here that linearized the development workflow. It may not be the
component with the "most delay" but it's still on that critical path, and thus
improving it will yield at least some speedup to some extent. Automatic layout
and timing analysis is great too, but unless you have a massive amount of
computing power at your disposal, AFIAK you can't get very far, so improving
the HDL side of things might be the /best/ you can do.

Second, there is the development cost of finding all your bugs with low-level
tools. Yes timing analysis is essential, but it's not great in diagnosing the
underlying problem. If you have lots of code that, well, isn't very
aesthetically pleasing, and you do all your debugging on FPGA or with timing
analysis, I suppose just about all bugs look like timing issues. With CLaSH:

\- You have far less code, and it's more high-level, so just reading looking
for errors it is more productive. \- You can try out your code on the repl,
seeing providing a stream of inputs and getting a stream of output. High level
state machine errors (do you really nail this the first time with verilog?)
are easily caught this way. \- Because you have more opportunities to
modularize your code, you have more opportunities to test components in
isolation. Unit tests vs. System tests--y'all know the deal. The former is no
panacea, but obviously it makes complete code/path coverage way more tractable
computationally. \- QuickCheck. I generate programs, run my single-cycle and
pipeline processor for n cycles, see if they both halted and compare
register/mem stage, otherwise throw out the test. I /suppose/ you could do
this with C-augmented testbenches, but it would be way, way, way more code and
effort. QuichCheck worked so well that I never wrote a test bench. \-
EVENTUALLY, with idris or [faking it with] dependent Haskell prove your
circuit correct up to the synchronous model CLaSH is built around.

In practice I can say I honestly wrote and debugged programs all from GHCi
(the Haskell REPL, so very much in software land), and saw them work first
time on the FPGA. Where this didn't happen was usually do to a black boxes,
like megfunctions and other components on the dev board. Obviously my Haskell
testing is of no use if I model them wrong in CLaSH.

Finally, it would be dishonest and misleading to not mention CLaSH's
downsides. CLaSH is designed assuming your circuit is totally synchronous (or
purely combinatorial, but that's the trivial case). I don't know often this
comes up in the real world, but in interfacing with the components on the dead
board, I often had to do things that violated rigidly synchronous circuit
design --- inverting my clock to get a second 180-degree-off clock domain,
asynchronous communication with SRAM. [CLaSH supports multiple clock domains,
but only knows about their relative frequency, not phase.] You can often still
describe these circuit in CLaSH, but because it violates its synchronous
model, it won't understand them and neither will your Haskell-land testing
infrastructure. Basically you loose the benefits that made CLaSH great in the
first place. Fundamentally, I think true fixing these cases means designing a
lower-level "asynchronous CLaSH" that both normal CLaSH and these cases can
elaborate to. Trying to tack them on as special cases to CLaSH and it's
synchronous modle won't fly.

But all is not lost, if you can contain the model-violation to one bit of code
and give it kosher synchronous interface, you are all good. Write some Haskell
to simulate what it does (need not be even in the subset CLaSH can
understand), and make a Verilog/VHDL black box. CLaSH doesn't help you with
that module, but that module doesn't pollute the rest of your program either.
Most real-world designs are by and large synchronous, unless the world has
been lying to me. So the quarantined modules would never form a significant
part of your program.

That about wraps it up, ...hope somebody's still reading this thread after
writing all that.

~~~
gluggymug
Two big questions

1\. Can it do cycle accurate simulation of multiple clock domains?

2\. Can it reuse the verification IP after the design is converted to
Verilog/VHDL?

~~~
Ericson2314
1\. So I've actually never used multiple clock domains with/ CLaSH. [The
inverted clock I mentioned went to the RAM megafunction, which was
instantiated in Verilog. For testing purposes my RAM (in CLaSH) had zero-
cycle-delay reads, which is what the RAM w/ phase-shifted clock was supposed
to simulate. Also the circuit topology is the same either way (but for the
inverter on the clock), just the circuit "works" for different reasons, and
thus the timing is different.]

I don't quite know what you are asking, but I think/hope the answer is yes.
See [https://hackage.haskell.org/package/clash-
prelude/docs/CLaSH...](https://hackage.haskell.org/package/clash-
prelude/docs/CLaSH-Signal-Explicit.html#g:4)

2\. The QuickCheck testing infrastructure I wrote is all nonconvertible. But
CLaSH has support to make testbenches. Like I said, I never used it because
QuickCheck is oh so awesome, and undergrad projects are small, but see \-
[https://hackage.haskell.org/package/clash-
prelude/docs/CLaSH...](https://hackage.haskell.org/package/clash-
prelude/docs/CLaSH-Prelude-Testbench.html) \-
[https://hackage.haskell.org/package/clash-
prelude/docs/CLaSH...](https://hackage.haskell.org/package/clash-
prelude/docs/CLaSH-Tutorial.html#g:7)

Note that the testBench code works in Haskell too.

~~~
gluggymug
I ask about clock domains because you stated real world designs are by and
large synchronous. The issue is when we have data crossing clock domains we
have a potential area for bugs dependent on the possible combinations of clock
speeds.

It becomes effectively asynchronous because we need to determine when data
from one clock domain arrives relative to the edge of the other clock.

I can't understand the Haskell stuff you are linking to. I don't know whether
it is capable of finding the issues I am talking about.

The second question was just trying to figure whether CLaSH can be used with
Verilog/VHDL in some way. I was hoping against hope that there was a usable
aspect to it in industry.

I can't figure out whether there is though.

The Haskell aspect is not a sweetener. We don't usually study that and it
doesn't look good. You think VHDL looks bad but I think Haskell looks bad.
It's like saying you've been real keen on a new beer made from brussel
sprouts.

10 years ago, people were going on about SystemC. It didn't really catch on
and it was a lot more normal looking.

~~~
Ericson2314
Ok, yeah sorry the docs other than the tutorial assume some familiarity with
Haskell.

2\. What do you mean by "verification IP"? It was that phrase that made me
mention testbenches.

Basically, while CLaSH is hard coded to understand certain types such as the
Signal type, almost all primitive function are just defined with Verilog/VHDL
templates which it instantiates. One can write their own templates that work
just the same way. So for any piece of CLaSH-compilable Haskell, you get
VHDL/Verilog for free, and for any bit of piece of VHDL/Verilog, you can use
it in CLaSH by writing some Haskell (with the same ports, and that hopefully
does the same thing) and then telling CLaSH the Haskell is to be replaced with
your Verilog/VHDL.

This is about as good bidirectional comparability as one can get. Automatic
Verilog/VHDL -> CLaSH compilation would be an improvement, but I don't think
it is possible: I'm not sure to what degree the semantics of Verilog/VHDL are
formalized, and even if they are, there's no way the implementations all
respect those semantics.

The testbench functions are just templated like any other primitive function.

1\. UnsafeSynchronizer "casts" one signal to another -- it's compiled to a
plain net in Verilog/VHDL. At each output cycle, n, it looks at the
round(n*fin/fout) input cycle and give it that value.

Obviously this is unsafe because, as you say, in the real world the problem is
asynchronous. You don't know the exact frequency ratios and phase differences,
nor are they constant, and even if you did you'd get subtle timing errors with
an incoming value that doesn't change on the clock edge.

The trick is it is a pretty basic "black box" to augment CLaSH with, so proper
synthesizers can be written in pure CLaSH. if that's not enough for some
super-asynchronous synchronizer design, one can always fall back on writing
their own black-box as described above.

\------------------------------------------------------------------------------------------------------------

I don't think anyone imagines that CLaSH will be immediately understandable to
someone who has never used Haskell. So no way does anyone expect the benefits
will be immediately clear. So are you saying the restrictions I mention sound
too onerous, or are you saying "I dunno, it looks weird"?

If the former, that's perfectly acceptable, thank you for reading.

If the latter, I'm sorry but this is a pet peeve of mine--we get this a lot in
the function programming community. Understand that we are claiming the
benefits are worth the non-trivial learning curve. If it was so damn obvious,
it couldn't offer much benefit over the status quo---people would have already
switched en mass and it would be the status quo.

While C-esque cuteness looks nice, I agree such things are doomed to failure.
The C model is easy enough to stand, but it's linearity, implicit state, and
notion of control flow have nothing to do with the hardware---you can
understand both models, but the compilation process is necessarily non-trivial
and sufficiently "far from subjective" that many designs cannot be expressed
at all, and many more must be expressed through very round about means.

Functional HDLs like CLaSH have a dead-simple structural compilation model, so
while they may not understand every circuit, they can express it---the
compiler is near subjective but not homomorphic counting these like this SR
flip-flop:

    
    
      \ r s -> let q  = nor r q'
                   q' = nor s q
               in (q, q')
    

This compiles to exactly what it looks like, but diverges (i.e. infinite
loops) under Haskell's semantics.

Lastly here is a comparison of writing the same project twice (though the
paper is done by the CLaSH designers)
[http://www.researchgate.net/profile/Jan_Kuper/publication/47...](http://www.researchgate.net/profile/Jan_Kuper/publication/47936996_Comparing_CaSH_and_VHDL_by_implementing_a_dataflow_processor/links/02e7e520c06378cdbd000000.pdf?origin=publication_detail)

~~~
gluggymug
Verification IP is reusable code created by verification engineers. E.g. Say
the designers are developing a networking module. The verification engineers
would build the verification IP to generate the network packets. They also
build the monitors to check the network protocols. For any design under
verification, there is a corresponding amount of verification providing
stimulus and checking.

Check out: [https://verificationacademy.com/verification-methodology-
ref...](https://verificationacademy.com/verification-methodology-
reference/uvm/docs_1.1c/html/files2/intro-txt.html)

The reason I bring this up is: verification is the hard part of the HW
workflow. The other similarly tough part is synthesis. Every single project I
have been in, verification and synthesis are the toughest tasks that consumes
the most team effort. Not design. When we plan projects, it all revolves
around the verification task.

For every bus, every module and at various stages of SoC integration, we are
writing verification code using System Verilog.

If you want to improve our tools, I would look at the verification/simulation
or the synthesis side.

