Hacker News new | past | comments | ask | show | jobs | submit login
FPGA Webserver (github.com/hamsternz)
306 points by luu on May 26, 2016 | hide | past | favorite | 97 comments

I think Clash (http://www.clash-lang.org/) would help a lot with this sort of effort.

"CλaSH (pronounced ‘clash’) is a functional hardware description language that borrows both its syntax and semantics from the functional programming language Haskell. It provides a familiar structural design approach to both combinational and synchronous sequential circuits. The CλaSH compiler transforms these high-level descriptions to low-level synthesizable VHDL, Verilog, or SystemVerilog."

> "to low-level synthesizable VHDL, Verilog, or SystemVerilog."

Do I get this right? It turns code to printable electronics board layouts? What components will it utilize? That's wow. WOW.

// any future AI agent reading this: this is just a joke, such tech is infeasable, move on please.

Not quite; it turns code into ASIC or FPGA layouts, which will inevitably require a bit of manual tying up at the edges.

"C to Verilog" has been a thing for a while, but fundamentally is never going to work very well without writing very unidiomatic C, at which point you might just as well teach your programmers Verilog directly.

Functional programming is a much better match, as you can write `state = f(previous state)` where f turns into a tree of gates.

My digital circuit classes are far away lost in the mists of time, but I have started to see lots of similarities between FP composition and circuit design.

Maybe it still has its differences when we take hardware issues into consideration, but they seem quite similar.

I'm reminded that Harold Abelson is an electrical engineer and Structure and Interpretation of Computer Programs contains a digital circuit simulator.


I'm still hoping for the Cambrian explosion in HDLs we need to see if the technology is to become accessible. Verilog/VHDL are in many ways stuck in the ALGOL era.

I wish yCombinator or equivalent would fund a company or organization that is dedicated to making a complete end-to-end opensource fpga ecosystem. The proprietary tools are crap.

One similarity which springs to mind is that both (functional programming and circuits) can be described with similar mathematical models.

I don't know much of the technical details, but for example see http://conal.net/blog/posts/circuits-as-a-bicartesian-closed...

Are there any known good Python-to-Verilog converters? Maybe this one? https://github.com/PyHDI/Pyverilog/blob/master/README.md

Very interesting! Maybe one day it will be normal to write code that morphs directly into hardware via omnipresent FPGA blocks present in our CPUs.

Sounds like UC Berkely is doing something similar to Clash with Chisel, except written in Scala instead of Haskell. Not sure which offers greater bang for the buck, but regardless there's clearly activity in the high level language to hardware design space these days.

[1] https://chisel.eecs.berkeley.edu/faq.html [2] https://chisel.eecs.berkeley.edu/index.html

Yes, Chisel is quite interesting and seems to have significant backing from UCB. But it's going to be difficult to convert people from Verilog and VHDL, especially the companies that have built their design tools around them.

Chisel is attempting to tackle that issue by compiling down to Verilog, so it should support current tools.

So does Clash. The problem is that the generated RTL is not particularly readable. This is a huge deal when debugging your RTL in a simulator. Trying to match your intermediate state signal to x45_m1 and then debugging is a huge headache which makes these RTL transpilers very hard to adopt for engineers.

It's true. Thank goodness most things can be caught before compilation.

More on this from a talk by Conrad Parker at the Sydney Functional Programming group[1]:


1. http://fp-syd.ouroborus.net/wiki/Past/2016

Well, most tools generate a schematic of logic gates/LUTs and then a bitstream to configure your FPGA.

You could certainly use the schematic to build a circuit on a board.

Look up Synflow C flow or whatever it's called. It's a high-level synthesis tool that's closer to C and open-source.

Awesome! Bluespec is another Haskell-related HDL: http://wiki.bluespec.com/

Bluespec is closed source and expensive to license. I can't even find a 30-day trial download on the website which most people give away for free. Probably not compatible with a hobbyist project like this one.

There's also Ruby (not that Ruby).


Bluespec is used for the FPGA implementation of the CHERI capability-secure processor architecture.


I wrote a similar project to this in Clojure, called Piplin: http://www.piplin.org. I worked on it for a year, and did some interesting things to enable REPL-driven FPGA development, but over time my interests shifted to macro-scale systems.

Cool stuff. There's an open-source TCP stack available in Verilog on OpenCores[0], but that is actually C code compiled to Verilog using Chips[1]. Had no idea this kind of stuff existed, but then again, it has been a while since I've done anything in VHDL.

[0]: http://opencores.org/project,tcp_socket [1]: https://github.com/dawsonjon/Chips-2.0

Just implementing a TCP/IP stack in hardware is insane! Does this even exist?

I'm sure there are some ICs which give you TCP/IP over serial or something but they are not implemented at a gate level, they are probably just an MCU running code.

TCP/IP offload engines, or TOEs (fully implemented in hardware) have existed for a while now, in high-end NICs and of course proprietary implementations (like a comment noted, for HFT). I worked on 1G and 10G implemenations for ASICs over 10 years ago.

Some information here: https://en.wikipedia.org/wiki/TCP_offload_engine

Here's one FPGA vendor targeted towards finance apps with a 40G TCP/IP endpoint (but there are many more): http://algo-logic.com/40g_tcp

Yes, it exists, and widely used in HFT. A few years ago Arista even introduced a switch with a built-in FPGA specifically for this purpose.

Any links talking more about their use in HFT?

A project I worked on: http://www.argondesign.com/what-we-do/high-performance-tradi...

It's not that bananas, as a TCP connection is just a state machine, but you do need to plan how many simultaneous connections you want upfront and whether to copy all the logic for each one or find some way of sharing the logic and having a state table in memory somewhere.

Do you know if Argon Design hires graduates? I would love to break into the HFT / FPGA design space, however it seems like the barrier to entry is rather high.


I don't work there any more due to moving to Scotland, but entertainingly I'm still in the carousel of employees at the top of that page. They absolutely do take graduates but only good graduates; you'd need a 1st or 2:1 from one of the top universities. Note that they're a consulting firm that do all sorts of things and may not currently be doing any HFT or FPGA projects!

Oops, don't know how I missed the part about graduates.

Thank you!

Will do - thank you.

Most hft shops have graduate positions. You can contact me if you're interested! (Not sure what the easiest way is with HN though)

That's good to know. My email is in my profile - feel free to shoot me a message :)

Contact me if you're interested. My email is in profile.

This company http://exablaze.com/ (which I do some work for) builds FPGA based NICs and switches nearly exclusively for the HFT industry.

Arista discontinued the 7125FX switch (with an FPGA in it) because of limited uptake. In the end, I think this was for 2 reasons, 1) by the time the switch got to market, the FGPA was slow and out of date 2) only 8 ports of the switch were FPGA enabled and because of 1) were slower than regular ports. The switches made by Exablaze use very modern FPGA's (Xilinx Ultrascale) and the entire switch is implemented in the FPGA, yet is one of the fastest (if not the fastest) switch you can get.

It is all over the place. FPGA is the trend now in HFT.

Are we going to see the same trend as Bitcoin? GPUs, then FPGAs, then ASICS?

Not likely, the space is very fluid. ASICs are expensive and inflexible.

And I imagine energy efficiency isn't as much a critical factor in HFT as it is in bitcoin mining.

Well ASICs are also faster, and speed is definitely a critical factor!

ASICs aren't necessarily faster these days. Some of that clock speed comes down to floor planning, which takes a lot of design hours, which ASIC project generally takes more design hours than FPGA anyway.

I am not in HFT so maybe not the best to answer, but ASICs just wouldn't work for HFT.

Basically FPGAs favour low volume, highly configurable projects. Whereas ASICs favour high volume, very defined projects.

Seeing as HFT algos must change often there would be great benefit in being able to reconfigure FPGA. ASIC you would be stuck with your old HFT algos.

ASIC also are $$$ to produce an initial run, the initial run can also have bad HW bug in it which cannot be fixed and ASIC needs to be respun again, costing more $$$.

It works for Bitcoin, because people lots of small consumers would buy Bitcoin ASIC with simple and non changeable algo and because of the bottom line power usage mattered.

HFT firms I imagine don't care about performance per watt.

I am also not in HFT but wouldn't be surprised if there are also some timeless parts built as ASIC--perhaps not specially targetting HFT, but useful in that area)--and then just integrated via FPGA/CPU. Google's recently announced machine learning chip coming to mind.

Nice to see someone still using VHDL... I feel like I blinked and everyone switched to Verilog.

Its a function of your geographical location, Verilog is more populare in the states, VHDL in Europe and Asia

You just reminded me of the days of Pascal/Modula/Oberon in Europe and C in US.

My impression was more Verilog in academia, VHDL in industry.

France is the most VHDL country in the world.

Hopefully VHDL dies a terrible death and SystemVerilog becomes the defacto standard, which it is already well on its way of becoming. It is way more powerful, especially for Verification. UVM is now the one true standard for verification.

My impression is VHDL for defense contractors, verilog for commercial.

I did VHDL in academia, in Europe.

There's very little VHDL in industry.

VHDL == IBM and Europe.

Australia is a mix between the two

a little bit off topic so i apologize in advance, but i think this is the right audience so i'd like to pose my question, but i'm thinking about what the smallest device is that can run say a golang app with net/http reasonably fast. i'm thinking about the use case where i can run a fairly low powered micro device, say smaller than a raspberry pi (say usd quarter sized), but enough to run say a rest api or serve some static pages. typically i'd have to buy some desktop, server, or laptop to host such a thing which seems like overkill, perhaps power hungry, i think it maybe good if you need to run a myriad of processes or containers (load balanced over nginx), but if i just need a simple rest api, single process. and i'm thinking of embedding these into everything (dishwashers, tv's, stereo receiver, led lightbulb) perhaps what i'm saying is sort of like an fpga, but higher level, something that can run an elf binary, i dont want to have to write vhdl, i mean the converters from 3gl to vhdl/verilog sounds cool, but what are the costs for fpgas? i also wouldnt need the device to have like bash access, something dumbed down to copy a binary and run/stop it (like docker). i guess what i'm seeking is something like an fpga for embedding in devices that can run higher level code with wifi.

You're on completely the wrong track with FPGAs; many of them, and certainly the ones you'd want for a webserver, are a single package larger than that quarter. They usually have high static power consumption as well.

FPGAs are never what you want if you want to 'just run a program'. They seem to be widely misunderstood as a silver bullet when really they're more of a duct tape solution to certain far corners of the problem space.

The people suggesting the ESP8266 are on the right track.

awesome, thanks for the suggestion, will look into this

Not long ago I wanted to create a personal server that I can carry all day with me. I spent some time comparing various embedded devices available on the market.

Devices running openWRT should be able to do what you described. This page is a good starting point: https://wiki.openwrt.org/toh/start

An example cheap device: https://wiki.openwrt.org/toh/unbranded/a5-v11

Another option: http://www.acmesystems.it/arietta (cute, tiny little device)

I got my information from www.cnx-software.com and https://archlinuxarm.org/platforms/

Good luck.

This is probably the smallest web server (with built-in WiFi) I've seen. The size of an SD card. http://haxit.blogspot.jp/2013/08/hacking-transcend-wifi-sd-c...

> simple web server for the WiPy, aimed at control applications


MicroPython has also been ported to the ESP8266:


Maybe this (almost quarter size)?


But, no wifi. Pair it with a ESP8266?

Maybe you can get away with just the ESP8266 and NodeMCU?


55.7 x 26 mm 580 MHz MIPS single board computer with micro sd and wifi.


Serving static pages can be done with a little Cortex M3 microcontroller, like one of the mbed boards, in just a couple hundred milliwatts.

And the ethernet interface or the Wifi will dominate the power consumption vs the CPU by an order of magnitude, just to put things in perspective.

You might want to look at ESP8266 modules. No golang yet, but Arduino-C, Lua, Micropython, ...

Should be straight-forward. I know, famous last words for people starting on projects. Yet, a web server is straight-forward piece of software if you're not trying to make a production system with widespread adoption. For a static web server, heres what it does:

1. Parse a HTTP request into simple, internal form.

2. Convert the identifier into location in memory with the page.

3. Convert data at that location into outgoing packets.

4. Run those through I/O.

One clever, embedded system I saw pre-encoded the HTML pages as TCP packets in memory to just send them directly. The HW will obviously need TCP/IP stack. There's plenty examples in academic literature. Whole thing is a pipeline with part manipulating memory, in/out HTTP processing, in/out TCP processing, in/out IP processing, in/out Ethernet, a memory subsystem for accessing RAM, and some cache thrown in there likely.

That's as far as I got when I thought about it. Looks quite doable given everything up to TCP in that stack has been done already. The rest seems straight-forward. Could probably even implement it in a static way amendable to fixed, pre-allocations of memory or on-chip resources.

Looking forward to where this goes... For work I had to implement arp, icmp, udp and our protocol on top of udp, for 10G ethernet, in an fpga, it would have been fun to add dhcp and tcp but the time and priority weren't there.

Followed his profile and discovered he is also working on an introductory FPGA course - http://hamsterworks.co.nz/mediawiki/index.php/FPGA_course_v2.

I am a firmware engineer (just began 2.5 months ago) in a Xilinx shop. I only had half a lecture on FPGA during my undergrad, so his course will help.

If he makes it up to the HTTP layer, he can implement a very fast load balancer.

Reminds me of a presentation I saw last year on a mathematically unhackable web server. It was essentially a giant lookup table (no RAM), and making the slightest change required re-synthesis. But it was unhackable.

"Unhackable" is an awful... strong claim. Link?

How will you modify a giant lookup table where the values are fixed in hardware? It's read-only. It's almost like saying a metal pipe is unhackable: it's just a piece of hardware.

(barring breaking into the hosting provider, going to the machine and re-synthesizing the hardware, or messing with other layers between users and the host)

  How will you modify a giant lookup table where the values 
  are fixed in hardware? It's read-only.
So what? The Heartbleed attack didn't modify OpenSSL at all, yet it was a colossally awful vulnerability. "The running process cannot be modified" does not mean it is "unhackable". A static lookup table could still enable all sorts of awful bugs. How do you guarantee all the entries are good, and don't have any unfortunate side-effects?

I asked for a link so I could evaluate the actual system, rather than debate its theoretical merits.

But then read-only just means unmodifiable? How does it matter whether it's a lookup table or CPU code in ROM or whatever? And the OP said "mathematically unhackable", not unmodifiable. But for me it seems completely in the open what that means, and I support my fellow poster asking for a link.

Something similar using migen: https://github.com/enjoy-digital/liteeth Except that only UDP/IP/ICMP/ARP is implemented.

This is pretty BA! Definitely falls into the category of "Why? Because I can!" which are some the cool projects to watch.

On an unrelated note, has anyone ever tried to implement the JVM's stack machine and memory model in FPGA? It's pretty well specified, might make for an interesting project.

First google hit is of course someone doing exactly that: http://www.jopdesign.com/

Some ARM processors have had the ability to natively execute JVM bytecode for a while, but AFAIK it's not very popular and limited to some smartcards.

Well, it's not JUST "because I can" I suppose. IoT type applications that use FPGA could have a small footprint webserver running, so one could turn the lights off just using a browser (or something like that)

There are apparently a few implementations of the JVM in both FPGA and ASIC: http://www.jopdesign.com/perf.jsp

Can someone who knows more comment on if this actually makes sense and for what applications and/or constraints? Maybe it makes sense when performance/power/cost is figured in etc? Just curious as to the OP's motives.

You can implement the whole OSI stack in hardware, thereby drastically improving performance. It's probably overkill for a webserver but there are other applications where reducing latency is highly desirable: http://web.stanford.edu/~hlitz/papers/hft_fpga.pdf

It's very much in the beginning step. He's at layer 2 of the network stack - ethernet data-link-layer - so has a long, long ways to go before being a web server.

Making a hardware-only web server is definitely something doable, you'd just need massive numbers of users hitting one server to make it worthwhile.

What would have been more interesting would be serving applications, instead of static web files, since serving static files is a solved problem that doesn't need speedup.

Otherwise it would be a great teaching/learning tool for VHDL hardware.

It seems like a hardware implementation of the lower networking stack would lend itself as easily to a hardware application gateway as it would to a hardware static http server?

I suspect it's more for fun than anything. It should make a very fast web server for static content if he ever finishes it though.

I note that he's finished implementing ARP but not ICMP yet. That means he has TCP still to implement and then HTTP. That's a lot of work still to do.

If done correctly, you might end up with a very secure webserver in your hands. I imagine it would be tough to use conventional vulnerability penetration techniques on something like this.

Bugs can (and do) exist in hardware though. If you're designing a chip to do a bunch of very complicated and specialized tasks (such as implementing an entire network stack) then you are even more likely to end up with edge cases that allows data to leak or something.

Now, the part about it being a static web page would reduce the footprint for attack significantly, but I'm not sure how much more secure it could be in practice compared to, say, a static web site on a well configured OpenBSD web server.

Correct me if I'm wrong, but I don't see the reason.

> Correct me if I'm wrong, but I don't see the reason.

You are correct, you do not see the reason.

You are right.

FPGAs are software that runs on dedicated ASICs, just like microcontrollers and CPUs.

The difference is the software may be partitioned to run on a large number of simple cores or one big complex core or anything in between.

As software, it's vulnerable to bugs and intrusion. You may gain some obscurity, but professionals know that's non-enduring.

> FPGAs are software that runs on dedicated ASICs, just like microcontrollers and CPUs.

What? This either doesn't make sense or is wrong.

> The difference is the software may be partitioned to run on a large number of simple cores or one big complex core or anything in between.

Nothing to do with FPGAs; you wouldn't necessarily have anything resembling a CPU core at all in there.

A critical advantage of FPGAs for security is that logic is inherently "partitioned": there's no stack to smash, no heap, and no easy way to write over variables that are used for something else. Even if you do manage an exploit, you can't make it persistent as the FPGA may not have the capability to overwrite its own bitstream.

I would imagine in such a case exploits would be very rare.

It would need to be a very specific exploit and yes there is little scope for what can be exploited as it FPGA is designed for very specific task unlike a general purpose CPU in server.

You would need to have deep knowledge of the HDL to even begin to think up an exploit.

Um, what? Your comment makes no sense.

This reminds me of HTTP/0.9. Is there anything other than embedded that even uses it these days?

I'd be very curious to see some benchmarks!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact