Hacker News new | past | comments | ask | show | jobs | submit login
How FPGAs work, and why you'll buy one (2013) (yosefk.com)
109 points by mpweiher on April 16, 2015 | hide | past | favorite | 107 comments



The major issue with FPGAs today is that there is no free toolchain for compiling the bitstreams that run on them.

The only project I know of to make any real progress on fixing this is fpgatools. [0] It supports a single model of the Xilinx Spartan-6 series, the XC6SLX9. I know almost nothing about FPGAs, but seeing as how almost no one is working on this problem, I figured I'd try to add support for the XC6SLX45, the model that is the Novena motherboard. So far I've added the C code to represent all of the pins. [1] Unfortunately, I haven't heard anything from the maintainer about how to proceed further.

fpgatools is only a single piece of the puzzle, though. It provides libraries to build the low-level bitstreams, but we'd still need Verilog/VHDL implementations that use it, as well as replacements for the proprietary "soft cores" that most people are using. The amount of work required is intimidating. Not having an HDL readily available did motivate me to have some fun rewriting the example C programs in Scheme, though. [2]

Does anyone know about other such efforts to free FPGAs? I'd be very grateful to hear about them.

[0] https://github.com/Wolfgang-Spraul/fpgatools [1] https://github.com/Wolfgang-Spraul/fpgatools/pull/8 [2] https://gitorious.org/davexunit/guile-fpga


My roommate is working on this problem: http://diamondman.github.io/Adapt/ https://github.com/diamondman/Adapt

I don't understand this stuff well, I'll try to get him to comment here.


I disagree strongly. The major issue with FPGAs today is that they cost money (because they are chips). They cost even more than your CPU. So you can't treat them life software. To put it in simpler words: everyone has access to a CPU, less than 1% have access to an FPGA. Fix that* and the tooling will follow.

* one way to make FPGAs "free" is to incorporate some FPGA blocks in an Intel CPU. Then it will feel free.


It's the programming model, not the FPGA itself that people are asking to be free (as in freedom, not cost). Comparing to the world of software: the FPGA is analogous to the CPU, the programming model is analogous to the instruction set, the tool chain is analogous to the compiler and the hardware Description Language" is analogous to source code. The freedoms that apply to software/compiler/information can equally be applied to their analogue in the FPGA space.

Incidentally, FPGAs start at US$1.24 in single unit quantities from Digikey, and go up from there. US$6.75 will buy a "proper" logic cell based FPGA. They are no more expensive than a CPU. A Zynq development board, which has 2 x ARM cores with integrated FPGA, can be had for US$119 [1].

https://adapteva.myshopify.com/collections/featured-products...

--- Edit: fixed price on dev. board


Intel CPUs are anything but free. They contain nonfree microcode and other nastiness, so they would be unsuitable for free hardware projects.


I think the grandparent is referring to 'free as in beer.' It doesn't cost anything to tinker with programming because almost everyone already has a computer they use for other things. Not so with FPGAs.


PART 1/3

The lack of open tools is a huge problem. My friends on IRC are working on reverse engineering multiple chips enough that full compilation tool chains can work. My part of the project is working to make a highly generic method of loading the compiled files onto the actual chips no matter the programmer or chip used.

I will expand on the steps of compiling for an FPGA, loading a program onto an FPGA, the difficulties we face in making our own tools, and then talk about my project a little (which cgag was kind enough to name drop). As you can see from the post size this will be very long but should be very educational.

First thing is there are more parts to the bitstream generation, as well as other tools of the toolchain required after bitstream generation is done.

The compilation is split into the following parts.

HDL SYNTHESIS HDLs can work for both ASIC design, FPGA design, and more general work like some mathematical proofs that never intend to run on real hardware. This step takes in a HDL program and calculates all the gates required. The output is called a netlist is more of a mathematical construct of connections over perfect wires (everything is ideal i.e. not realistic). This step should also detect repeated gate mpatterns in the netlist and de-duplicate them (This is crazy important since it can shrink a netlist substantially). Commonly tools that implement this step will produce everything in one gate type (like NAND http://en.wikipedia.org/wiki/NAND_logic). There are two major open source tools: Icarus Verilog (http://iverilog.icarus.com/) used more for the math proofs and simulation but not intended to generate information for real chips. Yosys (https://github.com/cliffordwolf/yosys) A much newer tool that is made to be part of a real life FPGA toolchain once the next tools are available. Only works with Verilog (more popular in industry).

TECHNOLOGY MAPPING The link OP posted shows how the FPGA has specialty hardware peppered in with the general gates to speed up certain operations. The link briefly talks about the truth tables in the LUTs which means that each LUT is not really just A gate, they are each a box that can be configured to implement one of many common sequences of gates to save space and speed. Each FPGA can be implemented differently, and since the synthesis step outputs a netlist usually in a specific gate type, the gates of the netlist often do not match the gates that can be efficiently implemented in the LUTs. This tool is the first one that has to have deep knowledge of the FPGA's underlying technology (often called the Fabric) so we can convert the netlist into something that could in theory be put on the target FPGA. Synthesis tools do not implement this step since it is FPGA specific.

PLACE AND ROUTE After we generate a netlist of realistic gates we have to find out how to fit them into them onto the physical chip with enough room to connect to each other. This by itself is a VERY hard problem and takes most of the time of the full compilation. It is made more difficult by the fact that this requires extremely specific information about the FPGA's fabric to know not only what paths are available, but also make sure that the paths between LUTs are not too slow so we can send signals around the chip fast enough (programs for FPGAs are not like programs for CPUs where each instruction happens one after the other. In an FPGA all the gates are running at the same time so we have to time the electricity running at the speed of light to give the circuits time to execute, and move LUT's around if our timing is not met. This timing happens in a CPU too, but the CPU designer got this working so our instructions run in a predictable way).

BITSTREAM GENERATION The final step of the compilation toolchain is to take the LUT configuration and routing information from place-and-route and convert it to a binary file the FPGA can actually 'run'. Like Place-and-route this step is highly FPGA dependent. We already know the configuration of the fpga we want from the last step so this step is just converting that configuration into the binary file format for the FPGA. This format is also kept secret.


PART 2/3

LOADING or 'FLASHING' With out final bitstream file in hand, we obviously want to shove it into an fpga. But how? For anyone who has programmed microcontrollers directly you know you need a box that lets your computer talk to the chip. For Arduino this is some SPI setup, but most FPGAs and ARM cores support JTAG (http://en.wikipedia.org/wiki/Joint_Test_Action_Group). JTAG was intended as a method to do unit tests on highly complicated boards but its easily expandable architecture let people do what they wanted including programming their chips. Unfortunately each company implemented programming differently and while most companies produce chips that comply with JTAG signaling, each company and sometimes each chip handles programming their own way. A new standard, IEEE 1532, was made to mitigate the damage of nonstandard interfaces. IEEE 1532 specified a format for people to write up the random nonstandard sequences required for chip configuration that can be interpreted and run, but sadly the standard suffers from common problems of Electrical Engineers writing specifications for software. The result is a standard too loose to really standardize anything, with tool writers often having to do custom work per chip using the chip's 1532 spec as a guideline.

CHALLENGES In order to build tools for the TECHNOLOGY MAPPING, PLACE-AND-ROUTE, and BITSTREAM GENERATION steps, we need through documentation on the layout and configuration format of the FPGAs we are targeting. This information is unavailable and aggressively kept secret. One engineer bothered Xilinx until they told him that in order to receive the information his company would have to make them several million dollars a quarter, and even then he would only get it under an NDA. I have many opinions on how counter productive it is keeping this information secret and how it does not actually help Xilinx maintain competitiveness, but that is a different post. The FPGA manufacturers are not only unhelpful, they do not want there to be open tools. I am not honestly sure why since Xilinx gives away their compiler for free. They do sell the professional version but Xilinx admitted that they only made several million from software sales and the cost of engineers writing it must have been in the same scale (they made 14 billion last year). I would expect them to be indifferent to open tool chains but they actively work against tools that have been made. The licensing language in their compiler says you are not allowed to reverse engineer the tool (honestly not a big problem since the code is terrible), but you are also not allowed to look at the output bitstreams to try to figure anything out about the chip. If you think this is an idle threat, they have sent cease and desist letters to at least one tool chain that got pretty far reversing the bitstreams. It is likely that this part of the license would lose if challenged in court, but as I said above, Xilinx made 14 billion last year, and no engineer has been able to challenge them even if they are right. I hope to get help from the EFF in the future to challenge and strike down this surprisingly common clause.

WHAT IS BEING DONE Several of my friends on IRC cut FPGAs open and use high power optical microscopes as well as electron microscopes to capture each layer of the chip's layout. At least one is currently working on a computer vision tool to automatically map the chips and produce verilog code of the FPGAs themselves. If this works well we should be able to get enough information to build basic tools for each chip the whole process is applied to.


PART 3/3

MY PROJECT: I want to pave the way for FPGAs to be usable in everything from laptop/desktop computers to phones (if the static power consumption gets better). I have some very interesting ideas of what an average user could do with in system FPGAs that can be reconfigured at runtime.

My main target is fixing the huge tangled JTAG mess. I mentioned earlier that even though the JTAG standard is pretty solid, everything built on it and around it are highly non standard. There is one more problem; there were standards to address how to build boxes to let your PC talk JTAG since that is not the IEEE's area. Each company has their own JTAG Controller which is often USB. Each controller only works with the Company's software and only with the company's chips. The good news is that there is no physical reason these controllers can not talk to all JTAG devices. Each of them have custom USB protocols for how to make the box talk JTAG, and only the company's software knows how to talk to it. The limitation here is the software on the PC and not the controller itself. All we need is documentation on the controllers USB protocol, which is where my project started.

As a final note on JTAG controllers, almost all of them require being loaded with firmware every time they plug in. This means that you have to have the proprietary firmware available. But since this firmware is not allowed to be distributed by anyone other than the company that created it, even if we know the secret USB protocol, we can not use these controllers with our own open tools out of the box without writing custom open firmware for each one.

I first started documenting JTAG controller protocols. This is available on my github (though it needs to be cleaned up). Next I started work on an open tool for talking JTAG to all chips independent of what JTAG controller is used. I only support a few chips for now, but all you have to do to add support another JTAG controller is write a driver and all supported chips will work with it (independent of manufacturer). To address propriatary firmware issue, I started writing my own open source replacements and have a stable (though limited functionality) version of the Xilinx Platform Cable controller available on github. And I am working on firmware for digilent boards (including the Atlys). My next step is to rewrite my tool in something other than python (likely C though I have a fantasy of Rust). The future version will not require controllers to be USB (and will support controllers that use ethernet, PCIe, etc), will support more than just JTAG (Debug Wire, SPI, etc), and likely be a daemon that keeps track of all jtag hardware. I want to see tools like OpenOCD (provides GDB interfaces to embedded systems, etc) to replace its dated controller architecture with calls to my daemon. I want to unify how all of these devices are communicated to so we have something to use when we get configurable FPGAs on the PCIe bus so the kernel or user software can configure them as needed.

NOTES: The mathematical theory and how to implement it in computers of the de-duplication step of SYNTHESIS, the TECHNOLOGY MAPPING step, and the PLACE AND ROUTE step are all taught in the VLSI CAD Coursera course https://www.coursera.org/course/vlsicad http://diamondman.github.io/Adapt/ My initial configuration tool https://github.com/diamondman/adapt-xpcusb-firmware Firmware for Xilinx Platform Cable


Brilliant writeup, thanks! Please publish this somewhere more permanent and resubmit so it gets more attention (and hopefully discussion).


My friend put it up on his blog until I get around to making mine again. http://curtis.io/others-work/open-tooling-for-fpgas


Just implement an FPGA in Verilog, then use that FPGA as your target. It has the added bonus of making your bitstream portable across Altera, Xilinx, Lattice, etc. One could use genetic algorithm to map the search space so that the virtual fpga compiles cleanly using available resources of the hardware.


I did it (while playing with reconfigurable dataflow CPUs), and timings (expectably) were awful.


It could see cases where there is an impedance mismatch between the guest FPGA running on the host FPGA. The idea of the genetic algorithm to map out the search space is to find some identity function HDL that allows for a clean 1:1 or 2:1 mapping of resources without incurring some combinatorial inefficiency.

Maybe c-slowing could help with the timings, even if the area used was 4x it would still be a great research platform for reconfigurable computing since the bitstreams are proprietary, not documented and not portable.

http://blog.notdot.net/2012/10/Build-your-own-FPGA


Wow! How did I managed not to know of this project?!?


Maybe you also don't know about this project?

http://www.clifford.at/icestorm/

They had some interesting progress lately:

http://hackaday.com/2015/03/29/reverse-engineering-lattices-...


Yep, did not hear of this one too. I'm more interested in the Xilinx chips at the moment, but may take a look at the Lattice stuff later.


Want to help with the XC6SLX45? :)


Maybe. I've got an Atlys board (but I'm not sure I'm ready to brick it yet).

Currently I'm more interested in trying to use it with the existing open source place&route tools ( https://code.google.com/p/vtr-verilog-to-routing/ ).


I think a lot of people posting here don't realize that programming an FPGA means programming how the gates are set up on the FPGA, not software programming.

VHDL/Verilog etc are descriptive languages, not software languages ala C/Python/Java/etc. You should have enough hardware design knowledge (at RT level) to be able to sketch your HW design in a piece of paper, and then, you are apt for writting VHDL/Verilog and putting it into the FPGA.


While this is sort of true, this attitude contributes to the problem another commenter mentioned about how working with FPGAs is like taking a trip to the 70s. There's no reason you can't use a full-fledged programming language like Python to specify your hardware with the strict HDL subset at the RT Level, then use the full language's power to test it off-FPGA, and use all the software tooling around your full language to make the whole experience as pleasant as possible. In fact, that's the approach of MyHDL: http://www.myhdl.org/


Yup, made an experiment with it -> http://www.eetimes.com/author.asp?doc_id=1323837

I really like MyHDL!


I disagree that you need that much hardware knowledge. I also think that statements like these can be discouraging to people that don't have a background in hardware. That's a shame because for me playing around with HDLs was one of the most mind opening experiences for a software developer.

I wrote tetris in Verilog that outputs to VGA[0]. I definitely encourage anyone interested in HDLs and FPGAs to start learning.

[0] https://github.com/jeremycw/tetris-verilog


I'm sure a lot of knowledge of hardware design helps, but it's not necessary. I've toyed around with VHDL on a DE0-Nano board, and while I have a lot of C experience I have none with HW design.

I can still program enough to do simple things, e.g. sample an analog temperature sensor, drive a simple led display that display current temperature, average of last hour etc.


I would argue that for the job you are aiming for, you could do it more than fine with a microncontroller like a 3 bucks ATMEGA or Arduino. You would program it in C, and you don't need a 60 bucks FPGA + a 200 bucks for the FPGA controller.

If you are learning, meddling with FPGAs your way is awesome. But nobody is going to use that approach for serious work on production.


While, I do have an EE degree, I play with FPGAs fairly frequently and rarely think about the hardware that I'm laying out.

In my experience, unless you are building something either very large and need to worry about your gate count, or something very time sensitive and need to work about component distance, you can just program and enjoy real concurrency. :)


But the point of FPGAs is to be cost effective. They are a substitute for custom PCB designs with RT components and somewhat low production numbers (500~5.000 devices, depends on the FPGA etc).

Either you build something big enough that it doesn't make sense to make a PCB + components, and mass produce them on a medium level using FPGAs, or you aren't using FPGAs for what they are. It's fine if you are learning and toying with them, we like to hack on things, but you are never going to use them that way in a job, in production.


Another way to think of a register transfer level hardware description languages is as a type of dataflow programming language.


Our experience with GPU cloud computing at Graphistry should be pretty representative. We spend a lot of effort getting subsecond interactivity in funny C dialects (OpenCL/CUDA). To get those down further, we can put together a few GPUs and reuse most of the code. Eventually, however, data communication costs get too high, so FPGAs would be the next step. That is certainly doable: OpenCL -> FPGA compilers are a thing!

So the question is who has already been going down that pathway, and I'll leave that as an exercise to the reader :) My guess: that'll stay in private hands for awhile, and after a few years, as FPGAs get put into public clouds and everyone goes down the same pathway, a lot of people will be using FPGAs, even if indirectly.


With OpenCL + FPGA what you do is basically instantiate a GPU architecture inside an FPGA and then execute the OpenCL algorithm into that.

I think that's cool but you aren't getting a better throughput than what you would get with a GPU.

The better solution would be to do your custom optimised control/data path for the FPGA architecture you are currently using (VHDL/Verilog).


That's not how it works, actually. The compiler generates the dataflow directly in hardware.


>>Eventually, however, data communication costs get too high

Looks like the program is spending too much time in 'MPI_RECV'? Clearly the same time would be spent by any other hardware device, and raw compute power wont fix this. You need a faster interconnect. Indeed, among the few fixes for this problem are complicated changes to the algorithm, such a maintaining some kind of network routing scheme, which often involves complicated logic difficult to implement on anything but a CPU.


The trouble I had with FPGAs is how radically different languages like Verilog and VHDL are when compared to languages like Java, Python, or C++.

Writing code that is concurrent by default is a massive paradigm shift when all I had been exposed to at that point were procedural languages.


That's because hardware description languages are not programming languages. This point was constantly emphasized in my digital systems courses. You do not describe a sequence of instructions using an HDL, you describe the layout of a circuit with registers, wires, and logic blocks. You have to consider the physics of the device to avoid violating timing constraints or driving a signal from two different sources, etc.


Hardware description languages are not imperative programming languages (even if they might superficially resemble them). That doesn't mean they aren't programming languages at all -- they just belong to a different category of language.

(I suspect that there's a connection between hardware design and functional programming, but I don't have enough hardware design experience to know how closely the two are related.)


Conal Elliott, Haskeller of FRP fame, was recently working for a company where he was developing a compiler of functional programming into categorical semantics which is actually quite neatly similar to the layout of a circuit. The company went under, unfortunately, so I'm not sure of the status of his research, but some remarks are available on his blog [0] [1] [2].

Further, there is a notion of a Generalized Arrow [3] which is a useful, advanced functional programming technique which is quite nice for implementing FRP. These are somewhat obviously "wiring diagrams" but are shown to be in correspondence with a more normal "lambda calculus"-like syntax.

[0] http://conal.net/blog/posts/haskell-to-hardware-via-cccs

[1] http://conal.net/blog/posts/overloading-lambda

[2] http://conal.net/blog/posts/optimizing-cccs

[3] http://www.megacz.com/berkeley/garrows/


You have CλaSH [0] based in Haskell developed at University of Twente that has the same goal.

[0] http://www.clash-lang.org/


I'd rather classify HDLs as dataflow languages.


They are programming language, but you don't program the FPGA device with them. Your program a generator that runs on your CPU and creates the physical layout for the FPGA device as its output.


Of course they are programming languages. Anything with a well-defined semantics is a programming language.


[deleted]


And why setting up LUTs and interconnect is not a programming?


(Apologies for deletion. I moved my comment to parent. Won't try that again.)

It is, but on a different level. I think the analogy would be that in VHDL/Verliog you are dealing with a template or macro language (which is executed to generate code) as opposed to a low-level language (which translates almost directly to machine instructions).


I'm not sure a presence of a stream of instructions or anything comparable is a requirement for a definition of a "programming". You can program a DNA, or a slime mold: http://www.wired.com/2013/06/slime-mold-computers/

Or even crabs: http://hackaday.com/2012/09/28/making-logic-gates-out-of-cra...

Anything which can be used to implement any kind of computing in a broad sense is a programming, as long as it's done using a well-defined formal language.


Or carnivorous plant! http://www.dailymotion.com/video/xbwqej_david-naccache-quel-... (sorry it is in French).


Try to start modeling your problem as a state machine. Separate the combinational and the sequential part in two state machines. The flow becomes quite similar to regular embedded system software (that can be programmed using state machine pattern as well).

Say

e.g:

state read_input =>

  reg_a <= input

  next_state <= multiply_input_by_10
state multiply_input_by_10 =>

  reg_b <= reg_a * 10

  next_state <= finish
state send_to_output =>

  finish_reg <= '1'

and then parallel to that you would have

  finish_wire <= finish_reg

  value_wire <= reg_b
or something like that, sorry for the bad example but definitely use state machines


I started writing a (hopefully) better and clearer tutorial of what is HDL for programmers. Feel free to contribute at https://en.wikibooks.org/wiki/Programmable_Logic/Verilog_for...


Sounds like flow-based-programming or even FRP (e.g. netwire)


I think it is very important to understand that writing Verilog or VHDL isn't like writing code in a programming language. Programming languages provide instructions to a fixed state machine. With hardware description languages (HDL) you are describing the function of hardware. There are no instructions. It is hard to make good designs without understanding the underlying building blocks of hardware designs.


I have mostly the same experience as a software engineer playing with FPGA's. I don't buy the argument that Verilog isn't a programming language - there are still plenty of software principles that apply to a HDL and we should take advantage of those. There are interesting differences to software - we're used to sequential software being easy and parallelizing it being hard. In a HDL, parallel is easy but sequential logic is much harder requiring state machines. Writing testable software was historically hard, but test cases cheap. It is easy to write a testable Verilog module, but fairly expensive to write test cases compared to software.

Software tools are also way ahead of FPGA tools. I'm not sure how folks that develop FPGA designs professionally retain their sanity after using the tools...


That might be a barrier to entry, but it's a good thing for your growth as a programmer.


Right now, FPGAs deliver slightly better perf/W than GPUs, but significantly inferior FP32 performance. That said, they do shine for simple embarrassingly parallel tasks assuming the task does not require FP32 operations. A good example here is bitcoin mining.

That changes this year with the release of Altera's Arria 10, which comes with 1.3 TFLOPS of hardcoded FP32 math units. The other gamechanger is the ability to write kernels in OpenCL rather than VHDL.

However, the build time for OpenCL on FPGAs is still many hours compared to tens of seconds for GPUs so my guess is that GPUs will remain the development platform for FPGA code for the foreseeable future, occasionally deploying to FPGA on a daily basis or so.

Drifting a little off-topic, the SDKS and libraries for all the accelerators should be free. That's what makes me such a CUDA fanboy. I'm looking at you Intel. Charging for your Xeon Phi/CPU OpenCL compiler? Bzzzzt wrong...


A sequel is in the making, titled "Why you won't buy an FPGA"

Did the followup article ever get written? I didn't find anything with a search for the proposed title on either his site or Google.


The reason is because GPUs gained more general-purpose capabilities. GPUs are the new FPGA, for all practical purposes.


Author here.

I didn't write the follow-up yet because it requires a lot of research and I'm busy :-)

If you ask me, GPUs are anything but "the new FPGA". GPUs are the least efficient hardware accelerator out there, but also the most accessible to the largest number of programmers. FPGAs are much more efficient than GPUs on DSP workloads and GPUs are useless for I/O while FPGAs are a godsend. On the other hand, FPGAs have a ton of problems GPUs don't have. The two do not look similar to someone caring about accelerators any more than snow and ice look similar to someone living at a place where they get to see both... though the two might seem similar to people from hot places where neither is common (or perhaps if they saw one but never the other.)


But are they the most efficient in cost-per-computation? For all the major data crunchers I'm familiar with, doing either finance or scientific calculations, that's the only metric they cared about.

Only place I can see the cost-per-computation metric not mattering is in space satellites. Am I way off?


Do you get more throughput per dollar with FPGA relative to GPUs? Most certainly, except for floating point stuff, especially double precision. (Finance would care much less than scientific computing and I think FPGAs are way more prominent there.)


You sure? The ones I'm personally familiar with are investment banks that have hundreds of thousands of computers doing machine learning modeling. They ran the costs, and found GPUs to be far more cost effective.


Machine learning software will tend to use floating point, hence the result IMO. In HFT for instance I'd expect things to be the opposite.


For any sort of mobile device energy-usage per computation is an important metric. Hence you have chips with multiple low power modes which can trade off different amounts of computational power with different amounts of power efficiency.


But ASIC is 100x better than FPGA for energy per computation. Know of any mobile devices that have FPGAs on them now?


It's not fair to compare a programmable circuit with a fixed-function circuit though, because programmability is often a requirement.


Sure: http://www.eejournal.com/archives/articles/20131118-lattice/

But you're right - when programmability isn't important you'd rather have an ASIC.


FPGAs can respond to signals in nanoseconds, talk to directly different peripherals (integrated circuits, SPI, DDR3, SATA, HDMI - whatever). State machines can "branch" on every clock cycle.

GPUs... usually respond in milliseconds. Usually can't talk with anywhere except host across PCIe bus. State machines and branches... uh, yeah, don't do those on a GPU!

Maybe tile CPUs will give FPGAs a fight in the future in some market segments. Easy (well, easier than FPGA) programmability and potentially good I/O. Transputer was so amazing back in the day. 30 years ago. Maybe Tilera and such will eventually succeed?

Anyways, FPGAs and GPUs are very different beasts.


I am currently just another college student, but I have some experience in this field, as I had interned at Xilinx (which is the largest FPGA maker right now, iirc, although Altera may have taken the crown).

I don't think your typical college CS student, and by extension, the average programmer, would be interested in using FPGAs right now. This isn't an issue of performance, or costs, or lack of use cases- FPGAs are quite fast now and certainly can be extended to new niches that currently are CPU-bound. The issue isn't related to any of the advantages stated in this article, it's that the FPGA toolchain is still mired in the 1970s, a world dominated by the EE ecosystem that modern CS sprouted from.

Building things for a FPGA simply- for a lack of a better word- sucks. There is a lack of beginner tutorials, a free (non-proprietary) implementation tools. Severely lacking free example code and libraries. Few ways to share code, like Github. Much weaker community help.

Richard Stallman may be overzealous, but his impact on programming is striking if you compare it to what could have been, in the world of electrical engineers. Working on FPGAs now is crippling, when you are used to coding in the 21st century. Imaging a world where programming is without the GCC compiler, with little libraries to build on, without Github, without Stackoverflow. And THAT is the reason why FPGA adaption is low.

There are currently over 400,000 questions on Stackoverflow tagged "python", and over 40,000 questions tagged "mongodb". [1][2] In contrast, there is less than 2000 questions for the Verilog language, and less than 500 for Xilinx, even less for Altera.[3][4]

I have helped organize several hackathons at my University, where the largest one consisted of over 2000 people from across the USA. Random hacks that innovate are encouraged. There are plenty of other obscure platforms that are used, and yet very few people use FPGAs. This is indicative of the difficulty for typical programmers to dive into FPGAs, and it will be a hard problem to solve.

[1] http://stackoverflow.com/questions/tagged/python

[2] http://stackoverflow.com/questions/tagged/mongodb

[3] http://stackoverflow.com/questions/tagged/verilog

[4] http://stackoverflow.com/questions/tagged/xilinx


One of the biggest problems is how insular and impenetrable that community is. The FPGA community, and to a certain extent the entire semiconductor industry, seems to have a prescribed path for engineers.

First, you get an electrical or computer engineering degree with a couple internships at semiconductor companies. Then, you get a junior role at one of those companies where you are mentored by senior engineers, who pass on the black arts of chip design, tool usage, Verilog/VHDL quirks, and such. You slowly move up the ranks, either at your first company or another semiconductor company, until you are the senior engineer mentoring new grads. Then you retire.

Any deviation from this path and you're screwed. You don't come into this clique from the outside, and they won't let you back in if you move too far away. It has resulted in a negative feedback loop: this attitude is bolstered by how odd, archaic, and inaccessible the tooling is, but also serves to keep different approaches out and keep the tooling odd, archaic, and inaccessible.

Disclaimer: I'm a disgruntled EE and systems software developer who gets a steady stream of pings for web and NLP dev jobs but can't get the time of day from a hardware company.


I was one of those semiconductor guys for about 15 years but now I am in the retirement phase. I worked for Motorola, Freescale, Canon and Qualcomm. Did the FPGA stuff at Canon.

Believe it or not they let people in all the time. And they let you back in. I got my Qualcomm gig after ~2 years of being out of tech entirely.

I reckon it's because of that crappy tooling thing you mentioned. Years later and nothing has really changed hence I was ready to jump back in pretty easily. Things haven't really changed in 10 years IMO. Tools vendors just pump out the same crap.

I wouldn't worry about working for a HW company. The job is nothing special.

The main point is the engineering aspect: You start with the problem then weigh up the pros and cons of each solution. Most problems require a specialist in the domain. E.g. DSP, Image processing, telecommunications. Stuff that requires fast computations at low level. That can be a way for outsiders to get in. They learn the HW stuff as they go.

Stuff like web and NLP (Natural Language Processing I assume?) is a little bit too high level for most HW engineering work unfortunately.


> I wouldn't worry about working for a HW company. The job is nothing special.

It's more the nature of the work. I want to work on hardware, but not for a defense contractor again. I'm reasonably good at the CS-ey stuff I'm doing now, but I don't really like it. I like making physical devices do things.

> Stuff like web and NLP (Natural Language Processing I assume?) is a little bit too high level for most HW engineering work unfortunately.

Until four weeks ago I had never done any serious web dev, and NLP was just the first escape path that presented itself when I desperately wanted out of a defense contractor black hole. I am, fundamentally, a signal processing engineer. Even when I was doing real-time radar code nobody wanted to talk to me.


Radar is a bit niche. Embedded signal processing SW might be a good avenue. Verification is another one.

Like any job, HW companies either want experienced people or grads to do the crap work. You need a related skill-set to go for experienced positions.

FPGA hobbyists are not considered to be experienced.


This is non-sense, consider the thousands of self taught FPGA programmers in Asia.


I was not aware there were thousands of self-taught FPGA programmers in Asia. Even accounting for that: a) I'll wager most are using pirated software; b) that number pales in comparison to the billions of people and millions of self-taught programmers in Asia; c) they still have a snowball's chance in hell of working for or otherwise influencing Western hardware companies.


We used FPGAs from Xilinx to implement a MIPS processor for a course in school, and I remember one of the most frustrating things of the whole process was the toolkit/documentation. Compilation errors were very unhelpful, and debugging was a nightmare for most relatively complicated tasks.


Based on my limited experience playing with FPGA dev tools and boards as a hobbyist, this is my experience as well.

Not only is there a huge learning curve -- some of it is necessary (you are essentially doing concurrent programming), but some of it is the arbitrary way these proprietary tools work. As a complete beginner it's bad enough trying to get a very simple "divide the clock down to flash a LED on the dev board at 1 Hz", it's a whole additional thing to try to figure out "How do I edit the code in my favorite text editor instead of this stupid Windows IDE, and how can I invoke the build tools from make?"


There have been periodic bursts of FPGA hype over the past several decades, and they have all fizzled out for precisely the reasons you cite.

"FPGAs: they're the technology of the future, and always will be!"


I agree, I'm a CS student taking a Digital Design course and programming VHDL is possibly the worst experience I've ever had. (I am very aware that VHDL is not a programming language but a HDL).


What I'm having trouble understanding (as just a regular, not-hardcore programmer) is this: What end-user advantage will cause people to buy devices that include FPGAs? What use are they to a consumer? What interesting programs or devices can be created with an FPGA that can't be done otherwise?

Basically, if "everyone can easily create whatever custom objects they need!" is the utopian vision for 3D printers, and "everyone can self-host their services and protect their privacy and freedom!" is the utopian vision for a home server[0], what is the utopian vision here?

[0] Such as one running https://sandstorm.io/


> What end-user advantage will cause people to buy devices that include FPGAs?

As always, cost is a large motivator, and here's a concrete example.

Back in the day, 3ware, the makers of real hardware RAID cards shipped a line of PCI (or were they IDE?) cards that had an FPGA instead of an ASIC. You can bet that making an ASIC would have been far more costly to produce compared to an FPGA.

As far as your utopian vision goes, GNU Radio is probably it, but uptake on that has also been slow.


> What end-user advantage will cause people to buy devices that include FPGAs?

Well, the "consumer" (a term I do not like) would never even know there was an FPGA in their device.


Sure, but what would cause manufacturers to start putting those FPGAs in those devices?


I think one issue here is that to achieve the great gains in performance FPGAs can provide you do need to treat it like a proper HW design. C to RTL exists but to get the best out of it you're basically using C as a syntactic sugar for verilog/VHDL.

So the overhead for programmability is higher than a DSP/GPU if you want to gain over and above them.

I do wonder if there's a sweet spot in here somewhere that's basically a sea of DSP or GPU (maybe better to call them stream processing units) with programmable wiring and some specialised ALUs/FP units mixed in that can also use the wiring allowing you to create special one-cycle fixed function ops?


As a C programmer, I don't think C is suitable for such a fundamentally parallel architecture as an FPGA. I would prefer a higher level language, maybe declarative, where it is up to the compiler to lay out the parallel operations. Unfortunately the engineering culture gap from HDL to a high level language is much wider than to C.


Actually it's easier and more natural to translate a (domain-specific) high level language into RTL than trying to fit an alien but so-called "general-purpose" C.

I'm using a language which translates into Verilog, but features native support for expressing things like FSMs, pipelines, buses, FIFOs - and something as simple as this makes HDL coding much simpler and much less error-prone than a conventional low-level RTL.


This is because C, as I understand it, is a language that best described the fundamental workings of processors, and let you work with a thin-abstraction layer over the cpu/memory architecture.

FPGA don't have this traditional cpu/memory architecture, and as a result, C fits them poorly.


Exactly. C assumes too much about the hardware semantics, it's got a peculiar memory model, sequential execution and all that. There are tricks, of course, allowing to get out of the C box a little bit - e.g., using multiple address spaces to simulate distinct memory and memory-like blocks, but it's still unnatural and do not make hardware description any simpler than doing it manually in RTL level.


Is there a possibility to use just-in-time compilers to port software to FPGAs? I can see how that could be used to accelerate applications written on bytecode languages (i.e. mostly everything in your mobile phone) to near-assembly speeds.


porting regular software to FPGA's is certainly an interesting idea and many people have worked on it (google C to FPGA). Unfortunately, these solutions aren't quite what you would think, because the programming model of a microprocessor (serial execution of instructions) is very different from that of an FPGA (parallel evaluation of gates, non-uniform state access/routing -- you cannot just access any variable in Verilog anywhere, you have to make sure it is possible to efficiently lay out your algorithm on a 2D structure of gates).

As for JIT-ing, compilation times for FPGAs are often in the hours, and the software involved is unfortunately proprietary (and not easily integrated with). There are some field-reconfiguration libraries, but I am not sure to what extent real-world applications compile new Verilog on the fly..


Just wanted to add that many FPGAs physically support updating parts of their configuration without turning off or stopping the rest of the chip. This is just very hard to do without the docs we need of the chips, and almost no one does it.


> ... almost no one does it.

I've heard an anecdote that Altera kept getting pestered by customers asking why they don't have it--"but why not? Xilinx has it!".

So they implemented it just so they could say, "We have it too!".

It shut them up, and since no non-academics actually use PR (that I'm aware of; I'd love to hear counterexamples), they've promptly ignored it again :).


What jbangert said is the crux of the JITing problem: "compilation" for FPGAs is not really at all like compiling software, and can be hugely computationally expensive. On top of that, pretty much the entire synthesis, mapping, and place and route flow is usually proprietary.

That said, there have been some efforts in this direction. Search the literature for "warp processing".


As kyboren says fpga compiling is crazy intensive. This is because we are mapping physical resources and things have to physically fit instead of just being able to arbitrarily jump somewhere else in the program. It becomes a huge constraint satisfaction problem, there is no 'correct' answer, and many times FPGA compilers will produce different results when run multiple times because they often use randomness to start the process if there are too many options.


Back in the day I learned about using VHDL to design circuits, is that the kind of compilation also used for FPGAs?

That language paradigm is close to modern dataflow oriented programming languages. It should be theoretically possible to compile an application program written in that style and have the logical circuits in the FPGA be a direct mapping of the business logic in the application.


> Back in the day I learned about using VHDL to design circuits, is that the kind of compilation also used for FPGAs?

FPGAs can implement any digital circuit which fits, with some caveats. So yes, VHDL is commonly used to describe designs for FPGAs. I'm not sure what you mean by '[that] kind of compilation', though, as you never specify what kind of compilation you mean.

Implementing a circuit on an FPGA involves many of the same, or similar, steps and processes as implementing it on an ASIC (and obviously, does not involve all the physical design stuff). If that's what you're asking.

> That language paradigm is close to modern dataflow oriented programming languages. It should be theoretically possible to compile an application program written in that style and have the logical circuits in the FPGA be a direct mapping of the business logic in the application.

Sure, of course it's theoretically possible; there's nothing you can do in digital logic you cannot do on a UTM (space/memory constraints aside), and vice-versa. This field is known as "High-Level Synthesis", and it's been around for decades. It's only now becoming sort-of widely used for real-world stuff.

But it's nothing like "directly translating" to logic for the FPGA ;). It's probably easier for most programmers to design in such a language vs. an HDL, but AFAIK that's not even close to the biggest challenge with doing HLS. I think timing and resource allocation are much bigger problems (but I'm not very informed about HLS!).

Look through the HLS literature if you're interested in what's involved--I'm not an expert by any means. Also, if you're interested in alternative programming languages for hardware design, check out BlueSpec and Chisel.


I think FPGAs are an untapped technology. I would love a book by Manning or Pragmatic called "FPGAs in Action" or "101 FPGA Projects". I have no idea what to do with it besides try to make a bitcoin miner.


Seek and ye shall find :)

(1) http://www.fpga4fun.com/ "FPGA projects: 25 projects to build using an FPGA board."

(2) https://www.youtube.com/watch?v=Er9luiBa32k "FPGA 101 - Making awesome stuff with FPGAs [30c3]"

(3) http://hamsterworks.co.nz/mediawiki/index.php/FPGA_course "I want to help hackers take the plunge, purchase an FPGA development board and get their first projects up and running."

(3b) http://hamsterworks.co.nz/mediawiki/index.php/FPGA_Projects

(4) http://www.nandland.com/articles/fpga-101-fpgas-for-beginner... "The following pages give a quick overview of the basics of FPGA design. Read these tutorials to give yourself a solid foundation on your journey into the wonderful world of Digital Design. "


I have several FPGA-based devices, in hybrid digital-analog synthesizers. FPGAs allow rapid reconfiguration of signal routing between analog components, which latter deliver much nicer sound quality than DSPs for many requirements, especially filtering. There are also Field Programmable Analog Arrays but I don't have any of those, yet.


I would love to play with those for synths/audio/effects! Any recommendations on where to look? I look every few years and products come and go. Just now I looked and found this, which looks intriguing! I love to see more about these!

http://anadigm.com/apex-devkit.asp


Check out the Milky Mist http://m-labs.hk/m1.html


If you're into hardware, robotics, automation then there's a HUGE space for this stuff. National Instruments has made there name here for some time.


Data science and Artificial Intelligence also comes to mind.


Suppose I wanted to get started with FPGAs for purpose of computation (no interest in control or actuation of sensors/devices). What would be the best starting board for less than $300?

I'm not a student, but I can probably find one if your suggestion is a student dev board.


The Terasic series are great, especially if you aren't too fussed about peripherals on the board. The DE0-nano has a decent sized FPGA and SDRAM for less than $100. There is also a SoC variant of this board coming soon with an integrated ARM subsystem.


I've recently purchased the parallela board:

http://www.parallella.org/

It's got USB/ethernet/arm cpu , and has been incorporated into numerous other projects. It's ~$100.00 USD.


I second the parallella, not only does it have an FPGA, an arm core and a massively parallel cpu. So much to play with and still running Linux so the workflow is super smooth.


http://valentfx.com/logi-pi/ (or logibone) is reasonably good, it's a Spartan6 LX9, easily programmable via SPI, and you can communicate with your design using the same SPI, easily clocked up to 30mbps.

For bigger designs (e.g., a full OpenRISC SoC fits, with ethernet and VGA and all that) you can use Digilent Atlys board, it's cheap for students (although communicating with it is a bit more complicated).


Probably the Zedboard.


Xilinx is working on tools to make FPGA development easier for traditional software developers.

http://www.xilinx.com/products/design-tools/sdx.html


I've written a number of IP cores with FPGAs over 10 years ago and the sad fact is that the same problems and criticisms that led me to forget about using them anywhere than at some big company are still very relevant today. Problems include:

* Hostile toolchains for independent developers / engineers. Lack of reasonably useful toolchains including simulators, synthesis tools, the works. iVerilog has been lame before in my experience, for example. * Expensive hardware that's artificially inflated in price due to lack of commodity concerns (perhaps lack of competition in general) * Toolchains unfriendly / bad UX for traditional software developers. I read a Linux Journal article with a Python-like HDL featured - myHDL and its traction is what I suspected it would be - irrelevant to anyone but hobbyists, and hobbyists have basically zero input into the industry unlike software's dynamics. After working with different compiler vendors and doing some research in the space (GPGPUs completely destroyed the market like I figured back then because software developer use cases of hardware tend to be very narrow uses of FPGA / ASIC capabilities - parallel processing or custom FPUs for boutique DSPs being typical), I've seen rather little progress compared to the leaps in software productivity. Even the recent Scala-based Chisel language seems a bit trite as well and more research-ware that won't be adopted by manufacturers probably because... * Most FPGA and ASIC targeted use cases for HDLs are primarily for hardware engineers first most, not people that are software engineers first. When people get specs on hardware, nobody really cares about the HDL code written as much as whether the SOC has a lot of test benches and they look at the block diagrams instead of the HDL typically and things tend to be developed typically as black boxes due to how IP cores work (per industry conventions). * Compilers for hardware languages have had tons of issues for a long, long time because expressing bit math and signals (especially the analog extensions to VHDL and Verilog) have been really cumbersome with imperative style languages. Historically, we've gotten far more concise, legible results with language clarity by using conventions from Matlab and R than with C and Ada, but Verilog and VHDL are just too damn dominant and most experienced electrical engineers that are hardware gurus really do not like to learn a lot of new languages and semantics unlike how software developers tend to behave typically.

However, I've always felt that hardware engineers and software engineers should talk a lot more, and the ever-increasing gap has been a disappointment for me. I'm totally on-board with a lot of testing for software, but at the same time I understand that software is typically designed with the expectation to change things frequently while hardware is typically written to be reliable-first and to go through insanely rigorous testing because you cannot patch hardware. Most IP cores that are distributed a lot have at least 2.5 times as much test benches / harnesses as the code to actually do the work. Hardware has much more rigorous specification designs than software typically though, so this is possible at least. Haskell's QuickCheck would be totally helpful for how a lot of hardware is tested because nowadays you can't test all the combinations of signals on, say, a 128-bit bus in a reasonable amount of time. So test benches are full of tons of statistical functions and are sometimes even generated with machine learning techniques to exercise the most likely to fail use cases and states. This all goes out the window if you switch languages typically. That kind of attention to testing is almost unheard of in software in my experience.

A trivial-seeming example of how HDLs are hostile to software developers is the difference when you write a switch statement in C versus in an HDL (tremendous apologies, I haven't written Verilog in 10 years):

  switch (input) {
    case b'10: 
      output <= input & inputB;
      break;
    case b'01:
      output <= input ^ inputB;
  }
vs.

  switch (input) {
    case b'10: 
      output <= input & inputB;
      break;
    case b'01:
      output <= input ^ inputB;
      break;
    default:
      output <= b'00;
  }
Omitting a default case causes a latch to be produced by the compiler, so the two are VERY different and will punish software developers that like to use shortcuts (my Perl background bit me real hard trying to do shortcuts in the Verilog compilers I used before - I was even more surprised to see my C-style macros failing after only 2 levels of indirection to try to make my cores more configurable, forcing me to rewrite my cores often). Unlike in software where something like whether a switch statement becomes a jump table or a series of if statements isn't typically make or break, in hardware the difference between a single clock cycle being used up or not is massively important.

After basically yelling at my compiler and tools oftentimes for doing things that defied their documentation enough times, waiting hours for place and routes to finish (they ARE NP hard, after all), and uncovering so many edge cases in FPGAs that required use of clock trees and manual refinement of the resulting FPGA image file, I realized that this just wasn't going to get much better of an experience because these levels of concerns are what hardware engineers are responsible for tuning themselves and you just can't abstract and delegate it away.

I asked my boss back then why Intel shouldn't enter the market and he responded "because if it's commoditized, the entire industry will lose all their margins because the market for FPGAs is primarily large company hardware engineers." It took over 10 years later for Intel to finally buy Altera it seems... and in the meantime we had an incredible number of things happen across the spectrum while FPGAs are still practically the same user experience and fundamental architectures (nothing like SMT happened in FPGA land really, no really revolutionary compilers to make writing hardware super easy).

There's a gazillion more reasons why we'll see more people using the Raspberry Pi and Arduino for learning to tinker with hardware, and Xilinx, Altera, etc. have always been the way they are now, so my decision to just never go back to the fun world of FPGAs seems to have been justified, sadly.


(2013)




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: