
FPGAs and Deep Machine Learning - chclau
https://fpgasite.wordpress.com/2016/08/25/deep-machine-learning-with-fpgas/
======
inetsee
The history of Artificial Intelligence goes back much further than the "early
1980s". It goes at least back to the conference at Dartmouth in 1956, if not
back to Turing's 1950 paper, "Computing Machinery and Intelligence".

~~~
chclau
Good point. The sentence is confusing. When I wrote it I was thinking about
expert systems, which really blossomed more or less during the 80s, and not
about AI as a whole. I have changed that paragraph adding some of your
comments. Thanks.

------
jcbeard
So what's changed that FPGA's are making a come back? They were all the rage
in 2008/9...however that faded relatively quickly when people realized
programming them was a hassle and that programming had direct implications to
synthesizability/clock speed/area. Looks like Google Trends shows a downward
spiral in interest from 2004 to present, so my intuition was close.
OpenCL->HDL and Vivado's C/C++ -> HDL makes the process a bit easier, but it's
still not as easy as writing optimized C code.

To get perf on an FPGA over a hard core you must go very wide (as in lots of
parallelism). I suspect you could order custom hard cores from places like
Tensilica and get far better perf/watt. I love FPGA's, they thrive on parallel
integer/fixed point codes if enough time is put into designing the pipeline.
It seems for most float heavy codes a hard core unit at a higher clock rate
with a dedicated MMU is much better? Am I missing something that changes that?

~~~
asimuvPR
This won't directly answer your question but FPGA boards are now more
accessible to hobbyists. That helps with popularity.

~~~
jcbeard
not sure they haven't always been fairly accessible? I suppose prices are now
dirt cheap vs. 2010/11 where I'd have to pay 300-400 for a mediocre FPGA (I've
used digilent for years, prices on there have dropped quite fast in the last
two years or so). then again, atmel had an FPGA co-processor for awhile at a
relatively cheap price (can't remember the name...probably X28 with some
characters), but still not really capable of running a neural network at any
appreciable depth.

~~~
asimuvPR
They may have been available for sale but people probably did not know about
them. That itself takes most of the time. People need to learn about the
product and understand how it can be used.

------
doc_holliday
It's excellent to see FPGAs being increasingly utilised.

As moore's law no longer holds true for CPU, there is increasing interest
turning to FPGAs. You can create a truly bespoke Processor for any task.

Of course this is trade of between development time / cost vs processing needs
of task.

I suspect FPGAs demand to only increse over the coming years. Intel's
integration of FPGA into their line of processors is a promising step and sign
of where things may be heading.

~~~
BenoitP
I wonder if common boring languages/runtimes, say Java/JVM, are easily
amenable to making use of on-die FPGAs.

Will we see a new generation of JITs emitting FPGA code?

~~~
doc_holliday
Unlikely for many years (if ever), compiling HDL down to an image to flash
FPGA with takes hours. Routing wires etc is very difficult process and is very
compute intensive.

JIT for FPGA is quite possibly mathematically impossible.

------
vonnik
There is literally no new information in this article about DL and FPGAs.

------
fnord123
I had discussed this with some Deep Learning experts at the local University
and they believed that FPGAs won't be as powerful as people seem to think
since the many-many connections between blocks will not be up to the task.

In any event, Nerabus is a company (by the same guys as CodeThink) which is
interested in running FPGA in the cloud:
[http://nerabus.com](http://nerabus.com) I'm not sure if they got off the
ground with the idea or if they were too early or what.

------
daveloyall
Hi, chclau, I see you are the author and you are present in this comment
thread.

Your post smells like spam. However, other posts on the same blog have
content: [https://fpgasite.wordpress.com/2016/08/09/pseudo-random-
gene...](https://fpgasite.wordpress.com/2016/08/09/pseudo-random-generator-
tutorial/)

What's up? What is this blog for?

~~~
chclau
I am sorry. Why do you say that this post is SPAM? Is an article I wrote since
I saw several other articles about Deep Machine Learning and about how quite a
few companies are using FPGAs for it, and I thought it would be nice for other
developers to know.

My blog is for sharing knowledge on VHDL projects for FPGA, but I also share
news on the field. I think that knowing where your field is going to is part
of the needed knowledge to be a good designer.

------
nl
Note that as far as I'm aware all the custom FPGA solutions (and even Google's
TPU which I think is ASIC) are really only being used for inference, and only
"beat" GPUs on a efficiency basis.

That's interesting and useful, but the real bottleneck with deep learning is
the training stage. AFAIK everyone is still doing that on GPUs.

~~~
petra
But isn't efficiency(and density) almost everything when it comes to data
centers ?

~~~
dgacmu
TCO is. FGPAs have a fairly high initial outlay, and fairly high costs to
program. In addition, they often lag behind CPUs both in density (except for
hard IP blocks, but that's not what you're buying them for!) and in frequency.
You have to be able to get a big win on parallel execution and avoiding
control overhead to make them truly cost-effective.

Given that Google's already gone down the custom ASIC path for DNNs (and see
Intel's recent acquisition of Nervana for $400m), and how much effort Nvidia
is putting into making Pascal a great architecture for deep learning, I'm
deeply skeptical of the wins from FPGAs in this space. The volume demands are
too high: I suspect they'll fade out pretty quickly in favor of custom
silicon.

~~~
petra
I agree about DNNs.

But in general, do you think that FPGA's will become a big part of the cloud ?
or will they just be employed in a few, relatively small niches ?

~~~
dgacmu
I don't know, but I'm skeptical. It depends too much on whether the
computationally intensive (and latency sensitive) parts of cloud-y workloads
become highly specialized, or common. If they're common, you'll just have a
set of ASIC-based accelerators, or Intel will start putting in accelerators
for them on their chips. This is probably the most likely path. But if it
turns out that genomics needs one thing, banking needs another, CAD/design/sim
another, etc., then FPGAs might work.

Low volume + low money = CPU. Low volume + crazy money -> high frequency
trading = FPGAs viable. Low volume + really hard realtime control or fast
signal processing -> Often FPGA, but DSPs are still viable. Medium volume +
decent money = FPGA. Baidu and MS's deep learning fits here. High volume =
ASIC. Google's TPUs fit here.

Machine learning and DNNs seem to be in the process of jumping from mostly-CPU
to serious ASIC.

FPGA's success also depends on how well Intel and the ARM ecosystem can start
to support medium-volume customization. Intel's historically been more
exclusively high-volume, but if you look at the kinds of deals they're inking
with their tier-one (and two?) customers, it's clear they're trying to move in
that direction: [https://goparallel.sourceforge.net/intel-bets-big-custom-
xeo...](https://goparallel.sourceforge.net/intel-bets-big-custom-xeon-chips-
cloud-companies/)

The challenge for FPGAs is that the better the design & synthesis tools get to
support FPGA, the easier it is for the CPU manufacturers to do the same thing
and adapt more rapidly. The core advantage the FPGAs retain is that they can
be retargeted much more easily, and their dev cycle is shorter. My crystal
ball is fuzzy. :)

~~~
petra
Thank you.

>> Medium volume + decent money = FPGA.

I wonder though: How important is reprogramming ? because you can create
structured ASIC's (something between an asic and an FPGA) at a low enough
volume( maybe 500/1000+ ,like eAsic(which Intel are working with) ) ?

