
Field programmable gate array that's 4.2x faster than a 16 core CPU - ColinWright
http://www.hpcwire.com/hpcwire/2012-04-16/latest_fpgas_show_big_gains_in_floating_point_performance.html
======
ramchip
The title doesn't mean much if you don't specify _at what_. It's only about
floating point, and the comparison is with a purely theoretical CPU.

~~~
mkl
Also, two of article's four authors work for Xilinx, which makes the FPGA in
question.

Does anyone know how much these things cost? A quick google yielded nothing,
but I may using the wrong terms.

~~~
mseebach
Check farnell.com. They don't seem to stock the Virtex-7 family yet, but the
price-range of the Virtex-6 family might provide some guidance (£350-£850).

Edit: This is for the naked chip, not a board. There's quite a discrepancy
with int19h's findings, I don't know if that can be all up to the board?

~~~
Amadiro
Boards for high-end FPGAs like that are not cheap to make yourself either
(needs like 6 layers or more, because they have SO MANY densely packed pins +
you need one or two layers to feed its gratuitous hunger for power), so you'll
probably end up having to buy some ready-to-use solution that probably
quadruples that price...

------
Jimmie
"Field programmable gate array that's 4.2x faster than a 16 core CPU",
theoretically and only in regards to 64-bit floating point arithmetic.

What's with the link-baity titles lately?

~~~
ColinWright
I'm quoting directly from the article:

    
    
        Comparing theoretical peaks for 64-bit floating point
        arithmetic, the current generation of Xilinx’s Virtex-7
        FPGAs is about 4.2 times faster than a 16-core microprocessor. 
    

And with regards your question about titles "lately", I'd be interested to see
what other submissions I've made that you think are "link-baity".

Thanks.

~~~
tubs
I think it was in reference to other titles submitted lately in general, not
necessarily by you.

------
tgflynn
Anyone have an idea about how this would compare with current GPU performance
? My impression is that GPU's are currently way ahead of CPU's in floating
point performance (though maybe not for 64 bit ?).

EDIT: To make this question a bit more specific, say I wanted to develop a
really fast neural net implementation, which basically reduces to matrix-
vector multiplication and function interpolation. Would I be better off
looking to do this with a GPU or an FPGA given the current state of both
technologies ?

From what little experience I've had with GPU's I think bandwidth to the
device might be a limiting factor but I'm guessing this would affect either
type of co-processor.

~~~
scott_s
In my experience, it's not bandwidth that is the limiting factor, but
_latency_. You'll hit the same problem with FPGAs if you're using it as a co-
processor, as they are typically connected to the motherboard over PCI
Express. If the vectors you're using are small (where "small" means small
enough to easily fit into an L1 cache on a processor), then you probably won't
see any performance improvement by offloading the computation to an
accelerator.

I say this because in a matrix-vector multiplication, only the vector has
data-reuse. You do a single pass over the matrix. I wrote a paper where
latency killed any performance benefit from using a GPU, because the
computation we performed did only a single pass over the data:
<http://people.cs.vt.edu/~scschnei/papers/debs2010.pdf> If you're doing a
matrix-matrix multiplication, then that's a different story because each
element in each matrix will be reused.

------
Estragon
What's a good text for learning to program these? (Or perhaps series of texts,
as my knowledge of electronics and computational hardware is very
superficial.)

~~~
scott_s
In grad school, I took a configurable computing course in the ECE department.
I'm a CS guy - I had never done any hardware design before. You may benefit
from reading over my short writeups of the assignments:
<http://people.cs.vt.edu/~scschnei/ece5530/>

I recall that in trying to describe the impact of the web to typical business
folks, Douglas Adams compared it to trying to explain the ocean to a river:
first, you have to understand that river rules no longer apply. Hardware is
similar. First, you have to understand that software rules no longer apply. If
you dive into this even a little, I predict you will be shocked (much as I
was) how much of your concept of "computation" is tied up in sequential,
memory-hierarchy based processors.

~~~
Estragon

      > If you dive into this even a little, I predict you will be shocked 
    

Thanks, sounds like my kind of ride.

