
The Other Cray Launches CPU-FPGA Hybrids - luu
http://www.theplatform.net/2015/05/28/the-other-cray-launches-cpu-fpga-hybrids/
======
etep
Intel has an interesting angle on this: FPGA + Xeon.

One story I see here as similar to what has been happening to the GPU
business: Intel controls the access to the CPU, and therefore they will have
an advantage because they can hold the competing solutions at arms length over
PCI.

The link is just the first result I got back from Google, but, for example,
20x perf upside claimed: [http://www.extremetech.com/extreme/184828-intel-
unveils-new-...](http://www.extremetech.com/extreme/184828-intel-unveils-new-
xeon-chip-with-integrated-fpga-touts-20x-performance-boost)

The other takeaway, for me anyway, is that these are interesting times because
hw architecture and hw/sw co-optimization appear to be gaining in importance
(i.e. because Moore's law is slowing).

~~~
varelse
Correct, Intel is playing dirty with Skylake. Sandybridge (i7-3820), IvyBridge
(i7-4820), and Haswell (i7-5930k) all had 40 PCIE lane high-end consumer CPUs.
Such CPUs could be used with PLX PCIE switches to build inexpensive quad-GPU
supercomputers:

[http://exxactcorp.com/index.php/solution/solu_list/85](http://exxactcorp.com/index.php/solution/solu_list/85)

As far as I can tell, there is no Skylake analog to these CPUs. So instead of
building something better than a GPU, all Intel can do is buy Altera and do
their worst to raise the price of building a fat GPU server.

Previously, they blocked the ability to send P2P copies between GPUs over QPI.
But PLX 8747 PCIE switches provided a nice workaround (as will a single 8796
switch this round).

I guess those can, do, and those who can't, erect roadblocks.

~~~
etep
It doesn't make much sense to say that Intel can't build something better than
a GPU, right? It's like saying the reason nvidia doesn't build CPUs is because
they can't build a better CPU than Intel. It distracts the conversation.

So the question really is, what is the long term strategy and why? It appears
to me that Intel has validated some of this "custom hardware" FPGA strategy
and that their view is it will be "better together."

I agree that Intel is erecting roadblocks, but I doubt it is because they
"can't build a GPU," rather more likely because "they can erect roadblocks."
In fact, they are obligated, by their shareholders, to erect those roadblocks
(or to play dirty if you want).

~~~
varelse
You seriously need to investigate what a cluster#$%! Xeon Phi is. Their
attempt to kill NVIDIA was a joke and continues to be one for anyone not
getting paid to say otherwise.

IMO if Intel really cared about "shareholder value(tm)" they would have
acquired NVIDIA by hook or by crook. Instead, they bought the promising
redheaded stepchild Altera.

Meanwhile, NVIDIA _owns_ the ML/Deep Learning space for at least the next 2-3
years no matter what manure Intel tries to fling at them. If only AMD had a
decent driver/tools team, this battle could be far more interesting.

That said, 2018 or so and beyond is a green field(tm) if Intel stops choking
on its own process and exploits its process advantage to build a GPU killer
either as a co-processor or by integrating sufficient multiple AVX units into
the cores of its CPU roadmap.

All IMO of course.

------
pinewurst
This article is from May '15.

~~~
dang
Yup, but we asked luu to repost it because it didn't get any attention the
first time.

~~~
p1esk
Can you give a brief summary of this product?

~~~
GFK_of_xmaspast
It looks like they did something to opencl to get it to target FPGAs, and then
went a level beyond and had some kind of high-powered memory interconnect
between CPU and FPGA so that they can share memory space.

~~~
p1esk
So this is a physically separate CPU, not part of the FPGA chip, right? If so,
what are the advantages compared to CPU implemented as a hard core in the
FPGA?

~~~
solarexplorer
The separate CPU is much faster, because it has an entire die for itself and
because it is a high-volume off-the-shelf chip. An embedded CPU/FPGA will
suffer much more when it executes the serial part of the application on the
embedded CPU. So if you remember Amdahls law...

> What are the applications for this hybrid?

Highly parallel applications with tight loops. These loops can be offloaded to
the FPGA while the rest of the system can run as before. This approach allows
you to use languages like FORTRAN/C/C++ without resorting to Verilog/VHDL, so
it's much easier to adopt.

~~~
p1esk
But a separate die means bandwidth and latency of CPU-FPGA communication is
greatly reduced. So if data has to be constantly moved between them, it might
be a disadvantage. What are the applications for this hybrid?

~~~
stonogo
There is an entire chart devoted to this question in the article.

