
FPGA and Xeon combined in one socket - jsnell
http://www.theregister.co.uk/2014/06/18/intel_fpga_custom_chip/
======
pjc50
The barrier to adoption of FPGAs is not so much the hardware issue as the
toolchain issue. The toolchains are closed, frequently windows-only, slow, and
not friendly to newbies.

(Not that shader language or CUDA is _that_ accessible, but you can play with
shaders in a browser now. The time to "hello world" or equivalent isn't too
bad)

~~~
chillingeffect
I argue the barrier to FPGA adoption is the actual use-case, not the tools.
People who need them can use the tools just fine. FPGA designers are not
idiots. They do not walk around scratching their heads wondering, "how come
nobody is using our products? If only they were easier to use!" There is a new
open-source attempt at replacing imcumbent tools bi-annually begun and
dropped.

The use cases for FPGAs are a much harder impediment to adoption. Many people
get "FPGA boners" when they even hear the word, fancying themselves "chip
designers," but practical use cases are much rarer. As evidence, notice they
predominate in the military world, where budget is less of an issue than the
commercial world.

The technical issue with FPGAs is that they are still one level abstracted
from any CPU. They are only valuable in problems where some algorithm or task
can be done with specific logic more quickly than the CPU, given 1. the
performance hit of reduced real estate and 2. reduced clock speed relative to
a CPU and 3. more money than a CPU.

Further diminishing their value is that any function important enough to
require an FPGA can more economically get absorbed into the nearest silicon.
For example, consider the serial/deserial coding of audio/video codecs. That
used to be done in FGPAs, but got moved into a standard bus (SPI) and moved
into codecs and CPUs.

Because of this rarity, experienced engineers know that when an FPGA is
introduced to the problem in practical reality, _it 's a temporary solution_
(most often to make time-to-market). This confers a degree of honor which is
why people get so emotionally-aroused about FPGAs.

You can bet though, that if whatever search function Microsoft is running on
those FPGAs proves to be useful, it will be soon absorbed into a more
economical form, such as an ASIC, or, more likely, additional instructions to
the CPU.

Really, installs on 1,600 servers such as this article reports, is not that
impressive and certainly only a prototypical rollout.

~~~
ChuckMcM
Ok, I'll take the counter argument.

FPGAs promise the designer 'arbitrary logic' and deliver 'a place others sell
into.'

I disagree that FPGA experts "like" the tools they are given, they tolerate
them. One of my friends worked at Xilinx for 15 years and understood this all
too well. He felt the leading cause of the problem was that the tools group
was a P&L center, they needed to turn a profit in order to exist. They got
that profit by charging high prices for the tools and high prices for support.
His argument was that 'easier' tools cut into support revenue. When I've had
high level (E-level, but not C-level) discussions with Xilinx and Altera there
has been a lot of acknowledgement about the 'difficulty of getting up to
speed' on the tool chain and many free hours of consulting are offered. From a
business engagement point of view, making hard to use tools and then "giving
away" thousands of dollars of free consulting to the customer to gain their
support seems to work well. The customer feels supported, and stops wondering
why if the have consultants around for free those consultants wouldn't just
make the tools more straight forward to use and available on a wider variety
of platforms.

But the biggest thing has always been intellectual property. You buy an
STM32F4 and it has an Ethernet Mac on it (using Synopsis IP as evidenced by
the note in the documentation), you pay $8 for the microprocessor, work around
the bugs, and get it running. If you buy an FPGA, lets say a Spartan 3E, you
pay $18 for the chip, and if you want to use that Synopsis Ethernet MAC?[1]
$25,000 for the HDL source to add to our project $10,000 if you are ok with
just the EDIF output which can be fed into a place-and-route back end. Oh and
some royalty if you ship it on a product you are selling.

The various places that have been accumulating 'open' IP such as Open Cores
([http://opencores.org/)have](http://opencores.org/\)have) been really helpful
for this but it really needs a different pricing model I suspect. A lot of HDL
is where OS source was back at the turn of the century (locked down and
expensive).

[1] I did this particular exercise in 2005 when I was designing a network
attached memory device
([https://www.google.com/patents/US20060218362](https://www.google.com/patents/US20060218362))
and was appalled at the extortionate pricing.

~~~
Florin_Andrei
> _From a business engagement point of view, making hard to use tools and then
> "giving away" thousands of dollars of free consulting to the customer to
> gain their support seems to work well._

To me, the entire recent history of computer industry (well, all of it is
recent BTW) shows that, if you want your technology to become mass-adopted,
you need to make it easier for the little guy to get in the game. The high
school kid tinkering with stuff in the parents' basement; the proverbial
starving student. That's how x86 crushed RISC; that's how Linux became
prominent; that's how Arduino became the most popular micro-con platform
(despite more clever things being available).

You make the learning curve nice and gentle, and you draw into your ranks all
the unwashed masses out there. In time, out of those ranks the next tech
leaders will emerge.

~~~
ChuckMcM
I don't disagree, and I suggested as much to the Xilinx folks (well their EVP
of marketing at the time) that if they just added $0.25 to the price per chip
they could fund the entire tools effort with that 'tax' and since they would
be 'giving away' the tools they could re-task all of the compliance guys who
were insuring that licenses worked or didn't work into building useful
features.

Their counter is of course that they have customers who sweat the $0.25
difference in price. (which I understand but $10,000 in tools and $15,000 in
consulting a year is a hundred thousand chips. Which they say "oh at that
volume we would wave the tooling cost." And that got me back to your point of
"You already have their design win, why give them free tools? Why not give
free tools who have yet to commit to your architecture?"

It is a very frustrating conversation to have.

------
msandford
This gets even bigger if they throw their IP muscle behind it like they do
with ICC. If you can get (for pay or free) fast matrix multiply, FFT, crypto,
etc cores for the FPGA you will see even faster adoption.

If they're clever enough to make some of those IP cores available to say
MATLAB adoption will be faster still.

Nothing sells hardware easier than "do no extra work but spend another couple
of grand and see your application speed up significantly"

~~~
rdrdss23
Can you elaborate on what you're trying to say?

MATLAB already have MATLAB->HDL, which works very well. We have a team that
uses it exclusively for FPGA programming.

~~~
msandford
MATLAB will recognize if you've got FFTW or ATLAS or other highly tuned
numerical libraries installed. And MATLAB will then use them whenever
possible.

If Intel does a good enough job of providing a collection of compute kernels
and the surrounding CPU libraries to make using them roughly as "easy" as CUDA
then a lot of people will pick that up.

I don't have any hard numbers but I would suspect that there are a great many
more people who use MATLAB on a CPU than those who do MATLAB->HDL. So what I'm
speculating about is that Intel might support those folks who use MATLAB on a
CPU for more general purpose things.

Does that make more sense?

------
peterwwillis
"Intel reveals its FrankenChip ARM killer: one FPGA and one Xeon IN ONE SOCKET

Scattered reports of maniacal cackling amid driving rain and lightning at
Chipzilla's lab"

Is this just a Register thing, or do all UK rags use this kind of
unprofessional hyperbole? It's _literally_ the most annoying thing in the
world.

~~~
Aqwis
The Register is 50% satire and 50% tech news. Don't take it seriously.

------
rthomas6
This is their competition: [http://www.xilinx.com/products/silicon-
devices/soc/zynq-7000...](http://www.xilinx.com/products/silicon-
devices/soc/zynq-7000/silicon-devices/index.htm)

Zynq has been out and working in industry for a couple years now.

~~~
hga
I don't know how much competition they're giving Xilinx, but Altera is doing
the same thing with the same high performance ARM core:
[http://www.altera.com/devices/processor/soc-
fpga/overview/pr...](http://www.altera.com/devices/processor/soc-
fpga/overview/proc-soc-fpga.html) As the fine article notes, Intel is now
doing some fabbing for Altera.

I've gotten the impression that putting a general purpose CPU in the corner of
an FPGA was a pretty standard thing.

One of the things that should differentiate this new effort from Intel is FPGA
" _direct access to the Xeon cache hierachy and system memory_ " per "
_general manager of Intel 's data center group, Diane Bryant_".

~~~
rthomas6
Direct cache access sounds cool, but I'm sort of under the impression that
Altera and Intel are playing catchup with the FPGA SoC idea. I believe Xilinx
was the first mover by a large margin, though I could be wrong. That doesn't
mean Xilinx's Zynq will always be the best product, but it is already for sale
right now, is an established product, and works well.

~~~
blackguardx
FPGAs with CPU cores have been around for over a decade. The difference is
that in the past, the industry has mostly focused on the PowerPC core. Now
that ARM has such tremendous popularity, it makes sense to focus on it.

~~~
rjsw
It made sense for Xilinx to use PowerPC in the past, the chips were being
fabbed by IBM.

------
nomnombunty
I have always wanted to learn Verilog. However, I find it quite different from
the typical programming language such as c or java. What is the best way for
someone who has programming experience to learn Verilog?

~~~
deadgrey19
The first thing to know is that Verilog is not a programming language. It is a
hardware description language. This may sound picky, but it fundamentally
changes the way you need to think about using the language. With
Verilog/VHDL/HDL, you describe a circuit, which requires very different
thinking to programming languages where you describe a sequence of
instructions.

The other thing to know is that HDL languages are mostly the domain of
electrical engineers and hence have suffered a lack of any "computer science"
in them. The languages and all of the tools are clunky and reminiscent of
1970/1980's style programming when CS and EE diverged. Hence, do not expect to
find decent online tutorials or freeware source code available. It's all
locked up and proprietary as with all other EE tools.

The best place is to start with a text book, this one
([http://www.amazon.com/Fundamentals-Digital-Logic-Verilog-
Des...](http://www.amazon.com/Fundamentals-Digital-Logic-Verilog-
Design/dp/0073380547/)) is a nice introduction to digital design with examples
from Verilog.

Personally I prefer VHDL, and this fantastic introduction
([http://www.amazon.com/Circuit-Design-VHDL-Volnei-
Pedroni/dp/...](http://www.amazon.com/Circuit-Design-VHDL-Volnei-
Pedroni/dp/0262162245/))

To make either of these useful, you will need a hardware platform and some
tools to play with. The DE1/2 is a reasonably priced entry board with plenty
of lights, switches and peripherals to play with at a reasonable cost and is
well matched with the text books above.

[http://www.terasic.com.tw/cgi-
bin/page/archive.pl?Language=E...](http://www.terasic.com.tw/cgi-
bin/page/archive.pl?Language=English&CategoryNo=165&No=83)

~~~
typon
1\. I recommend that a newbie get a DE0 nano board. It's much cheaper than
DE1/DE2 and has fun sensors like an accelerometer on it which can lead to
pretty cool applications. I designed a quadcopter control system entirely on
the DE0, using a NIOS II based Qsys system. The academic price is only $59:
[https://www.terasic.com.tw/cgi-
bin/page/archive.pl?No=593](https://www.terasic.com.tw/cgi-
bin/page/archive.pl?No=593)

2\. Fun fact: the cover of the Fundamentals of Digital Logic book has Chess on
it because the author, Zvonko Vranesic, is not only an father in the FPGA/CAD
industry, he is also an International Chess master. Also, he's quite good at
ping pong for being 76 :(

~~~
swetland
Depends what you're doing -- DE0 Nano is nice for integrating into larger
projects, but if you want to say implement a little CPU and peripherals,
something like the plain 'ol DE0 is only slightly more expensive and has a
bunch of buttons/switches/LEDs/7-seg-displays for just-getting-started
projects as well as handy IO interfaces like VGA, PS2 mouse/keyboard, etc.

------
ddalex
This may be huge. People will use the FPGA as they use the GPU now, but FPGA
has the potential to greatly reduce the programming complexity associated with
GPUs.

In the end, the success will boil down to how easy the development is, and how
well designed the libraries will be - if the framwork will be capable to
automatically reconfigure the hardware to offload CPU-intensive tasks, this
has high tech potential for widespread adoption, not just datacenter-wise.

~~~
nomnombunty
Can you elaborate on how FPGA has the potential to reduce programming
complexity associated with GPUs? I personally think it is harder to program
with Verilog than to program using CUDA.

~~~
hderms
Well it's probably easier to program CUDA for embarassingly parallel tasks, or
other tasks well-suited to CUDA, but FPGAs might make certain tasks easier
because of their flexibility.

------
pbo
If I recall correctly, Intel tried a few years ago to sell a system-on-chip
combining an Atom CPU with a FPGA from Altera. I believe it didn't work very
well, especially with regards to communication and synchronization between the
two cores.

~~~
deadgrey19
It didn't work very well, but there is a good reason: Nobody wanted a slow and
comparatively low performance chip paired with a small FPGA connected via a
(slow) PCI-express connect. There are hundreds of big FPGA boards with PCIe
connectors that can be tied to big CPUs already. It was a non-product from the
get-go.

------
mindcreek
Hmm, this might have implications for digital currencies and their mining.

~~~
deadgrey19
Unlikely. Custom built Application Specific Integrated Circuits (ASICs) (i.e.
bitcoin mining chips - e.g
[http://www.butterflylabs.com/](http://www.butterflylabs.com/)) will always be
faster than FPGAs (which are comparatively slow) and CPUs (which are fast, but
general).

~~~
tromp
Except that every other months or so sees the introduction of a new proof-of-
work system that won't have ASICs for years, if ever. FPGAs can easily
outperform the CPU/GPU competition on any alt-coin using such proof-of-work.

------
pling
Sold! Seriously. This is what I wanted for the last two decades.

------
retroencabulato
A good time to know Verilog.

------
YZF
Sounds like a bit of a gimmick.

FPGAs are typically used in ASIC development to emulate the ASIC being
developed. I've seen boards with 20 FPGAs emulate an ASIC design at <~1/10th
of the speed at >>x10 power. While FPGAs are programmable hardware they are
far less efficient than custom hardware for various reasons. Naturally ASIC
emluation is an application where FPGAs have a very large advantage over
software... At volume they're also a lot more expensive and good tools are
also _very_ expensive (virtually no mass produced commercial product uses
FPGAs). Now obviously if the FPGA is inside the Xeon you're not really paying
much more for it (except you lose whatever other function could be crammed in
there).

Companies like Microsoft, Facebook, Google have enough servers to make a
custom block inside Intel's CPU more attractive than an FPGA in terms of
price/power/performance (and they can get that from ARM vendors which is
probably scaring Intel).

CPU vendors have spent the last several decades moving more and more
applications that used to be in the realm of custom hardware to the realm of
software. There are certainly niches of highly parallelizable operations but a
lot of general purpose compute is very well served by CPUs (and a lot of it is
often memory bandwidth bound, not compute bound). Some of these niches have
already been semi-filled through GPUs, special instructions etc.

The FPGA on the Xeon is almost certainly not going to have access to all the
same interfaces that either a GPU or the CPU has and is only going to be
useful for a relatively narrow range of applications.

I think what's going on here is that as the process size goes down simply
cramming more and more cores into the chip makes less and less sense, i.e.
things don't scale linearly in general. So the first thing we see is cramming
a GPU in there which eventually also doesn't scale (and also isn't really a
server thing). Now they basically have extra space and don't really know what
to put in it. Also each of the current blocks (GPU, CPU) are so complicated
that trying to evolve them is very expensive.

EDIT: Just to explain a little where I'm coming from here. I worked for a
startup designing an ASIC where FPGAs were used to validate the ASIC design. I
also worked on commercial products that included FPGAs for custom functions
where the volume was not high enough to justify an ASIC and the problem
couldn't be solved by software. I worked with DSPs, CPUs, various forms of
programmable logic, SoCs with lots of different HW blocks etc. over a long
long time so I'm trying to share some of my observations... If you think
they're absolutely wrong I'd be happy to debate them.

EDIT2: Re-reading what I wrote it may sound like I am saying I am an ASIC
designer. I'm not. I'm a software developer who has dabbled in hardware design
and has worked in hardware design environments (i.e. the startup I worked for
was designing ASICs but I was mostly working on related software).

~~~
astrodust
FPGAs are terrible at emulating ASICs, but CPUs are even worse, yet FPGAs do
excel at certain problems that can be expressed as programmable logic that
operates in a massively parallel manner.

What if the Intel FPGA did have access to the same resources as a GPU? This
isn't inconceivable, it's in the same socket as the CPU.

This gives you the ability to implement specialized algorithms related to
compression, encryption, or stream manipulation in a manner that's way more
flexible than a GPU can provide, and way more parallel than a CPU can handle.

~~~
CamperBob2
_What if the Intel FPGA did have access to the same resources as a GPU? This
isn 't inconceivable, it's in the same socket as the CPU._

An FPGA that competes with a modern GPU would probably cost in the
neighborhood of US $50,000 per chip.

~~~
astrodust
In a general sense, yes, but not in very narrow problems where the GPU would
stumble and flail because of architectural limitations that would prevent it
from fully applying itself.

