
On FPGAs as PC Coprocessors (1996) - zpiman
http://www.fpgacpu.org/usenet/fpgas_as_pc_coprocessors.html
======
13of40
Something I've often dreamed about is an fpga board in pcie card form with a
sane toolset along side it so I can treat it as software instead of getting
advanced degrees in desktop cable management and I/O pin mapping. Does
something like that exist?

~~~
slivym
What you're describing is OpenCL, yes it exists, both Xilinx and Intel produce
toolsets. No they aren't sane by software standards, but they're fantastic
compared to hardware engineering. A card will cost you ~$10k for something
you'd actually get acceleration from ([https://www.xilinx.com/products/boards-
and-kits/alveo/u250.h...](https://www.xilinx.com/products/boards-and-
kits/alveo/u250.html)) and you'll still need a degree in electronic
engineering to produce something that convincingly accelerates your task.

~~~
imtringued
Most FPGAs that are viable accellerators aren't for hobbyists but they also
are not as expensive as you think. I can't find it anymore but I once saw an
online shop with a huge variety of 500k+ LUT FPGA modules (just the chip
itself on a small PCB) for around 1000€ + 500€ breakoutboard/mainboard. At
those prices it makes more sense as an individual to invest into more CPU
cores or a GPU (if your problem maps to it).

Edit: Maybe its this one here. [https://shop.trenz-
electronic.de/de/TE0808-04-06EG-1EE-Ultra...](https://shop.trenz-
electronic.de/de/TE0808-04-06EG-1EE-UltraSOM-MPSoC-Modul-mit-Zynq-UltraScale-
XCZU6EG-1FFVC900E-4-GB-DDR4?c=449)

~~~
gudok
How much time would it take to synthesize 500K-lut design on a high-end
workstation?

~~~
nraynaud
wait, we have an FPGA-based hardware module to accelerate synthetization!

(or even more ironic: an ASIC module)

------
parski
I'd think it would be cool to have an FPGA in my PC for various kinds of
emulation. If I want to play some old games I can use it for accurate
emulation (like the MiSTer project[1]) or if I'm in a DAW and want to produce
audio from some old synthesizer I can do that on the FPGA to get a more
authentic sound. Likely niche but I'd be all over it.

[1] [https://github.com/MiSTer-
devel/Main_MiSTer/wiki](https://github.com/MiSTer-devel/Main_MiSTer/wiki)

------
sytelus
Probably a stupid question: Instead of 6 core or 8 core CPUs, why Intel
doesn't make 4 traditional cores + 2 FPGA cores on same die?

~~~
slivym
It's a good question: the answer is that they have done this. (or atleast they
are doing this) [https://www.nextplatform.com/2018/05/24/a-peek-inside-
that-i...](https://www.nextplatform.com/2018/05/24/a-peek-inside-that-intel-
xeon-fpga-hybrid-chip/)

What's proving to be a problem though is where does this fit? If you don't
have a clear need for an FPGA then just buy a normal Xeon. If you do need an
FPGA then why compromise your Xeon? Have an FPGA card, or hell a group of FPGA
cards.

The only place this makes sense is if you can think of a use case where you
have an FPGA task that needs low latency communication with your CPU. Even
with this chip though you have an uphill struggle because the cache hierarchy
of a Xeon makes access to memory non-deterministic which traditionally isn't
what FPGAs are designed for. It's much more difficult to design your algorithm
on FPGA to deal with arbitrary memory latency.

So the question back to you is: What would you use it for?

~~~
tyingq
The TI AM335x CPU has something (sort of) like this...basically 2
microcontrollers that share memory with the ARM cpus.

People have done some pretty clever things with it. Audio processing, driving
LED matrix boards, emulating old video boards, driving precision servos,
software oscopes and logic analyzers, etc.

Though that's in a small dev board, like the Beaglebone Black, not a beefy
Intel server.

------
jakeogh
Kinda sorta related: Novena, since it has a on-board FPGA:
[https://kosagi.com/w/index.php?title=Novena_Main_Page](https://kosagi.com/w/index.php?title=Novena_Main_Page)

------
08-15
The article seems to say FPGA on a high latency bus can only accelerate
workload that are streamed via DMA, and implies that a general purpose
accelerator has to be closer to the CPU. Sounds like a coprocessor, like
putting an FPGA into the slot where the 8087 used to be.

That made me think, why not get even closer? Why not have an FPGA as execution
unit? Modern CPUs have multiple ALUs, multiple FPUs, multiple vector units.
Wouldn't it be great if an FPGA was added to that, such that the instruction
set becomes extensible?

The idea is too obvious to assume nobody ever thought of it. Why isn't it
done?

~~~
GrumpyYoungMan
It has been researched: Alessandro Forin's eMIPS research project was on
integrating FPGA fabric as an execution unit.

Project page: [https://www.microsoft.com/en-
us/research/project/emips/](https://www.microsoft.com/en-
us/research/project/emips/)

Research paper: [https://www.microsoft.com/en-us/research/wp-
content/uploads/...](https://www.microsoft.com/en-us/research/wp-
content/uploads/2016/02/emips-emipsreport1.pdf)

Back then Moore's Law was still going full steam so there wasn't much interest
but, who knows, maybe that will change in a few years.

------
fromthestart
>So as long as FPGAs are attached on relatively glacially slow I/O buses \--
including 32-bit 33 MHz PCI

GPUs are on the PCI bus, aren't they? Has something changed in the last two
decades to increase bandwidth?

~~~
slededit
GPUs worked well because you could transfer all your large art assets upfront
and then only communicate your mesh and shader logic as the game ran. They
don't work so well if you need frequent access to system memory.

~~~
imtringued
They also do not have to send the result back to the CPU because they have a
video output directly attached to them.

------
vmh1928
the Netezza shared nothing database appliance used FGPAs as helper cards on
each of the x86 data blade servers. A little more about it worked here (and
via The DuckDuckGo.)

[https://www.slideshare.net/classicboyir/netezza-pure-
data](https://www.slideshare.net/classicboyir/netezza-pure-data)

