Edit: Maybe its this one here. https://shop.trenz-electronic.de/de/TE0808-04-06EG-1EE-Ultra...
(or even more ironic: an ASIC module)
I caution you to not dismiss the entire field of computer hardware engineering as "cable management". If that's your view, best just stick to whatever you're doing now.
What's proving to be a problem though is where does this fit? If you don't have a clear need for an FPGA then just buy a normal Xeon. If you do need an FPGA then why compromise your Xeon? Have an FPGA card, or hell a group of FPGA cards.
The only place this makes sense is if you can think of a use case where you have an FPGA task that needs low latency communication with your CPU. Even with this chip though you have an uphill struggle because the cache hierarchy of a Xeon makes access to memory non-deterministic which traditionally isn't what FPGAs are designed for. It's much more difficult to design your algorithm on FPGA to deal with arbitrary memory latency.
So the question back to you is: What would you use it for?
People have done some pretty clever things with it. Audio processing, driving LED matrix boards, emulating old video boards, driving precision servos, software oscopes and logic analyzers, etc.
Though that's in a small dev board, like the Beaglebone Black, not a beefy Intel server.
There are few things that can be done very well on a FPGA, but most things are not, and the market for it tiny.
If you really have an application that's perfect for a CPU/FPGA combo, just buy a PCIe card with a beefy FPGA.
It will cost you, but the development of the FPGA logic will cost way more.
Think of them like graphics cards, but even more niche. Trying to stick them directly into the CPU isn't going to provide the power of a dedicated add on.
That made me think, why not get even closer? Why not have an FPGA as execution unit? Modern CPUs have multiple ALUs, multiple FPUs, multiple vector units. Wouldn't it be great if an FPGA was added to that, such that the instruction set becomes extensible?
The idea is too obvious to assume nobody ever thought of it. Why isn't it done?
Project page: https://www.microsoft.com/en-us/research/project/emips/
Research paper: https://www.microsoft.com/en-us/research/wp-content/uploads/...
Back then Moore's Law was still going full steam so there wasn't much interest but, who knows, maybe that will change in a few years.
GPUs are on the PCI bus, aren't they? Has something changed in the last two decades to increase bandwidth?
PCIe 3: 16 lanes * 8 Gtransfers/s * 128/130 (encoding) : ~126 Gbit/s
So, yes, it has changed quite a bit!
But so has everything else.
If you want performance, you still better do it through DMA transfers that bypass the CPU, because otherwise, the CPU will still be waiting for thousands of cycles to fetch data from the device on the other side of the bus.
And the transfers that are done by the CPU should be write-only to the bus as much as possible.
But PCIe 3 is a whole different beast than PCI.