Radio-Astronomical Imaging: FPGAs vs. GPUs (2019) [pdf]

TomVDB · on March 29, 2020

They are comparing a lower end $200 GPU from early 2014 against a higher end $5000 FPGA board from 2016.

That doesn’t make everything irrelevant, but it’s definitely a weird to publish a paper about this in 2019.

Traster · on March 29, 2020

Truly the hallmark of any reputable FPGA benchmark "We hired an intern, put him through a lobotomy and then had him write code for this GPU, our team of seasoned professional FPGA design engineers wrote code over the next 7 years that really kicked his arse".

numpad0 · on March 29, 2020

If I understand correctly, price/performance of higher end FPGAs are bonkers because production volume is nil and they are for simulating larger circuits, not faster ones.

FPGA’s flexibility has potentials and they’ve come to comparable price range but won’t be in competitive range for some time still.

pinewurst · on March 29, 2020

And that lower end GPU is still notably better performing.

lnsru · on March 29, 2020

They didn’t implement it in proper VHDL/Verilog. They used OpenCL compiler, what is a great waste of resources. Of course, this way for comparison was quick.

Good solution would be high level synthesis from Matlab/Python/C instead of blindly replicating OpenCL kernels designed for GPU. Might work even better on less fancy FPGA than Arria 10.

TomVDB · on March 29, 2020

One of the key points of the paper is about how OpenCL makes it easier to implement things for an FPGA. Using Verilog/VHDL is about an order of magnitude more work, which would probably completely disqualify using the FPGA for a project like this.

llukas · on March 29, 2020

"The source code for the FPGA imager is highly different from the GPU code.This is mostly due to the different programming models: with FPGAs, one buildsa dataflow pipeline, while GPU code is imperative."

Please explain how they used OpenCL kernels designed for GPU.

enos_feedler · on March 29, 2020

You can think of OpenCL kernels (or any imperative sequence of low-level operations) as data flowing through math operations. Normally, we leverage a single set of math circuits to perform all of these operations in sequence, and orchestrate the data flow through a register file. You could imagine removing the register file and instantiating an actual circuit that represents the data flow of the program itself. This creates more opportunity for pipelining, which should be plentiful in a highly data parallel computation. The issue with FPGA is they are clocked lower and are not very dense, so the tradeoff is generally not worth it.

llukas · on March 29, 2020

Are you saying that even with OpenCL kernels tailored for FPGA we get subpar results? (can belive that compilers do subpar job even on GPU)

enos_feedler · on March 30, 2020

Yes, I don't think it's an issue with the compiler. The FPGA approach requires a flexible fabric that just has lot's of overhead to give it programmability compared to an ASIC. For an FPGA to have value, you _really_ need to leverage it's programmability. Emulating an ASIC design for verification and testing is a good use case.

llukas · on March 29, 2020

Paper mentions SKA has specific requirements but doesn't really go into details.

If you put radiotelescope in the middle of nowhere and you need to build your own powerplant and deal with logistics of transporting then you care about power efficiency and robustness.