
FPGA/DNN Co-Design: An Efficient Methodology for IoT Intelligence - Katydid
https://arxiv.org/abs/1904.04421
======
light_hue_1
This is a terrible paper and the authors didn't employ even the most basic
scientific standards. They search over the shape of the network that works
best on the FPGA to give them their target performance on the FPGA. But they
don't search over the network that gives the best target performance on the
GPU! Of course the FPGA wins.

We have known for years now that deep networks are extremely compressible when
they're trained. You can drop the vast majority of the weights and still
maintain almost all of the performance. Just like you can drop the accuracy
from floats, to ints, to int8, and even to bool for the weights and you can
still perform well.

This papers is a joke and it deserves to be rejected from anywhere it's
submitted. They optimize the shape of the FPGA network but not the GPU
network. They don't apply the standard methods to prune weights in networks
and they don't compare to those methods. They also pick one GPU network at
random, lots of object detectors are faster than Yolo. People have explored
the speed-performance tradeoff closely.

About FPGAs in general: there are great reasons why FPGAs have been just over
the horizon for a long time. The toolchains suck and they're mostly very
closed down. Debugging is very hard. They cost 20x or so as much as GPUs. Your
code becomes very specific to one particular FPGA so upgrading is a total
pain. And more.. There are plenty of better solutions out there like TPUs.

~~~
lnsru
Only FPGAs have deterministic behavior, so they still have bright future.
Tools is other topic, but they are perfect for this particular task. It makes
no sense compare my 2 favorite IDEs: Qt and Vivado. Their purpose is very
different. FPGA debugging is easy when you can simulate all your code before
going to hardware. Hardware debugging with integrated logic analyzer is really
easy. For hardcore projects you can take 3rd part logic analyzer that spits
every event of the system over 10G Ethernet interface. Portability is the code
depends solely on the author.

Edit: GPUs are nice and powerful, but they are big, require separate computer
and lots of power.

~~~
Erlich_Bachman
> Only FPGAs have deterministic behavior

What? Calculations on GPUs are deterministic.

~~~
lnsru
You still have computer with operating system attached to GPU. FPGA can
operate standalone.

Edit: There are attempts to have real-time Linux. I saw impressive demos from
OSADL, but none deployed in the field with GPU.

~~~
Erlich_Bachman
But FPGAs are still much more expensive than using GPUs overall, even if you
include the cost of the computer.

------
m0zg
The question in 2019 is not whether it outperforms FPGA, but whether it
outperforms the various TPU-like things, like Google Edge TPU, Apple Neural
Engine, Huawei's NPU, Bitmain, etc. And the answer to that question is very
likely "no".

