The Raspberry Pi provides documentation for their GPU architecture, so it *would...

dividuum · on Feb 11, 2019

I believe, Idein did that. At least they regularly post impressively (for the Pi) fast examples to /r/raspberry_pi like https://redd.it/a5o6ou. It seems the result isn't available individually or as open source but only in the form of a service (https://actcast.io/)

joshvm · on Feb 11, 2019

There are some well optimised libraries, for example a port of darknet that uses nnpack and some other Neon goodies. You can do about 1fps with tiny yolo. Not sure if it used anything on the gpu though.

zozbot123 · on Feb 11, 2019

NEON is the CPU SIMD feature, it has nothing to do with the vc4 GPU.

joshvm · on Feb 12, 2019

Yes, I know. My point was that CPU-only deep learning is possible on the Pi if you don't need real-time inference. What I wasn't sure of is whether that specific port does anything on the GPU at all, or if it's only using NEON intrinsics.