Nvidia's Tegra X1 should supposedly be capable of <10ms for imagenet grade models . It's fair to assume though, that this must be for trimmed down and/or 16bit models as compared to full inception models.
And finally, Sam who also facilitated building TF on the Pi is about to host a 6 weeks half theory, half practice course on TF and deep learning  (me thinks he deserves this plug).
https://petewarden.com/2014/08/07/how-to-optimize-raspberry-... (example involves deep learning!)
https://www.raspberrypi.org/forums/viewtopic.php?f=29&t=7891... (someone actually tried making an LLVM backend!)
I really doubt someone integrated that into TensorFlow…
> it was not feasible to analyze every image captured image from the PiCamera using TensorFlow, due to overheating of the Raspberry Pi when 100% of the CPU was being utilized
Just put a heatsink on the CPU. It's like $1.50 ... $1.95 on Adafruit. I glue a heatsink to every RPi3 unit I build.
> it was taking too long to load the 85 MB model into memory, therefore I needed to load the classifier graph to memory
Yeah, one of the first things you learn with TF on the RPi is to daemonize it, load everything you can initially, and then just process everything in a loop. That initialization is super-slow, but after that it's fast enough. YMMV
Even with the heatsink (which we install on all of the Pis), we were still having overheating issues. We tried a few other things too to mitigate the problem:
1. Reducing sampling rate for the image recognition (but if we reduced this beneath several seconds we could miss the express trains)
2. Using a cooling fan (https://www.amazon.com/gp/product/B013E1OW4G/ref=oh_aui_sear...) - still didn't prevent overheating if the CPU was continuously loaded at 100%.
3. Only sampling images where we detected motion (https://svds.com/streaming-video-analysis-python/)
We decided to use the 3rd option: Leveraging our motion detection algorithm, which while sensitive to false positives, allows us to use Deep Learning image recognition to eliminate those false positives.
Happy to chat more about your experiences daemonize-ing TF applications!
Are you seeing anything happen, other than some slight throttling?
The chip cannot fry itself. It's designed to slow down so as to stay below the dangerous temperature range.
> Happy to chat more about your experiences daemonize-ing TF applications!
Eh, that was just a fancy way of saying I do what you do. Launch the program once, and let it run forever. It performs initialization (which takes a long time), then it drops into a processing loop: wait for input / read / process / do something / repeat. Pretty basic stuff really.
Ever had any problems without it?
Either that or it was meant for outdoors operation in arctic regions.
EDIT: I guess the question should be: is there a RPi-like sized machine available which is more suitable for training ANN?
I imagine a proliferation of robots, security cameras and smart open source siri/alexas
On the Pi3, our application processes 320X240 images at 10 FPS without any problems.
Let me know if you have any questions!