We were working on a project to detect objects using deep learning with raspberry pi and we have benchmarked various deep learning architectures on pi. With ~100-200 images, you can create a detector of your own with this method.
In this post, we have detected vehicles in Indian traffic using pi and also added github links to code to train the model on your own dataset and then script to get inference on pi. Hope this helps!
One advantage of using API based approach is you can get much higher FPS without compromising accuracy and is also independent of pi CPU power and heating etc.
lets you customize a neural net with a small number of specific images (using a technique called 'transfer learning')
You didn't answer this question. Your "sorta-answer" suggests "yes", but the title "How to easily Detect Objects with Deep Learning on Raspberry Pi" suggests that your answer should be "no".
The title wasn't "How to easily Detect Objects with Deep Learning on Raspberry Pi with cloud services".
How am I suggesting "yes"? And how the title is suggesting answer as "no"? There are pros and cons of both methods. If you are doing inference on a remote place with no access to internet, off-device is out of question. We are just trying to give a complete landscape so that if someone has a use case and trying to come up with solution, it might be helpful. Depending on use case, can pick on-device or off-device.
Squeezing a full blown ML tutorial into a blog post is a tall order. Of course it’s too thin to really do it yourself without a lot of further research, you can’t really expect anything else. But I think the title leads people to expect more detail, hence posts like this and the owl cartoon above.
Maybe add a breakdown of performance for doing this local on a pi vs using the api? Would make it easier for people to weigh pros and cons.
$0 for 1,000 slow API calls
$79 for 10,000 fast API calls
To put that into perspective the 10k API calls is less than 10 minutes of 24 fps video. You should have a much higher plan or pay per request overage price.
If you want to run hours or days worth of video through an object detector - you probably want to go out and buy a gpu and machine to stick it on of your own...
I'm curious as to what the application you're thinking of where this seems like "real world usage"? (I can imagine applications like vision-controlled drones, but I'm pretty sure places like ATH Zurich have better solutions (as in "less generalised and more applicable to drone control") and in-house hardware to train and run it on.)
One application would be nudity detection for a family friendly site, lots of video would need to be checked.
The argument that you would want to run your own machine validates my point. However the same could have been said for video encoding or any other form of intense processing which all now have cloud alternatives.
Nudity detection though - I'd probably at least try doing something like "Check every 50+rand(100) frames, and only examine more carefully if you get hits on that sampling". Sure - that's "game-able" - but subliminal nudity isn't something I'd expect trolls or griefers to expend too much effort to slide past your filters...
I agree you neeed 24 FPS output but you don’t need to process all 24 frames raw as images.
But yes, no disagreement here.
$299 for 100k images
$499 for 1M images
We are adding plans for video.
Since 24 FPS doesn’t always make sense. Especially from a compute perspective and data perspective because there is a huge amount of redundancy. Should see a 1/20 at the minimum on price for video.
The ARM core on the SoC is part of the asic. As is the VideoCore part.
Cambricon is going big in china so its not just google and apples. They claim to be 6 times faster than GPU.
I am more interested in potential of being able to run video processing, voice models effortlessly on tiny devices. and also to train models offline or locally.
I think there is a good scope of solutions (like vision recognition) that port well across AI chips.
This is true. Google has TF for tpu and Intel Nervana has neon. Each player will likely publish a software library.
The only reason one may want to avoid them now is that there were still enough people who were taught from crappy elementary school textbooks that had this bogus rule, and they will think your grammar is bad if you use split infinitives (and they remember enough from elementary school to recognize them).
Just trust your ear. If splitting an infinitive makes a sentence sound clearer, do it.
If someone gives you crap, cite the Oxford Dictionary people .
PS: same goes for ending a sentence with a preposition. Sometimes it is clearer to do so. In that case, do it! You can cite Oxford for this, too .
Haha, just kidding : one _fewer_ thing to worry about.