
Nvidia's demo of real-time object recognition using deep learning [video] - vkhuc
https://www.youtube.com/watch?v=zsVsUvx8ieo
======
locusm
The CEO butting in all the time was really annoying, had a business partner
that did this in meetings all the time - its fucking annoying and rude. On the
surface I have no idea if this is ground breaking or not, my first thought was
ahh nVidia using Linux!

~~~
razster
Unfortunately these are the ones that make it to the top and are named CEO.

------
polskibus
Looks great! Can anyone with more knowledge about deep learning say whether
this is an exceptional achievement in the field ?

~~~
gamegoblin
As far as state-of-the-art classification goes, this is not terribly
impressive. See the recent results in something like:

[http://cs.stanford.edu/people/karpathy/deepimagesent/](http://cs.stanford.edu/people/karpathy/deepimagesent/)

I think the impressive thing here is that the GPU is presumably doing GIANT
matrix multiplications in real time. A prediction from a neural net is just a
series of matrix multiplications, and matrix multiplications are about n^2.8
in complexity, so you can see how matrix multiplications with thousands of
rows/columns (often what these sorts of deep image classifiers involve) are
hugely computationally expensive.

So it's definitely important for real time machine learning systems to have
access to this kind of linear algebra power, but the actual machine learning
techniques demonstrated are not super impressive. The hardware is. Which makes
sense since this is an Nvidia demo.

~~~
dogma1138
From what i can understand what's even more impressive is that it was running
on a beefed up version of their latest mobile SOC and not on some 5000$
compute GPU card. Which means that this application can be both very
affordable and very practical since people won't put a 300W GPU in their car.

~~~
gamegoblin
Definitely agreed. When well-maintained and easy to use machine learning
libraries meet very powerful, highly embeddable GPUs (or other dedicated
linear algebra sort of hardware), I think we'll see a big revolution in the
whole "smart object" field.

Right now you have iPhones and whatnot doing touch ID with fingerprints, but
imagine if your phone could recognize you just by quickly analyzing the gyro
data as you raise your phone and comparing it against the other thousand times
you've pulled your phone out of your pocket, combined with the slight pressure
readings near the touchscreen's edge because it's learned where you're fingers
fall on the case.

^ contrived example I just thought of, but you get the idea.

~~~
agumonkey
Intel 'realsense' drone demo was, and I'm not into the smart/IoT trend,
somehow impressive. A flying electromechanical bug on stage at a mainstream
show, to me that was a small but real inflection point.

[https://www.youtube.com/watch?v=Gn83Psbv61I](https://www.youtube.com/watch?v=Gn83Psbv61I)

ps: I'm not sure it was fully real-time though, the door avoidance restart
seemed a little too nicely cued.

------
nl
Here's how to do the street sign part of this yourself:
[https://gist.github.com/iandees/f773749c47d088705199](https://gist.github.com/iandees/f773749c47d088705199)

------
zwieback
Cool demo but I still wonder if fundamentally this is just a brute-force
approach. Wouldn't it be better to do some traditional preprocessing (e.g.
recognizing rectangles, circles, etc.) and feeding higher-level descriptors
into the classifier?

If the net learns based on pixels you still have to somehow solve rotation and
scale invariance. Or is there something new in deep-learning vs. old-school
neural nets that fixes the issues that bedeviled neural nets the first time
they were popular?

~~~
vkhuc
I think they used the methods described in
[http://www.cs.berkeley.edu/~rbg/papers/r-cnn-
cvpr.pdf](http://www.cs.berkeley.edu/~rbg/papers/r-cnn-cvpr.pdf)

~~~
zwieback
Thanks, interesting paper.

------
plg
video shows demo happening in ubuntu --- at least the video playback

------
rasz_pl
@10:08

on the right merc sls classified as SUV

on the left one SUV classified as two VANs

Their algorithm works at about 1Hz rate when doing signs. This is ~state of
the art from 20 years ago, but running on small mobile SoC at a slow rate.

~~~
SammoJ
Please show a paper where fine-grained vehicle classification in unconstrained
images is anywhere near this performance from 20 years ago. You will not be
able to, because it wasn't.

~~~
rasz_pl
state of the art classification accuracy/range, not speed.

