

Andrew Ng: Machine Learning in Robotics - helwr
http://youtube.com/watch?v=AY4ajbu_G3k

======
colincsl
Here is a more in-depth presentation he gave on the same subject at a Google
Tech Talk:

<http://www.youtube.com/watch?v=ZmNOAtZIgIk> (April 11, 2011)

~~~
euccastro
Thanks!

Also found related tutorials:

<http://www.stanford.edu/class/cs294a/tutorial.html>

<http://ufldl.stanford.edu/wiki/index.php/UFLDL_Tutorial>

------
hamner
Great non-technical overview. What early-stage startups are working on these
types of problems?

~~~
ippisl
The technique he uses is called deep learning. one startup is binatix.com.

------
ippisl
How complex is this technology ?

Is it possible to offer it via simple api, useful by someone without knowledge
in machine learning , where you give it data to train , and you get get a
trained algorithm that can do perception stuff ?

~~~
hamner
The implementation of these algorithms is relatively straightforward. The
challenge is the state of the art in computer vision is not currently at a
point where it is possible to reliably detect 10s-100s of object categories in
real time on current systems. It is currently possible to build systems that
get decent real-time performance on detecting a few categories concurrently,
or offline systems that get around 60% accuracy across hundreds of well-
defined categories. Thus, (1) faster general purpose hardware, (2) better
algorithms, or (3) running the best algorithms in ASICs designed for CV is
necessary. The top labs now typically use GPU clusters to train / run their
algorithms, with the computationally expensive stages usually being feature
extraction and/or classifier training.

Google Predict (<http://code.google.com/apis/predict/>) offers a general
machine learning API geared towards those who want to apply machine learning
to their applications without subject-specific knowledge. I've not used it so
I can't speak to its accuracy, but it is not geared towards computer vision
and I imagine it would fail miserably at such tasks (since computer vision is
highly dependent on domain-specific feature extraction techniques), and I
imagine it performs well at NLP tasks. The primary limitation of such a system
is that it acts as a black box - you throw data in and get answers out without
any knowledge of the process behind it.

This black-box model is limiting for three major reasons. First, depending on
the domain, incorporating domain-specific knowledge can greatly improve
performance. Secondly, it is hard to understand the limitations of such a
system. Many ML algorithms can fail catastrophically when the input is
substantially different from the training data, and the black box makes it
hard to understand when the system is likely to fail and adjust accordingly.
Third, in many cases you face a tradeoff involving speed, memory, and
classification/regression performance. This tradeoff is automatically
determined for you and is not transparent.

I've been considering a general ML system that offers an API similar to Google
Predict, yet is transparent in the feature extraction / model selection stages
for those that would benefit from digging deeper into the system. Is this
something that you would pay for?

Specifically for computer vision, there's a variety of startups and companies
working on providing a system for object recognition and classification. One
example is <http://www.numenta.com/>, though when I tried there software about
a year ago it did not seem to function very well compared to the state of the
art. Others that are making visual search type applications include
<http://www.tineye.com/> and <http://www.kooaba.com>

