
Untapped opportunities in AI - dennybritz
http://radar.oreilly.com/2014/06/untapped-opportunities-in-ai.html
======
striglia
Cool article. I really like repetition that model complexity is not a pancea.
Seems like the industrial AI/ML movement as a whole has gone down a road where
practitioners will, by default, throw the most powerful model they know at a
problem and see how it pans out. Works well on benchmarks(if you
regularize/validate carefully) but isn't a very sustainable way to engineer a
system.

Separately, I do find it curious that his list of "pretty standard machine-
learning methods" included Logistic Regression, K-means and....deep neural
nets? Sure they're white hot in terms of popularity and the experts have done
astounding things, but unless I've missed some _major_ improvements in their
off-the-shelf usability they strike me as out of place in this list.

~~~
fchollet
Deep convolutional neural nets are the staple method for companies doing
computer vision at scale. Google, Facebook, Yahoo, and Baidu make extensive
use of them. In 2014 they definitely deserve their place on the shortlist of
"pretty standard machine-learning methods". They are the current state of the
art for visual recognition in particular (see the results on ImageNet from the
past few years).

They have also been commoditized by libraries such as Theano (Python) and
Torch (Lua). Google and Facebook use their own tools based on Torch.

My own version of the shortlist would be: Logistic regression, KNN, random
forests, SVM, and deep convnets.

~~~
agibsonccc
This is why I'm starting skymind[1] to address. Many of these companies are
using cutting edge techniques without a focus on usability (since they know
this stuff). The hope here is to create an off the shelf package people can
just use with only knowledge of fairly conventional machine learning
techniques while also avoiding the problems of having to program in java
(which isn't realistic in data science) or even lua.

Many people could benefit from a neural nets ability to create good features
for itself, but it's hard to use in a practical setting. That being said over
the next few years I believe this can change.

[1]: [http://wired.com/2014/06/skymind-deep-
learning/](http://wired.com/2014/06/skymind-deep-learning/)

~~~
fchollet
I like the idea a lot. Just trying to understand this better: it seems like
your company is entirely about selling consulting services, yet your stated
goal is to "give people machine learning without them having to hire a data
scientist". What's your path to that goal?

~~~
agibsonccc
In this case, being that on staff data scientist for them.

Many companies only need a one off model to set themselves up for some sort of
baseline data product. This can also be training for them on using machine
learning for their problem solving.

The goal isn't necessarily to totally supplant data scientists (love press
sensationalism), but to help enable companies to build easy to use models in
their apps.

This can also map to saving data scientists time by not necessarily "skipping"
the feature extraction part (which they can with deep learning and still do
reasonably well) but allowing them to just use a fairly good machine learning
model out of the box to use as a baseline.

The great thing about machine learning is the ability to mix different
techniques. Google's voice detection is a great example of this. They use
neural nets for feature extraction and hidden markov models for final
translation of speech to text.

I think deep learning (if wrapped in the right apps or sdks) with the auto
feature extraction alongside then specifying say: a "task". This task could be
predicting a value, labeling things, or even compression of data[1] would
allow companies to not focus on machine learning, but on straight problem
solving.

The idea would be once they are familiar enough with how to feed data in to
the system, and specifying a "task", they can do a lot of machine learning by
themselves without having to think too much about the problem they are solving
(what features work best given the data I have?)

[1]
[http://www.slideshare.net/agibsonccc/ir-34811120](http://www.slideshare.net/agibsonccc/ir-34811120)

------
hyp0
Massive datasets do outperform clever theories... but I think that's just
because no one has yet worked out the theories that work best with the data.
This requires insight, in addition to data, and could come from anyone.

The alternative - that massively complex probabilistic models _are_ the best
theory of the data - is hopefully not true. Especially not of our minds. But
it _could_ be true, and if so, it would mean that our intelligence is
irreducible, and we are forever beyond our own self-understanding (even in
principle). Our history is full of inexplicable mysteries that were eventually
understood. But not all: quantum randomness. I really hope intelligence is
will be one of the former.

~~~
iandanforth
There are a lot of AI problems that can be solved with less than human
intelligence but some human numbers for reference:

By the time you're 30 you have been exposed to:

~1.4 petabytes of visual information ~1.8 terabytes of auditory information

Touch and proprioceptive bandwidth is harder to calculate but the ascending
pathway through the spinal cord is about 10 million fibers, which is 10x the
optic nerve (Or 5x the number of fibers from both eyes). So:

Between 1.4 and 14 petabytes of touch and proprioceptive information.

So we're a fairly large data problem on top of millions of years of evolution
that have baked in some knowledge and abilities.

~~~
why-el
Not really. Our knowledge takes shape way, _way_ before we hit 30 in every
area you can think of. Some modular brain systems actually come to full form
in the few first years, some in the few months (vision).

I would argue that the data we exposed to is not only small, but actually
sparse.

~~~
MattHeard
If you consider brain development to be iterative over many generations of
brains through human and pre-human history, the training datasets are a lot
larger.

~~~
maaku
Except that doesn't make sense. The complexity of changes that can be carried
over is very, very small compared to the changes which go on in a single
lifetime. And the mechanism is totally different.

------
araes
I can honestly say that this post has revolutionized my thoughts on AI.
Primarily this is because of what I perceive as the thesis statement, which
is:

"<AI> is the construction of weighted tables (choices, data, meta relations,
whatever) from large sets of <prior data> by <method>"

This is kind of crazy, because I think it says you could make a Turing AI by
using large datasets of prior life data for humans. In essence, "<my life> is
the construction of weighted tables from large sets of <life experience> by
<human learning>." For example, if you had an AI that could learn through
text, you could have extensive transcribed conversation logs of people and
then large time-activity logs to use as your inputs.

If it could learn through video (IE, it could view images, understand objects,
object relations, events in time, and assign will to the person behind actions
/ events) then you could instead just feed it huge video logs of people's
lives. If you wanted a copy of a person, you could feed it only a single
individual, and if you wanted a more general AI, then you could feed it cross
sections of the population.

In addition, there's a very cool meta aspect to the large dataset concept, in
that it can be large datasets for when to use, or to feed data to, specialized
sub-AI's. For example, you might have a math sub-AI that has been trained by
feeding it massive sets of math problems (or perhaps it can learn math through
the video life logs of a person?). If its then being used as a part of a
larger piece, then you'd want to know when to use it to solve problems, or
when to feed it experience inputs for further learning. In essence, its tables
of categories for experience types, and then grown / paired sub-AI's for those
types.

I would wager that it is possible, right now, to create a chatbot that can
pass Turing using the above by feeding it the equivalent of mass IRC chat or
somesuch huge, human interaction by text dataset over a variety of topics.
This would naturally need sub-AI's for mechanical things like grammar or parts
of speech, and then possibly higher level meta-AI's for interpreting intent,
orchestrating long form thought, or planning. In a way, its layers of AI based
on level of thought abstraction. If it were a human, the high intensity
portions of sub-AI would occupy space relative to intensity within
reconfigurable co-processor zones (sight:visual cortex, sight:face
recognition:occipital and temporal lobes, executive functions:frontal lobes,
ect...)

~~~
nopinsight
Consider this simple sentence:

 _" Jane grew up in an idyllic rural area."_

No current AI implementation, to my knowledge, can understand such a sentence
nearly as well as humans do. A competent chatbot judge could suggest a novel
situation, say _a broken-winged black Pegasus appeared in Jane 's hometown
when she was seven_, and ask pointed questions to find out if the interlocutor
is a human or a bot.

The issue with almost all current approaches to AI is that it is either purely
symbolic or sub-symbolic. The current symbolic approaches cannot completely
capture preconceptual experience human use to make sense of the world. When we
hear "idyllic rural area", humans use our mental imagery and sensory
experiences to help us understand the sentence much more deeply than the list
of words suggests.

The subsymbolic approach could potentially solve this issue, but it raises the
problem of integrating all those complex, interacting parts, vision, auditory,
motor control, conceptual thoughts, etc. into a unifying whole. More
importantly, would we be able to control and direct the beast sufficiently
well once it becomes reality?

There is now some AGI (Artificial General Intelligence) research on
integrating the two paradigms. If anyone is interested, a presentation is
available here:
[http://ieet.org/index.php/IEET/more/goertzel20130531](http://ieet.org/index.php/IEET/more/goertzel20130531)

~~~
valas
Why is it relevant if AI "understands"? The post above doesn't claim it would
understand, it just claims that for all the practical purposes it would pass
Turing AI test.

~~~
hexagonc
Because you couldn't pass a Turing test, even for practical purposes with that
approach. Passing the Turing test is a problem that cannot be solved by big
data alone. You have to model not just language patterns and word sequences
but _the prior causes_ for those word sequences. The prior causes for the
words (and images, and videos in the original example) are ultimately desires
and experiences of real human beings and truths about the universe.

In other words, a computer has to actually model the world and the changes to
the state of the world as the conversation goes on in order to pass a Turing
test. I didn't read anything in the original description to suggest that was
happening.

------
jostmey
As a postdoctoral candidate in biology, I can say that my approach to problem
solving is exactly the opposite: My job is to infer as much as I can from the
scant amount of data I can obtain. The goals outlined in this article are to
collect as much data as you can, creating what is essentially a glorified
lookup table of results. I must say the latter approach seems a hell of a lot
easier.

