
The truth about deep learning - clmcleod
http://blog.claymcleod.io/2016/06/01/The-truth-about-Deep-Learning/
======
vonnik
Anyone following DL news knows that DL alone will not lead to strong AI. The
most impressive feats in the last year or so have come from combining deep
artificial neural networks with other algorithms, just as DeepMind combined
deep ConvNets with reinforcement learning and Monte Carlo Tree Search. There's
not really an interesting conversation to be had about whether DL will get us
to strong AI. It won't. It is just machine perception; that is, it classifies,
clusters and makes predictions about data very well in many situations, but
it's not going to solve goal-oriented learning. But it solves perception
problems very well, often better than human experts. So in the not too distant
future, as people wake up to its potential, we will use those infinitely
replicable NNs to extract actionable knowledge from the raw data of the world.
That is, the world will become more transparent. It will offer fewer
surprises. We may not solve cancer with DL, but we will spot it in X-rays more
consistently with image recognition, and save more lives.

Disclosure: I work on the open-source DL project Deeplearning4j:
[http://deeplearning4j.org/](http://deeplearning4j.org/)

~~~
MrBra
What is a goal-oriented IA problem that is not centrally based on perception?

------
AndrewKemendo
I understand and empathize with the skepticism or rather criticisms around
hand wringing with respect to the implications of current deep learning
methods.

However, as someone who builds them for vision applications I'm increasingly
convinced that some form of ANN will underlie AGI - what he calls a universal
algorithm.

If we assume that general intelligence comes from highly trained, highly
connected single processors (neurons) with a massive and complex sensor
system, then replicating that neuron is step one - which arguably is what we
are building, albeit comparatively crudely, with ANN's.

If you compare at a high level how infants learn and how we train RNN/CNNs
they are remarkably similar.

I think where the author, and in general the ML crowd focuses too much is on
unsupervised learning as being pivotal for AGI.

In fact if you look again at biological models the bulk of animal learning is
supervised training in the strict technical sense. Just look at feral children
studies as proof of this.

Where the author detours too much is assuming the academic world would prove a
broader scope for ANN if it were there. In fact however research priorities
are across the board not focused on general intelligence and most machine
learning programs explicitly forbid this research for graduate students as
it's not productive over the timeline of a program.

Bengio and others I think are on the right track, focusing on the question of
ANN towards AGI and I think it will start producing results as our training
methods.

~~~
clmcleod
Hey Andrew,

First, I'm curious where you thought I was focused on unsupervised learning?
It certainly didn't cross my mind when I was writing this --- I was
(implicitly) strictly talking about supervised machine learning.

My post actually is in support of what the latter of your comment says in a
round-a-bout way. In general, the people that are making huge strides in deep
learning (Bengio, Hinton, Lecun are obviously the big three) understand the
capabilities and, maybe more importantly, the limitations of DL. My main point
is that the ML community at large is actually not on the same page as the
experts, and that causes many more problems.

I want us, as a community, to stop treating deep learning any different than
any other ML algorithms --- have a consensus, based on scientific facts, about
the possibilities and limitations thus-far. If we, "the experts", don't
understand these things about our own algorithms, how can we the rest of the
world to understand them?

~~~
dave_sullivan
> I want us, as a community, to stop treating deep learning any different than
> any other ML algorithms --- have a consensus, based on scientific facts,
> about the possibilities and limitations thus-far. If we, "the experts",
> don't understand these things about our own algorithms, how can we the rest
> of the world to understand them?

I agree. It's interesting watching the "debate" around deep learning. All the
empirical results are available for free online, yet there's so much
misinformation and confusion. If you're familiar with the work, you can fill
in the blanks on where things are headed. For instance, in 2011, I think it
became clear that RNNs were going to become a big thing, based on work from
Ilya Sutskever and James Martens. Ilya was then doing his PhD and is now
running OpenAI, doing research backed by a billion dollars.

The pace of change in deep learning is accelerating. It used to be fairly easy
for me to stay current with new papers that were coming out; now I have a
backlog. To a certain extent, it doesn't matter what other people think, much
of the debate is just noise. I don't know what AGI is. If it's passing the
turing test, we're pretty close, 3 years max, maybe by the end of the year.
Anything more than that is too metaphysical and prone to interpretation. But
there have been a bunch of benchmark datasets/tasks established now. Imagenet
was the first one that everyone heard about I think, but sets like COCO, 1B
words, and others have come out since then and established benchmarks. Those
benchmarks will keep improving, pursuing those improvements will lead to new
discoveries re: "intelligence as computation", and something approximately
based on "deep learning" will drive it for a while.

~~~
gambler
_> If it's passing the turing test, we're pretty close, 3 years max_

Well yes? If a Turing test you realize the simulation of some idiot in the
online chat, it has long been there - and nobody wants. But the system, which
can lead a meaningful conversation, today there is no trace. And there is even
no harbingers of its occurrence.

<\- This was translated by Google Translate from a piece of perfectly
intelligible and grammatically correct text in another language. If this is
the state of the art in machine translation, how on Earth can you expect a
machine that can _converse_ on human level in three years?

~~~
XenophileJKO
Sadly, I read the google translated text and it read like a person where
english was their second language. I didn't realize it was an "example" until
I read your next section. So it had me fooled.

~~~
gambler
You could probably replace 90% of YouTube comments with a simple trigram-based
chat bot and no one would notice. But that's hardly a good measure of AI
quality.

Although, your comment illustrates the main problem with the Turing Test. It
depends on too many factors and assumptions that have nothing to do with the
AI itself.

A good AGI test should be constructed in such a way that any normal person
passes it with 100% certainty and no trivial system can pass it at all.

------
aab0
"Here is my personal answer to the second question: deep neural networks are
more useful than traditional neural networks for two reasons: The automatic
encoding of features which previously had to be hand engineered. The
exploitation of structurally/spatially associated features. At the risk of
sounding bold, that’s it — if you believe there is another benefit which is
not somehow encompassed by these two traits, please let me know."

Let me ask a very simple question. What set of hand-engineered features gives
<5% error on ImageNet?

~~~
clmcleod
Exactly --- none. But those features were born out of brute forcing a spatial
exploitation, not some magical connection that we humans never thought about
previously, which reinforces the point.

~~~
rspeer
I would go farther and say that the success of deep learning comes mostly from
one thing: putting convolution layers in neural nets, instead of just random
connections or fully-connected layers.

Google's image and video recognition? Deep Dream? That's all convolution.

Speech-to-text? That's convolution.

AlphaGo? That's convolution.

Convnets are a great advance in machine learning, don't get me wrong. I hope
that soon we get a generalizable way to apply convolution layers to text or
music.

~~~
singhrac
We've been applying convnets to text for a while, though I'm not sure what you
mean by generalizable.

There's a few other major techniques (other than convnets) that are important,
like RNNs in general.

~~~
rspeer
Yeah, I should not make such sweeping statements. (Or maybe I should, because
it's a great way to find out what other people's perspective is when they show
up to correct me.)

My concern about the uses of convnets on text that I've seen is that I don't
think they can deal with little things like the word "not". (The Stanford
movie review thing can definitely handle the word "not", but that's
different.) I'm unconvinced as of yet that we're convolving over the right
thing. But maybe the right thing is on the way, especially when Google gave us
a pretty good parser. And maybe the right thing involves other things like
RNNs, sure.

I guess image recognition could have similar cases, and the results just look
more impressive to me because I work with text and not with images.

~~~
nl
LSTMs can handle negation. See [http://k8si.github.io/2016/01/28/lstm-
networks-for-sentiment...](http://k8si.github.io/2016/01/28/lstm-networks-for-
sentiment-analysis-on-tweets.html) for a pretty nice example.

I think there are some papers out of the IBM Watson group on question
answering where they use ConvNets. I don't remember looking a the negation
case specifically, but Question Answering generally has cases where that is
important.

------
mrdrozdov
My top reasons why everyone getting into maths/stats/cs should go straight for
deep learning:

a. recent findings are documented incredibly well in both research and code

b. because of its success, there are many areas for useful contribution at
relatively less effort from the researcher

c. because of its success, it'll help you develop marketable skills

d. it's fun

Maybe it won't solve General AI, but it seems like a damn good foundation for
the person/people that will eventually come out with ideas that move us closer
in that direction.

~~~
tomp
Which paper/blog post/book specifically documents the most recent research?
E.g. just yesterday I figured out (after surfing the internet for a long time)
that the diminishing gradient problem have essentially been solved by ReLUs,
so auto-encoders are no longer necessary and we can train a whole deep network
at once. Another is e.g. batch regularization. Probably not news for a
researcher, but where could I learn these updates?

~~~
mrdrozdov
The only way to really stay caught up is to read the papers that are being
released (NIPS, ICML, ICLR, CPVR, etc). If you use a course or guide, then
you're behind by definition. If you're okay with being a little behind, then
the Deep Learning courses for NLP or Computer Vision are both excellent. NYU
also has similar courses. You can also try following some major Deep Learning
people on Facebook/Twitter as they post about the new stuff daily. People to
follow: Lecun, Lawrence, Bengio, Cho, Karpathy, just to name a few. Interested
to hear others ideas on how to stay up to date.

------
chris_va
"The automatic encoding of features which previously had to be hand
engineered." Yes, that is the main benefit.

The drawback is that we are still hand tuning architectures, slowly inventing
(or incorporating) things like LSTM and the like into the model.

One goal would be to achieve a universal building block that can be
stacked/repeated without the need for architectural tuning.

Maybe something that combines recurrence, one-shot learning, deep learning,
and something stolen from AI (like alpha-beta, graph search, or something
self-referencing and stochastic with secondary neural networks) into a single
"node". Then we won't have to worry about architecture so much.

------
morgante
Why are we so preoccupied with the notion of "artificial intelligence" in the
first place?

Artificial intelligence, if it can even be defined, does not seem like a
particularly valuable goal. Why is emulating human cognition the metric by
which we assess the utility of machine learning systems?

I'd take a bunch of systems with superhuman abilities in specialized fields
(driving, Go, etc.) over so-called "artificial intelligence" any day.

~~~
alanbernstein
Driving and go were considered waaay deep into "hard AI" just 10 years ago.
Even if you mean to ask this question about "general artificial intelligence",
you only need to go as far as speech or text comprehension for an example of
something that needs to behave more or less like a person.

~~~
morgante
> Driving and go were considered waaay deep into "hard AI" just 10 years ago.

Oh, I know that. In fact, I think there are some very useful and very hard
tasks which even AGI (at least a basic one) would not be able to solve.

My point is that we should focus on applications, not person-ness. Whether a
technique is similar to human cognition or not, or the result is similar to a
person, is immaterial.

Put simply, I don't think it matters much if all our super-powerful machines
are ultimately Blockheads.

~~~
kazagistar
There is value in trying to emulate the human mind, which is in learning how
the human mind works. A proper simulation of human intelligence would be an
invaluable building block for health fields like psychology. Of course this is
a very different kind of AI, and the focus on application specific measures of
AI success have distracted from this goal.

------
return0
What are "traditional nets" ? What are the "other learning algorithms" ? What
is a universal algorithm (and for what)? Neural nets are universal function
approximators. There isnt something [edit: a function] they can't learn. When
stacked they seem to produce results that are eerily human-like.

I think the "universal algorithm" in the article refers to some kind of
emergent intelligence. Well, nothing that he mentions precludes it. Our brains
aren't magical machines. Neural nets may not model real neurons, yet it is
amazing how they can produce results that we identify as similar to the way we
think. There is nothing in computational neuroscience that comes close to
this. If anything, the success of deep nets bolsters my belief in
connectionism rather than the opposite. I would expect it is very difficult to
formulate "intelligence" mathematically, and to prove that DNs can or cannot
produce it.

~~~
ktRolster
>Neural nets are universal function approximators. There isnt something they
can't learn.

Teach one to sort an arbitrary list. A universal function approximator is not
a Turing machine.

~~~
riyadparvez
There is a new research going on in Neural Turing Machine which, in theory,
can exactly do that. Maybe we have to wait some time to see NNs can sort
numbers.

~~~
eva1984
On a high level, sort() is a seq2seq function, I think throw LSTM/RNN will
yield some interesting results to this problem.

------
EGreg
The truth about most automation in general:

The logic is written by humans. The _main_ mechanism by which computers /
robots begin to outperform people in eg playing Chess, Go or Driving, is
_copying what works_.

Humans outperformed animals because they were able to try stuff, recognize
what works and transmit that abstract information using language.

The main advantage of computers is being able to _quickly and easily copy
bits_ and check for errors. You can have perfect copies now, preserving things
that before could only be copied imperfectly.

And now you copy algorithms that work. The selection process might need work
but the actual logic is still written by some human somewhere. It's almost
never written by a computer. Almost all the code is actually either written by
a human or at most generated by an algorithm written by a human, which takes
as input code written by another human.

What's the "smarter" thing is the system of humans banging away at a platform,
all making little contributions, and the selection process for what goes into
the next version. That's what's smarter than a single human. That and the
ability to collect and process tons of data.

All the current AI does is throw a lot of machines at a problem, and stores
the result in giant databases as precomputed input for later. That's what most
_big data_ is today. Whoever has the training sets and the results is now
hoarding it for a competitive advantage.

But really, the thing that makes all the system smart is that so many humans
can make their own diff/patch/"pull request". Anyone can write a test and
submit a bug that something doesn't work. That openness what made science and
open source successful.

Open source has served the _long tail_ better, too. Microsoft builds software
that runs on some hardware. Linux has been forked to run on _toasters_. Open
source drug platforms would have helped solve malaria, zika and other diseases
faster.

If we had patentleft in drugs, we'd outpace bacterial resistance. Instead we
have the profit motive, which stagnated development of new drugs.

~~~
return0
There is more to that than precomputed data, sometimes a network can point out
mappings that were not thought of beforehand. Humans copy what works, we spend
~25 years perfecting that ability.

~~~
EGreg
Aren't we simply storing the precomputed result of human-devised feature
extraction algorithms?

Yes they can run unsupervised, and yes they run for multiple iterations. But
they are written by humans for extracting features from data, not much
different than, say, PageRank 20 years ago iterating to approximate the
"popularity" of a page on the web.

This AI doesn't see a problem and come up with an algorithm to solve it. It
uses algorithms written by _humans_ to accomplish goals set by _humans_.
Nearly all AI today is still simply a preprogrammed machine.

What I am saying, though, is that _humans + a system_ becomes smarter than a
single human over time because we find ways to express "what works" in such a
manner that a massively parallel computing platform can then find a better
answer, faster and more consistently. That's it. This is _the algorithm
written by humans_ and collectively refined by humans banging away at a
platform.

And why open source would be better than the profit motive when it comes to
drugs.

~~~
Diederich
I hear what you're saying, but I don't think this is an algorithm made by
humans:
[http://karpathy.github.io/2016/05/31/rl/](http://karpathy.github.io/2016/05/31/rl/)

Of course, humans wrote the framework, but nothing to do with how to play the
game. Granted, the goals are set by humans.

~~~
EGreg
Actually, that algorithm is the result of many iterations of a simple
algorithm written by a human. It is itself the output of the algorithm, not
too different than the machine code generated by a compiler after
optimizations. The author says as much and names the algorithm-producing
algorithm:

 _But at the core the approach we use is also really quite profoundly dumb
(though I understand it’s easy to make such claims in retrospect). Anyway, I’d
like to walk you through Policy Gradients (PG), our favorite default choice
for attacking RL problems at the moment._

What the human is doing is identifying the problem, writing a solution, and
simply letting the computer work out the parameters through various
statistical methods. But the resulting algorithm itself is pretty much in the
narrow class of algorithms that were already described by the human. The
computer just executed a dumb and straightforward search through a space of
parameters, which itself was a simple preprogrammed algorithm.

Look, most of our science is also pretty much parametrized models, often with
smoothness assumptions for calculus. Now with Deep Learning we may indeed find
more interesting parametrized models. But that is a far cry from understanding
abstract logical concepts and manipulating them to come up with _entire
algorithms from scratch_ to solve problems.

~~~
PeterisP
You seem to put a lot of weight into the notion that all these things can be
reduced to essentially a search for parameters with in a particular (not all-
encompassing) solution space.

Do we have any good reason to suppose that a parametrized model isn't _enough_
for everything we'd want, including a system that has human-level or higher
intelligence and creativity? (assuming adequate structure that we don't yet
know, that allows for a sufficiently large solution space)

We have good reasons to assume that we ourselves, the sum of a particular
person's memories, skills, identity and intelligence, are contained within a
particular set of parameters encoded by different biochemical means in our
brains, and the process of how we learn skills, facts and habits is
_literally_ a search through that space of parameters.

"far cry from understanding abstract logical concepts" is more related to the
types of problems we're tackling - symbolic manipulation and reasoning is a
valid but very distinct field of AI, but it's not particularly useful for
these problems any more than it's useful to having a human programmer craft
explicit algorithms for computer vision.

You would expect a computer system "to come up with entire algorithms from
scratch to solve problems" if you were making a computer system to solve the
general problem of machine learning, i.e., a program to replace the research
scientist making learning systems, not the human who currently solves the
particular problem. We aren't trying to do that, it is a bit of a different
direction, isn't it?

~~~
EGreg
_We have good reasons to assume that we ourselves, the sum of a particular
person 's memories, skills, identity and intelligence, are contained within a
particular set of parameters encoded by different biochemical means in our
brains, and the process of how we learn skills, facts and habits is literally
a search through that space of parameters._

Actually, I have better reasons to assume that a parametrized model would
poorly describe a human brain. We grow organically out of cells replicating in
an environment for which they have been adapted. These cells make trillions of
neural connections. Each cell has its own DNA etc.

We have already tried understanding just the DNA using straightforward
parametric models, and they are too organic to be described that way.

It is far more likely that human brains are specifically adapted to the world
they live in, and can operate with abstract concepts which are encoded in
fuzzy ways (like a progressive JPEG, for example) that allow us to _apply
concepts to situations_ and _search for concepts that fit situations_. The
concepts themselves are the hard part. It's not really a parametric model.
Each concept represents _experience_ that is stored between neurons.

Yes, we can teach these concepts to a computer eventually but we would have to
figure out a language to express this info and data structures to store it.
We'd still be designing the computer to mimic what we think we do. Ultimately
for the computer to truly replicate what humans do it might need to simulate a
gut brain, neurotransmitters etc. And even it would be only a simulation.

I think computer intelligence is just of a very different sort that human
intelligence. Less organic, far less ability to come up with new concepts or
reprogram itself to "understand" concepts. It is fed parametrized models and
does a brute force serch or iterative statistical approximations, and then
saves the precomputed results, that's all. That's why humans can recognize a
cat with a brain that fits inside your head and consumes low energy, and
computers need a huge data center which consumes a lot of energy.

We aren't replicating human intelligence. We are building huge number
crunchers, and the algorithms are still written by humans.

Even our languages are too tied up in organic experience acquired over the
years (refereces to current events, puns, emotions, fear of some animals vs
dominance over others, inside jokes of each community etc) that language
recognition is currently quite dumb and has trouble with context. Once again
we solve this by dumbing down the human input, making people talk to computers
differently than they would if the computer had "understood" anything they
would say as a _human with similar experience_ would.

When computers write algorithms to solve arbitrary problems the way groups of
humans do, then I'd admit we made a huge leap forward. As it is, AlphaGo and
self-driving cars are the result primarily of _human_ work and refinement of
the algorithms. It just is amazingly smart because computers crunch numbers
fast, consistently and _replicate_ what works across all the instances.

Computer AI does raise philosophical questions of identity and uniqueness, but
currently they are not capable of true abstract thought.

The closest system I know is Cyc: [http://www.wired.com/2016/03/doug-lenat-
artificial-intellige...](http://www.wired.com/2016/03/doug-lenat-artificial-
intelligence-common-sense-engine/)

And once again the rules "we all know" were fed to it by humans through a
language and data structures and code devised by humans and now we will judge
whether it does well and replicate the result to millions of machines. We are
still doing nearly all the actual design.

------
argonaut
Not sure why this is so highly upvoted. Nobody is questioning that deep
networks work better than shallow ones, and there is a good understanding in
academia of why (that fits with most lay people's intuition). I hardly
consider that the most interesting or relevant question.

~~~
ewjordan
Actually, as a great fan of what has come out of the current deep learning
hypefest, I'd question whether "deep" really matters. Most of the great
successes have resulted from medium depth nets using the same shitty backprop
algorithms that have been known for decades.

"Deep learning" is a red herring. They're just doing exactly what the shallow
learning pioneers told them to do twenty years ago, just with more computing
power.

~~~
riyadparvez
There are a lot of innovations recently, not just the computing power -- layer
by layer unsupervised training, batch normalization, ReLU, dropouts, momentum,
gated memory unit, neural language modeling, encoder-decoder just to name a
few.

~~~
ewjordan
Yep, there are a lot of innovations, and plenty that people are holding back
as well.

None of it involves the word "deep", it's all just bog standard 1980 style
nets with tweaks.

------
arcanus
"Since I am feeling especially bold, I will make another prediction: deep
learning will not produce the universal algorithm. There is simply not enough
there to create such a complex system."

While I (emotionally) agree, it will be interesting to see if the complexity
(and non-linearity) of these algorithms permit 'emergent' behavior to appear.

~~~
ffwd
Or, without wanting to sound too conspiratorial, it's easier to sell the magic
of "intelligent machines" to the consumer if they have been drowned in AI
coverage in the media when there finally is helpful algorithms and useful
voice interactions and so on. This is completely speculative on my part but I
wonder about the autonomy of people and how it could change mental models if
all the machines around you are supposed to be smarter than you

~~~
marlag
Has anyone proven logically that I, an entity, can create a new entity that is
more intelligent than me? My intuition might be wrong, but how would that ever
be a possibility, other than me aiding in the creation of a new human being?

~~~
mcbits
Relatively weak human arms have enough strength to assemble tools that are
much stronger than the original arms. A small, well placed flame can initiate
a reaction that burns even hotter and brighter than itself. Our abstract
concepts of strength, heat, brightness, and so on, are latent to varying
degrees in the environment. Certain arrangements of raw materials release that
potential.

If those analogies hold, then intelligence is still another concept expressed
by some configurations of matter. Who knows how much "latent intelligence" is
available to be released, but I guess the assumption is that it's much greater
than what's already manifesting in our brains.

------
nbvehrfr
From my intuitive understanding (not an expert), very abstract description how
it works in general: \- you have real world problem -> task which you need to
solve \- you build model (algorithm, math method & etc) which should solve the
task \- you need to find optimum of the complex function (error function)

Third step is usually finding optimum of the function. Deep neural networks
help you to move complexity from step 2 to step 3. One example you mentioned,
when feature engineering is moved from 2 -> 3\. So you can use simpler methods
on step2 to solve same problems, or extend problems area which you can solve
with the same complexity on step2.

------
estefan
Can anyone recommend a good resource that summarises what the different
algorithms are best suited for aimed at novices?

I've been working my way through
[http://neuralnetworksanddeeplearning.com/](http://neuralnetworksanddeeplearning.com/)
(with a big detour back into maths thanks to the Khan Academy) and have done a
few ML courses, but they mainly cover a couple of algorithms, not all the ones
available in spark's MLLib or tensorflow for example.

------
yason
In my opinion, in the 80-90's, neural networks and machine learning used to be
10% a solid concept in terms of academic research and 90% hype. Now neural
networks and machine learning are 10% a solid concept in terms of being a
practical applicable tool and 90% hype. Things have changed a lot because I
almost run out of fingers when trying to express the orders of magnitude in
which raw processing power has increased. You can literally feed the network
with anything when training and get reasonable results later in recognition.
That's one impressive yet humanly vague hash table there. And no, you don't
have to wait for months or weeks anymore to train new things. Not even days,
necessarily.

Why people pull in artificial intelligence is both naively optimistic and
quite understandable. Modelling something of a neural system is so close to
how biological brain works that the parallel is blatantly obvious. On the
other hand, the current deep networks do not translate to intelligence; not at
all. Machine learning might be, in part, something we could describe as
"intelligent" as it's able to connect dots that are very difficult to connect
by traditional algorithms but it absolutely is no intelligence. Then again, we
do hang out in the same neighbourhood. If we will ever create an artificial
intelligence in software I'm quite certain it will be very much based on some
sort of massively deep and parallel network of dynamic connections.

I'm not that interested in artificial intelligence myself. I would be
interested in artificial creativity and emotional senses, but to model those
there are bigger metaphysical questions to be answered first.

------
pkghost
I love the last sentence, and want to expand on it. If ANNs are tools to help
computers perceive, then they are analogous to components or layers in the
nervous system. If we map the nervous system thoroughly enough and understand
the inputs and outputs of each layer/region, then reproducing a human-like
nervous system might not be all that complicated.

~~~
narrator
If reproducing the human nervous system weren't that complicated, we could do
drug design inside a computer. The ability to do that alone would be worth
billions.

~~~
Xcelerate
> If reproducing the human nervous system weren't that complicated, we could
> do drug design inside a computer.

While the myriad nuances of the entire human body are indeed significant
roadblocks to drug development, we have a long way to go before those concerns
represent the primary bottleneck to progress. If a simulation of a protein's
local environment were to reach chemical accuracy (via either some algorithmic
breakthrough in quantum chemistry or the development of scalable quantum
computers), that would be a _huge_ boon to drug development.

------
peter303
People have been working on neural nets for over 50 years now. The topic goes
in and out of fashion. Nets are more powerful now and computers vastly more
powerful.
[https://en.m.wikipedia.org/wiki/Perceptrons_(book)](https://en.m.wikipedia.org/wiki/Perceptrons_\(book\))

------
Cozumel
When you have to train a network with a zillion images of a dumbbell
([http://www.businessinsider.sg/googles-ai-can-teach-us-
about-...](http://www.businessinsider.sg/googles-ai-can-teach-us-about-the-
human-brain-2015-7/#.V1AWhT9VK1E)) for it to recognise what a dumbbell is and
then it still gets it wrong (adding arms!), then somethings fundamentally
broken, in so much humans don't learn like that. DL is a huge step forward but
it's not ever going to be any kind of AGI.

------
DrNuke
As usual with tools, even these, a clear understanding of the specific
problem, the relevant metrics and the expected goal is decisive. I am saying
that experimental protocols are still devised by humans against a cost vs
opportunity matrix. Brute computational force is not independent yet,
artificial intelligence has not emerged yet.

------
EGreg
This article shows how desp learning is different than true _human-like_
understanding:

[http://www.wired.com/2016/03/doug-lenat-artificial-
intellige...](http://www.wired.com/2016/03/doug-lenat-artificial-intelligence-
common-sense-engine/)

------
Xcelerate
> deep learning will not produce the universal algorithm

I'm curious what HN users think the "universal algorithm" will end up looking
like?

My own guess (wild speculation) is that we'll start moving in the direction of
concepts like tensor networks. While that term sounds like it has something to
do with machine learning, it actually falls under the domain of theoretical
physics. Tensor networks are a relatively recent development in quantum
mechanics that show promise because of their ability to extract the
"interesting" information from a quantum state. Generally speaking, it's very
difficult to compute/describe/compress a quantum state because it "lives" in
an exponentially large Hilbert space. Traditionally, the field of quantum
chemistry has built this space up using Gaussian basis functions, and the
field of solid state physics has built it up using plane waves. The problem is
that regardless of the basis set chosen, it appears as though exponentially
more basis vectors are required to accurately describe a quantum state as the
system becomes larger.

Tensor networks are an attempt to alleviate this problem. While it is true
that the state space of an arbitrary quantum system is exponentially large in
the number of particles, it turns out that for _realistic_ quantum systems,
the relevant state space is actually much smaller — i.e., real systems seem to
live in a tiny corner of Hilbert space. And this tiny subspace even includes
all of the possible states that one could put a collection of qubits into
within the lifetime of the universe.

The projection of a system's state vector into either the position or momentum
basis is known as the system's "wavefunction" (some texts allow more than
these two bases). Since the wavefunction exhibits the highly desirable
property of being localized in position/momentum space, this allows one to
build up a good approximation to the state using Gaussians or plane waves —
that is, unless the wavefunction exhibits strong electron correlation (quantum
entanglement). Quantum entanglement is the exception to nature's tendency to
localize state space about a point in spacetime, and thus it is frequently the
case that the most commonly used basis sets are highly suboptimal for many
real electronic systems (superconductors stand out as a notable and somewhat
pathological example).

I'm not entirely familiar with all of the math behind it, but tensor networks
essentially describe the small but relevant region of Hilbert space by
exploiting properties of the renormalization group. In this sense, a compact
way of describing "real world" quantum states is developed. I think this has
applications to a "universal algorithm", because real world data rarely
consists of a random or uniform scattering of information across the data's
state space. In my own research, I've found that a lot of the NP-hard problems
I run into are efficiently solvable in practice (stuff involving low rank PSD
matrices) precisely because the data _isn 't_ random. If tensor networks are
good at finding a basis set that is "local" in abstract Hilbert space with
regard to some real-world set of quantum states, then it seems as though they
would work equally well for a lot of the real world data that lives on a low-
dimensional manifold in a high-dimensional space — the kind of data that
machine learning (and eventually artificial general intelligence) seeks to
tackle.

------
isseu
Talking about the benefits of dnn, what about the levels of abstraction? Each
layer add levels of abstraction that you can't see in shallow networks.

~~~
clmcleod
I think the concept of "interpretability" is what you are getting at. I group
that in with automatic feature engineering, since they are the same idea from
different perspectives. Sometimes that is a benefit, sometimes it's not:
[http://blog.keras.io/how-convolutional-neural-networks-
see-t...](http://blog.keras.io/how-convolutional-neural-networks-see-the-
world.html)

------
stared
> "deep learning will not produce the universal algorithm"

I doubt that a general algorithm exists (why should it?).

But well, if we are talking about human-level (or superhuman-level) AI, it is
good to remember that WE are deep, recurrent neural networks (with a very
different implementation, and spikes instead of floats, but still). If it work
in vivo, why its abstracted version shouldn't work in silico?

------
armitron
Entirely content-free post. Click-bait most likely.

------
radarsat1
> Nothing is more frustrating when discussing deep learning that someone
> explaining their views on why deep neural networks are “modeled after how
> the human brain works” (much less true than the name suggests) and thus are
> “the key to unlocking true artificial intelligence”.

While I get what he is saying here, and more or less agree, I think it is not
to be taken lightly that there _is_ a significant difference in this
discussion now as compared to 30 years ago. The difference is not _how_ neural
networks work, which clearly differs but is related in some ways to the brain,
but rather _what_ neural networks see.

What is really significant when you can handle lots and lots of data, and
throw it all at a giant neural network, is what we see happening in the
network. The observation that the hidden-layer filters developed as an optimal
feature for classifying images appear to be Gabor-like directional filters
(I'm referring of course to this type of thing [1]) is not random, and not an
insignificant result. It really does relate to perception, in the sense that
1) we know that the brain has directional filters in the visual cortex and 2)
more importantly, from signal processing theory we know that such filters are
"optimal" from a certain mathematical point of view, and if they develop
naturally as the best way to interpret "natural" images (or other natural
data, such as audio [2]), it shows that development of such filters in the
brain is perhaps also quite likely. There is quite some research in
neuroscience at the moment looking for evidence of such optimal filters in
early neural pathways.

So yes, neural networks are not models of "how the brain works", but the newly
established ability to process huge amounts of data, and to examine what kind
of learning happens in order to optimise this processing, can tell us a lot
about the brain -- not how it works, but what it must _do_. Complemented with
work in neuroscience, the idea of modeling information processing is _not_
unrelated and can really lead to some significant contributions in our
understand of perception.. and perhaps, eventually, cognition -- but who
knows.

The misunderstanding here is thinking that the be-all and end-all of
neuroscience is studying how neurons fire and interact. Neuroscience is much
more than that. Neuroscientists want to know how we experience and understand
the world, and a big part of that is understanding what is required to process
and interpret information, what is the information, what are its statistics,
and what kind of neural processing would be required to extract it from our
sensory inputs. Of course, this must be complemented by studies of how humans
_do_ react to stimuli, to try to verify that we _do_ process information
according to some model. But that model being verified -- that comes from what
we know about information processing, and computer science can contribute
there in a significant way.

[1]:
[https://computervisionblog.files.wordpress.com/2013/05/gabor...](https://computervisionblog.files.wordpress.com/2013/05/gabor.png)

[2]:
[http://www.nature.com/neuro/journal/v5/n4/abs/nn831.html](http://www.nature.com/neuro/journal/v5/n4/abs/nn831.html)

------
dredmorbius
Define your terms. WTF is "Deap Learning"?

~~~
robotresearcher
[https://en.wikipedia.org/wiki/Deep_learning](https://en.wikipedia.org/wiki/Deep_learning)

I suspect your question was somewhat rhetorical, but since you asked.

~~~
dredmorbius
No, it's a direct criticism of lazy or sloppy writing style in which authors
fail to communicate effectively.

The term has specific meaning within a specific field (not clearly
identified), and has been growing rapidly (which is to say: hasn't reached a
stable level of penetration) within printed material over the past decade or
so.

