
What are the limits of deep learning? - pizza
https://www.pnas.org/content/116/4/1074
======
endymi0n
I think the current critique of Deep Learning as a technique is a dead end.
Now I'm only a semi-professional in a sea of recent development and research,
but if you ask me, the technique itself can take us all the way to General
Intelligence and more.

What's missing from this discussion is how we humans don't just use a single,
vision based "deep net" in order to identify the banana, we use lots of very
different deep nets that run in parallel and cross-validate each other.

We have a vision net like computers that can be fooled with the same tricks
(just google "optical illusion").

But a second net does three-dimensional inference of the banana shape from
stereo vision and cross-validates that with a toaster. If in doubt, unnoticed
parallax movement will be commanded to our body in order to get a better shape
estimation.

Another net parses the content. A banana in a jungle or on a table would be
highly plausible. A banana in an operating room or on a roller coaster would
be implausible.

Another net parses context and timeline. If no banana was there and no banana
has been incoming through our continuous estimated 3d space, there is probably
no banana there.

There's smell, taste, all kinds of things running in parallel that
continuously cross-validate our perception of this being a banana, probably a
lot of different models just from vision alone that feed back into the main
classification thread.

We'll even happily do looped cross validation if our eyes rest on a scene. Our
memory will fetch reference data for all first order predictions and start
subconciously processing this into the main prediction. It will predict how a
banana feels, how soft it is, how warm. If we see a banana on lava that is not
burning, our logic and reasoning will challenge the prediction. All of this
happens subconciously, and there's no magic behind.

If you ask me, there are no limits of deep learning. There's just still a
limit of imagination in feature extraction and processing.

~~~
YeGoblynQueenne
>> We have a vision net like computers that can be fooled with the same tricks
(just google "optical illusion").

I don't know of any optical illusions where a bus turned upside down is
perceived as a plowshare, or an image of an elephant superimposed on the image
of a room causes the other objects in the room to be misidentified.

Human optical illusions reveal the structural bias in the way we see, but that
has nothing to do with adversarial examples and overfitting to a few pixels.

~~~
endymi0n
> I don't know of any optical illusions where a bus turned upside down is
> perceived as a plowshare, or an image of an elephant superimposed on the
> image of a room causes the other objects in the room to be misidentified.

I wouldn't be so sure about that actually. Observing my daughters, I'm
sometimes extremely surprised about how they identify objects. As you grow,
your concious mind probably simply does not notice the continuous cross-
validation happening.

Also, at the severe augmentation levels of years of ~50 fps "live video"
training data, I can imagine there's another compensation algorithm at work in
humans that tries to logically orient any picture into a most common,
"upright" fashion before even trying to identify anything.

~~~
YeGoblynQueenne
I don't understand what you mean by cross-validation. I'm guessing you don't
mean actual cross-validation, where you test a model learned on a training
partition against a testing partition, etc?

~~~
tabtab
Maybe it's about context. One wouldn't expect to see a (realistic) elephant in
a living room. For humans, if something seems "out of place", we tend to focus
on it more to make sure we are interpreting it correctly. NN's have no sense
of "out of place" other than "knowing" X and Y are not normally seen together
in the training set.

Humans give oddities "more CPU" (or run like heck to avoid being trampled and
worry later about being right). We apply life experience and logic to explore
it deeper. In the elephant case, the shadows look wrong also, which should
trigger the "Maybe it's a cheap Photoshop job" module in our brain. But NN's
don't understand the concept of cheap Photoshop jobs because they don't browse
the web for cute kitten videos and fun but fake news like humans do. Maybe
NN's can be trained for fake photo detection, but solving every identification
hurdle of the human world by adding yet another dimension to the training step
probably won't scale.

------
evrydayhustling
Yet another piece that cherrypicks AI failures in order to make vague thesis
that "deep learning is not enough". I can't decide if it makes the article
more or less honest that the "Beyond Deep Learning" section is essentially a
catalog of... recent developments in Deep Learning. The need to (selectively)
pidgeonhole and then judge (on impossible to operationalize criteria) a fast-
evolving field is purely rhetorical and, worse, aimed at folks who don't
understand the way the field is progressing.

I've got nothing against raising platforms for alternative lines of AI
research, especially including highlighting the ways they complement current
deep learning But this article format is so transparent and tired.

~~~
dmreedy
The criteria certainly are impossible to judge currently. Which makes your
assertion that the field is 'progressing' strange.

I think there's a lot of merit to remembering the last time we got this
excited about a given model for AI. And the time before that. And the time
before that.

Go back further, to the time that we were all so excited that we could model
the entire world with a little bit of beautiful logic. And the subsequent
disappointment.

There is so much nonsense out there, so much hand-waving, mixed signals, false
promises, downright naivety, and both malicious and completely inadvertent
miscommunication. A little rigor in a field _this_ excitable would be nice, I
say.

~~~
evrydayhustling
The progress is well defined. Both academic and industrial deep learning
groups aggressively define challenge tasks in vision, NLP, relational learning
and more, then document progress on them. There is even an unprecedented
release of code and datasets to let others replicate and extend work.

Detractions like your "so much nonsense out there" pick straw man goalposts -
whether it's AGI (whatever that really means) or "the craziest thing I heard
from IBM marketing". The fact that some people are talking about crazy
milestones doesn't mean that there aren't real ones being met.

------
alexandernst
I'm not an IA expert in any way; I just merely started learning some basic
stuff, but so far the only thing I see from all the algorithms I have covered
so far is that IA, deep learning, ML and so on are just statistical databases
or sets, so to call them. They don't learn anything, they just repeat. From
that I can say that IA is limited only to the data you can label and then feed
it.

------
YeGoblynQueenne
>> And it could make the networks far less vulnerable to adversarial attacks
simply because a system that represents things as objects, as opposed to
patterns of pixels, isn’t going to be so easily thrown off by a little noise
or an extraneous sticker.

But, representing objects and their relations is how Good, Old-Fashioned AI
works and the inability to deal with noisy input was, exactly, where GOFAI
systems stumbled. How are relational representations going to suddendly help
neural nets (famously robust to noise already) better handle noise?

In any case, if we really want to learn relations there are symbolic learning
techniques that can do that very cheaply from very few examples. Are we
reinventing the wheel, only with more artificial neurons, now? And what's the
point of that?

~~~
AstralStorm
The problem is that neither AI can differentiate between something new and
noise. Even probabilistic models have this problem to a lesser degree. So the
idea is indeed to have an ANN learn a kind of a semantic network - then the
problem is still pruning it and clustering reliably as well as detecting the
new.

------
hprotagonist
"NNs that rely on backprop only really work on continuous functions and many
interesting things like language[0] are not products of those" is real high on
my list of limits.

[0]: [https://www.linkedin.com/pulse/google-hyping-why-deep-
learni...](https://www.linkedin.com/pulse/google-hyping-why-deep-learning-
cannot-applied-easily-berkan-ph-d)

~~~
yorwba
That argument seems to be based on the idea that because written language uses
a discrete encoding, it's impossible to use continuous functions. But as soon
as you replace deterministic discrete values by continuous probabilities of
those values, you're back in the space of continuous functions. Even non-NN
approaches to NLP use that representation, because optimizing continuous
parameters is something we can do very well. Discrete optimization is much
harder.

~~~
YeGoblynQueenne
And how well do all those techniques work when it comes to representing
meaning, rather than structure?

~~~
yorwba
I think meaning isn't really separable from structure. But assuming you're
talking about semantics vs. syntax, then continuous representations can
certainly do a lot more than just syntax.

They can model how words change in meaning over time [1], they can identify
words that mean the same in different languages [2] and use that to translate
whole sentences [3] without ever seeing a direct translation.

Of course none of those tasks are solved to perfection yet, but _some_ kind of
meaning is definitely being captured.

[1] [https://arxiv.org/abs/1703.00607](https://arxiv.org/abs/1703.00607)

[2] [https://arxiv.org/abs/1710.04087](https://arxiv.org/abs/1710.04087)

[3] [https://arxiv.org/abs/1710.11041](https://arxiv.org/abs/1710.11041)

~~~
YeGoblynQueenne
I disagree. If I tell you that argoubr + vliugo - viyt = plqtv, do you now
understand what any of those words mean? If I say that "gye" must always
precede "turo", do you know what "gye turo" means, or why it doesn't make
sense to say "turo" without "gye"? What kind of meaning is captured by the
relations between these words that these rules I just described, define? Would
it help if I gave you all the possible ways that every word in the language
those words are drawn from are to be articulated together, without telling you
what any of those wors mean? And how would I explain those words' meaning
without recourse to words whose meaning you already know?

Also, that's what I mean by "structure" \- the context in which tokens are
found in text. I don't want to say "syntax" or "semantics" because it seems to
me those are loaded terms that have very specific, er, meaning and I am not a
linguist to be able to unpick it.

~~~
yorwba
> Would it help if I gave you all the possible ways that every word in the
> language those words are drawn from are to be articulated together, without
> telling you what any of those wors mean? And how would I explain those
> words' meaning without recourse to words whose meaning you already know?

That's why I linked the two papers on unsupervised machine translation. Given
enough text in an unknown language, it is already possible to turn those
unknown words into words which humans understand.

~~~
YeGoblynQueenne
But that's humans. Humans already understand language.

I don't disagree that humans can extract meaning from language models. But
that only speaks to the ability of humans to extract meaning from even very
badly formed language. Not to the ability of those models to represent
meaning.

Edit: OK, I don't think I understand what you, er, mean. I think you're saying
that language models essentially have a kind of understanding of language, and
not just the ability to reproduce it without understanding it. Is that what
you are saying?

------
cwkoss
The limit of deep learning is the quantity and accuracy of training data. For
some problems, you can't run 10^x trials: you can synthesize training data but
it may be of questionable real-world relevance and lead to overfitting of the
generation algorithm.

~~~
randcraw
The other big limit to DL is its inability to learn a new concept from a
_single_ training example, AKA one-shot learning.

Given that humans _do_ most of their supervised learning this way underscores
that, even though DL nets intriguingly resemble the architecture of the brain,
in fact, DL's current learning process clearly departs from the one used by
the brain. (Unless the way a baby fiddles with each new object is in fact a
form of data augmentation, increasing the visual sample space by changing the
object's angles. But I digress...)

Better to drop the "Neural Net as Brain" metaphor before we Procrustes it to
death.

~~~
hnick
It also has no imagination, unless I've missed something (I admit I really
need to catch up in this area).

Humans are experts at imagining 'what if' scenarios and learning from things
that never even happened. "If this was bright red, is it still a cat? Or do
all cats have a limited range of colours on a fixed cat-colour-spectrum?" I'm
not sure if any DL systems do that?

~~~
randcraw
Proposing what-if scenarios and asking questions of your own and using your
imagination generally falls under the name 'counterfactuals' \-- a big area of
AI research. But to date, the priorities of deep learning have focused
elsewhere (mostly in perception of images and speech via discrete
classification).

Likewise, 99% of the computing strategy used by DL practitioners and
researchers continues to use large scale supervised learning -- because it
continues to make advances in many problem spaces just as it is.

Is it possible to escape DL's apparent limits and use it differently to solve
problems that lie in the counterfactual domain? Perhaps. But so far, there's
been relatively little work or modest results along those lines.

Given the architectural model used by DL (large matrix multiplications and
iterations using SIMD operations on GPUs), the prospect of applying it to
alternative AI objectives that are more conventionally symbolic and whose
methods are typically algebraic (and unnatural to share the same computational
model as DL) do not seem to be especially promising places for DL to thrive.
Not so far anyway.

------
darawk
> Then there’s the opacity problem. Once a deep-learning system has been
> trained, it’s not always clear how it’s making its decisions.

This is no less true of human decision making.

------
otabdeveloper1
What do you mean 'limits'?? There are no limits to deep learning! Singularity
here we come!

(That was sarcasm. Sad that a disclaimer is needed nowadays.)

