
The Flaw Lurking in Every Deep Neural Net - bowyakka
http://www.i-programmer.info/news/105-artificial-intelligence/7352-the-flaw-lurking-in-every-deep-neural-net.html
======
abeppu
(a) The document linked by the OP is to a blog post that discusses and
extensively quotes another blog post [1] which in turn discusses an actual
paper [2]. Naturally the paper is where the good stuff is.

(b) Both blog posts somewhat understate the problem. The adversarial examples
given in the original paper aren't just classified differently than their
parent image -- they're created to receive a specific classification. In the
figure 5 of the arxiv version, for example, they show clear images of a school
bus, temple, praying mantis, dog, etc, which all received the label "ostrich,
Struthio camelus".

(c) The blog post at [1] wonders whether humans have similar adversarial
inputs. Of course it's possible that we might, but I suspect that we have an
easier time than these networks in part because: (i) We often get labeled data
on a stream of 'perturbed' related inputs by observing objects in time. If I
see a white dog in real life, I don't get just a single image of it. I get a
series of overlapping 'images' over a period of time, during which time it may
move, I may move, the lighting may change, etc. So in a sense, human
experience already includes the some of the perturbations that ML techniques
have to introduce manually to become more robust. (ii) We also get to take
actions to get more/better perceptual data. If you see something interesting
or confusing or just novel, you choose to focus on it, or get a better view
because of that interestingness or novelty. The original paper talks about the
adversarial examples as being in pockets of low probability. If humans
encounter these pockets only rarely, it's because when we see something weird,
we want to examine it, after which that particular pocket has higher
probability.

[1] [http://www.i-programmer.info/news/105-artificial-
intelligenc...](http://www.i-programmer.info/news/105-artificial-
intelligence/7352-the-flaw-lurking-in-every-deep-neural-net.html)

[2] [http://arxiv.org/abs/1312.6199](http://arxiv.org/abs/1312.6199) or
[http://cs.nyu.edu/~zaremba/docs/understanding.pdf](http://cs.nyu.edu/~zaremba/docs/understanding.pdf)

~~~
arjie
I'm sure someone else has considered this before, but perhaps optical
illusions are similar examples of data that causes pathological behaviour?

Many of them constrain our viewpoint or the sequence of images viewed so we'd
be unable to take the actions you mention to handle them.

~~~
theforgottenone
Yes. But more importantly, we are continuously training our network. There is
no terminating "training set" except the set of all considered
classifications. Our discovery of ourselves being "wrong" about a thing is our
network continually adjusting. We also have the notion of ignorance. These
classifiers are often forced to come to a conclusion, instead of having "I
don't know, let me look at it from another angle" kind of self-adjustment
process. "Aha, it is a cat!" moments do not happen for ai. In us, it would
create a whole new layer to wrap the classifier around some uncertainty logic.
We would be motivated to examine the reasons behind our initial failure, and
use the conclusions to train this new layer, further developing strategies to
adapt to those inputs.

------
Houshalter
There are some possible applications of this:

Better captchas that are _optimized_ to be hard for machines, but easy for
humans.

Getting around automated systems that discriminate content. Like detecting
copyrighted songs.

Training on these images improves generalization. Essentially these images add
more data, since you know what class they should be given. But they are
optimal in a certain sense, testing the things that NNs are getting it wrong,
or finding the places where it has bad discontinuities.

~~~
varelse
"Better captchas that are optimized to be hard for machines, but easy for
humans."

Nope, not gonna work. You'd have to have the classifier/ANN parameters to
generate these in the first place in order to locate its adversarial
counterexample. Otherwise, the perturbations would likely be irrelevant noise.

~~~
Houshalter
The discovery of the paper was that these adversarial examples worked on
_other_ neural networks. Including ones trained on entirely different
datasets. They are not specific to a single NN.

~~~
varelse
Well... Not really... They split the MNIST data set and trained on disparate
halves. Which is to say I wouldn't generalize from two networks trained on far
less than 10x their parameter counts all the way to _all_ neural networks in
existence, but of course, your opinions may vary...

------
bluekeybox
Nobody pointed this out yet. It would be very interesting to keep finding such
perturbations that mess up learning and repeatedly add the new-found examples
to the training set, retraining the model in the process. I wonder if after a
finite number of iterations the resulting model would be near-optimal
(impossible to perturb without losing its human recognizability) -- or, if
this is impossible, if we could derive some proofs for why precisely this is
impossible.

~~~
murbard2
The article points out that this is the approach taken by the researchers who
found the effect.

------
OscarCunningham
I don't find it so surprising that, out of the vast number of possible small
perturbations, there are a few that cause the image to be misclassified. I
suppose it is interesting that you can systematically find such perturbations.
But is there anything here which suggests that a neural network which does
well on a test set won't continue to do well so long as the images given to it
are truly "natural"?

~~~
wodenokoto
The interesting thing is that it gets misclassified across networks.

According to the blog post, I can build two NN with different structures and
train them on a random subset of a collection of dog and cat pictures. Distort
a random picture until network A misclassifys it, then according to the
article network B will also misclassify it, despite it having a different
structure and a different training set.

I don't think it's obvious that network B will fail as well.

~~~
OscarCunningham
You're right. I guess I just don't like the fact that it's titled as "The Flaw
Lurking In Every Deep Neural Net", when in fact neural nets will continue to
classify new data as well as ever.

I agree that what you point out is very interesting.

------
varelse
More like "The Flaw Lurking in Every Machine Learning Algorithm with a
Gradient"(1) IMO. For example, in a linear or logistic classifier, the
derivative of the raw output with respect to the input is the input itself
while the derivative of the input is the weight. Knowing this one can use the
weights of _any_ classifier to minimally adjust the input to produce a
pathological example.

As for humans, I submit we have all sorts of issues like this. It's just that
we have a temporal stream of slightly different versions of the input and that
keeps inputs like this from having any significant area under the curve. Have
you never suddenly noticed something that was right in front of you all along?

(1) And probably those that don't too, but it's harder to find cases like that
without a gradient (not that it can't be done, because I've found them myself
for linear classifiers using genetic algorithms, simulated annealing, and
something that looked just like Thompson Sampling but wasn't).

------
acjohnson55
Maybe it's a function of the fact that I'm not an AI expert, but I never
thought it was that specialization for features (whether semantically
meaningful or not) was localized to individual neurons, rather than the entire
net. Why would we think otherwise?

~~~
darkmighty
I suppose it was assumed it was working in a "divide and conquer" manner,
since that usually leads to a complexity reduction of algorithms? (and it was
then assumed that the division was a clear region over the previous layer)

Of course there's no real need for the network to work that way, and perhaps
this interpretation can be made if we assume that divisions are
"fuzzy"/arbitrary.

------
TheLoneWolfling
What happens when you add random noise to the inputs to the neural net?

~~~
sabalaba
If you rephrase your question to "What happens when I add random noise to the
inputs of a neural network and try to teach it to output a "denoised" version
of the input." and you've just invented denoising autoencoders.

[http://www.iro.umontreal.ca/~lisa/publications2/index.php/pu...](http://www.iro.umontreal.ca/~lisa/publications2/index.php/publications/show/217)

~~~
TheLoneWolfling
Alright, that's neat. Not at all what I was suggesting, but neat nonetheless.

------
_greim_
I'm not an expert in any sense, just a curious bystander. Assuming that the
ratio of perturbations causing misclassifications to ones that don't is
extremely low, couldn't you perturb the image in the time dimension, such that
"dog" misclassifications would be odd blips in an otherwise continuous "cat"
signal, with some sort of smoothing applied that would average those blips
away? And in fact wouldn't that be the default case in some real world
implementations, such as biological NNs or driverless car ones? The input to
the NN would be a live video feed captured via light focused on some kind of
CCD, which is going to be inherently noisy.

------
ivanca
Of course having data that contradicts the patterns that the neural network is
looking for is going to make it err even when is subtle. Humans have it easier
because we handle more abstract concepts. One way to solve this is
"simplifying" the data: In practical terms (for images) that means applying a
bilateral filter[0], also know in Photoshop as "surface blur".

[0]
[http://en.m.wikipedia.org/wiki/Bilateral_filter](http://en.m.wikipedia.org/wiki/Bilateral_filter)

~~~
streptomycin
That's not the interesting part, the interesting part is:

 _What is even more shocking is that the adversarial examples seem to have
some sort of universality. That is a large fraction were misclassified by
different network architectures trained on the same data and by networks
trained on a different data set._

~~~
ivanca
Yeah, It is the interesting part. Because with sufficiently large inputs (even
from different sources) the networks are going to be looking for the same
patterns.

~~~
streptomycin
But why would different structures of networks fit to different data result in
exactly the same type of "overfitting"?

------
nathanathan
The way people classify hybrid images is a function of how far away they are.
I wonder if these are essentially hybrid images for neural nets. It seems like
the noise being added is very high frequency. Given that, I would bet that
neural nets classify typical hybrid images the same way as they would the
sharper image component.

------
andrewchambers
Humans have a constantly changing view of things, so any adversarial examples
from one angle quickly change when viewed from another angle.

This issue can be framed in another way, something incorrectly classified
could become easily classified with very minor changes in perspective.

~~~
Houshalter
It's not clear how resistant the changes are to random noise. Or how easily it
would be to modify their procedure to create images which work even with
random noise.

I do suspect noise would help, but some of the changes are things like
blurring edges and lines that NNs are sensitive to. Adding noise would just
make that worse.

~~~
andrewchambers
I'm not talking about noise or blurring, though that's another point.
Something like a self driving car doesn't operate on a single image, it
operates on a time series of images from multiple angles and points in time. I
think an adversarial example which lasts for a prolonged period of time from
multiple angles would be rare if not impossible.

It needs to be investigated further.

------
coldcode
I've always thought that our brains are more complex than we can model at the
moment. There is some fundamental concept that we are missing and it allows
the brain to classify things we can't do with even our best neural nets today.

------
bsaul
A side question i've always wondered : if you train a nn to recognize a
particular person's face from photos, will it be able to recognize a drawing /
cartoon of that person ?

~~~
tinco
No, unless you've explicitly trained it to match photos to cartoon images. The
article is a bit sensationalistic when it says 'what if a NN would misclassify
a pedestrian crossing as an empty road?'. The truth is you can't compare a
simple photo recognizer with human perception. Human perception obviously can
do far more advanced things than just matching photos to memories of photos,
it's not even very good at matching photos. We have depth perception, we have
object isolation, we can remember and abstract shapes, we can extrapolate and
interpolate, we have loads and loads of context. We don't see the world in
RGBA. We see the world as a continuous stream of related information.

All these systems together make sure a human would never[1] mistake a
pedestrian crossing to an empty road, and allow us to match abstract paintings
to realistic images. Any serious artificial autonomous agent would similarily
consist of many independent but contextualized systems.

edit: 1] Never as in, never unless the pedestrian makes a good effort to look
like an empty road to all of those systems

------
guard-of-terra
Should be easy to counter. Just show several dithered images to NN. Of course
if there is a deliberate attack this may not be sufficient.

------
eximius
I am immediately forced to think of Godels Incompleteness Theorems. Can it be
proved that these examples always exist within some bounds of manipulation?

------
dang
Url changed from [http://thinkingmachineblog.net/the-flaw-lurking-in-every-
dee...](http://thinkingmachineblog.net/the-flaw-lurking-in-every-deep-neural-
net-by-mike-james/), which points to (actually, does a lot more than just
point to) this.

------
michaelochurch
This is a really interesting find. At the same time, I have this lurking fear
that it will be misappropriated by the anti-intellectual idiots to marginalize
the AI community, cut funding, et cetera. Another AI winter, just like the one
after the (not at all shocking) "discovery" that single perceptrons can't
model XOR.

If anything, this shows that we need more funding of machine learning and to
create more jobs in it, so we can get a deeper understanding of what's really
going on.

