
Deep Neural Networks Are Easily Fooled - sinwave
http://arxiv.org/abs/1412.1897
======
zackchase
These arguments were introduced by Szegedy et. al. earlier this year in this
paper:
[http://cs.nyu.edu/~zaremba/docs/understanding.pdf](http://cs.nyu.edu/~zaremba/docs/understanding.pdf).
Geoff Hinton addressed this matter in his Reddit AMA last month.

The results are not specific to neural networks (similar techniques could be
used to fool logistic regression). The problem is that ultimately a trained
network relies heavily on certain activation pathways which can be precisely
targeted (given full knowledge of the network) to fool networks into
misclassification on data points which might to a human seem imperceptibly
changed from those which are correctly classified. It is important to
understand adversarial cases, but unreasonable to get carried away with
sweeping pronouncements about what this does or doesn't about all neural
networks, let alone intelligence generally, or the entire enterprise of AI
research, as seems to happen after a splashy headline.

~~~
Dn_Ab
I'd characterize this differently and also, as a lot more interesting than
that. understanding.pdf can be viewed as a sort of dual to this paper but
they're not covering the same thing. In Szegedy et al., they constructed
invisibly perturbed images that resulted in the misclassification of
previously correctly classified images. Here, the results of a search were
images whose classification have little to no visual similarity to typical
members of that class.

In a way this is interesting because it's a sort of visualization of what the
network views as important in discriminating between different objects. It's
also interesting as a display of how alien the learned model's view of the
world is.

Take optical illusions...optical illusions are remotely similar to this sort
of exploit, although the sort of scene modeling we do is a lot more complex
than recognition or decomposition. Anyways, illusions exploit cues that result
in distorted recognition but not drastically so, unlike the case for these
networks. My guess is that this is due to animal vision using a lot more high
level cues -- cues that are also useful in a natural setting -- depending on
things like size, color, shade, lines, context and so on. Visual systems are
also a lot more proactive, filtering out things that don't make sense, fudging
color at the edges of vision, smoothing out shades and generally making
inferences and deductions about what it should be seeing and how things are
"supposed" to be. In fact, a good number of illusions exploit those aspects of
vision.

In the case of these networks, the cues are incomprehensible, having no
natural counterpart, so we see most of them as noise. But sometimes they make
a kind of sense, as in the starfish, baseball and sunglasses examples. Based
on the observations in the paper, I would guess only a handful activations
strongly associated to each feature are responsible for each susceptibility.

With animal brains the distortions usually end up in a slightly transformed
space, a different scaling or something. It's useful to match a bit
overzealously and get something like pareidolia but it also makes sense to
have the conflations actually be like something you might run into. The ANNs
have no such incentive.

Their paper also wonders about whether this is unique to discriminative
classifiers. Would a generative classifier, with access to a proper
distribution, be so susceptible? That'd be very interesting to see.

They also mention some real world consequences, some of which I disagree with.
Neural Networks are good at interpolating between examples, so if your
training has good coverage over what is to be expected then it'll work very
well. And in the era of big data this isn't really a problem (that they don't
generalize as we do might explain some of why they have trouble with abstract
images) so I'm skeptical an image search solution would be thrown off by
textures.

There is, however, a better example of facial or speaker recognition. For
example, you could train a network to distinguish between faces or voices and
then evolve a pattern against it. This could then be used in such a way as to
be randomly matched to an individual on a target database. Not good.
Driverless cars are also mentioned but those are typically augmented beyond
just vision. Personally, I'd add medical scans to the list of things to be
careful with.

Finally, it's worth mentioning that some of the evolved images are inspired
works of art. And a few of the images optimized (not evolved) with an L2
penalization are recognizable without the label and a few more where you can
see why it gave the label it did.

Your offhanded dismissal was unwarranted IMO.

~~~
romaniv
_The whole point_ of this paper is that these 'illusions' look like noise to
humans and make no sense in relation to real-life objects, and you're hand-
waving it away as if it was some insignificant, tangential technicality.

-

I'm willing to bet that in late 70s many people also thought that rule-based
systems could lead to real AI if they were "more complex" and had more
computing power available.

~~~
sgt101
May be they were right?

[http://www.cs.berkeley.edu/~russell/papers/ptai13-intelligen...](http://www.cs.berkeley.edu/~russell/papers/ptai13-intelligence.pdf)

------
akiselev
We humans are as brilliant at pattern matching as we are in finding patterns
that aren't really there, not just with our vision but with our understanding
of probability , randomness, and even cause & effect. Thankfully, our brains
are very complicated machines that can recognize a stucko wall or a cloud and
invalidate the false identification of a face or unicorn or whatever based on
that context.

With that in mind, is it really surprising that [m]any of our attempts at
emulating intelligence can be easily fooled? An untold number of species have
evolved to do exactly the same thing: exploit the pattern matching errors of
predators to disguise themselves as leaves or tree branches or venomous
animals that the predator avoids like the plague. DNNs seem to be relatively
new and we've got a long ways to go, so is this a fundamental problem with the
theoretical underpinning or do we just need to train them with far more
contextualized data (for lack of a better phrase)?

Is there any chance of us having accurate DNNs if we can, as if gods during
the course of natural selection, peek into the brain of predators (algorithms)
and reverse engineer failures (disguises for prey) like this?

~~~
Dewie
What I don't get about AI optimism: So we humans are very fallible and can
easily make mistakes. Computers are better at us when it comes to problems
that are clearly defined and for the problem is decidable (can implement an
algorithm to solve any instance of the problem). But we need AI when the
problems are more fuzzy, like recognizing a lion in a picture.

How can we build a mostly automated future, if the AIs that are supposed to do
our jobs turn out to be very fallible as well? They won't - supposedly - have
the problem of being self-aware and being able to follow their emotions rather
than their own best judgement and reasoning. But it seems that some problems
are inherently prone to making mistakes. Can it be avoided at all? And if so,
who do we blame when an AI makes a "mistake" like that? The training set?

~~~
derefr
People's fallibility goes down as you throw more of them at a task—not because
the majority will be right, but because the signal adds up while the noise
cancels out. This is what the efficient market hypothesis, "wisdom of crowds",
etc. are basically about.

If you train 1000 AIs on different subsets of your training corpus, their
ensemble will be much "hardier" than one AI trained on the entire corpus. The
automated future comes from the fact that you didn't need 1000 full training
corpii to get this effect, nor do 1000 AIs cost much more than one to run,
once you've built out hardware enough for one.

In other words, AI makes the application of "brute-force intelligence" to a
large problem cheap enough to be feasible, in the same way slave labor made
building pyramids by brute force cheap enough to be feasible.

~~~
nightski
There are many examples where crowd behavior exhibits less "wisdom". Take any
market bubble for example. Have we ever looked at a tough scientific problem
and came to the conclusion that the best path forward was to collect as many
random people off the street and shove them in a room to solve it?

Also bootstrapping or model parameter selection techniques are already heavily
used in AI and have not yet brought us this future. I believe that the model
you presented has been simplified a bit too much ignoring a lot of important
variables.

~~~
compbio
> examples where crowd behavior exhibits less "wisdom"

When crowds act irrational there is usually a problem in communication. The
crowd would still be able to solve problems better after they repair these
communication channels. For instance, some rocket launches failed because
information by engineers lower in the chain of command, did not work its way
up to the decision makers. The group was too compartmentalized, but launching
a rocket is necessarily a group effort. 9/11 could have been prevented, or the
aftermath lessened, if communication between intelligence agencies was better.
In a market crash we often see a single actor making a decision or prediction,
and there is little to no reward for people down the chain to disagree with
that prediction, or even adjust it (leading to insufficient variance in the
predictions). Everyone is blindly chasing the experts, while in a good group
setting there is no need to chase the experts.

> Have we ever looked at a tough scientific problem and came to the conclusion
> that the best path forward was to collect as many random people off the
> street and shove them in a room to solve it?

Has a scientist ever solved a tough problem growing up in isolation to other
scientists? I consider "standing on the shoulders of giants" to be a form of
group intelligence. But yes: We have done something similar at RAND
corporation. The problem was: Forecast the impact of future technology on the
military. The solution was to collect experts (not random people), put them in
a room with an experiment leader, and gradually converge to the best forecast,
using anonymous feedback every round. It's called the Delphi Technique and it
is still in use.

Also, there is an experiment running right now, that takes random civilians,
has them answer intelligence questions ("Will North-Korea launch a nuke within
30 days?") and gives weights to their answers, according to previous results.
This way random civilians trickle up to the top, that _individually_ beat _a
team_ of trained intelligence analysts, simply using their gut or Google. It's
called the "the Good Judgment Project". Put ten of those civilians in a room
and you have an intelligence unit that is not afraid to be wrong, does not
have a reputation to uphold, and does not care about any group pressure,
authorities or restrictive protocols that may hamper a group of real
intelligence analysts.

> Also bootstrapping or model parameter selection techniques are already
> heavily used in AI

I believe the parent was talking about model ensembling/ model averaging, not
ensembling techniques used by single models, like the boosting or bagging that
random forests use. If you have a single attack as input crafted for a single
model, then a voting ensemble of three models (lets say: random forests with
Gini split, regularized greedy forests and extremely randomized trees) will
not be foiled.

------
monochr
I have nothing intelligent to say without reading the full paper...

...But, how different is this from the various optical illusions humans fall
for? I mean we can't exactly tell the difference between a rabbit and duck
ourselves[1] so isn't it just a universal property of all neural-network like
systems that there will be huge areas of mis-classifications for which there
hasn't been specific selection?

[1] [http://mathworld.wolfram.com/Rabbit-
DuckIllusion.html](http://mathworld.wolfram.com/Rabbit-DuckIllusion.html)

~~~
praptak
I believe that the point of the article is that the triggers for optical
illusions are totally different in humans and the ANNs. I don't know how valid
is this statement - humans sometimes do recognize "shapes" in white noise too.

------
cLeEOGPw
The vulnerability exploits imperfections in the NN weights. To avoid this kind
of mismatch all you need to do is shift the same image by 1 pixel (assuming
recognition is done per pixel), and you can cross check results to check if an
error occurred.

Human brain recognizes better because it can sample the image many times from
many slightly different angles. There's a reason saccade
([http://en.wikipedia.org/wiki/Saccade](http://en.wikipedia.org/wiki/Saccade))
exists.

~~~
phreeza
This is highly oversimplified. The previous example of antagonistic samples
being found works even if the training data was perturbed as you described.
The reasons for saccades are also a lot more complex, involving photoreceptor
adaptation etc.

------
Animats
There was a similar result a few months ago for another type of machine
learning. (That's note 26 in this paper.) The problem seemed to be that the
training process produces results which are too near boundaries in some
dimension, and are thus very sensitive to small changes. Such models are
subject to a sort of "fuzzing attack", where the input is changed slightly and
the output changes drastically.

There are two parts of this process that are kind of flaky. The problem above
is one of them. The other part is feature extraction where the feature set is
learned from the training set. The features thus selected are chosen somewhat
randomly and are very dependent on the training set. It's amazing to me that
works at all. Earlier thinking was to have some canonical set of features
(vertical lines, horizontal lines, various kinds of curves, etc.), the idea
being to mimic early vision, the processing that happens in the retina.
Automatic feature choice apparently outperforms that, but may not really be
working as well as previously believed.

It's great seeing all this progress being made.

~~~
simonster
Automatic feature choice can actually lead to a set of features that resembles
V1 receptive fields (as demonstrated by Olshausen and Field in
[https://courses.cs.washington.edu/courses/cse528/11sp/Olshau...](https://courses.cs.washington.edu/courses/cse528/11sp/Olshausen-
nature-paper.pdf)).

I recently attended a talk by Geoff Hinton on "capsules." He pointed out that
the max pooling used in convolutional neural networks effectively disregards
information about relationships among features. Instead, he propose a network
composed of "capsules" that each estimate whether an implicitly defined
intermediate feature is present and its pose. The idea is that an object is
present only if its intermediate features are present and their poses agree.
He showed some neat results from these models (some published in
[http://arxiv.org/pdf/1412.1897v1.pdf](http://arxiv.org/pdf/1412.1897v1.pdf),
and some from
[http://www.cs.utoronto.ca/~tijmen/tijmen_thesis.pdf](http://www.cs.utoronto.ca/~tijmen/tijmen_thesis.pdf)).
Notably, these models can evidently learn to classify MNIST with >98% accuracy
given only 25 labeled examples. (I am not sure how many unlabeled examples
were used.) I don't have any experience with these models, but given that most
of these images look like a single feature embedded in noise or as a texture,
I would not be surprised if a capsule-based network would not be so
susceptible to these images.

------
benanne
The discussion about this paper on r/MachineLearning is quite insightful and
worth reading:
[http://www.reddit.com/r/MachineLearning/comments/2onzmd/deep...](http://www.reddit.com/r/MachineLearning/comments/2onzmd/deep_neural_networks_are_easily_fooled_high/)

------
MrQuincle
I start to like this way of looking for false positives or false negatives
more and more.

It would be interesting to introduce some kind of aspects known from the human
brain and see if the misclassified items "move" in some conceptually
understandable direction.

* Introduce time. Humans are not just image classifiers; humans are able to recognize objects in visual streams of images. Such streams can be seen as latent variables that introduce correlations over time as well as space. What constitutes spatial noise might very well be influenced in our brains by the temporal correlations we see as well.

* Introduce saccades. A computer is only able to see a picture from one viewpoint. Our eyes undergo saccades and microsaccades. That's an unfair advantage for us, being able to see a picture multiple times from different directions!

* Introduce the body. We can move around an object. This again introduces correlations that 1.) are available to us, and 2.) might define priors even when we are not able to move around the picture. In other words, we can (unconsciously) rotate things in our head.

~~~
sxyuan
You might be interested in this paper, if you haven't seen it already:
[http://arxiv.org/abs/1406.6247](http://arxiv.org/abs/1406.6247)

------
larrydag
Another journal paper covering the same thing.

[http://arxiv.org/abs/1312.6199](http://arxiv.org/abs/1312.6199).

And the article I got that references it.

[http://www.i-programmer.info/news/105-artificial-
intelligenc...](http://www.i-programmer.info/news/105-artificial-
intelligence/7352-the-flaw-lurking-in-every-deep-neural-net.html)

~~~
krick
[https://news.ycombinator.com/item?id=8544911](https://news.ycombinator.com/item?id=8544911)

------
comex
One might say that Picasso's Bull is a human equivalent of this: he "evolved"
a sequence of images and ended up with something that has very few features of
a bull, but nevertheless gets recognized by humans as such.

Then again, unlike the neural networks in the paper, humans would be capable
of classifying abstract images into a separate category if asked.

------
ifdefdebug
I know literally nothing about this science, so that paper had me concerned
about the following question:

Given a visual face recognition door lock or similar system. If I want to
break such a door lock, can I install that system at home, train it with
secretly taken pictures of an authorized person, and evolve some kind of key
picture with my home system until I can show it to the target door lock and
fool it into giving me access?

OK this is a very simplified way to put the question, but is that something
this paper would imply to be possible (in a more sophisticated way)?

~~~
therobot24
If i'm understanding you correctly you want to train a second system with a
first system's authorized user, then use the 'key' to open the first system.

A someone who actively researches biometrics I can't say that this is a good
method for a few reasons.

1) Systems often train templates which look very different than the original
input, especially if more than one image is involved in training. These
templates aren't necessarily going to be recognizable to the first system
(even if they can be represented as a 2D image).

2) Many enterprise systems (such as from Honeywell or whoever) include
liveness tests and spoofing measures. Though anecdotally they are not very
good, they check for basic measures such as if the pupil expands and contracts
from a burst of light.

3) Most biometrics that involve access to some place (verification) usually
include a 3rd party monitoring said access.

If you were to do this for say someone's home. Depending on the system you may
gain access with a high definition photo as many consumer systems are set to a
higher false accept rate (FAR) to prevent user aggravation. However, if they
set it to be very strict (giving a larger false reject rate) then the best way
would probably be attack at their sensor directly. That is, the system often
doesn't care about the surroundings, it's trained for one task (open if
authorized user).

------
jacobsimon
I don't have too much of a problem with this actually, because a lot of the
"nonsense" images actually bear strong resemblance to the objects. The gorilla
images clearly look like a gorilla, the windsor tie images clearly show a
collar and a tie. The image coloring is way off of course, but the gradients
seem about right.

------
crimsonalucard
If we could find out the selection criteria behind each layer of the neural
network for the human visual cortex we could possibly build something more
accurate.

Although I doubt the visual cortex is a simple feed forward network like the
one used in the paper. It's likely to have a non linear structure that's
significantly more complex.

------
bitL
So, deep neural networks are like artists, able to see a structure in chaos?
Like when Michelangelo looked at a large stone, seeing David there
immediately, so do DNNs recognize lions in white noise? We should applaud
introduction of phantasy and imagination into science ;-)

------
fallenpegasus
What this tells me that there probably exist deeply weird images that would be
recognized as something by one person or by very few people, but would be just
an unrecognizable mash of colors and lines to everyone else.

------
yummyfajitas
I wish they explained why evolutionary algorithms were used. They seem to
suggest gradient ascent also works - I wonder what the key criteria are for
constructing good adversarial images?

------
hippich
This brings interesting question. Is it possible to hack human brain? Will
specific set of stimuli make brain react in certain way?

~~~
compbio
A field in cognitive science that asks these questions is "gestalt
psychology".
[http://en.wikipedia.org/wiki/Gestalt_psychology](http://en.wikipedia.org/wiki/Gestalt_psychology)

------
SeanDav
If the shoe was on the other foot, I can imagine a race of super computer AI's
administrating a similar test to humans and saying look at the puny human
vision system. It is fooled easily by simple optical illusions that wouldn't
fool even a 2 year old AI. Clearly, there are questions about the generality
of the human vision system and perhaps it is not fit for purpose...

~~~
romaniv
Yes, of course, what constitutes sensible image recognition is just a matter
of opinion, and opinions of some algorithm are as valid as yours. In fact, my
pseudo-random number generator seeded with image binary accurately recognizes
100% of images I give it (based on my newly developed definition of image
recognition).

~~~
SeanDav
I have got no idea of the point you are trying to make. You seem to be
criticizing something I said but not sure what. You do realize that I was
trying to make a somewhat ironic / somewhat humorous comment that you have to
be careful that what you test for is relevant and that one isn't focusing on a
narrow weakness which may not actually be that relevant for the general case.

~~~
romaniv
You're making a comment in response to a specific research paper. Therefore, I
interpret the comment within the context of that paper. So you're implying
that the paper is "focusing on a narrow weakness which may not actually be
that relevant for the general case". I disagree.

 _Any_ 2d image is an optical illusion, so it makes no sense to criticize
human image recognition based on it being 'fooled' by illusions. The real
criteria for whether image recognition works well or not is altogether
different.

------
robg
So are human brains.

