
One pixel attack for deceiving deep neural networks - astdb
https://arxiv.org/abs/1710.08864
======
taneq
When discussing adversarial attacks on neural nets, everyone seems to just
kind of take for granted that the human visual system is immune to this kind
of fudgery. I'd be very interested (and a little afraid) to know if this is in
fact the case - we know of a bunch of optical illusions that are perceived
differently to what they "ought to be", so it's not too much of a stretch to
figure that we could come up with a technique to generate more extreme
examples.

~~~
aqsalose
Having read some very introductory stuff about human visual system, I'd be
willing to guess that a partial reason is that the human vision works on the
input level quite differently than "observe NxM colored pixels in grid". The
act of constructing what we'd think as a single "view" of our surroundings
(that one intuitively might equate with a still pixel image) includes lots of
low-level processing.

(E.g. Everyone probably has heard about the blind spot at the point where
optic nerve 'connects' to retina. But I was surprised to learn that most of
the stuff in our FOV is seen very poorly, and only the tiny area called macula
is responsible for the high acuity vision. And the macula is capable of
covering only a small spot in our FOV at one time: what we'd think as a single
view is really constructed in a post-processing step as our eyes move and scan
our surroundings. My amateur-level explanation might be a bit off, but the
point is, it stands to reason that kind of "dynamic" processing system is more
robust (especially to small pixel level "attacks" as in the linked paper) than
simply constructing mappings of still pixel grids to outputs, as currently
done by CNNs.)

~~~
TeMPOraL
What we know about the brain points to lots and lots of top-down influence on
visual processing. Your learned preconceptions can go down the stack and
alter/override raw sensory data that gets fed from your eyes up the stack,
forming loops that are arguably absent in ANN solutions.

------
tbabb
If by knowing the structure of a NN a "trick" image can be specially crafted
to fool it, what is the equivalent for a human brain?

If we get to the point of having a human connectome to analyze-- or the kind
of access to neural topology that a neural lace would provide-- could an
optimizer generate an image of static that human would mistake for the
president of the United States?

It seems outwardly implausible that such an image could exist, but perhaps
that is only because we've never seen one (or if we had, would we know?), and
a "blind" search of images would never find it, as the space of images is
galactically huge. With a "map" of the brain it might suddenly become
plausible.

And if so, that world sounds absolutely terrifying to me.

~~~
YeGoblynQueenne
>> If we get to the point of having a human connectome to analyze-- or the
kind of access to neural topology that a neural lace would provide-- could an
optimizer generate an image of static that human would mistake for the
president of the United States?

Probably not, because our brain doesn't seem to work like Artificial Neural
Networks do. Most notably, we learn to identify novel objects after only
seeing a single example of them while ANNs may require many thousands of
examples. We don't seem to learn to identify individual pixels, either (though
what exactly our brain does when we learn to identify objects from their
images is anyone's guess).

Digital images are also not a very good analogy for what human eyes see: our
vision doesn't have "pixels" and we don't even need images to be particularly
clear to identify them with good accuracy (we can still tell what things are
up to a point, even in the dark, when it rains, when our visual field is
occluded etc).

Generally, you can't expect the human brain to work like an ANN. Like others
have said before [1], the "neural network" analogy is not a very good one. It
often serves only to create confusion about the capabilities of ANNs and the
human brain.

________________

[1] [https://spectrum.ieee.org/automaton/robotics/artificial-
inte...](https://spectrum.ieee.org/automaton/robotics/artificial-
intelligence/facebook-ai-director-yann-lecun-on-deep-learning)

    
    
      Yann LeCun: My least favorite description is, “It works just like the brain.” 
      I don’t like people saying this because, while Deep Learning gets an inspiration 
      from biology, it’s very, very far from what the brain actually does. And describing 
      it like the brain gives a bit of the aura of magic to it, which is dangerous. It 
      leads to hype; people claim things that are not true. AI has gone through a number 
      of AI winters because people claimed things they couldn’t deliver.

~~~
Houshalter
I think the differences between ANNs and BNNs are exaggerated. I think they
probably work on similar principles. Even if there some differences.

But none of that is particularly relevant. Even if they are completely
different, so what? The same procedure could still work. Take a biological
brain and backprop through it to find exactly what inputs change the outputs
by some small degree and tweak it bit by bit until you change the output. You
can apply this to any function, it's general.

Adversarial examples are exceedingly rare in natural data. They require
tweaking exactly the right pixels in exactly the right direction. It's
something stupid like a one in a billion billion chance of such a malicious
example occurring by chance if you just randomly add noise to images. It
requires a very precise optimization procedure.

So if adversarial examples did exist for human vision, we probably wouldn't
know it yet. They don't occur in nature. So there's no reason for the brain to
have evolved defenses against them. (Though camouflage is an interesting
natural analogy, it's not quite the same.)

~~~
YeGoblynQueenne
>> Take a biological brain and backprop through it to find exactly what inputs
change the outputs by some small degree and tweak it bit by bit until you
change the output.

How exactly do you "backprop through" a (biological) brain?

Also, I don't see why you'd ever want to do that "tweak it bit by bit" thing
to a brain. Human brains seem to catch on to ideas pretty quickly. They don't
need to go back and forth on their synapses a million times until they learn
to react to a stimulus. Whatever the brain does is light years ahead of
backprop, which is, all things considered, a pretty poor algorithm. So why
would you ever want to do that "backprop on a brain" thing, if you could do-
you know, what brains do normally?

~~~
Houshalter
If you have a perfect simulation of it, you can just run it step by step and
create a computation graph and go backwards through it.

The algorithm used to do the optimization doesn't really matter. Use a GA or
hillclimbing if you want.

EDIT since you added more to your comment:

>why you'd ever want to do that "tweak it bit by bit" thing to a brain. Human
brains seem to catch on to ideas pretty quickly.

So what? That's the process for creating an adversarial image. Does it matter
how many steps it takes to create it?

>They don't need to go back and forth on their synapses a million times until
they learn to react to a stimulus.

That's exactly how learning works in humans. Try to learn to juggle with just
one or two tries. It takes thousands. And that's after you've spent years in
your body learning how to coordinate your muscles and locate objects with your
eyes and how physics works, etc.

>Whatever the brain does is light years ahead of backprop, which is, all
things considered, a pretty poor algorithm.

I really really doubt that. There are a number of theories about how the brain
might implement a variation backpropagation for learning. Hinton has one.

Backpropagation is not a poor algorithm, it's probably close to optimal. No
one has been able to come up with something better besides just little
heuristic tweaks. It's very difficult to see how you could do so. It's so
simple and elegant and general.

~~~
YeGoblynQueenne
Ah, so all you need is a perfect simulation of a brain?

~~~
Houshalter
Ideally. I'm of course talking about whether it's hypothetically possible to
do this. Once we understand the brain better, it may be possible to create a
reasonably accurate computer model and do this for real.

------
Animats
The underlying problem seems to be that deep neural network classifiers tend
to place their classification boundary surfaces very close to data points in
at least one dimension in a high-dimensional space. That makes them brittle -
perturb the data very slightly in the wrong direction and they move through a
boundary into some other classification.

I don't know enough about the subject to know why training does that, or what
can be done about it.

~~~
mamp
What's concerning is that the network used dropout, which I thought was aimed
and making networks less brittle (i.e. reduce overfitting).

~~~
RSchaeffer
Go watch Yatin Gal's talk on dropout in neural networks. He shows pretty
convincingly that the belief that dropout reduces network overfitting by
introducing noise is wrong.

~~~
mannigfaltig
Wait, that can’t be wrong because that is literally what DO does. It is a
convex hull regularizer around the network activations using noise. That is
also why dropout does not solve susceptibility to adversarial examples: It
merely extends the regions that the NN generalizes to outward; but that is
limited because high-dimensional spaces are counter-intuitively large and the
noise required to cover a descent fraction of the “unmapped” space would
completely prevent learning. AFAIK, Yarin Gal merely provides a Bayesian
interpretation of the noise.

~~~
RSchaeffer
IIRC, his "Bayesian interpretation of the noise" actually shows that dropout
performs approximate integration over model parameters. As he says, dropout
doesn't work _because_ of the noise but _despite_ the noise.

[https://youtu.be/3ONLxYeM1Sc?t=19m21s](https://youtu.be/3ONLxYeM1Sc?t=19m21s)

~~~
mannigfaltig
That seems like a strange/unnecessary way to put it because DO _is_ noise.

------
metaphor
Figure 8. Without a label, I wouldn't have been able to independently tell
anyone that this was a picture of a dog.

For what it's worth, images used in this paper are 32x32 pixels.

~~~
JepZ
I can't even see a dog when I know there should be one...

~~~
wingerlang
Looks more like one of these dump trucks to me
[http://3.bp.blogspot.com/__EzFEHn2YBI/SbEdsBdbL7I/AAAAAAAAAR...](http://3.bp.blogspot.com/__EzFEHn2YBI/SbEdsBdbL7I/AAAAAAAAARY/wDqn0-t74R0/s1600/cat797.jpg)
(why are these always yellow?)

Or half a torso of a dog... with a dump truck in the background..

~~~
mattkrause
> (why are these always yellow?

The yellow is the manufacturer's (Caterpillar) trademark color--and it
probably helps that it's very high visibility.

~~~
wingerlang
That would be my guess, but I see lots of other types of vehicles that are
normally yellow in other colors. But this specific kind I don't recall ever
seeing in another color.

Even here [0] most are yellow despite having a different manufacturer. Maybe
the difference is enough to differentiate them though.

[0]
[https://en.wikipedia.org/wiki/Haul_truck#Ultra_class](https://en.wikipedia.org/wiki/Haul_truck#Ultra_class)

------
jszymborski
So, I might be wrong, but my understanding of these attacks is that they
require you know what the model of the classifier is.

If my understanding is correct, I guess my question is: how general are these
attacks? Can we ever say "oh yah, don't try to classify this penguin with your
off-the-shelf model, it'll come-up iguana 9/10"

~~~
Donald
See [https://arxiv.org/abs/1610.08401](https://arxiv.org/abs/1610.08401) for
model agnostic examples of adversarial attacks.

~~~
justifier
This is a great paper.. the perturbation map is very reminiscent of
psychedelic imagery

Psychedelics come up a lot with nn created images, but this is interesting
that under the influence of a perturbation suddenly the classifier starts
presenting illogical assumptions

Also similar to how people recollect their own classifiers unfolding or
showing bias when considered under the influence of psychedelics

Perhaps there is some deeper analogue with a substance influenced brain's
neuronal activity

------
jaimex2
Interesting, can this be prevented by simply random blurring images first?

The NN should still be able to recognise something partially blurred if
trained that way.

~~~
draugadrotten
or perhaps adding random pixel noise to the image before attempting
training/recognition to prevent "smoothness" in an area from being a
recognized property

~~~
viraptor
Alternatively eliminate the noise by low-pass filtering in freq domain. Even a
small LED light will bleed to more than one pixel in a real photo.

------
DrNuke
From a workflow point of view, this is a datasets security / integrity
problem, isn't it? Public and open source datasets should come with some sort
of sanity check then. A best practice protocol for pre-processing private
datasets / unvetted sources should also be made public and disseminated.

------
alexott
Related article: "BadNets: Identifying Vulnerabilities in the Machine Learning
Model Supply Chain"
([https://arxiv.org/abs/1708.06733](https://arxiv.org/abs/1708.06733))

------
Too
Related: Slight Street Sign Modifications Can Fool Machine Learning Algorithms
[https://news.ycombinator.com/item?id=14935120](https://news.ycombinator.com/item?id=14935120)

------
m3kw9
One way to beat back these types of attack is to have the software do a type
of image filtering that averages out the adjacent pixel in a
indeterminate(random) way before feeding into the NN for inference

~~~
yodon
Not really. The single-pixel attacks are impressive because they just modify
individual pixels but the concept of adversarial image generation is a broad
and increasingly well studied area within neural networks. There are
infinitely many ways to structure adversarial images not just single pixel
defects.

------
knolan
Wouldn’t it be sensible to median filter the input images to remove these
types of single pixel outliers?

------
bfirsh
If you're on a phone, here's a responsive HTML version of the paper:
[https://www.arxiv-vanity.com/papers/1710.08864/](https://www.arxiv-
vanity.com/papers/1710.08864/)

~~~
jwilk
Beware that figure numbers in the HTML version are incorrect. :(

------
threecoins
This makes me feel like neural networks are nothing but glorified hash
functions. Well, if you think about it, here we are just optimizing for hash
collisions of similar things.

~~~
Sharlin
The whole point of classification algorithms, including NNs, is to map similar
inputs to similar outputs, or indeed the same output, and different images to
different outputs. Hash functions usually attempt to erase all information of
"similarity" between inputs. However, the metric that determines what
"similar" means in a NN is not necessarily what we expect it to be.

But of course NNs are definitely just functions, in the strict mathematical
sense. You could replace one with a large lookup table. The interesting part
is the training: how to come up with the function in the first place.

------
pmarreck
Maybe Marvin Minsky was right about neural nets (he’d surely be shaking his
head right now, maybe dropping an “I told you so” or two...)

------
thriftwy
It is kinda obvious that you should make multiple passes at recognition with
jitter and dithering.

------
NPMaxwell
The field of Music Theory is algorithms & patterns for music. Computer music
provides the opportunity for rapid iteration. If that direction were followed
intensely, music theory would become a sub-topic within Psychology.

------
fagnerbrack
Thank you, an academic article without paywall

------
kmbriedis
Does this publication bring any value as it was known well before that such
thing is possible? Or no-one had measured how many pictures can be
"transformed" by changing only 1 pixel?

