
Attacking machine learning with adversarial examples - dwaxe
https://openai.com/blog/adversarial-example-research/
======
Dowwie
"attackers could target autonomous vehicles by using stickers or paint to
create an adversarial stop sign that the vehicle would interpret as a 'yield'
or other sign"

yeah, this article needs to go to the top of HN and stay there for a while

~~~
echelon
I wonder if such "patterns" could find use in clothing or as bumper stickers.
I can envision this sort of thing taking off as a counter-culture, or anti-
technology social weapon. It certainly wouldn't be hard to produce and iterate
on.

I imagine it would be hard to enforce, let alone legislate against subtle
visual cues that trigger machine vision signals.

Interesting times lie ahead...

~~~
rhcom2
People have been using a similar idea to counter facial recognition.

[https://cvdazzle.com/](https://cvdazzle.com/)

~~~
sp332
More recently, CMU made glasses that can make you show up as someone else.
[http://qz.com/823820/carnegie-mellon-made-a-special-pair-
of-...](http://qz.com/823820/carnegie-mellon-made-a-special-pair-of-glasses-
that-lets-you-steal-a-digital-identity/)

------
pakl
Adversarial examples are just one way to prove that deep learning (deep
convolutional nets) fail at generalizable vision. It's not a security problem,
it's a fundamental problem.

Instead, ask yourselves why these deep nets fail after being trained on huge
datasets -- and why even more data doesn't seem to help.

The short answer is that mapping directly from static pixel images to human
labels is the wrong problem to be solving.

Edit: fixed autocorrect typo

~~~
TTPrograms
"Adversarial examples are just one way to prove that deep learning fail at
generalization"

Do you know what proof is? Adversarial examples demonstrate that there is one
esoteric failure mode of current deep learning models, one that for all we
know is present in human vision (we can't take derivatives with respect to the
parameters of our own neurons). It will likely be solved in the next few
years. At a minimum you start training on adversarially generated examples.

This response is absolute hyperbole and clearly devoid of any factual
knowledge of the nature of deep conv nets and their properties.

~~~
pakl
Training on adversarial examples doesn't solve the fundamental problem, it
merely tries to plug the holes. But in such high dimensional spaces there are
many many holes to be plugged. :)

Agreed the failure mode may seem esoteric, but note that OpenAI is making a
big deal about them.

A non-esoteric way to demonstrate the lack of generalization is to feed a deep
conv network real world images (from outside the dataset). Grab a camera and
upload your own photo. Roboticists who try to use deep conv nets as real world
vision systems see these failures all the time.

~~~
TTPrograms
FYI, @OpenAI:

"At OpenAI, we think adversarial examples are a good aspect of security to
work on because they represent a concrete problem in AI safety that can be
addressed in the short term."

[https://openai.com/blog/adversarial-example-
research/](https://openai.com/blog/adversarial-example-research/)

Hardly proof that deep learning is fundamentally flawed.

Regarding real world issues, these issues come up when you don't separate
training and test (and real world) sets properly. My worries would be with
implementation.

~~~
pakl
I'm certainly not saying that deep learning is fundamentally flawed. It's a
great method, very powerful. (Excellent algorithm.)

I'm saying it's not reasonable to expect good generalization in deep convnets
that learn mappings from static images to human labels. (Wrong problem.)

------
scythe
I'm actually wondering how much the no-free-lunch theorem for data compression
affects adverserial examples. A neural network can be conceptualized as an
extremely efficient compression technique with a very high decoding cost[1];
the NFLT implies that such efficiency must have a cost. If we follow this
heuristic intuitively we're led to the hypothesis that an ANN needs to expand
its storage space significantly in order to prevent adversarial examples from
existing.

[1] -- consider the following encoding/decoding scheme: train a NN to
recognize someone's face, and decode by generating random images until one of
them is recognized as said face. If this works then the Kolmogorov complexity
of the network must exceed the sum of the complexities of all "stored" faces.

------
danbruc
So what features are those networks actually learning? What are thy looking
for? They can not be much like features used by humans because the features
used by humans are robust against such adversarial noise. I am also somewhat
tempted to say that they can also not be to different from the features used
by humans because otherwise, it seems, they would not generalize well. If they
just learned some random accidental details in the trainings set, they would
probably fail spectacularly in the validation phase with high probability but
they don't. And we would of course have a contradiction with the former
statement.

So it seems that there are features quite different from the features used by
humans that are still similarly robust unless you specifically target them.
And they also correlate well with features used by humans unless you
specifically target them. Real world images are very unusual images in the
sense that almost all possible images are random noise while real world images
are [almost] never random noise. And here I get a bit stuck, I have this
diffuse idea in my head that most possible images do not occur in the real
world and that there are way more degrees of freedom into direction that just
don't occur in the real world but this idea is just too diffuse so that I am
currently unable to pin and write down.

~~~
pakl
> I have this diffuse idea in my head that most possible images do not occur
> in the real world and that there are way more degrees of freedom into
> direction that just don't occur in the real world but this idea is just too
> diffuse so that I am currently unable to pin and write down.

Yes! You're on the right track! The number of degrees of freedom of images of
pixels and textures is HUGE. There is not enough data to practically learn
directly from those images. So the deep networks are starved for data -- even
with the big datasets they are trained on. (It's only thanks to the way they
are set up they do well when tested on very similar images, like sharp hi-res
photos. But they fail to generalize to other kinds of images.)

So how can you reasonable reduce these degrees of freedom?

It turns out that the continuity of reality itself provides a powerful
constraint that can reduce the degrees of freedom. See, when a ball rolls
along, this physical event is not just a collection of textures to be
memorized. It's an ordered sequence of textures that vary in a consistent and
regular way because of many learnable physical constraints (like lighting).

So, it turns out you can reduce the dimensionality by making a particular kind
of large recurrent neural net learn to predict the future in video. Our very
preliminary testing shows it works shockingly well.

~~~
hulahoof
That sounds very interesting is there somewhere I can keep up with the
progress?

------
zitterbewegung
There was a presentation at defcon 2016 about another software package that
attacked other deep learning models. See

[https://www.youtube.com/watch?v=JAGDpJFFM2A](https://www.youtube.com/watch?v=JAGDpJFFM2A)

[https://github.com/cchio/deep-pwning](https://github.com/cchio/deep-pwning)

------
spott
Are there any examples of these kinds of adversarial patterns that don't look
like noise?

While it is pretty easy to add noise to another image, it isn't exactly easy
to do it to a real object. The noise wouldn't remain the same as you change
perspective with respect to the sign, which would likely change its
effectiveness.

~~~
jdc
[https://cvdazzle.com](https://cvdazzle.com)

------
jseip
I can't see that image without thinking of Snow Crash. This is almost
literally Snow Crash for neural nets.

------
L_226
Anyone want to make a mobile app that emits 'noisy' light so when you use your
phone in public CCTV facial recognition fails?*

I'd be interested to know if this is a viable concealment strategy. It might
only be effective at night or low light situations, so sunlight doesn't wash
out the noise. It would be pretty subtle to use as well, how many people do
you see walking around with their noses stuck to a screen?

* For research purposes only, of course.

------
Terribledactyl
I've also been interested in using adversarial examples in extracting
sensitive info from models. Both extracting unique info from the training set
(doesn't seem feasible but I can't prove it) or doing a "forced knowledge
transfer" when a competitor has a well trained model and you don't.

------
a_c
I wonder if adversarial examples can be deliberately used as a kind of
steganography? Kind of like a hidden QR code. On the surface, the product
looks a panda, with deliberately added signal. Under the hood, it is
classified as gibbon. It could be used to verify the authenticity of a
particular product.

------
oh_sigh
As a defensive measure, why can't random noise just be added to the image
prior to classification attempt?

~~~
antognini
The issue is essentially due to the curse of dimensionality. Since your input
is very high dimensional, you can find some direction where you don't have go
very far to maximize some other output. You can find a few of these
adversarial examples and add them to the training set, but there are going to
be an exponential number of other directions where you only have to go a
little farther. So you can't just add all these perturbations to your training
set because there are too many of them. It's a hard problem. (I've been
thinking about this issue a lot for my research...)

~~~
m-j-fox
I think she means you can add white noise to the front end making the output a
little bit non-deterministic. Since it won't give the same confidence number
twice, it frustrates gradient descent.

~~~
antognini
Just adding white noise won't work. If you add white noise to an image the NN
will make almost exactly the same prediction. The issue with adversarial
images is that you add a _very particular_ perturbation to the image, not just
any perturbation at all.

~~~
darawk
I think the idea proposed is that if you add white noise on top of the
adversarial perturbation, it will destroy that very particular perturbation.

~~~
esoterica
It won't destroy the perturbation. There is a low dimensional manifold along
which the cost function will decrease/increase. The adversarial perturbation
lies on that manifold, but random white noise (which is a random high
dimensional vector) will have close to zero length with high probability when
projected onto the manifold and hence won't affect the cost function.

------
Florin_Andrei
It's like 'fake news', but for computers.

~~~
visarga
You're right. Like fake news, it's sometimes hard to identify an adversarial
example because it exploits weaknesses in perception or judgement.

------
aidenn0
Without knowing much about ML, it seems that using two (or more) very
different methods could be a reasonable defense; if the methods are
sufficiently different then it will get exponentially harder to find a
gradient that fools all the methods; what to do when the outputs strongly
disagree is a good question, but switching to a failsafe mode seems better
than what we have now.

~~~
lerid
These adversarial inputs have been shown to generalize to separately trained
models.

~~~
aidenn0
I was thinking use the same training set but different ML techniques; It's
been almost 20 years since I took an AI class, but RNNs weren't the only thing
in there...

------
huula
Every example provided in the article is model and training data specific, it
only tells one thing, your data is not telling the truth, so why not getting
better data.

~~~
lerid
These adversarial inputs have been shown to generalize to separately trained
models.

------
n3x10e8
This is going to be the SQL injection in the AI age

------
jmcminis
Adding high frequency noise "fools" ML but not the human eye. It feels like
this is a general failure in regularization schemes.

Why not try training multiple models on different levels of coarse grained
data? Evaluate the image on all of them. Plot the class probability as a
function of coarse graining. Ideally its some smooth function. If it's not,
there may be something adversarial (or bad training) going on.

~~~
lerid
This seems like an obvious thing to try, but technically the convolutional
structure already looks at different scales (in a way similar to mipmaps).

~~~
jmcminis
I've built a CNN before and that's not my understanding of it. The high
frequency noise changes the output of the first layer on the CNN. This is what
gets pooled as you go deeper into the net. Coarse graining is like getting rid
of the weights you have for the first layer and replacing them with something
uniform (average the smallest details together).

