
A Guide to Synthesizing Adversarial Examples - anishathalye
http://www.anishathalye.com/2017/07/25/synthesizing-adversarial-examples/
======
anishathalye
Last week, I wrote a blog post ([https://blog.openai.com/robust-adversarial-
inputs/](https://blog.openai.com/robust-adversarial-inputs/)) about how it's
possible to synthesize really robust adversarial inputs for neural networks.
The response was great, and I got several requests to write a tutorial on the
subject because what was already out there wasn't all that accessible. This
post, written in the form of an executable Jupyter notebook, is that tutorial!

Security/ML is a fairly new area of research, but I think it's going to be
pretty important in the next few years. There's even a very timely Kaggle
competition about this ([https://www.kaggle.com/c/nips-2017-defense-against-
adversari...](https://www.kaggle.com/c/nips-2017-defense-against-adversarial-
attack)) run by Google Brain. I hope that this blog post will help make this
really neat area of research slightly more approachable/accessible! Also, the
attacks don't require that much compute power, so you should be able to run
the code from the post on your laptop.

~~~
zitterbewegung
Thank you very much! I requested this and I think I might have possibly come
off as rude or entitled but I am so happy you donated your time to create this
tutorial!

~~~
anishathalye
Oh, I didn't take it that way at all. In fact, I thought it was a great idea!
:)

------
0xdeadbeefbabe
> This adversarial image is visually indistinguishable from the original, with
> no visual artifacts. However, it’s classified as “guacamole” with high
> probability!

May "guacamole" become as prominent as "Alice and Bob".

------
dropalltables
This is delightful. As someone who uses AI/ML/MI/... for security, I find
there is not nearly enough understanding for how attackers can subvert
decision systems in practice.

Keep up the good work!

~~~
zitterbewegung
I was introduced to this from a Defcon panel I went to in 2016. See
[https://www.youtube.com/watch?v=JAGDpJFFM2A](https://www.youtube.com/watch?v=JAGDpJFFM2A)
. It gives a good conceptual overview.

------
jwatte
I have the feeling that the fact that imperceptible peturbations changes the
labels, means that our networks/models don't yet look at the "right" parts of
the input data.

Hopefully, this means research will focus on more robust classifiers based on
weakness identified by adversarial approaches!

------
lacksconfidence
Is the next step generating adversarial examples and injecting them into the
training pipeline?

~~~
blt
I think there have been some papers on that already. Sorry, I don't know them
off the top of my head. It's definitely a good idea.

~~~
moyix
Yes:

> Adversarial training seeks to improve the generalization of a model when
> presented with adversarial examples at test time by proactively generating
> adversarial examples as part of the training procedure. This idea was first
> introduced by Szegedy et al. [SZS13] but was not yet practical because of
> the high computation cost of generating adversarial examples. Goodfellow et
> al. showed how to generate adversarial examples inexpensively with the fast
> gradient sign method and made it computationally efficient to generate large
> batches of adversarial examples during the training process [GSS14]. The
> model is then trained to assign the same label to the adversarial example as
> to the original example—for example, we might take a picture of a cat, and
> adversarially perturb it to fool the model into thinking it is a vulture,
> then tell the model it should learn that this picture is still a cat. An
> open-source implementation of adversarial training is available in the
> cleverhans library and its use illustrated in the following tutorial.

[http://www.cleverhans.io/security/privacy/ml/2017/02/15/why-...](http://www.cleverhans.io/security/privacy/ml/2017/02/15/why-
attacking-machine-learning-is-easier-than-defending-it.html)

[https://arxiv.org/abs/1312.6199](https://arxiv.org/abs/1312.6199)

[https://arxiv.org/abs/1412.6572](https://arxiv.org/abs/1412.6572)

------
bane
I was really inspired by this paper at USENIX [1]. This looks like very _very_
early research, but the outline it provides leaves lots of room for
adversarial ML research.

Bonus, if you tackle this problem you get several semi-orthogonal technologies
for "free".

1 -
[https://www.usenix.org/system/files/conference/cset16/cset16...](https://www.usenix.org/system/files/conference/cset16/cset16_paper-
kaufman.pdf)

------
yters
If it is so easy to fool deep learning, why is it so hyped? Seems a great
security risk.

~~~
suryabhupa
Many machine learning and reinforcement learning models are susceptible to
adversarial attacks; it's not unique to deep learning. However, because so
many systems that are currently deployed in applications use deep learning,
it's under particular scrutiny.

~~~
yters
Then it seems the hype of machine learning is not well founded. Machine
learning in general is a big risk if it so easily fooled.

------
jcims
This seems like a direct argument against camera only systems for autonomous
vehicles.

~~~
jwatte
Robustly synthesizing and delivering adversarial input to cameras in the wild
in real time is a very different problem. Easier to just blind then with a
laser. Which you could also do to humans, if you're adversarial enough.

~~~
Twisell
But randomly "running into" a naturaly generated adversial input in the wild
should be a more common problem.

The end result will vary depending if the adversial input is a child or the
rear end of a truck illuminated by sunshine at a certain angle.

So far only the second one have been tested IRL but for some reason I'm not
really fond of the idea that we should be gathering more field data of
adversial input...

My main fear with current autonomous driving applications of ML is that they
might not be that ready for prime time and that we are only a few deadly
accidents away from a major setback in public trust relating to autonomous
driving.

