
Optical illusions that flummox computers - pixelcort
http://www.theverge.com/2017/4/12/15271874/ai-adversarial-images-fooling-attacks-artificial-intelligence
======
mholt
I'm finishing a survey paper that discusses research with adversarial examples
plus about 9 or 10 other attacks on or weaknesses of neural networks (and
machine learning models in general). Overall conclusion is: much like the
early Internet, we're rapidly advancing towards machine learning tech that
works but isn't secure. And 20 years later, we're still trying to make the
Internet secure...

If neural networks are here to stay, maybe we should slow down their public
deployment for a moment and understand them better first. It would be ideal to
find fundamental structural/algorithmic changes that can harden them rather
than relying on heuristics or other "wrappers" to make input/output safe to
use in autonomous environments. The more that is "extra", the less those
security features will be implemented. (We see this rampantly on the web today
with HTTPS.)

~~~
tormeh
Worse is better. Sure, making the internet safe from the start seems
technically superior, but I guess that it would have delayed adoption so
severely we wouldn't have come out ahead.

~~~
sathackr
Exactly. Which is worse overall? No Internet? Or a hopelessly insecure
internet?

Should we wait till something is perfect before we deploy? Should we hold back
on self driving cars until the tech is completely foolproof? What about the
number of accidents/deaths that could be prevented by even a flawed autonomous
car?

Sometimes insecure and imperfect now is better than delaying a technology
years until it can be 'perfected'.

~~~
mcguire
This is a fundamental problem with statistical approaches like NNs: we
currently have no good idea when our how they will fail; it's the explanation
problem.

Putting a great deal of reliance on a system in that situation seems like a
great mistake. "It looks like it works under normal circumstances" is not very
reliable.

~~~
scarmig
Devil's advocate: suppose humans succeed only 99% of the time, but we can
understand and characterize the 1% of failures fairly well. Now suppose
machine learning-based approaches succeed 99.9% of the time and only fail
0.1%, but that 0.1% is for totally unknowable reasons. Is the former better
than the latter?

It's fun to imagine someone constructing adversarial examples to cause self-
driving cars to crash. But is that really a big deal? Humans can just as
easily be tripped up by an "adversarial" example: I can pretty easily point a
laser pointer from my window at random humans driving a car. The legal system
has well-defined mechanisms for dealing with this.

~~~
mcguire
Collectively, we have a lot of experience with the foibles of our visual
system (to the extent that we don't notice what it does poorly, which is a
problem, too). Further, we've arranged civilization to play to its strengths,
which is one of the reasons we're so interested in facial recognition.

The recognition systems look like they are doing the same thing we are, but
they're not. Adding an adversary taking advantage of the difference is just
sauce on top of the problem.

------
candiodari
TLDR: high-energy (sudden changes from one pixel to the next) overlays "fool"
AIs. If you think about it you can see why : they change the statistical
properties of an image a lot "without" disturbing it.

Less so, obviously, if you do something like downsample it or otherwise soften
or ... with filters first. Nor do they fool neural networks with attention
(they simply at some point decide it's not worth looking at and identify the
picture by something else).

And the ridiculous example given does not work without being able to read the
mind of the neural network (the misidentified panda).

Most neural network classification mistakes are "understandable" (e.g. look at
the misidentified carousel). These are really 99.99% or more of the total
mistakes make. Also that network probably needs more Indian elephants in it's
training set (kids make stupid mistakes classifying animals they've never seen
or only seen very few times as well [1]).

Given how a lot of animals look, I wonder if this doesn't work on "real"
brains as well. I for one have trouble seeing zebras in pictures, and it's of
course not for lack of contrast. Counting them or accurately judging distance
is just out of the question. But many animals look way more colorful and
contrast-rich than seems advisable, from chickens, of course peacocks, to
ladybugs.

A number of optical illusions seem based on high contrast patterns being
included in images. Especially if, like in the examples here, the high
contrast patterns don't line up with the objects in the image (e.g. moving a
vertical and horizontal slit filter over an image and you will not be able to
see through it, however in any freeze frame you won't have that problem).

[1]
[https://www.youtube.com/watch?v=bnJ8UpvdTQY](https://www.youtube.com/watch?v=bnJ8UpvdTQY)

------
zeteo
Burying the lede:

> The fact that the same fooling images can scramble the “minds” of AI systems
> developed independently by Google, Mobileye, or Facebook, reveals weaknesses
> that are apparently endemic to contemporary AI as a whole. [...] “All these
> networks are agreeing that these crazy and non-natural images are actually
> of the same type. That level of convergence is really surprising people.”

~~~
shmageggy
This isn't surprising to me at all, since these systems are only "developed
independently" in a very narrow sense. Each company may have it's own
implementation (Tensorflow vs. Torch, etc) but they are implementations of
nearly identical underlying models, namely deep convolutional neural networks.
We shouldn't be surprised that they all agree since they are all the same
under the hood, probably even down to the level of choice of hyperparameters
(stride length, filter size, etc) since there is plenty of research on what
kind of deep net works best on particular applications.

~~~
mholt
That would be convenient, but even when the machine learning models are
implemented differently, as long as they accomplish the same task, the attacks
often transfer from one to another. Papernot et al.'s latest research suggests
that it is the space of the embeddings within a network, rather than the
weights and nodes themselves, which define the effectiveness of adversarial
examples. Models which accomplish the same tasks often have spaces that
overlap, and that overlap multiplies by dimensions, if I read it right.

------
itchyjunk
In nature, the natural neural network called brain seems to weigh in the
information from multiple sources instead of relying on just one. Not just in
terms of number of sense but even for the same sense. Vision for example: it
gets input from 2 eyes and if the data between the two is too inconstant, it
sometimes drops it.

The machine learning systems will probably use similar tricks at some point.
You might need to double the resource needed to process data twice or more,
but you end up with a harder to fool systems. At least with the current
adversarial attacks.

\------------------

[0] [https://www.scientificamerican.com/article/two-eyes-two-
view...](https://www.scientificamerican.com/article/two-eyes-two-views/)

[1] [http://www.bbc.com/future/bespoke/story/20150130-how-your-
ey...](http://www.bbc.com/future/bespoke/story/20150130-how-your-eyes-trick-
your-mind/)

~~~
lisper
Even the human brain often gets things catastrophically wrong. The problem is
that it's only evident when it happens to other people, not so much when it
happens to you. And if you're surrounded by people all of whom have the same
bug in their wetware, then it gets _really_ hard to tell.

~~~
craigching
This is a really interesting avenue to explore. How easy is it for humans to
misunderstand something such that it leads to "catastrophic" failures. Here, a
catastrophic failure could be something as simple as a huge fight between two
people for a simple misunderstanding. If humans can have "breakdowns" based on
simple misunderstandings, what are the cascading effects from a simple
misclassification that can happen?

~~~
lisper
[https://en.wikipedia.org/wiki/Religious_war](https://en.wikipedia.org/wiki/Religious_war)

------
amelius
What happens if you train the system with these optical illusions in place, as
well as the original images? Will it become harder to find new illusions? Or
will illusions always be able to trick the system no matter how many illusions
you trained with?

Remark: I noticed that even a watermark in the lower-left of the image (as you
see on TV) can totally mess up DL prediction.

~~~
mholt
This is called "adversarial training" and is ineffective because of
transferability. Also the space of adversarial examples is thought to be large
(especially as dimensionality increases), making robust adversarial training
intractable in practice. See
[https://arxiv.org/abs/1312.6199](https://arxiv.org/abs/1312.6199) and
[https://arxiv.org/abs/1510.05328](https://arxiv.org/abs/1510.05328)

~~~
amelius
Interesting, what do you mean by transferability?

~~~
mholt
Training a model on adversarial examples essentially is a type of "gradient
masking." ([https://blog.openai.com/adversarial-example-
research/](https://blog.openai.com/adversarial-example-research/)) A model
trained on one set of adversarial examples in an attempt to harden it against
all adversarial examples is still vulnerable to adversarial examples produced
from a totally separate model that performs the same task.

------
paulsutter
The real issue here is that imagenet doesn't have enough funny glasses. Same
reason detection networks will think that gravel is broccoli, a network learns
just enough to distinguish between the categories it's presented.

Improvements in datasets, transfer learning, and online learning will help.
Unfortunately this underscores the issue that giants such as google have more
pictures of funny glasses than anyone else...

~~~
Jweb_Guru
No, that's not the only issue (it's not just that it doesn't have enough
adversarial samples), as you would know if you'd read the papers involved.

~~~
paulsutter
The central issue is that networks are trained on a limited database of images
(usually imagenet) and then frozen in time. Going beyond this involves
evolving the nature of networks, and there's nothing the human brain can do
that can't be done.

The design of networks is limited by the dataset, since researchers are just
trying to hit the metric for the dataset. Better datasets will lead to better
networks. Better data eventually means billions of cameras, for example.

~~~
kem
The issue is that the networks are using a trivial, nonessential stimulus
feature to base classification decisions on. It's like basing the decision on
a very small-eigenvalue feature eigenvector. That could be a training set
issue, or it could be something about the network structure.

The article was kind of interesting to me because it reveals that networks are
probably sometimes making decisions on highly discriminating but non-essential
stimulus features.

It's like the networks might have a high success rate with large number of
replicated examples, over sample size, but not over a large number of distinct
examples.

They're overfitting, but in a way that isn't immediately obvious because the
test stimuli tend to be limited.

My guess is the answer is to incorporate into the training set a lot of quasi-
random or structured but abstract images as controls for training.

~~~
paulsutter
Yes exactly, they just barely work for the examples shown them. It is caused
by the training set, the network structure, and the metric which are all
intertwined.

The design of the network is a direct consequence of the dataset+metric
because researchers focus on accuracy scores against the test data. Given a
better dataset and metric, researchers will solve it with a better network
design. I guarantee it.

------
mcguire
" _To add to the difficulty, it’s not always clear why certain attacks work or
fail. One explanation is that adversarial images take advantage of a feature
found in many AI systems known as “decision boundaries.” These boundaries are
the invisible rules that dictate how a system can tell the difference between,
say, a lion and a leopard. A very simple AI program that spends all its time
identifying just these two animals would eventually create a mental map. Think
of it as an X-Y plane: in the top right it puts all the leopards it’s ever
seen, and in the bottom left, the lions. The line dividing these two sectors —
the border at which lion becomes leopard or leopard a lion — is known as the
decision boundary._ "

I am not sure I buy this theory. Moving an image across a nearby boundary
shouldn't result in the image producing a _higher_ confidence value, should
it?

I'm thinking of the panda/gibbon example in the article.

~~~
caseysoftware
Yes, that's exactly what it does.

By shifting it a few points across the boundary, it creates a higher
confidence that X is actually Y instead of X.

The problem is that without studying the neural network and trying out
different inputs and seeing the results, it's hard to figure out exactly what
will shift things across the boundary. Even if you could get the same starting
data set and attempt to train your own, you don't necessarily know which data
was used for testing vs validation.

You'll likely come up with similar boundaries but not the same therefore you
don't know the effectiveness of your adversarial approach until you actually
try it against that particular system..

~~~
mcguire
But how do you go from 60% certainty that it is an A to 90% certainty that it
is a B?

------
JulianMorrison
Uhhh, this has me a little worried that they are digging in territory that
could contain basilisks. Humans are neural networks. Please do not generalise
a means of hacking me? Thank you and much obliged.

~~~
anchpop
Neural networks have very little in common with how our brains work

------
rhaps0dy
"My, what a shiny red motorbike."

Totally what happens inside of a NN :D

------
adangert
You know you probably could extend this to AI magic tricks, where you are able
to make certain objects do things they are not supposed to, perhaps with an AI
viewing video.

------
verroq
Note that to generate perturbations you need access to the underlying model.
The attack is just optimising towards a specific class with added cost for
visual differences.

~~~
mholt
That's not entirely true:

> If we transfer adversarial examples from one model to another model trained
> with one of these defenses, the attack often succeeds, even when a direct
> attack on the second model would fail [PMG16].

[http://www.cleverhans.io/security/privacy/ml/2017/02/15/why-...](http://www.cleverhans.io/security/privacy/ml/2017/02/15/why-
attacking-machine-learning-is-easier-than-defending-it.html)

------
McKayDavis
Clever 2 level pun with "Hi" hidden in the Magic Eye <-> Magic-AI.

------
mrkgnao
More blatant "SEO walks into a headline" jokes, anyone?

------
rochellle
But these aren't optical illusions!

~~~
dingo_bat
They are for image classifiers.

~~~
rochellle
It's anomalous data for half-baked, fault-intolerant software.

------
spyckie2
Actually a pretty interesting article.

The ease at which you can 'fool' machine learning right now adds an additional
layer to practical machine learning in the wild - risks from malicious
attacks.

Imagine someone putting up a lawn sign that tricks self driving cars into
seeing something that isn't there and applying the wrong behavioral pattern
because of it. Or even simpler, someone taping a sticker over the self driving
car's cameras that cause erratic behavior. Can have really bad consequences
and seems really simple to do.

~~~
AndrewOMartin
I wrote a bit on this subject just last week, each paragraph mentions the
problems of increasingly uncontrolled environments, finishing up with actively
hostile environments like the examples you just mentioned.

[http://www.aomartin.co.uk/what-robots-can-and-cant-
do/](http://www.aomartin.co.uk/what-robots-can-and-cant-do/)

