Beyond the fact that unexpected inputs produce unwanted outputs in both cases, I'm not sure I see the connection.
In the DNN case, the network should be able to end up in both states. However, the boundary is fairly sharp and small changes in the input not only push it over the boundary, but make give it high confidence that the new state is the correct one.
The brain, however, shouldn't ever end up in an epileptic state; there are feedback mechanisms that decorrelate neural activity and keep excitation in a certain range. These mechanisms are weakened or defective in epilepsy, which allows very potent stimuli to drive neural activity and kick off positive feedback loops.
The odds that there are adversarial perturbations that reliably push the system into an unstable state between two or more output states is probably quite high. It just hasnt hit the threshold at which it becomes interesting for postdocs to look for yet.
(E.g. Everyone probably has heard about the blind spot at the point where optic nerve 'connects' to retina. But I was surprised to learn that most of the stuff in our FOV is seen very poorly, and only the tiny area called macula is responsible for the high acuity vision. And the macula is capable of covering only a small spot in our FOV at one time: what we'd think as a single view is really constructed in a post-processing step as our eyes move and scan our surroundings. My amateur-level explanation might be a bit off, but the point is, it stands to reason that kind of "dynamic" processing system is more robust (especially to small pixel level "attacks" as in the linked paper) than simply constructing mappings of still pixel grids to outputs, as currently done by CNNs.)
You can cover someones eyes obstruct visual system entirely, and you'll be fooled for a second until you realize it's probably just your friend because you can feel their hands. Did you fool your friend's visual system? Or does it need to be fooled forever to say you fooled the visual system?
So the visual system on its own is not enough as we have many other senses that you'd need to fool in order to fool you entirely.
I think optical illusions is a great way to fool the visual system only, but it doesn't fool you. (all your senses) There are magic tricks for that sort of thing, but even then we know it's just a trick.
As raverbashing pointed out our visual system is more than just a 2d grid of points with a certain color, so obviously that is easier to fool.
When I saw the 32x32 image of the frog with the pixels I wasn't even sure what it was for a few seconds. But I had context, like I know these images are probably things that are easily classified, so it's gotta be some common animal, airplane, car, house, etc. I also knew that I was supposed to ignore the pixels.
If you present the frog image with pixels to people on the street and tell them to classify the image, I'm sure many of them will get it wrong if they are given less than 2 seconds to look at it. More so when they're focused on something completely irrelevant.
Clearly it isn't, but given that animals use camoflage, the visual system is the end product of millions of years of evolutionary arms race against confounding inputs.
The human vision has a natural blur (some more than others), natural temporal integration of inputs, automatic centering of subjects, limited resolution and gain adjustment
Hence one pixel nudged makes absolutely no difference
One is the short-timed "persistence of vision" one
The other is observing a scene multiple times by slightly different angles and positions
If some weird arrangement causes an illusion, changing the position slightly usually fixes that
Also, human data is noisy, I guess augmentation strategies might want to consider that as well
A neutral net does not. That's the missing element - neural nets "need to know how to know when they don't know".
When we do know that there is an optical illusion, it is because of the context we have that the machines don't, like, we can read the title of the article ("These 11 optical illusions will blow your mind").
For more a much more subtle and in some ways (imo) disturbing one, look into the blindness constantly induced by our visual perception during saccades. A malicious attack on that would be... well I don't really have a word for it. Disturbing doesn't really fit. Suggestions welcome!
However, if you look at the images in the article, I think it's telling about how much more sophisticated human perception is. The authors even have a note in the first image along the lines of "look carefully, because you might have trouble locating the aberrant pixel."
I think these types of attacks speak volumes about the fragility of DL optimizations: I think there's more overfitting going on than people acknowledge or realize. My sense from reading in working with some ML things is that this extends to natural language data as well.
DL systems are often highly optimized to a particular test set, and might do well on cross-validation to other exemplars of that test set, but that's not the same as generalizing across different types of test sets.
Maybe I'm wrong, though--I think DL NN models have been fantastic, but there's a certain amount of hype in ML in general.
But in reality the brain is stateful and noisy and works on a stream of images. Even if now and then, by sheer accident, for a short time, your brain might be mislead, it surely wouldn't go like "well I was pretty sure that was an X, but just now Y is scoring higher than X, so I'll just flip my opinion on that in an instant".
Obviously, humans aren't going to have to have this amount of error from small, small changes like these (except for maybe in very autistic people who have dramatic reactions to changes in their environment, but I may also be wrong; I'm not autistic, and I'm not a psychologist), but our innate perceptual biases do play heavily into how stimuli are processed and reacted to.
> This first example of the Berryman Logical Image Technique (hence the usual acronym BLIT) evolved from AI work at the Cambridge IV supercomputer facility, now discontinued. V.Berryman and C.M.Turner  hypothesized that pattern-recognition programs of sufficient complexity might be vulnerable to "Gödelian shock input" in the form of data incompatible with internal representation. Berryman went further and suggested that the existence of such a potential input was a logical necessity ...
But we know neural nets are not continuous: a tiny change in input could cause a huge change in output as suggested here.
If we get to the point of having a human connectome to analyze-- or the kind of access to neural topology that a neural lace would provide-- could an optimizer generate an image of static that human would mistake for the president of the United States?
It seems outwardly implausible that such an image could exist, but perhaps that is only because we've never seen one (or if we had, would we know?), and a "blind" search of images would never find it, as the space of images is galactically huge. With a "map" of the brain it might suddenly become plausible.
And if so, that world sounds absolutely terrifying to me.
Is this a face?
Are these lines straight?
Is this stationary?
What colour is this?
These even work across vast numbers of people.
With full knowledge of a person's brain and the connections, we would surely at least be able to enormously improve on this.
What other things might be possible? Could we make people move in a certain way, do or say certain things? There's no fundamental reason we'd be limited to affecting the visual processing sections of the brain.
And now I need to read snow crash again.
The dress is sort of a variation on the same theme, except it divides people by how they extrapolate the rest of the scene.
At least to some extent it's possible, with hypnotism.
Probably not, because our brain doesn't seem to work like Artificial Neural Networks do. Most notably, we learn to identify novel objects after only seeing a single example of them while ANNs may require many thousands of examples. We don't seem to learn to identify individual pixels, either (though what exactly our brain does when we learn to identify objects from their images is anyone's guess).
Digital images are also not a very good analogy for what human eyes see: our vision doesn't have "pixels" and we don't even need images to be particularly clear to identify them with good accuracy (we can still tell what things are up to a point, even in the dark, when it rains, when our visual field is occluded etc).
Generally, you can't expect the human brain to work like an ANN. Like others have said before , the "neural network" analogy is not a very good one. It often serves only to create confusion about the capabilities of ANNs and the human brain.
Yann LeCun: My least favorite description is, “It works just like the brain.”
I don’t like people saying this because, while Deep Learning gets an inspiration
from biology, it’s very, very far from what the brain actually does. And describing
it like the brain gives a bit of the aura of magic to it, which is dangerous. It
leads to hype; people claim things that are not true. AI has gone through a number
of AI winters because people claimed things they couldn’t deliver.
But none of that is particularly relevant. Even if they are completely different, so what? The same procedure could still work. Take a biological brain and backprop through it to find exactly what inputs change the outputs by some small degree and tweak it bit by bit until you change the output. You can apply this to any function, it's general.
Adversarial examples are exceedingly rare in natural data. They require tweaking exactly the right pixels in exactly the right direction. It's something stupid like a one in a billion billion chance of such a malicious example occurring by chance if you just randomly add noise to images. It requires a very precise optimization procedure.
So if adversarial examples did exist for human vision, we probably wouldn't know it yet. They don't occur in nature. So there's no reason for the brain to have evolved defenses against them. (Though camouflage is an interesting natural analogy, it's not quite the same.)
How exactly do you "backprop through" a (biological) brain?
Also, I don't see why you'd ever want to do that "tweak it bit by bit" thing to a brain. Human brains seem to catch on to ideas pretty quickly. They don't need to go back and forth on their synapses a million times until they learn to react to a stimulus. Whatever the brain does is light years ahead of backprop, which is, all things considered, a pretty poor algorithm. So why would you ever want to do that "backprop on a brain" thing, if you could do- you know, what brains do normally?
The algorithm used to do the optimization doesn't really matter. Use a GA or hillclimbing if you want.
EDIT since you added more to your comment:
>why you'd ever want to do that "tweak it bit by bit" thing to a brain. Human brains seem to catch on to ideas pretty quickly.
So what? That's the process for creating an adversarial image. Does it matter how many steps it takes to create it?
>They don't need to go back and forth on their synapses a million times until they learn to react to a stimulus.
That's exactly how learning works in humans. Try to learn to juggle with just one or two tries. It takes thousands. And that's after you've spent years in your body learning how to coordinate your muscles and locate objects with your eyes and how physics works, etc.
>Whatever the brain does is light years ahead of backprop, which is, all things considered, a pretty poor algorithm.
I really really doubt that. There are a number of theories about how the brain might implement a variation backpropagation for learning. Hinton has one.
Backpropagation is not a poor algorithm, it's probably close to optimal. No one has been able to come up with something better besides just little heuristic tweaks. It's very difficult to see how you could do so. It's so simple and elegant and general.
It doesn't look anything like a dog!
I think all the other fail-safes human brains have would be a pretty good protection against OP's scenario. When I'm not seeing clear I automatically take a second look, try a different angle, realize that something isn't quite right and evaluate the whole situation accordingly. All things that today's neural networks don't and can't do.
Having been taught the difference between a photo of a dog and a drawing of a dog, would it then be able to differentiate between a photo of any object and a drawing of that object, or do we need to teach it the difference again for every different object there is?
If I teach it to identify a simple two colour single-line drawing of a dog, like that Picasso picture, will it then be able to handle surrealist drawings of dogs, and impressionist, and cubist, and a picture of a sculpture, and watercolours and charcoal and all the other varieties of form and style, or do I need to teach it separately for everything? Don't forget slide puzzles! I can tell this is a dog - https://lh3.googleusercontent.com/oAtmNcl25MPQOZ5Occ_fr7_BKr...
These of course are hypothetical questions; I suspect the answer is that there is going to be an awful lot of teaching, with a few pleasant surprises when it gets one style of artwork from having seen enough of other styles.
Except the first one; to the NN, a real dog is a picture of a dog - with no concept of real object behind the picture, the NN's universe is pictures and it will only ever be a simple machine for identifying things in a very very narrow universe.
Nothing I'm saying here is news to anyone, of course, but sometimes it seems like these NNs are portrayed as general identifiers, when they're actually very narrow.
There are many known ways to create optic illusions that trick humans this quickly and thoroughly. Efficiency is the real question if you are concerned about deception and human brains.
I don't know enough about the subject to know why training does that, or what can be done about it.
Doesn't this imply the Jacobian of the network is ill-conditioned near the adversarials? If so, it seems like we could test this by imposing regularization on ratio of min and max singular values of the neural network's Jacobian, and examine what effect, if any, on adversarial examples.
For what it's worth, images used in this paper are 32x32 pixels.
Or half a torso of a dog... with a dump truck in the background..
The yellow is the manufacturer's (Caterpillar) trademark color--and it probably helps that it's very high visibility.
Even here  most are yellow despite having a different manufacturer. Maybe the difference is enough to differentiate them though.
If my understanding is correct, I guess my question is: how general are these attacks? Can we ever say "oh yah, don't try to classify this penguin with your off-the-shelf model, it'll come-up iguana 9/10"
Psychedelics come up a lot with nn created images, but this is interesting that under the influence of a perturbation suddenly the classifier starts presenting illogical assumptions
Also similar to how people recollect their own classifiers unfolding or showing bias when considered under the influence of psychedelics
Perhaps there is some deeper analogue with a substance influenced brain's neuronal activity
If they write regular blog posts or publish the architecture in a journal however, and the api gives you probabilities, you could maybe recreate it with a process similar to distillation?
The fundamental problem is that there are only so many training examples and layers that you can fit in a gpu's memory. The current state of the art neural networks work well for very specific tasks (ie handwriting recognition), but don't generalize.
Additionally the size of the network isn’t the same thing as GPU memory and you don’t fit training examples in memory anyway (?!).
The NN should still be able to recognise something partially blurred if trained that way.
You could trivially thwart this attack by rate-limiting--it needs many passes through the network to evolve an image--or by caching the classifier's output and returning it for all similar (e.g., by hamming distance) images.
Instead, I think this work is interesting because it shows the limits of the network's generalization abilities.
Reading PDFs on screen sucks.
But of course NNs are definitely just functions, in the strict mathematical sense. You could replace one with a large lookup table. The interesting part is the training: how to come up with the function in the first place.