yeah, this article needs to go to the top of HN and stay there for a while
In the case of a human not seeing a stop sign, some drivers will remember a stop sign used to be at that location, but a self-driving car is less likely to forget and might (in a world with networked cars, which I find slightly scary for different reasons than this) be able to "remember" that other cars had previously seen a stop sign at that location.
While an AI might specifically already know that a given intersection is supposed to be controlled, there are large numbers of driving situations in which memory (or maps) aren't very useful in dealing with.
When it comes to self-driving cars, probably the way around attacks on signs or other static road features is just to fail safe even in situations in which you're mis-directed by road features (that is, you drive cautiously enough that even if you don't understand the road situation, you can find a way to stop without hurting anyone). But that's not a general AI solution, and may not even be a general self-driving car solution.
Deep neural nets should really be called "really complicated composition of differentiable functions" and all the training algorithms should really be called "really complicated function root finding" because that's all these things are at the end of the day, a rats nest of functions with a billion knobs.
That's not the reason at all.
It's because if you add exactly the right type of specially crafted noise to an image of a stop sign such that, to a human, it looks like a panda--then who is going to be the arbiter that decides "No, it's still actually really an image of a stop sign and not a panda". The machine?
The real reason is the subjective nature of the perception of reality. With the advent of AI, all sorts of schools of philosophy that used to seem like useless navel-gazing are going to find some real important practical applications real soon. This is metaphysics and ontology. On a similar note (not in TFA), in the "friendly AI" debate, we're going to be looking real hard at the foundations of the philosophy of ethics (where does it really come from?), not just the teleological vs deontological (etc) debates we use to reason about law, politics, governance and appeasement of our "justice" instincts.
Or quite possibly we just don't know how to construct the noise yet :P
Isn't that illegal in most jurisdictions?
Eg. look at the panda images at the start of the linked post. The third image looks like a panda to humans (and most humans probably wouldn't be able to see the difference between the first and third images), but it greatly affects the machine learning interpretation.
By modifying the sign in this way, if it is possible, it would be much harder to detect and enforce against than it would be if the attacker was targeting human drivers.
An image that triggers a false positive within the AI won't be noticeable to the average human, however good at detecting bullshit as they are. It probably looks like some annoying abstract pattern at the worst.
To the car, however, it could represent anything that made it react adversely upon observing it. Seems like figuring out how to find those images might pose a challenge.
I imagine it would be hard to enforce, let alone legislate against subtle visual cues that trigger machine vision signals.
Interesting times lie ahead...
At the time, I was thinking of posing for my DMV photo with it on, because I thought it was interesting and kinda funny. Failed to do so and never resumed the experiment.
I fear for the future if the fascists control the neural networks.
> We find that both adversarial training and defensive distillation accidentally perform a kind of gradient masking. Neither algorithm was explicitly designed to perform gradient masking, but gradient masking is apparently a defense that machine learning algorithms can invent relatively easily when they are trained to defend themselves and not given specific instructions about how to do so. If we transfer adversarial examples from one model to a second model that was trained with either adversarial training or defensive distillation, the attack often succeeds, even when a direct attack on the second model would fail. This suggests that both training techniques do more to flatten out the model and remove the gradient than to make sure it classifies more points correctly.
The AI doesn't need to be perfect, just better than humans.
Humans are much worse than AI when it comes to self-driving tasks like image/pattern recognition, sure. But they are inefficient in expected ways where measures can be taken like having larger and brighter signs, stronger rules, etc. But what happens when you don't know for sure when it can fail and when it won't?
It's been a few years since people are seeing the magic of machine learning but something like this was just discovered. Are you sure if someone goes in front of a Tesla with a picture like the parent comment quoted, it wouldn't crash and cause harm to the people inside?
Is it that scary to not know how and when it can crash? You don't think about being blindsided at each intersection...
IIRC, significant research has been performed to identify whether the resulting lane change is net positive (frees space in the slower lane) or net negative (causes cut off car to slow down).
If autonomous vehicles prioritize safety over efficiency by stopping if there's an ambiguous intersection, will that have a net negative effect on traffic while all these self-driving cars slow down for each other.
Nothing (and likely on-one) is foolproof.
Instead, ask yourselves why these deep nets fail after being trained on huge datasets -- and why even more data doesn't seem to help.
The short answer is that mapping directly from static pixel images to human labels is the wrong problem to be solving.
Edit: fixed autocorrect typo
Do you know what proof is? Adversarial examples demonstrate that there is one esoteric failure mode of current deep learning models, one that for all we know is present in human vision (we can't take derivatives with respect to the parameters of our own neurons). It will likely be solved in the next few years. At a minimum you start training on adversarially generated examples.
This response is absolute hyperbole and clearly devoid of any factual knowledge of the nature of deep conv nets and their properties.
Agreed the failure mode may seem esoteric, but note that OpenAI is making a big deal about them.
A non-esoteric way to demonstrate the lack of generalization is to feed a deep conv network real world images (from outside the dataset). Grab a camera and upload your own photo. Roboticists who try to use deep conv nets as real world vision systems see these failures all the time.
"At OpenAI, we think adversarial examples are a good aspect of security to work on because they represent a concrete problem in AI safety that can be addressed in the short term."
Hardly proof that deep learning is fundamentally flawed.
Regarding real world issues, these issues come up when you don't separate training and test (and real world) sets properly. My worries would be with implementation.
I'm saying it's not reasonable to expect good generalization in deep convnets that learn mappings from static images to human labels. (Wrong problem.)
No credible machine learning researcher will tell you that deep learning has totally solved "generalizable" computer vision. The only people claiming such a broad statement are usually the media or enthusiasts who have never done any actual research.
So it might be technically correct to say adversarial examples prove (by counterexample) that deep learning fails at generalization, but nobody in the field claimed that in the first place.
It is hyperbole to claim that adversarial examples will be solved in the next few years. That is extremely unlikely, since the reason they exist is due to the linear nature of convolutions (and I don't think anyone is suggesting we get rid of convolutions entirely).
 -- consider the following encoding/decoding scheme: train a NN to recognize someone's face, and decode by generating random images until one of them is recognized as said face. If this works then the Kolmogorov complexity of the network must exceed the sum of the complexities of all "stored" faces.
So it seems that there are features quite different from the features used by humans that are still similarly robust unless you specifically target them. And they also correlate well with features used by humans unless you specifically target them. Real world images are very unusual images in the sense that almost all possible images are random noise while real world images are [almost] never random noise. And here I get a bit stuck, I have this diffuse idea in my head that most possible images do not occur in the real world and that there are way more degrees of freedom into direction that just don't occur in the real world but this idea is just too diffuse so that I am currently unable to pin and write down.
Yes! You're on the right track! The number of degrees of freedom of images of pixels and textures is HUGE. There is not enough data to practically learn directly from those images. So the deep networks are starved for data -- even with the big datasets they are trained on. (It's only thanks to the way they are set up they do well when tested on very similar images, like sharp hi-res photos. But they fail to generalize to other kinds of images.)
So how can you reasonable reduce these degrees of freedom?
It turns out that the continuity of reality itself provides a powerful constraint that can reduce the degrees of freedom. See, when a ball rolls along, this physical event is not just a collection of textures to be memorized. It's an ordered sequence of textures that vary in a consistent and regular way because of many learnable physical constraints (like lighting).
So, it turns out you can reduce the dimensionality by making a particular kind of large recurrent neural net learn to predict the future in video. Our very preliminary testing shows it works shockingly well.
There are two issues with this - first, we know that the lower feature detectors of neural nets closely mimic the feature detection of the human visual cortex. Secondly, the features could be the same while there could be technical imperfections with the later stages of current neural nets.
While it is pretty easy to add noise to another image, it isn't exactly easy to do it to a real object. The noise wouldn't remain the same as you change perspective with respect to the sign, which would likely change its effectiveness.
I'd be interested to know if this is a viable concealment strategy. It might only be effective at night or low light situations, so sunlight doesn't wash out the noise. It would be pretty subtle to use as well, how many people do you see walking around with their noses stuck to a screen?
* For research purposes only, of course.
Edit: I should add that these perturbations appear to be very robust to different architectures and datasets. So, the same adversarial perturbation will trick different NNs that were trained on different datasets. This suggests that it will probably be fairly robust to noise. But maybe not! I'm not aware of this experiment having been done.
From the article:
"Adversarial examples are hard to defend against because it is difficult to construct a theoretical model of the adversarial example crafting process. Adversarial examples are solutions to an optimization problem that is non-linear and non-convex for many ML models, including neural networks. Because we don’t have good theoretical tools for describing the solutions to these complicated optimization problems, it is very hard to make any kind of theoretical argument that a defense will rule out a set of adversarial examples."
Why not try training multiple models on different levels of coarse grained data? Evaluate the image on all of them. Plot the class probability as a function of coarse graining. Ideally its some smooth function. If it's not, there may be something adversarial (or bad training) going on.