One can always train the network with more and more scenarios, but how do you know when to stop? How good is good enough in this regard?
It doesn't really have to be perfect as long as it doesn't fail in common scenarios.
In practice it probably doesn't matter anyway - the chance of the exact required perturbation of the input happening by chance are infinitesimal, due to the high dimensionality of the input. And even if it was a problem there are ways around it.
This is a good question. My impression is that humans fail and artificial neural networks fail but we don't know enough about the brain to say artificial neural networks fail in the same way as humans.
As another poster notes, humans accept human error more than computer error and I think that's because humans have an internal model of what other humans will do. If I see a car waving in a lane and going slowly, I have some ideas what's happening. I don't think that model would extend to a situation where neural network-driven car was acting "wonky".
Have you seen the AI Formula 1 called roborace? Once those cars get good enough to beat Lewis Hamilton or Seb Vettel I'll trust it with me and my family.
i agree that it doesn't have to be perfect, but the standard should be higher than "doesn't fail in common scenarios." we should also expect graceful handling of many uncommon but plausible scenarios. we expect human drivers to handle more than just common scenarios, and human drivers are pretty bad.
Please do anyway.
I'm wondering whether adversarial examples can also be found for autoencoders to the same extent. It seems very intuitive that you can overstep the decision boundary that a discriminatory network learns by slightly shifting the input into the direction of a different, nearby label.
There are no experiments that support your statements, unfortunately.
On top of that you are going need to fool them over multiple frames, while the sensors get a different angle on the subject as the car moves. For example in the first Deep Q-learning paper, "Playing Atari with Deep Reinforcement Learning", they use four frames in sequence. That was at the end of 2013.
I don't think anyone will be able to come up with a serious example that fools multiple sensors over multiple frame as the sensors are moving. Even if they do then inducing an unnecessary emergency stopping situation is still not the same as getting the car to drive into a group of people. Even if fooled in some circumstances the cars will still be safer than most human drivers and still have a massive utilitarian moral case in relation to human deaths, on top of the economic case, to be used.
The fooling of networks is still an interesting thing, but it's been overplayed to my mind and is not particularly more interesting than someone being fooled for a split second into thinking a hat stand with a coat and hat on it is a person when they first see it out of the corner of their eye.
http://arxiv.org/pdf/1312.5602.pdf page 5
2. A sequence of frames does not solve the issue because you can have a sequence of adversarial examples (although it would certainly make the actual physical process of projecting onto the camera more difficult, but not really any more difficult than the original problem of projecting an image onto a camera).
3. Using something conventional like LIDAR as a backup is the right approach IMO, and I totally agree with you there. But Tesla and lots of other companies aren't doing that because it's too expensive.
2. I honestly can't think of a situation where this could occur. It's the equivalent of kids shining lasers into the eyes of airline pilots, but the kids need a PhD in deep learning and specialised equipment to be able to do it. A hacker doing some update to the software via a network sounds much more plausible than attacking the system through its vision while it's traveling.
3. This is the real point in the end I guess, this Google presentation (https://www.youtube.com/watch?v=tiwVMrTLUWg) shows that the first autonomous cars to be sold will be very sophisticated with multiple systems and a lot of traditional software engineering. Hopefully LIDAR costs will come down.
2. The problem of projecting an image onto a car's camera already implies you'd be able to do it for a few seconds.
Studying induced failure in neural networks may help us understand the failure modes and mechanisms of these systems.
This is why any model that lacks explanatory power can't be used in mission and safety critical systems. If it can't reason about things the same way people can reason about things then the system overall can't really be trusted. It's one thing when a translation from english to spanish is wrong, it's a completely another thing when the control software of a self-driving car decides to accelerate instead of break and the root cause analysis is people throwing their hands up and saying neural networks are inherently susceptible to these kinds of problems.
You made a very specific claim that random fluctuations could have the same effect as adversarial examples. I was addressing that.
Are not all the minds subjected to the same limitation?
So when we see someone that just started driving perform well under some circumstances, we can good performance under circumstances that are similar to the human mind. The problem that the "fooling neural networks" experiments show is that two things that are similar for humans can be wildly different for a NN that's been trained to recognize them.
What is the accuracy of the human brain in recognizing traffic situations? It is probably not that hard to get a NN to do better, even if periodically it still causes an accident. This is the uncanny valley effect for self-driving cars. It's not enough to be better than average humans at driving, which i think they already are, they have to be perfect at driving for people to trust them.
Is there any reason they couldn't just put a driving test examiner in the car and test it like you would a human? Just ask the thing to drive around town, emergency stop, park, navigate a roundabout etc.
I think this sort of thing is something developers are going to have to find ways of dealing with; a car can be technically driving in a safe, legal way but if it's too different from how a human would drive, they are going to be a safety hazard.
Of course, standard driving behavior varies dramatically from place to place. For instance, in the United States, everyone is expected to get out of the way of whichever car has the right-of-way in that situation. In Indonesia, the car that has the right-of-way is expected to slow down, stop, or move over to accommodate other cars that do things like pull out in front of them in an intersection or pass on a two-lane road with oncoming traffic. A self-driving car in Jakarta would need to be trained very differently than a self-driving car in Seattle or Paris. Not just because the traffic laws are different, but because drivers have very different expectations about what is normal behavior.
I feel like this is already an urban myth, given the small amount of people who have actually been driving around the cars. And won't some of the people at fault for hitting them try to put the blame on the robot anyways?
Is there even a credible source for what you read?
I was curious, so I went through the whole list. By my count, in the history of the program they've been involved in 19 accidents during autonomous operation, and the car was only at fault in one of those.  The majority of the other crashes were caused by other drivers rear-ending the car while it was stopped.
It's hard to argue that a car stopped at red light violates anyone's expectation of how a human would drive.
Here is one example. It is hard to be sure exactly what happened, because Google obviously phrases its accident reports to put its cars in as favorable light as possible.
"April 28, 2016: A Google self-driving prototype vehicle travelling westbound in autonomous mode on Nita Avenue in Palo Alto was involved in an accident. The prototype vehicle came to a stop at the intersection of San Antonio Road, then, prior to making a right turn on San Antonio Road, began to gradually advance forward in order to get a better view of traffic approaching from the left on San Antonio Road. When the prototype vehicle stopped in order to yield to traffic approaching from the left on San Antonio Road, a vehicle approaching at approximately 9 mph from behind the prototype collided with the rear bumper of the prototype vehicle."
The fact is: slow speed rear ends are really common. I've had them happen to me several times during one year where I commuted every day. I've done it myself on another car.
I would not be surprised if over the course of the next few years Google cars get rear-ended at light to moderate speeds hundreds of times.
About 23-30% of human accidents are rear end collisions. It is entirely possible that the car drives so well that other types of collisions are minimized.
That leaves rear-end collisions - the type the car can't control - misleadingly seeming to be abnormally high.
Sounds like a textbook example of blaming the victim.
Plus, the whole point here I think is to save lives. Google self driving cars arent really a hazard they're more just very annoying because they are overly cautious.
I will admit that once I became aware of a Google car coming up to pass me on my left, and I did a little jink toward it on my bicycle. It reacted conservatively but decisively. It didn't jump into another lane or slam on its breaks. It just quickly gave me some more room and gently passed.
Kind of creepy, but very cool.
"It can't be bargained with. It can't be reasoned with It doesn't feel pity, or remorse, or fear. And it absolutely will not stop, ..."
It would never leave him, and it would never hurt him, never shout at him, or get drunk and hit him, or say it was too busy to spend time with him. It would always be there. And it would die, to protect him.
I mean sure, that's an easy problem to solve but in that case why use cars at all and not people movers or the like?
- Was that an obscene gesture or a thank you wave?
- Did the other driver suggest I go forward or tell me to stop?
To train a CNN to do lane following we only select data where the
driver was staying in a lane and discard the rest.
I'm inclined to agree, especially because it helps in 1) providing diagnostic information (such as the great driving visualizations shown in the video), and 2) makes it easier to incorporate algorithms and sensors (like with Google's cars) as a redundancy in case the neural network hits a crazy edge case.
Maybe the system can still be globally optimized, though, as long as individual subsystems are still verifiably correctly trained. i.e. lane detection, pedestrian detection could share some of the same convolutional layers and still be tested separately.
My personal prediction is that all of this 2D convolutional network stuff will be extended to 3D within a few years. The front-end will do a full 3D scene reconstruction from first principles, and then some sort of 3D features will be learned on ~that data.
In all seriousness though I'm really enjoying the number of corps exploring the space of self-driving cars. It can only make the reality of a road full of autonomous cars come all the sooner. (Though with the rate at which I see Google's self driving cars around Austin you'd think they were already out for public consumption.)
More seriously, this is amazing work, I am really impressed, I just wonder if we can't get something more topical like training lasers on mosquitoes or something. I feel like I did when 3D graphics was the new thing, every day it felt like there was a new advance, now the same tech is doing the same thing to machine learning.
Edit - Is a joke really worth this many downvotes? I mean who hasn't had occasional trouble with Nvidia drivers and games?