I have talked to a number of different autonomous driving startups and every one...

Animats · on July 4, 2019

Yes, it's scary.

We had an entry in the DARPA Grand Challenge in 2005. It was too slow, but it didn't crash. We profiled the road ahead with a LIDAR, so we had an elevation map of what was ahead. This is essential for off-road driving, and it's the gold standard for on-road driving.

But there's a range limit. Not from the LIDAR itself. From the geometry. If the sensor is 2m from the ground, and you're looking 30m out, the angle is so shallow you can barely sense elevation. You can't see potholes. Anything that looks like a bump hides a lot of road behind it.

The Stanford team's answer to this was to use machine learning to compare the near road texture, which they could profile, with the far road texture, which they couldn't. If the near road was flat, and the far road looked like the near road, they could out-drive their profiling range. If it didn't match, slowing down brought the stopping distance down to profiling range, and they could work through difficult terrain. The 2005 Grand Challenge didn't have much difficult off-road terrain. No rock-crawling or cratered roads. The 2004 course was harder, and nobody got past mile 7. So most of the time, fast mode was usable.

Google/Waymo started from the Stanford approach, trying hard to profile the immediate surroundings. Their machine learning efforts were focused on identifying other road users - moving objects, not fixed ones. Their earlier videos make this clear.

Google/Waymo built a cute little bubble car with a top speed of about 25MPH and a LIDAR on top. At that speed, you can profile the terrain all the way out to your stopping distance, so you have solid detection of fixed obstacles. That was something that had a good chance of working well. They decided not to manufacture it, probably because it would cost far too much.

Machine learning isn't that accurate. You can get to 90% on many problems. Getting to 99% is very tough, and getting to 99.9% is usually out of reach. The killer apps for machine learning are in low-accuracy businesses like ad targeting. Or in areas where humans have accuracy in the 80%-90% range.

Here lies the problem. Humans are very good at obstacle avoidance. Much better than 99%. Machine learning isn't as good as this problem needs.

There's another side to the problem - the false alarm rate. If you build a system which insists on a valid ground profile, a piece of crumpled cardboard on the road will result in slowing down until the sensors can look down on the cardboard and see past it to solid pavement. You get a jerky ride from a conservative system. That's why Uber disabled automatic braking and killed a pedestrian. That's why Tesla's system fails to react to fixed obstacles it could potentially detect. Waymo has struggled with this. Customer evaluation of driving quality seems to be based on good lane-keeping and low jerk. These are things that trouble poor human drivers. Self-driving has different strengths and weaknesses. This is what leads to systems which seem to be doing great, right up until they crash.

What self-driving seems to need right now is a rock-solid way of detecting the lack of obstacles ahead. All we have so far is LIDAR. Radar is still too coarse and has trouble distinguishing ground return from obstacles. Even LIDAR is rather coarse-grained. Stereo vision doesn't seem to be hugely successful at this. We need that before self-driving vehicles can be trusted not to run into obvious obstacles.

If you have to recognize what the obstacle is before determining that it's an obstacle, it's not going to work.

There are a whole series of secondary problems, from left turns to double-parked cars. But those are not the ones that kill people. It's the basic "don't hit stuff" problem that is not adequately solved.

computerex · on July 4, 2019

Do you have any references to substantiate anything you have said?

Animats · on July 4, 2019

Chris Urmson's talk at SXSW is good for how Google/Waymo's system worked. The DARPA Grand Challenge is well documented. Our team's code is now on Github, just for historical interest.[1]

[1] https://github.com/John-Nagle/Overbot

kqr2 · on July 4, 2019

In your opinion, which company is taking the best approach to solving this problem?

Animats · on July 5, 2019

Don't know. Nobody in commercial self driving publishes much in the way of technical details any more.

(That's part of the price of hostility to software patents - more trade secrets.)

mli168 · on July 4, 2019

Having actually worked in these companies, I can say the actuation and controls are still deterministic algorithms. Talking on some buzzword is one thing, but actual implementation is another. I would be surprise if any of these companies have largely NN based modules in their stack for controls and lower level planning.

hnick · on July 4, 2019

Can NNs be corrected, like a child or a dog being told a firm "no" when doing the wrong thing? Can the conditions be replayed such that those conditions result in a different response with a human auditor providing the correct response?

I'm guessing the answer is currently no. Which is interesting because one of the early benefits touted for self-driving cars was that people might die, but a patch would ensure no one dies twice for the same reason (within reason, which is more than can be said for humans).

Jasper_ · on July 4, 2019

The even scarier thing is that a lot of these startups are training their neural networks on game videos like GTA or cars they get off the Unity asset store. I seriously doubt any of the artists, graphics programmers, AI developers, etc. involved in these titles think their work is suitable for safety critical systems.

yo-scot-99 · on July 4, 2019

the positions for Vision/Perception at zoox all mention nn's: https://jobs.lever.co/zoox i bet even simple questions like "how does this scenario change decisions when the lighting / sensor rotation are altered" cannot be answered.

ericd · on July 4, 2019

Changing lighting and rotation on sensor input is a pretty standard way to improve neural net performance (it's called Data Augmentation), so I'm pretty sure they could answer that.