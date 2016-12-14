Hacker News new | comments | show | ask | jobs | submit login
How Self-Driving Cars Work
This article misses a lot of things technology wise. OK, I understand that.

This article also misses a lot air companies like Daimler who showed off self driving car captibilities publicly in the beginning 2000 and on the web 3 years ago. It misses the prototypes shown off by Audi and BMW. It misses also the prototypes shown by Japanese car manufacturers.

From technology companies it misses Bosch who will show off at CES their own developed car and as such their expertise. It misses Delphi who works with MobilEye on self-driving platform to be used by the OEM, and it misses other TIER1 like Conti who have also shown their work.

By the way, the Mitsubishi Outlander is currently the best equipped series car who can do self-driving capabilities, it has a stereo camera system and Lidar combined with other other radar sensors. Next are the luxury cars from Audi, BMW, Daimler, and Volvo. Tesla here is last with only a simple radar and camera system (no stereo view), which has also no night vision capabilities like the others have.

I just had a look at the website of the Outlander. Its not mentioned anywhere that it has self driving capabilities. The sensors are used to warn the driver and intervene if a crash is imminent.

So can I conclude there is a difference in the (state of) implementation between the hardware (sensors) and the software (self-driving intelligence) ?

Humans only have one sensor: a rotatable stereo camera. So at least in theory the number of sensors seems not the most important element ;-)

Tesla seems number 1 in pushing the frontiers in marketing, so it may also ahead in software, to compensate what it is lacking on the hardware side ?

> Humans only have one sensor: a rotatable stereo camera. So at least in theory the number of sensors seems not the most important element ;-)

I hear this comment a lot when defending Tesla's choices, and it's a red herring. The fact that humans only rely on two cameras means nothing. Repeating old comments of mine:

You also don't "need" megawatts of power to play top-level Go: humans do it with 100 watts of energy. Yet Google needed who knows how many megwatts of energy to train and run AlphaGo on their massive server farm.

Imagine two companies competing to win at Go, and one company had the attitude that megawatts of energy was not necessary for training and prediction, and another company threw the biggest GPU farm they could. The second company just played top-level Go this year. The first company is ~10 years away from a low-energy elite Go computer.

Humans implicitly perform SLAM (simulataneous localization and mapping). What do I mean? Look around your room. Close your eyes. Visualize the room. As a human, you've built a rough 3D model of the room. And if you keep your eyes open and walk through the room, that map is pretty fine-grained/detailed too and humans can keep track of where they are in the map.

Doing this accurately in moving environments (especially with lots of pure forward motion) with just two cameras is still a wide open research problem.

Doing this with LIDAR/GPS/IMU/recorded maps is solved. That's why people use LIDAR.

Matching the abilities of human perception is an insanely hard problem. Don't let cute problems like image classification fool you. Why make the problem even harder?

I've been wondering why no self driving systems seem to use DTAM or similar methods. Realtime dense 3D reconstruction and camera localisation on commodity hardware seems perfect for the job.

There is a big open research problem in the state of the art in visual SLAM (visual SLAM = SLAM from cameras), it doesn't work when the environment is moving (!!!).

Visual slam is still linear-algebra/geometric/keyframe based traditional computer vision (including variants that incorporate GPS/accelerometer info). I think the state of the art is stereo LSD-SLAM, but I could be wrong.

A great way to explain this to people would be to ask them if they've ever seen a panoramic photo glitch, and tell them that's what the car would be seeing most of the time (well, assuming anything is moving, like a car or a person or a piece of trash).

Would you add Nvidia to your list?

This year, a colleague made a seminar for the Nvidia people in the UK. I was not personally present. But from what I've been told, they had no glue about automotive. The only thing they have, is a high performance processor designed for embedded use, with which they want to be an automotive supplier. AFAIK, Tesla is the first going to use them. But I believe that currently they have no automotive experience in terms of software. They are building it up, currently.

I'd pass on reading this if you clicked through to the comments first. There's very little detail about how self driving cars work. This is a very basic survey of brands. Remember the Murray Gell-Man amnesia effect. Does anyone have real examples / links / videos of how self driving cars work?

Firstly, there are different approaches. Google and Uber seem to have a similar LIDAR + map approach. Tesla and Mobileeye have a camera focused approach.

The CEO of Mobileeye (who is an ex-machine learning professor) gave a very good, mildly technical talk on their approach at CVPR: https://www.youtube.com/watch?v=n8T7A3wqH3Q

What about Novideo and baidu?

