Several other companies are working on these (Quanergy, Blackmore) too, but so far they seem to be just press releases. Hopefully we'll see some real ones soon; the current state of the art for wide field lidar are many thousands of dollars and (imo) too fragile for use in production vehicles.
The allure of solid state lidar is intense though. Not needing avalanche diodes gives me a bubbly sensation around my prostate. There is probably no such thing as cheap lidar without solid state.
 no laser forms a perfectly straight line, but 30 degrees is more like a floodlight. It makes it very difficult to take measurements by applying a complex filter over everything. Basically all of the data is massively blurred when you get it and has to be deconvolved, which is never perfect. It's very hard to turn a blurry image into a sharp one.
> Velodyne today announced a solid-state automotive lidar ranging system that the company will demonstrate in a few months, release in test kits later this year, and mass produce at its new megafactory in San Jose, Calif., in 2018. The estimated price per unit is in the hundreds of dollars.
Advanced Scientific Concepts has had good flash LIDAR units for sale for years. They just cost too much. They sold them to DoD and Space-X.
Continental, the German auto parts maker (a very big company, not a startup) has purchased the ASC technology and expects to ship in volume in 2020. Here's a Continental prototype mounted on a Mercedes. This is mounted at bumper height and has a 120 degree field of view, and has only 30m range. So this is for city driving or slow driving in tight spots.
Continental says they intend to ship in volume in 2020. Nobody is yet interested in ordering enough units in the kind of volume a major auto parts manufacturer needs.
Google uses a high-mounted LIDAR with longer range as well as the bumper height sensors. This doesn't address that market.
Yeah I'm not impressed with Velodyne's lidars either. The work at Continental is going, quietly, to volume production.
For more range, you need bigger collecting optics, which means a bigger unit, or a narrower field of view. The tradeoffs are straightforward.
You also have to spread the laser output over a wider area to keep it eye-safe. The laser eye-safety requirement is on power through a 1/4" hole, corresponding to the pupil size of an eye. This protects people staring directly into the emitter. The power can be greater if the beam is wider. If you devoted the top inch of the windshield to sensors, and spread the laser output over a wide area of windshield, the power could be much higher.
I ordered several of their current lidars 5 months ago and now it looks like the total delivery time will be about a year after the order. And they're mostly unresponsive about what is going on.
Apparently some folks have heard that they're having some sort of supplier issue that they're not doing a good job of working through/ finding a replacement for.
Our world around us is 3D. A camera projects this information onto a 2D plane and throws away a lot of information in the process.
When we move in the world we need to know where obstacles are relative to us. We infer this from all the sensory input we receive and have a pretty good idea how far things are away from us because we have learned how size relations should be.
A computer could and can (to some degree) infer this information from regular 2D videos as well, because in the end most algorithms that do obstacle detection/avoidance have to extract positions in 3D to know exactly where those obstacles are.
A LIDAR (and other depth sensing technologies) provide a distance for each measurement (sometimes the distance is the only information but some sensors provide intensity, color, etc. as well).
So we immediately know the position in space of the measured point and thus can more or less directly extract obstacles in 3D space relative to the robot/car/drone without going the detour of "imagining" 3D information from the projected 2D image.
So it is
3D world -> 3D sensor -> 3D points -> 3D obstacles
3D world -> 2D sensor(s) -> Extraction of depth -> 3D points -> 3D obstacles
No reason to stick with our limitations.
Lidar is great for distance representations, but requires a highly consistent and exact timing source. Dual optical inputs allow us to approximate the same data through other methods that don't require nearly as exact timing.
Just a small FYI:
If you are doing time-of-flight measurement, then yes. But many (not all) LIDARs out there don't do time-of-flight, because to get the resolution wanted or needed, you need to have an oscillator (and sensors for the return pulse) capable of many, many gigahertz (it can get absurd quickly if you want anything below a resolution of around 10 CMs). This isn't easy or cheap (the oscillator actually is fairly easy and cheap - it's the return sensor that needs a very fast rise time that is expensive and difficult, though such devices do exist - some all solid-state, like flash LIDAR).
What many LIDAR sensors use (not sure about Velodyne) is actually using a kind of "doppler shift" of a laser modulated with a carrier wave. It still needs a high-speed multi-gigahertz oscillator to resolve small changes in distance, but everything else is fairly off-the-shelf and nothing real special. One downside of the method is that there is a discrepancy at farthest vs minimal distance measured due to ambiguity when comparing the received modulated beam to the outgoing modulated beam; you can't tell at such a point what the distance is (because both distances look the same).
Units like this:
that go for $75 (or less) do this. They're super accurate, the only downside is that it takes a second to cycle through all the necessary frequencies.
But sensors like Velodyne, SICK and I think Hokuyo are pulsed and measure the actual time it takes for the pulse to travel from the emitter to the target and back to the detector.
I winder, if we started heavily colonizing Mars 100 years from today, would we bother to build roads?
At first people would anyway live by whatever resource they were using. If a deposit ran out they might choose to build a haul road over moving their entire settlement.
Cue action music.
It's a sucky situation, but it makes for an awesome video game!
For that matter, neither can a camera (or multiple cameras), but I just wanted to point it out, because it is something that can be useful to know in self-driving vehicles, and isn't often considered or mentioned.
(Disclaimer: Cynical Plug)
This is exactly the sort of AI project that I'm working on: being able to recognize patterns even with lots of noise or incomplete information. I tried a show HN submission  but there was absolutely no interest. :(
Question: Does anybody know if google is yet at this level? Would google's AI be able to recognize the dalmatian?
I think my AI algorithm would be able to recognize the dog but I'm still not at that stage yet. Really need to get more computer power before that. Thanks for the image though, you've given me some really good testing ideas.
Pro-tip: For a moving viewer, anything in a 2D image that doesn't move horizontally or vertically is on a collision course with the camera. This is why deer seem to "come out of nowhere" because they are not moving within the field of view as they run into a car, and could actually remain hidden behind the A-pillar the entire time they are running toward the road right up until you hit em.
Apparently not, based on trying out the palmation image here:
Also, Lidar can avoid some of the optical illusions that will cause humans to crash. E.g. 
And, hopefully, precise data will let future computerized cars drive more precisely than humans can.
I imagine lidar gets you one step closer to the data you're really after.
Given that, it would be foolhardy to proceed with vision and sound only because humans can get by with mostly just those. Better sensors push down the intelligence requirements for complex behavior. For example, insects as a group have a wider variety of sensors than any other group of organisms. With pin sized brains and milliwatt power, they forage, hunt, fly, walk and some even learn or communicate with an elaborate (combinatorial) language. Their array of specialized sensors plays a large role in allowing for their surprisingly rich sets of behavior.
While I'm here, I'll also note that optical illusions are not things to get fooled by but instead, show the strength or assumptions of our predictive world models. Brains do not see the world as it is (see the checker shadow illusion and inattentive blindness), they predict based on, smooth, filter and adjust their inputs to more effectively act in the world. All inferential systems must have particular structural biases to do well in their target domain. In animals, between what was evolved and what is learned based on early environment; expectation and experience ends up top-down affecting what we see. It's very non-trivial -- many were surprised to learn for example, that even the "simple" retina receives inputs also from non-visual areas (and the details differ from species to species).
This is used a lot in mapping and research into e.g. forest canopies in the amazon. LIDAR has been extensively deployed on aircraft for this kind of purpose.
I've seen high-res LIDAR scans made available by my country's state survey for e.g. archeologists and it has the 'true ground' through the vegetation. Its like the vegetation has been stripped away and has a clinical feel when rendered.
Some lidars also offer two distance estimates per point, a first distance and a secondary distance. On the lidar I used that had this capability, the secondary distance was not always robust. But on permeable materials like vegetation, the secondary distance would typically indicate solid materials behind the initial curtain of material.
Anyway, the presence of multiple distance returns in a small solid-angle area (whether from the above primary/secondary, or from nearby scan points) can indicate density of vegetation. You can certainly see "through" the small gaps in a hedge. Depending on the lidar, and the range, gaps as small as a few cm^2 could be enough (talking about vehicle-mounded lidar for autonomous navigation, not remote sensing of forest canopy).
The difficult part however is pairing the points between the images. You and I do that easily, because when we see, for example, the top left corner of our refrigerator, we can easily identify that same point in both images, probably because we recognize the objects we are looking at. A computer has more trouble pairing the points.
Lidar is a way a cheating, by temporarily shining a dot on the locations so that the points can be found and coordinated easily from both images.
To get accurate range through stereo, you need two calibrated cameras, rigidly mounted (with respect to each other), and with a high resolution (pixels x pixels in each image of the stereo pair). All this implies very expensive sensors, large camera rigs, and lots of image processing. To get acceptable stopping distances for a vehicle going 60mph using stereo only to detect obstacles requires things like 1.5-meter camera bars, 2048 pixels per image in the stereo pair, and onboard computing to compute stereo correspondence at 30 or 60 fps. It's hardware-intensive!
TBH, I forget how the stereo range error scales with range - I think it is linear with range, but may be super-linear. This can be a problem for mapping. I think lidar is superior in this regard, in other words, its error scaling is sub-linear with range.
If you're used to using only stereo, the concept of having a lidar for that point-range measurement looks pretty magical. Of course, stereo vision offers some advantages relative to lidar - it's a passive measurement, for example.
e_z = e_d * Z^2/(b*f)
e_d is the disparity error (i.e. matching error) in metric units; i.e. a few microns, Z is distance, b is baseline, f is focal length
It's relatively easy to estimate the accuracy you can achieve on a car because the maximum baseline is fixed to < 2m typically. You can plug in reasonable fields of view, sensors etc. Typically you can assume 0.25 px matching accuracy, assuming the algorithm does sub-pixel interpolation.
Example: e_d = 2.2µm, Z=30m, b=1.5m, f=3mm (for a 2000px wide sensor that's 70 degrees FOV): e_z = ~40 cm. That's not too bad - enough to identify something big. At 100 m you'd have a disparity of 20 pixels. If you had a search radius of 256 px you'd have a close-range of 8m.
Nowadays the compute problem isn't too bad - you just throw a GPU at it (or an FPGA, but you have memory limitations there).
LIDAR has a more or less constant error with range, provided you've got a high enough signal to noise. However this does mean that at short distances, LIDAR is quite poor relatively.
Visual SLAM is an open research problem, and lidar lets companies sidestep a lot of those problems. I think that's a reasonable bet: falling prices for an existing technology is more predictable than hoping for algorithmic/research advances.
Probably some day "true" native neural network consumer hardware will exist, and I would expect it to be much more efficient. It would be a complete paradigm change from what we have now though, so it's probably going to be awhile.
...and while not inexpensive, and not quite the same as "neural network hardware", Nvidia does sell several deep learning platforms (beyond their GPU offerings):
That's 0.02 FLOPS/W.
RIKEN apparently is at about 7GFLOPS/W. That's three hundred billion times more efficient.
Not a fair comparison, but that's kind of the point.
This argument loses a lot of its bite after AlphaGo beats the top humans.
You could argue that you still need lots of energy for training: Even a 100 watt human needs to live for 30/40 years before he's a top Go player. So it's not unreasonable that accelerating the process might use more power. And if you have all that power for training, you might as well use it for inference.
You can probably build a bad Go playing machine in much less than 100 watts. There might be a way to extract something more power-efficient from AlphaGo, but it doesn't seem to be important to anyone.
Humans also are unable to learn effectively, due to limits on attention we are not sure of yet, beyond ~4 hours per day. A talented human can reach pro level in about 8 years, enough to beat a neural net only player. The total energy use in joules spent learning is still about an order of magnitude less.
Honestly both seem a little silly though, would an early, poorly trained version of Alpha GO confirm the need for lidar because it doesn't beat top humans?
It's much more likely that a stereo vision based system will give you false positives due to an incorrect feature match.
The main downside compared to cameras is that LIDAR data is much sparser. You can get 1MPt/sec LIDAR, but that's over an entire hemisphere normally so in one camera's field of view you might only get 20k measurements compared to millions of pixels.
In the end you need both (plus RADAR, ultrasound and everything else). Cameras will always be required for tasks like lane marking detection, reading road signs and visual identification of obstacles like pedestrians.
By introducing more dimensions of sensing (LiDAR, Sonar etc.) as some others put it, it should theoretically help judgement during driving and thus improve the safety aspect.
But I find it fascinating that you can already ask "we can do it, why can't computers." you're just a couple of years too soon. :)
And we're kinda shitty at it, killing large numbers of people every year doing it.
In 50 years, we'll be shocked humans were ever allowed to do it.
But sure, self driving cars might also be a solution. Who knows. It might certainly be a faster solution.
I'll be the first to say I'm not a great driver. I can and have driven long distances, but I always know I can do better. But all of our training here in the US to pass the test is mainly "ad-hoc". There isn't any kind of "certified training" program that you have to take and pass in order to then get a license. But there probably should be.
Honestly, what there really needs to be is such a program, but done over the course of weeks at a facility like what is used at the Bondurant Racing School (https://bondurant.com/). Having that kind of knowledge and experience to really know your vehicle and its capabilities (both what it can and can't do, and how to handle an emergency) might go a long way toward making better drivers (now that I think of it, I might have to go to that school myself - assuming I can afford it).
I think if we did do a better, stricter job at teaching drivers how to drive, and requiring it for a license, it would go a long way to cutting down on accidents. It wouldn't eliminate them, but it would surely reduce them, probably severely. Learning to keep control of emotions, and also handle stressful situations while driving would have to be a part of this as well.
It wouldn't be a perfect solution, but it would probably be a damn sight better than what we currently do.
USA and territories 7.1 (includes Guam, Samoa, Puerto Rico, etc.)
USA, Massachusetts 3.25
S. Korea 16
Yes you can infer distance using parallax (2 regular cameras, like our eyes). But if you can directly SENSE depth using a sensor capable of it (Radar/LIDAR), then you have a lot more certainty, redundancy, simplicity, and can even sense in the dark where cameras aren't able to effectively judge distance.
But LiDAR isn't necessarily enough either. A lot of scanners end up being very noisy, or have a limited range, whereas cameras are good as long as there's line of sight. Using the two technologies together makes a whole lot more sense than just using one or the other.
Disclaimer: I am a graduate student who works with LiDAR, photogrammetry, and 3D imaging technologies.
That said, you bring up a good point: Humans do a pretty good job driving vehicles with only a pair of eyeballs (usually), ears, and "feel" (I'm not sure how to put this last one, but a good driver tries to stay and feel "as one" with the vehicle, getting information about the road, various forces internal and external about the car, etc - in order to drive it properly and in control while making various maneuvers). So why couldn't we do the same with a self-driving vehicle?
Well - the truth is - we can:
I mention that paper so often in these kinds of discussions, that I am sure people are getting sick of it. Note that in that paper, only a single camera is used - like driving with one eye closed (or blind in one eye), yet it still was able to do so properly. Part of Nvidia's purpose here is to advance self-driving vehicles, but they also are trying to sell self-driving vehicle manufacturers on their tech as well:
If the need for expensive and exotic LIDAR systems can be overcome, and simpler systems like cameras and radar can instead be used, it will both be a cheaper way to manufacture these cars, as well as being more reliable (as the article noted, it's one thing to make a LIDAR - making a reliable LIDAR for automotive use over 100s of thousands of miles/kilometers is a different ball-o-wax - but manufacturers already have experience with making radar and cameras for vehicles robust).
For another take on using cameras for autonomous navigation - there's this project using high-speed winged drones (and completely open source - including the drone design files):
That one only uses a couple of small cameras for stereo vision.
These two projects (and others if you care to search) both show that such a system is possible; we probably don't need LIDAR for the majority of driving scenarios. Radar can probably cover other areas, and you mention the concept of humans using hearing - but honestly I am not aware of any self-driving vehicle research on that potential kind of sensing, but if none exists, it seems like it might be a fascinating and promising area of study for someone! Audio sensors would also be another one of those "simple and robust" sensors for automobile usage that manufacturers would like.
That said: If they can make a simple but robust LIDAR system, with no moving parts (such designs do exist, like flash LIDAR, for instance), and a low-cost (that part is key), having that on a vehicle certainly can't hurt. There are areas (particularly at night, or in inclement weather, or in other low-visibility situations) where a camera alone will have a similar trouble as a human would (though I wonder if a camera operating at other wavelengths (FLIR, Near-IR, and Near-UV for instance) could help?)
In those situations, having a LIDAR might be the key to a safer driving experience for a self-driving vehicle, whereas a human without that ability might make a wrong (and potentially fatal) decision.
They certainly claim that, but have they got the test data to back that up? All the companies that have actually demonstrated truly self driving cars under real world conditions use Lidar.
Who says? The chattering masses on HN?
Tesla has certainly deployed a lot of assisted driving systems, but then again, so have a lot of car companies. Lane-keeping and auto-braking are not new, and to date, that's all Tesla has actually shipped.
People who know better have serious doubts that you can do full autonomy with only video/radar input under real-world driving conditions (like darkness). That's why most of their competitors are using LIDAR.
I really hope velodyne delivers. Quanergy seems to have a nice site but seems vaporware in the sense you can't actually buy it.
A $100 light weight Lidar is truly game changing for robots, self driving cars and drones.
This is probably going to end up like the Oculus Rift; it won't be $100.00 - it will probably end up north of $500.00, possibly north of $1000.00.
If it were easy, SICK or Hokuyo would have done it already. The fact that neither have can mean many things, of course, but I bet one of the big ones is that it isn't easy to pack a 3D LIDAR into a small package and make it robust and cheap. Both of those companies 2D LIDAR solutions already hit the robust portion; Hokuyo's offerings hit the small package portion (SICK's 2D systems are mostly the size of a coffeemaker - I own a couple), but neither company hits the low price mark.
That could also mean that they have a niche market that's willing to pay those prices, but given the interest and want for fast 3D LIDAR for self-driving vehicles and other uses, the fact that they don't have anything out is telling.
Now doing it all solid-state? Well - there are companies that have these systems as well (supposedly at least) - called "flash LIDAR"; essentially firing a laser to "flash" the scene, then using a grid-array of CCD-like high-speed elements to gather a 3D delay-time between the flash and reception. From what I've seen, for even the low-resolution modules, they make the former two companies offerings look dirt-cheap in comparison...
Not an exact analogy, but this reminds me of ancient "software modems", which shaved off a few chips and offloaded processing to the CPU. They were cheap, but had a real impact on your computer's performance.
The trouble with image processing is that it seems basically impossible to perfect with the level of certainty you'd need for 100% autonomy, whereas LIDAR gives you very straightforward data to act on. You'd still need the cameras for recognizing traffic lights and such, of course.
(Edit: whoops, embarassing : totally missed the "hundreds of dollars." Sorry!)
By having hundreds of wiggles you average out the fabrication error in any individual wiggle and produce an extremely controllable beam. Of course they're still working on the coherent part.
Electro-optic materials are more like LCDs. Light interacts with the material and the material interacts with electricity. In a cathode ray tube there are electrons, not light, and they interact directly with the electric and magnetic fields. CRTs also need a dozen-odd fields to form a good beam, or 30+ for things like electron microscopes.
I mean - obviously you could just create a stationary disc with lasers mounted all around it and switch between them electronically, but that hardly seems cheaper.
Visible light is radio, just with different wavelength, so the principle is the same. The much, much smaller wavelength (centimeters vs sub-micrometer) makes it harder to build a device, though.
 Velodyne Says It's Got a "Breakthrough" in Solid State Lidar Design (Dec 2016) [ http://spectrum.ieee.org/cars-that-think/transportation/sens... ]
 [ http://www.businesswire.com/news/home/20170419005516/en/Velo... ]
Car companies are getting used to reading data from standard Lidar. Getting them to suddenly dump it for Solid-State Lidar may be a step too fast and they would rather go through a transition period first (standard + solid-state) until they are happy with the performance of solid-state lidar.
You are referring to this in the article, right? “I don’t necessarily believe that [the solid-state lidar] will obviate or replace the 360-degree units—it will be a complement,” Marty Neese, chief operating officer of Velodyne, told IEEE Spectrum earlier this month. “There’s a lot of learning yet to go by carmakers to incorporate lidar in a thoughtful way.”
The ros tf2 (tf for "transformations", not Tensorflow) library allows you to basically input a 3d model of your vehicle, like you might get from CAD, add the pose and location of your various sensors, and it will automatically handle the spacial transforms required to build a singular world model for you.
1) We don't just rely on our vision when driving but also on sound (nearby cars) and touch (feel of the road). I actually haven't seen any self driving car projects talk about this aspect which is interesting.
2) Self driving cars can't be just as good as humans they need to be effectively perfect. Quite a lot of people think they are amazing drivers i.e. infallible so mistakes from a self driving car aren't going to go down well.
Are driverless cars even measuring sound, vibration, and G-Forces? I'd like to better understand how those play into the whole sensor fusion of these systems.
Tesla famously doesn't use LIDAR, saying cameras (possibly with parking radar) are good enough.
Here's a nice summary: https://cleantechnica.com/2016/07/29/tesla-google-disagree-l...
With LIDAR, well, there's a new type of LIDAR coming out. Then there may be different successor designs and form factors.
Active sensors are needed for human safety applications, passive doesn't make sense.
NVIDIA is also betting on CV for self driving cars.
The Mind's Eye by Oliver Sacks has an interesting chapter on a woman who managed to develop stereovision in her late 40s through vision therapy.
I'm pretty sure I'm stereoblind. I think my brain is too lazy to combine images into 3D. Pot seemed to kick my brain into overdrive and process that signal. My friend laughs at me for "seeing in 3D for the first time" and doesn't believe me.
Modern photographic lenses tend to contain a large number of lenses in series, a few of which often have an exotic property like being apochromatic, aspheric, or made of fluorite, but such well-corrected lenses may be counter-productive for machine vision. Phase detection in a DSLR relies on separate collection sites like stereo vision, but autofocus still hunts under bad conditions.
The hack people used to do (maybe still do?) was to combine a laser pointer and a webcam, and infer the distance based on where in the Y axis the laser appeared in the image that the webcam received.
This, which is many points in many directions, is much, much better, and it sounds like it will be cheap (at least much cheaper than these devices have been).
Right now there's not much in the way of LiDAR on anything, but once there is, we will need to find ways to stop other active sensors from interfering with our own.
Totally not my field at all, was just daydreaming a bit. Thanks for the info.
Garmin has one on the market for $150 https://www.sparkfun.com/products/14032?gclid=CIzZ86-fs9MCFQ...
That LIDAR-Lite is only a single "pixel". The Velodyne press release doesn't specify the resolution, but I'm certain it's > 1x1.
Sometime back Quanergy made the same splash that Velodyne is making now. When you talk to them, their price points are very high, compared to what they were talking about in their press release.
I am still waiting for a decent Lidar unit that is dependable, and costs < $500 (with > 1k points per second spread across the sphere).
Their spinning lidars have 16 to 64 vertical beams and a horizontal resolution depending on the rotation speed with a sampling rate of 300.000 to 2.2M points/sec
edit - totally missed how many people already pointed this out, sorry.
The device in the article maps a much larger field of view.
 "Quanergy Announces $250 Solid-State LIDAR for Cars, Robots, and More" (Jan 2016) [ http://spectrum.ieee.org/cars-that-think/transportation/sens... ]
But, as I understand it, to do something useful with that, you need to be pulling in a lot of data in real time and processing it. So it may not be the Lidar unit itself that's the weight constraint.