Hacker News new | past | comments | ask | show | jobs | submit login

That was hilarious. Basically (unless this needs a reframing/realignment/repositioning/reorienting):

Q: "are less sensors less safe/effective?"

A: "well more sensors are costly to the organization and add more tech debt so safety is orthogonal and not worth answering".




Uh, that's not at all a good paraphrase.

Q: "Does [removing some sensors] make the perception problem harder, or easier?"

(note, this is literally what Lex asked, your restatement is misleading)

A: [paraphrasing] "Well more sensor diversity makes it harder to focus on the thing that I believe really moves the needle, so by narrowing the space of consideration, I think we'll get better results"

Karpathy might not be telling the truth, I don't know. But it's a much more credible pitch than you make it sound, because it's often true that you can deliver better by focusing on a smaller number of things. Engineering has always been about tradeoffs. Nobody is offering Karpathy infinite money plus infinite resources plus infinite time to do the job.

Again, I'm not saying Karpathy is honest or correct. I'm saying that the rephrasings in this comment and this thread are hilariously unfair.


It is definitely a clever marketing pitch, as there is plenty of evidence to back up that LIDAR makes self-driving cars significantly safer. However, despite the hype, Teslas aren't really self driving cars at the moment, so it seems an acceptable commercial decision wrapped up in a clever sales pitch.


That's also true for high resolution maps. The question is whether you're solving for self-driving on highways or a handful of mapped city centers or whether you want to solve for the real thing. Tesla is all-in on FULL self driving, and most other companies are betting on driver assistance or gps-locked self-driving. If Tesla can get FSD to work in the next couple of years then they're vindicated. If FSD requires a weak form of generalized intelligence (plausible) then FSD isn't happening anytime soon and investing in more sensors and GPS maps is correct.


High resolution maps do not give you an accurate 3D representation of nearby objects.

Our brains do an amazing job interpreting high resolution visual data and analyzing it both spatially and temporally. Our brains then take that first analysis and apply a secondary, experiential, analysis to further interpret it into various categories relevant to the current activity.

What I’ve seen from Tesla so far indicates to me that FSD shouldn’t be enabled regardless of what sensor package they’re using, let alone based on camera data only. They need to solve their ability to accurately observes their surroundings first, especially temporally. Things shouldn’t be flashing in and out that have been clearly visible to the human eye the entire time. Additionally, this all ignores the experiential portion of driving. When most people approach something like a blind driveway or crosswalk obscured by a vehicle (a dynamic, unmapped, situation), they pay special attention and sometimes change their driving behavior.


Actually that's a fair point. My goal was only for simplification and not intended to malign.


> because it's often true that you can deliver better by focusing on a smaller number of things.

This is true / dogma in linear / non-linear regression world, but of no real import in deep learning or Bayesian methods.


There seem to be 2 points of view here. One technical (sensors and the algorithms), one organisational (people and teams working on the problem).

My understanding is that by focusing on fewer things (vision only), they bet to make progress faster because of the simplified organisational aspect.


I think they’re talking about number of different systems doing the same thing. Have one system doing it that is sufficiently abstracted away from a common set of hardware vs various systems competing for various aspects of control.


Sorry, it's your opinion that researchers and/or engineers working on DL or Bayesian methods work better when they're distracted by many diverse tasks? What?


No, it's my opinion that in linear regression an inordinate amount of time is spent with feature selection and ensure there's no correlations among the features. When data is cheap in both X and Y, winnowing down X is a lot of work.


Munro’s cost breakdown is much more informative in just how much it’ll save in terms of parts/labor. https://youtu.be/LS3Vk0NPFDE

In general the ‘harm to consumers’ is really just making it more likely they damage the car in a parking lot or their garage, which tells you where their priorities are (sales, Automotive gross profit). Assuming occupancy network works, the only real blind spot left is if something in front of the car changes in between it turning off and on (assuming occupancy will 'remember' the map around it when it goes to sleep).

Also, Tesla’s strategy for safety is seemingly “excel in industry standard tests, ie. IIHS and EuroNCAP”, so this might be a case of the measure becoming a target.


This thread is unhelpfully mixing radar and ultrasonic sensors. Ultrasonic sensors, as your video explains, are primarily used as a parking aid; they are tuned for too low a distance to be helpful in just about any kind of driving scenario at speed.

Meanwhile, radar is the principal sensor used in systems like automatic emergency braking across the industry. It has no intersection with any of the parking stuff because it generally has to ignore stationary objects to be useful (hence the whole "Teslas crashing full speed into stopped vehicles" thing).


The first famous autopilot crash was because a white semi-truck was washed out by the sun and confused for an overhead sign.

That's literally trivial for a car with radar to detect.

Amazing how people talk about stuff they have no idea about when it comes to Tesla.


> ”The first famous autopilot crash was because a white semi-truck was washed out by the sun and confused for an overhead sign.

That's literally trivial for a car with radar to detect.”

That crash occurred on a car which was using radar. Automotive radar generally doesn’t help to detect stationary objects.

Further, that crash occurred on a vehicle with the original autopilot version (AP1), which was based on Mobileye technology with Tesla’s autopilot software layered on top. Detection capabilities would have been similar to any vehicle using Mobileye for AEB at the time.


I find very strange the claim that a moving doppler (pulsed doppler?) radar 'generally doesn't help to detect stationary objects'. I mean if the car is moving, it generates a doppler shift on all objects moving at a different speed, right?

Maybe it's difficult for reasons of false alarm detection (too many stationary objects that are not of interest) but you can get very good results with tracking (curious about these radars' refresh rate), STAP, and classification/identification algorithms, especially if you have a somewhat modern beamformed signal (so, some kind instant spatial information). Active-tracking can also be of help here if you can beamsteer (put more energy, more waveform diversity on the target, increase the refresh rate). Can't these radars do any of those 'state of the art 20 years ago' stuff?

There's something I don't get here and I feel I need some education...


Source: have worked with some of the (admittedly last-gen) automotive RADAR chips, NXP in particular.

The issue is the number of false positives, stationary objects need to be filtered out. Something like a drainage grill on the street generates extremely strong returns. RADAR isn't high enough resolution to differentiate the size of something, you only have ~10 degree resolution, and after that you need to go by strength of the returned signal. So there's no way to differentiate a bridge girder or a railing or a handful of loose change on the road from a stationary vehicle. On the other hand, if you have a moving object, RADAR is really good at identifying it and doing adaptive cruise control etc.

Edit: it looks like some of the latest Bosch systems have much better performance in terms of resolution and separability: https://www.bosch-mobility-solutions.com/media/global/produc...


Hi, thanks. Thought it might be so. Still...

RADAR can have high(er) angular resolution with (e.g.) phased arrays (linear or not) and digital beamforming. I guess it's the way the industry works and it wants small cheap composable parts, but using the full width of the car for a sensor array you could get amazing angular accuracy, even with cheap simple antennas. MIMO is also supposed to give somewhat better angular accuracy, since you can perform actual monopulse angular measurement (as if you had several independent antennas). There's even recent work on instant angular speed measurement through interferometry if you have the original signals from your array.

And with the wavelengths used in car RADARs you could get far down on range resolution, especially with the recent progress on ADCs and antenna tech.

I'm not saying you're wrong, you're describing what's available today (thanks for that).

Wondering when all this (not so new) tech might trickle down to the automotive industry... And whether there's interest (looking at big fancy manufacturers forgoing radar isn't encouraging there).


In theory a big phased array of cheap antennas is cheap, in practise not because you need to have equal impedance routing to all of the antennas, which means you need them all to be roughly equidistant to the amplifier. You could probably get away with blowing it up to the size of a large dinner plate, but then you also need a super stiff substrate to avoid flexing, and you need to convince manufacturers that they should make space for this in their design language without any metallic paint or chromed elements in front.

Which car brand do you think would take up these restrictions, and which customer is then going to buy the car with the big ugly patch on the front?


Thanks for the thoughtful reply.

Modern phased arrays can have independent transmitters (synchronized digitally or with digital signal distribution) or you can have one 'cheap and stupid' transmitter and many receivers, doing rx beamforming, and as for complexity you mostly 'just' need to synchronize them (precisely). The receivers can then be made on the very cheap and you need some signal distribution for a central signal processor.

Non-linear or sparse arrays are also now doable (if a bit tricky to calibrate) and remove the need for complete array or rigid substrate or structure.

If you imagine the car as a multistatic many-small-antennas system there's lots that could be done. Exploding the RADAR 'box' into its parts might make it all far more interesting.

I'll admit I'm way over my head on the industrial aspects, so thanks for the reality check. Just enthusiastic, the underlying radar tech has really matured but it's not easy to use if you still think of the radar as one box.


I know even for the small patch antennas we were looking at, the design of the waveguides was insanely complicated. I can't imagine blowing it up to something larger with many more elements.

If you wanted separated components to group together many antennas I suspect the difficulty would be accurate clock synchronization what with automotive standards for wiring. I'm still not sure I understand how they can get away without having rigid structures for the antennas, but this would be a critical requirement because automotive frames flex during normal operation.

Cars are also quite noisy RF environments due to spark plugs.

I guess what you're speaking of will be the next 10-20 years of progress for RADAR systems as the engineering problems get chipped away at one at a time.


Ah I'm probably oversimplifying and working in an industry with a far higher price per sold unit, so have a very distorted view of 'easy' or 'recent'.

Thanks for humouring me. RADAR is a very fun and interesting topic.


There's also a legitimate harm to consumers with such a large radar array in the front bumper. Because even a minor fender bender could total a $50k car.

So the car would be very difficult to sell since few people are willing to pay much higher insurance premiums just for that.


Ah thanks, didn't take that into account.


I've heard people on the internet claim that, in automotive radar the first thing they do when processing the signal is discard any stationary objects. Apparently this is because the vast majority of the time it's a sign or overhead gantry or guard rail - any of which could plausibly be very close to the lane of travel thousands of times per journey - and radar doesn't provide enough angular resolution to tell the difference.

Personally I've never seen these claims come from the mouth of an automotive radar expert, and many cars do use radar in their adaptive cruise control, so I present it as a rumour, not a fact :)


Indeed, my VW which uses a forward looking radar has signaled several times for stationary objects. In fact, the one time it literally stopped an accident was for a highway that suddenly turned into a parking lot. People keep repeating BS said by tesla and tesla apologists for why their cars run into stopped things and others seem to have less of a problem with it.


> I find very strange the claim that a moving doppler (pulsed doppler?) radar 'generally doesn't help to detect stationary objects'. I mean if the car is moving, it generates a doppler shift on all objects moving at a different speed, right?

I’m in the same boat as to not understanding why, but from what I have read the problem indeed isn’t that it doesn’t detect them, it’s that there are too many of them, and nobody has figured out how to filter out the 99+% of signals you have to ignore from the ones that may pose a risk, if it’s doable at all.

I think that at last part of the reason is that spatial resolution of radar isn’t great, making it hard to discriminate between stationary objects in your path and those close to it (parked cars, traffic signs, etc). Also, some small objects in your path that should be ignored such as soda cans with just the ‘right’ orientation can have large radar reflections.


Especially when most car radars are FMCW radars. They not only do know the speed, they also know the distance.

Some of the newest car radars can do some beam formimg, but not all.

Most models have multiple radars pointing in multiple directions as that's cheaper than AESA.

Only just recently have "affordable" beamformer's come to the market. And those target 5G basestations.

So the spec in most K/Ka-band models starts at 24.250GHz, where the 5G band starts. While the licence free 24GHz band that the radars use is 24.000-24.250GHz.

If this was not bad enough there has been consistent push from regulators to get the car radars on the less congested 77GHz band. And there's even less afforable beamformers for that band.


Might be time for some state sponsorship to have the beamforming asics, fpga designs for these bands. Although I might be missing something: once you're back down in your demodulated sampling frequency, your old beamformer should suffice? Or are we talking 'adc+demodulator+filter+beamforming' asic?


Not a fan of Tesla removing the sensors but a vehicle on a highway that isn’t moving the same direction as the car is not “trivial” with radar. No AEBs that use radar look for completely stopped objects after a certain speed because the number of false positives is so high.


So, yes, cars that are programmed to have AEB: perform well at AEB and not other tasks. We are in agreement here. (I even agree with you that those cars use Radar for AEB).

Now, where we disagree is you implying that cars with AEB-level radar (literally $10 off-the-shelf parts with whatever sensor fusion some MobilEye intern dreams ups) are somehow the same as self-driving cars (the goal of Tesla Autopilot).

Every serious self-driving car/tractor-trailer out there uses radar as a component of its sensor stack because Lidar and simple imaging is not sufficient.

And that's the point I was trying to make - we agree it's trivial for radar to find things they just need sensor fusion to confirm the finding and begin motion planning. This is why a real driverless car is hard despite what Elon would like you to believe. There is no one sensor that will do it. Full stop.

And this cuts to the core of why Tesla is so dangerous. They are making a car with AEB and lane-keeping and moving the goal posts to make people (you included) think that's somehow a sane approach to driverless cars.


> ”There is no one sensor that will do it. Full stop.”

Yet somehow, humans can drive cars with just a pair of optical sensors (mounted on a swivelling gimbal, of sorts).

In theory, a sufficiently capable AI should be able to drive a car at least as well as a human can using the same input: vision.


> Yet somehow, humans can drive cars with just a pair of optical sensors

A pair of optical sensors and a compute engine vastly superior to anything that we will have in the near future for self-driving cars.

Humans can do fine with driving on just a couple of cameras because we have an excellent mental model (at least when not distracted, tired, drunk, etc.). Cars won't have that solid of a mental model for a long, long time, so sensor superiority is a way to compensate for that.


The optical sensors are just a small part of the human (and animal in general) vision system. A much bigger component is our innate (evolutionarily acquired) understanding of basic mechanics, simple agent theory, and object recognition.

When we look at the road, we recognize stuff in the images we get as objects, and then most of the work is done by us applying basic logic in terms of those objects - that car is off the side of the road so it's stationary; that color change is due to a police light, not a change in the composition of objects; that small blob is a normal-size far-away car, not a small and near car; that thing on the road is a shadow, not a car, since I can tell that the overpass is casting it and it aligns with other shadows.

All of these things are not relying on optics for interpreting the received image (though effects such as parallax do play a role as well, it is actually quite minimal), they are interpreting the image at a slightly higher level of abstraction by applying some assumptions and heuristics that evolution has "found".

Without these assumptions, there simply isn't enough information in an image, even with the best possible camera, to interpret the needed details.


> "A much bigger component is our innate (evolutionarily acquired) understanding of basic mechanics, simple agent theory, and object recognition. ... they are interpreting the image at a slightly higher level of abstraction by applying some assumptions and heuristics that evolution has "found"."

Of course, and all this is exactly what self-driving AIs are attempting to implement. Things like object recognition and understanding basic physics are already well-solved problems. Higher-level problem-solving and reasoning about / predicting behaviour of the objects you can see is harder, but (presumably) AI will get there some day.


Putting all of these together amounts to building AGI. While I do believe that we will have that one day, I have a very hard time imagining as the quickest path to self-driving.

Basically my contention is that vision-only is being touted as the more focused path to self-driving, when in fact vision-only clearly requires a big portion at least of an AGI. I think it's pretty clear this currently means this is not a realistic path to self-driving, while other paths to self-driving using more specialized sensors seem more likely to bear fruit in the near term.


> sufficiently capable AI

And Tesla lacks that, so therefore they ought not simply rely on cameras and ought use extra auxiliary systems to avoid danger to their consumers, they are not doing this because it reduces their profit margins, alas, this hn thread


> Yet somehow, humans can drive cars with just a pair of optical sensors (mounted on a swivelling gimbal, of sorts).

In fairness, humans have a lot more than just optical sensors at their disposal, and are pretty terrible drivers. We've added all kinds of safety features to cars and roads to try to compensate for their weaknesses, and it certainly helps, but they still make mistakes with alarming regularity, and they crash all the time.

When you have a human driver, conversations about safety and sensor information seem so straightforward. The idea of a car maker saving a buck by foregoing some tool or technology at the expense of safety is largely a non-starter.

What's weird is, with a computer driver, (which has unique advantages and disadvantages as compared to a human driver) the conversation is somehow entirely different.


> We've added all kinds of safety features to cars and roads to try to compensate for their weaknesses

This is a super important point. Whenever self-driving cars comes up in conversation it's like, "we're spending billions of dollars on self-driving cars tech, but what if we just, idk, had rails instead of roads". We're putting all the complexity on the self-driving tech, but it seems pretty clear that if we helped a little on the other end (made driving easier for computers), everything would get better a lot faster.


My kid took 12 driving lessons of an hour long before she could drive stick shift on a road albeit very slowly.


> In theory, a sufficiently capable AI should be able to drive a car at least as well as a human can using the same input: vision.

In theory, cars should be use mechanical legs instead of wheels for transportation, that's how animals do it. In theory, plane wings should flap around, that's the way birds do it. My point being: the way biology solved something may not always be the best way to do it with technology.


> ”In theory, cars should be use mechanical legs instead of wheels for transportation, that's how animals do it.”

Wheels and legs solve different problems. Wheels aren’t very useful without perfectly smooth surfaces to run them on. If roads were a natural phenomenon that had existed millions of years ago, then isn’t it plausible that some animals might have evolved wheels to move around faster and more efficiently?


GP was stating that "two cameras mounted 15cm apart on a swivel slightly left of the vehicle center of geometry" has proven to be a _sufficient_ solution, not necessarily the best solution.


>Yet somehow, humans can drive cars with just a pair of optical sensors (mounted on a swivelling gimbal, of sorts).

This is wrong and I was surprised to hear them say it was enough in the video.

We don't have car horns and sirens for your eyes. You will often hear something long before you see it. This is important for emergency vehicles. Once you hear it, a good driver will immediately slow down and pull to the side, or delay movement to give space for the vehicle.

Does this mean self driving vehicles can't detect emergency vehicles until they appear on camera? That's not encouraging.


>Once you hear it, a good driver will immediately slow down and pull to the side, or delay movement to give space for the vehicle.

Robotically performing an action in response to single/few stimuli with little consideration for the rest of the setting and whether other responses could yield more optimal results precludes one from ever being a "good" driver IMO.

"See lights, pull over" is not going to cut it. See any low effort "idiot drivers and emergency vehicles" type youtube compilation for examples of why these sorts of approaches fall short.


Deaf people (or those who blast music) can drive. People who are blind in one eye can drive.


That might have something to do with the general intelligence prediction supercomputer sitting between the ears. If Tesla is saying they won't have real (not just an 80 percent solution that they then lie and say is complete) self driving until they develop an AGI, I agree


Optical sensors, an innate understanding of they works around them that they are previewing. And most importantly, a social understanding of what other humans around them are likely to do.


Our two eyeballs plus brain is SO MUCH MORE than just two mediocre CCDs.

Our eyes provide distance sensing through focusing, the difference in angle of your two eyes looking at a distant object, and other inputs, as well as having incredible range of sensitivity, including a special high contrast mode just for night driving. This incredibly, literally unmatched camera subsystem is then fed into the single best future prediction machine that has ever existed. This machine has a powerful understanding of what things are (classification) and how the world works (simulation) and even physics. This system works to predict and respond to future, currently unseen dangers, and also pick out fast moving objects.

Two off the shelf digital image sensors WILL NEVER REPLACE ALL OF THAT. There's literally not enough input. Binocular "vision" with shitty digital image sensors is not enough.

Humans are stupidly good at driving. Pretty much the only serious accidents nowadays are ones where people turn off some of their sensors (look away from the road at something else, or drugs and alcohol) or turn off their brain (distractions, drugs and alcohol, and sleeping at the wheel).


Yes, a "pair" of optical sensors. Tesla is at a disadvantage compared to humans -- they do not do stereoscopic imaging, which makes distance of objects less reliable -- they try to infer distance from a single flat image. Humans having two sensors pointed in the same direction gives us a very reliable way of determining distance (up to a relevant distance for driving at least).


Interestingly, even people with missing stereoscopic vision are allowed to drive. We don't require depth perception to drive. The assumption is that they can compensate.


Binocular vision isn't even the only source of depth information available to humans. That's why someone missing an eye can still make reasonable depth estimations.


Yes, and a sufficiently smart compiler should make optimization unnecessary. ;-)


Isn't this a bit like saying we can do better than fixed-wing aircraft, because birds can flap their wings? With sufficiently advanced material science, flapping-wing human flight too, is possible. But that doesn't mean Boeing and Cessna are misguided.


But that's not how people drive. They use their ears, they move their head around to generate parallax, they read the body-language of other drivers, they make eye-contact at intersections, they shift position to look around pillars, or stoop to see an inconveniently placed stop light. Fixed forward cameras do none of that.


Yes, in theory, something sufficient will suffice. I think we are trying to bring some clarity to the topic, however.


I believe in the sentiment, but it's true that we humans also crash our cars a LOT. From minor bumps and scrapes to multiple fatalities.


Running neural networks tweaked by million years of evolution.


You seem to be fine with the 30k Americans per year who die in car crashes.

That number is probably too high for robots to do though.

Humans are weird like that.


> humans can drive cars

Needs emphasis on can


But if the radar just sees a static object and can't tell if it's an overhead sign or a car, and the camera vision is too washed out, how would sensor fusion help in your example?


Perhaps stop cheaping out on the cameras and procure those with high dynamic range. Then again those may be "expensive and complicate the supply chain with for a small delta"


But then, again, what is the point of the radar?


A human driver slows down and moved their head around to get a better view when the glare from the sun is too strong to see well. I’d expect a self driving car to similarly compromise on speed for the sake of safety, when presented with uncertainty.


Lidar would make it pretty obvious whether it's a sign or a car, even if the camera didn't tell you. The part where the lidar doesn't bounce back at vehicle level would be a dead give away.


A approaching incline has entered the chat.


Approaching inclines aren't exactly hard to interpret from lidar either.


Which is another reason that "full" FSD is at least a decade away and likely from another supplier, if at all. Cannot believe people are funding this.


  That's literally trivial for a car with radar to detect.
In principle that is correct… but radars in automotive application are unable (or rather not used) to detect non-moving targets ?

Asking this because I know first hand that the adaptive cruise function in my car must have a moving vehicle in front of it for the adaptive aspect to work. It will not detect a vehicle that is already stopped.

The resolution of the radar is pretty good though, even if the vehicle in the front is just merely creeping off breaks… it does get detected if it is at or more than the “cruising distance” set up initially.

The AEB function on my car depends on the camera.


My understanding is that your typical automotive radar will have insufficient angular resolution to reliably distinguish, say, an overpass from a semi blocking the road, or a pedestrian standing in the middle of the road from one on the footpath.

Radar does however have the advantage of measuring object speed directly via the doppler effect, so you can filter out all stationary objects reliably, then assume that all moving objects are on the road in front of you and need to be reacted/responded to.

So I think it's the case that radar can detect stationary objects easily, but cannot determine their position enough to be useful, hence in practice stationary objects are ignored.


Adaptive cruise control is solving a totally different problem. It is specifically looking for moving objects to match pace with. That's very different from autonomous driving systems.

Radar is quite good at finding stationary metal objects, particularly. Putting it in a car, if anything, helps, because the station objects are more likely to be moving relative to the car...


The newer “4D imaging” radars can track stationary objects too. This is what the self driving companies have turned to in recent times.


What a smug post from someone who is completely wrong.

The car that crashed had radar, vision, USS, AND it was based on another company's technology.


The kicker for me is that the area covered by the ultrasonic sensors is essentially all blind spots for the cameras. The sensors currently are able to tell you when something too low to see is getting within a few inches of the car. It also gives an exact distance when parking, so I can know that I'm parking exactly 2ft from the wall every time. As much as they claim otherwise, it simply cannot be a matter of fixing it in software. The cameras can't tell you what they can't see. They simply don't have the coverage to do this, and clearly don't even have the coverage to hit parity with radar enabled autopilot either.


The video has a more reasonable answer.

The sensors are unreliable and expensive in terms of R&D. Having marginal parts which takes money from a finite R&D budget can easily result in a worse product. “They contribute noise and entropy into everything.” … “you’re investing fully into that [vision] and you can make that extremely good. You only have a finite amount of spend of focus across different facets of the system.”

His standpoint can be summed up as “I think some of the other companies are going to drop it.” Which would be really interesting if true.


> Having marginal parts which takes money from a finite R&D budget can easily result in a worse product.

"Less sensors can be more safe/effective if that allows us to focus on making effective use of the sensor information we do have, which is the result we're aiming for with this descision." would be a reasonable answer (if true), but that doesn't seem like a fair interpretation of what he actually said.


It’s fairly rambling but he touches on that exact point several times most specifically here at the 2 minute mark:

“Organizationally it can be very distracting. If all you want to get to work is vision resources are on it and you’re actually making forward progress. That is the sensor with the most bandwidth the most constraints and you’re investing fully into that and you can make that extremely good. You only have a finite amount of spend of focus across different facets of the system.”

Which was from this section: Q: “Is it more bloat in the data engine?”

“100%” (Q:“is it a distraction?”) “These sensors can change over time.” “Suddenly you need to worry about it. And they will have different distributions. They contribute noise and entropy into everything and they bloat stuff”.

Even earlier he says:

“These sensors aren’t free…” list of reasons including “you have to fuse them into the system in some way. So that like bloats the organization” “The cost is high and you’re not particularly seeing it if your just a computer vision engineer and I am just trying to improve my network.”


Isn’t that his point though? Where exactly is he not saying that?

I listened to his answer three times and I’m not able to come up with a different interpretation than that.


> Isn’t that his point though? Where exactly is he not saying that?

Andrej eventually gets to it. But his first response was to evade. Lex is a skilled interviewer. By not letting him wriggle out of a difficult question we eventually got a substantive answer. But Andrej's first instinct was to evade. That's notable.


I don't agree Lex is a skilled interviewer, he's great at creating interesting conversations in the aw-shucks way Joe Rogan is, but he mostly plays a fanboy role. I still love a Lex interview.

Otherwise I agree.


> don't agree Lex is a skill interviewer

Fair enough. Seemingly practiced may be a better description. (I'm not super familiar with his work.)


That was exactly his point. I listened earlier today. (It being his point doesn’t mean he is necessarily right.)


Didn't they used to talk about how the Tesla radar could actually see the reflections of the car ahead of the one just in front of you? i.e. the radar reflection bouncing underneath the car just in front of you?

This is what doesn't add up to me. Either a lot of that previous wonder-talk was actually a lie, or there's something else going on here.


> Either a lot of that previous wonder-talk was actually a lie, or there's something else going on here.

Tesla dropped radar and ultrasonic due to supply shortages. Nothing to do with their AI being smart.

Many first-hand reports on Tesla fanboy forums on how no-radar and no-ultrasonic autopilot is far worse than with the sensors.


It is a hard question to answer. It’s like asking if more programmers on a project will allow it to be completed faster with higher quality. Ya, theoretically they could, in practice not likely. More sensors are like more programmers, theoretically they can be safer and more effective, but in practice they won’t be. Sensor fusion is as hard a problem as scaling up a software team.


It is not a hard question to answer at all. LIDAR will make a self driving car safer, period. There is a lot of research to back this up.


LIDAR can be safer than an optical system, I can believe that. LIDAR and an optical system being safer than either alone without a lot of extra complexity: maybe not.


> More sensors are like more programmers

More programmers are like having more testicles, theoretically they should enable you to have more kids, but in practice bootleneck are elsewhere.

Both your and my reasoning by comparison is equally valid


That isn't it though. It isn't like pumping a baby out in 1 month using 9 women. No, the problem is the fusion of too much information that varies substantially. They have completely different views of the world and you can't just lerp them together.

I bring up the programmers working on a project example just to illustrate how more isn't always better even if it theoretically can be.


I mean yeah, but it is a friggin ton heavy object moving at high speed controlled by a computer. Having another kind of sensor system to cross-check might be the reasonable thing to have, even if you happen to make it work well in 99% of the cases just with optics — the issue is that the other 1% kill people.

Your optical system can be good as heck till a bug hits it directly on the lense coving an important frontal area and make it behave weirdly.


Not more sensors, different sensors.

In your metaphor it's like asking if you should have project managers as well as engineers on your project. And Tesla has decided that having only engineers allows them to focusing on having the best engineers. And they avoid the distraction of having to manage different types of employees.


Different sensors are even worse for sensor fusion. Actually, it only applies to different sensors, incorporating different signals with different strengths and weaknesses into a model that is actually better and not worse, is difficult.


Lack of focus is a major problem for companies and we all know that tech debt leads to increased bug counts.

Team focus on vision which is by far the highest accuracy and bandwidth sensor allows for a faster rate of safety innovation given a constant team size.


Tesla's cameras often get blocked by rain or blinded by the sun or not see that well in the dark. It's really hard to imagine those cameras replacing the ultrasonic sensors which do a pretty good job at telling you where you are when you're parking etc. I can't see how the camera is going to detect an object at pitch dark and estimate the distance to it better than an ultrasonic sensor. But hey, if people ding their cars it's more revenue.

The bottom line seems to be that the part shortages would have slowed production and cost cutting. The rest of the story seems like a fable to me. It was pretty clear Tesla removed the radar because it couldn't get enough radars.

The interview didn't really impress me. I'm sure Andrej is bound by NDA and not wanting to sour his relationship with Tesla/Elon but a lot of the answers were weak. (On Tesla and some of the other topics, like AGI).


One interesting side effect of only using visual sensors is that the failure modes will be more likely to resemble human ones. So people will say "yeah, I would have crashed in that situation too!". With ultrasonic and radar and ladar it may make far fewer mistakes but it is possible they might not be the same ones people make, so people will say "how did it mess that up?"


Sadly, that’s the worst way to actually design the system. I’d rather have two different technologies working together, with different failure modes. Not using radar (especially in cars that are already equipped) might make economic sense to Tesla, but I’d feel safer if visual processing was used WITH radar as opposed to instead of radar.

I also expect an automated system to be better than the poor human in the drivers seat.


You have to eventually decide to trust one or the other, in real-time. So having multiple failure modes doesn't solve the problem entirely. This is called 'Fusion', meaning you have to fuse information coming from multiple sensors together. There are trade offs because while you gain different views of the environment from different sensors, the fusion becomes more complicated and has to be sorted out in software reliably in real-time.


> There are trade offs because while you gain different views of the environment from different sensors, the fusion becomes more complicated and has to be sorted out in software reliably in real-time.

If you're against having multiple sensors though, the rational conclusion would be to just have one sensor, but Tesla would be the first to tell you that one of the advantages their cars have over human drivers is they have multiple cameras looking at the scene already.

You already have a sensor fusion problem. Certainly more sensors add some complexity to the problem. However, if you have one sensor that is uncertain about what it is seeing, having multiple other sensors, particularly ones with different modalities that might not have problems in the same circumstance, it sure makes it a lot easier to reliably get to a good answer in real-time. Sure, in unique circumstances, you could have increased confusion, but you're far more likely to have increased clarity.


This is one side of the argument. The other side of the argument is that what matters more than the raw sensor data is constructing an accurate representation of the actual 3D environment. So an argument could be made (which is what this guy and Tesla are gambling on and have designed the company around), is that the the construction & training of the Neural out-weighs the importance of the actual sensor inputs. In the sense that even with only two eyes (for example) this is enough when combined with the ability of the brain to infer the actual position and significance of real objects for successful navigation. So as a company with limited R&D & processing bandwidth, you might want to devote more resources to machine learning rather than sensor processing. I personally don't know what the answer is, just saying there is this view.


The whole point of the sensor data is to construct an accurate representation of the actual environment, so yes, if you can do that, you don't need any sensors at all. ;-)

Yes, in machine learning, pruning down to higher signal data is important, but good models are absolutely amazing at extracting meaningful information from noisy and diffuse data; it's highly unusual to find that you want to dismiss a whole domain of sensor data. In the cases where one might do that, it tends to be only AFTER achieving a successful model that you can be confident that is the right choice.

Tesla's goal is self-driving that consumers can afford, and I think in that sense they may well be making the right trade-offs, because a full sensor package would substantially add to the costs of a car. Even if you get it working, most people wouldn't be able to afford it, which means they're no closer to their goal.

However, I think for the rest of the world, the priority is something that is deemed "safe enough", and in that sense, it seems very unlikely (more specifically, we're lacking the tell tale evidence you'd want) that we're at all close to the point where you wouldn't be safer if you had a better sensor package. That means, in effect, they're effective sacrificing lives (both in terms of risk and time) in order to cut costs. Generally when companies do that, it ends in law suits.


> You have to eventually decide to trust one or the other, in real-time.

More or less. You can take that decision on other grounds - e.g. "what would be safest to do if one of them is wrong and i don't know which one?"

The system is not making a choice between two sensors, but determining a way to act given unreliable/contradictory information. If both sensors allow for going to the emergency lane and stopping, maybe that's the best thing to do.


It's far from the worst way, because if humans are visually blinded by the sun or snow or rain they will generally slowdown and expect the cars around them to do the same.

Predictability especially around failure cases is a very important feature. Most human drivers have no idea about the failure modes of lidar/radar.


> I can't see how the camera is going to detect an object at pitch dark

Lights?


A car typically doesn't have lights shining in all directions. My Tesla doesn't at an rate. At night, backing into my driveway, I can barely see anything on the back-up camera unless the brake lights come on. If it's raining heavily it's much worse. But the ultrasonic sensors are really good at detecting obstacles pretty much all around.


Interesting. I find the rear camera in my Tesla is outright amazing in the dark. I can see objects so much more clearly with it than with the rear view mirror. It feels like I'm cheating... almost driving in the day.


Reverse lights are literally mandated by law. Your Tesla has them, and if they're not bright enough that's a fairly cheap and easy problem to fix relative to the alternatives.

Ultrasonics struggle in the rain, btw.


The sensors also detect obstacles on the side of the car where there's no lighting. Every problem has some sort of solution, but removing the ultrasonic sensors on the Tesla is going to result in poorer obstacle detection performance. Sure, if they add 360 lighting and more cameras they can make up for that.

EDIT: Also I'm not quite positive why the image is so dark when I reverse at night. But it still is. The slope and surface of the driveway might have something to do with that... Still I wouldn't trust that camera. The ultrasonic sensors otoh seem to do a pretty good job. That's just my experience.

EDIT2: I love the Tesla btw. The ultrasonic sensors seem to work pretty reliably, they're pretty much their own system, the argument about complexity doesn't really seem to hold water and on the face of it the cameras won't easily replace them...


Just curious, do teslas have a reverse light?


Yes. Only one on my 2015 Model S 70D. Not sure how many on other Teslas.


I just assumed they used something similar to iPhone FaceID and Xbox Kinect dot emitters. https://www.theverge.com/circuitbreaker/2017/9/17/16315510/i...


You are greatly overestimating the functionality of the sensors, and underestimating the importance of the rest of the system. Sensors are important, but the majority of the work, effort and expense is involved with post-sensor processing. You can't just bolt a 'Lidar' on to the car and improve quality of results. Andrej and other engineers working on these problems are telling everyone the same story. The perfect solution is not obvious to anyone, and they have chosen one path. Engineers aren't trying to scam people out of a few dollars so they can weasel out of making high quality technology. This has Nothing to do with cost-cutting.


"The perfect solution is not obvious to anyone, and they have chosen one path. Engineers aren't trying to scam people out of a few dollars so they can weasel out of making high quality technology. This has Nothing to do with cost-cutting."

It has everything to do with cost cutting?


Lidar vs. Stereo camera vs. multiple cameras vs. ultrasound is a separate problem that engineers are trying to solve, not how can we sell cheaper mops. The decision to not use Lidar, as he says, and is the common debate being explored by people working on autonomous driving is whether it makes more sense to focus on stereo image sensors with highly integrated machine learning, or maybe use Lidar or other sensors and include data Fusion processing. Both methods have trade-offs.


"Lidar vs. Stereo camera vs. multiple cameras vs. ultrasound is a separate problem that engineers are trying to solve, not how can we sell cheaper fucking mops."

Okay? Tesla is a car company and they are absolutely trying to sell a cheaper car. That's obvious to anyone that's been in one.

"Both methods have trade-offs."

Right, isn't that why most other systems use both?


Both methods have trade-offs as in there are positive and negative merits for both approaches. Using both systems requires the sensor data to be fused together to make real-time decisions. This is the whole point, why people are trivializing this problem, and why it is easy to believe that they are just trying to scam people by going cheap on using multiple sensors. If you want to argue that it is better to use Lidar then explain why apart from 'others do it'. The podcast, and previous explanations by this guy and others that agree with him (which occurred way before some shortage issues) is about what is the best way to solve autonomous driving. You don't solve it by simply adding more sensors. There are multiple hours of technical information about why this guy Andrej thinks this way is best. Others make arguments for why multiple sensors and fusion makes more sense. No one knows the correct answer, it will be played out in the future. Maybe what some people care about is cheaper cars. That is not what the podcast was about, that is not how the Lidar + stereo camera vs. stereo-camera only decision was made. And in terms of the advancement of human civilization it is not interesting to me whether Tesla has good or bad quarterly results compared to what is the best way to solve the engineering problems & the advancement of AI, etc. I don't really care very much but it is slightly offensive when many people just dismiss engineers who are putting in tons of effort to legitimately solve complicated problems as if they are just scam artists trying to lie to make quick money. That is also a stupid argument. No company is going to invest billions(?) of dollars and tons of engineering hours into an idea they secretly know is inferior and will eventually lose out because they can have a good quarter. That is not a serious argument.


I'm not sure why you'd assume all of that. You keep saying engineers, but it's a business decision. Seems like you are getting caught up in marketing.


I am an engineer working on autonomous vehicles. Nothing personal just responding to the thread as a whole. I don't believe this guy is conspiring to trick anyone. Business decisions, or course. I think they are in good faith gambling on this one approach. So I am interested to see if their idea will win, or if someone else figures out a better way.


There problem is not that he was wrong, the problem is that he's made a motherhood statement in response to a very specific question.

He's not conspiring to trick people per se but he's also not being super clear. His position obviously makes it difficult to answer this question. It's possible he really believes this is better but if he didn't he wouldn't exactly tell us something that makes him and his previous employer look bad. Also his belief here may or may not be correct.

Is it a coincidence that the technical stance changed at the same time when part shortages meant that cars could not be built and shipped because of shortages of radars?

More likely there was some brainstorming as a result of the shortages and the decision was made at that point to pursue an idea of removing the additional sensors and shipping vehicles without those. This external constraint makes believing the claims that this is actually all around better, while hearing some reports of increases in ghost braking (anecdotes) a little difficult. Not clear if there was enough data at that time to prove this and even Andrej himself sort of acknowledges that it's worse by some small delta (but has other advantages, well shipping cars comes to mind).

So yes, sensors have to be fused, it's complicated, it's not clear what the best combination of sensors is, the software might be larger with more moving parts, the ML model might not fit, a larger team is hard to manager, entropy - whatever. Still seems suspicious. Not sure what Tesla can do at this point to erase that, they can say whatever they want, we have no way of validating that.


Maybe you're right, I don't care about Tesla drama.

Here is one possible perspective from an engineering standpoint:

Same amount of $$, same amount of software complexity, same size of engineering teams, same amount of engineering hours, same amount of moving parts. One company focuses on multiple different sensors and complex fusion with some reliance on AI. Another company focuses on limited sensors and more reliance on AI. Which is better? I don't think the answer is clear.

The other point is that I am arguing that many people are over-stating the importance of the sensors. They are important, but far more important is the post-processing. Any raw sensor data is a poor actual representation of the real environment. It is not about the sensors, but about everything else. The brain or the post-sensor processing is responsible for reconstructing an approximation of the environment. We have to infer from previous learned experiences of the 3D world to successfully navigate. There is no 3D information coming in from sensors, no objects, no motion, no corners, no shadows, no faces, etc. That is all constructed later. So whoever does a better job at the post-processing will probably out perform regardless of the choice of sensors.


People absolutely get that. Their issue is that Tesla is only relying on visual data and then on what is a disingenuous basis, insist that this is okay because humans "only need eyes" or some other similar sort of strawman argument.


Okay so they are "good faith" gambling? I don't want to drive in a car that has any gambling... I don't get how it being in good faith (generous on your part) makes it less of a gamble?


Uhh highest accuracy and bandwidth for what? You can have a camera that can see piece of steak at 100K resolution at 1000 FPS but doesn’t mean you can use a camera to replace a thermometer. Blows my mind how people eat up that cameras can replace every sensor in existence without even entertaining basic physics. ML is not omnipotent.


For the specific task of (for example) cooking a steak it’s not hard to envision a computer vision algorithm coupled with a model with a some basic knowledge of the system (ambient temperature, oven/stove temperature, time cooking, etc.) doing an excellent job.


No, I can't envision this. Surface texture alone will not tell you if meat is cooked. There is no getting around the temperature probe.

Now, simple color matching models are used in some fancy toasters on white bread to determine brownness. That's the most I've ever seen in appliances...


You cannot use vision to see the state of the side of a steak touching the pan, nor the internal temperature.


Tesla engineers are currently doing post-commit review of Twitter source code. Focus is the last thing I would credit them with.


I don't think it was your intent, but your statement makes it seems like all Tesla engineers are looking at Twitter code. I bet this number is closer to 4.

Tesla has ca. 1000 software engineers working in various capacities. The ca. 300 that work on car firmware and autonomous driving are probably not participating in the Twitter drama.


4 people is enough to review source that was written by thousands of engineers at twitter? thats not even enough for architecture review


I don't think the goal is to review all Twitter source. That should be the job of the (new?) development team. I think the goal was to look at the last 6 months of code, especially the last few weeks, for anything devious.


All of them?


do you have data to back this claim?


Have you been keeping up with the Twitter deal? This was covered Friday.

https://www.bloomberg.com/news/articles/2022-10-27/tesla-eng...

And the corresponding HN thread: https://news.ycombinator.com/item?id=33365065


I have not been following the whole thing super closely, no. Thank you for the links.


> "Team focus on vision which is by far the highest accuracy and bandwidth sensor allows for a faster rate of safety innovation given a constant team size."

By hiding the ball that you are starting from a much more unsafe position


> vision which is by far the highest accuracy and babdwidth

They are literally the least accurate of all sensors.

Radar tells you distance and velocity of each object. Lidar tells you size and distance of each object. Ultrasonic tells you distance. Cameras? They tell you nothing!

Everything has to be inferred. Have you tried image recognition algorythms? I can recognise a dog from 6 pixels, the image recognition needs hundreds, and has colossal failures.

We have no grip on the results AI will produce and no grasp on it's spectacular failures.

Driving will have to be solved without AI




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: