Tesla has bet the company on robotaxis, but their vision only tech stack doesn’t seem capable of solving it, which is a problem because Tesla has repeatedly promised FSD is right around the corner, or less than a year away. It’s hard to believe Karpathy would step down if he felt they were close to solving the problem anytime soon.
This announcement comes after a 4 month sabbatical where Karpathy said he wanted to take some time off to “sharpen my technical edge,” which makes it sound like this is the result of frustration with the technical approach instead of burnout.
FSD beta tester here. I think they are minimum 3 years away from anything exciting beta.. but the localization, mapping, and visualization are not the reason. I don't think LIDAR would contribute substantially to improvement.
The fundamental flaws are in the decision-making being based upon 10-30 second feature memory, ignoring features outright, and only depending on visible road features instead of persisted map data.
For instance, near my house there's an intersection where it will try to use a turn-only lane with a red arrow when it's trying to go straight thru a light. 100% of the time. Even if I'm in the correct lane with no traffic around.
That's because the turn arrow on the ground is worn off. It is not a perception problem, it is
a) ignoring the obvious red turn arrow signal's significance for lane selection deliberately
b) makes no attempt to persist or consult map data for 'which lanes go where'
c) it completely disregards painted lines on the ground in the "no drive here" striping.
Also one block from my house, FSD will stay still indefinitely waiting for trash cans (displayed as trash cans) to clear the leftmost lane so that it will turn left.
None of the failures I encounter are due to lack of perception.
> I think they are minimum 3 years away from anything exciting beta
I think that's optimistic. Ten plus years.
There's too many exceptions. In my home town, an intersection. Imagine, if you will:
Traveling westbound there's four straight ahead lanes. There's a traffic light for each straight lane. The traffic lights will alternate "left two straight, green; right two straight red" and then "left two straight, red; right two, green". It does this because there's a tunnel and a roundabout right there. I guarantee that FSD will choke on this.
If the pavement is clearly marked i would give it a good chance. We have a 6-way intersection with similar lighting that it succeeds at... Other than completely and recklessly disregarding the clearly visible "no turn on red" sign.
We also have a road with a reversible lane in the middle. Thankfully it recognizes the red X sign for this. Unfortunately, if it's driving for 11 seconds without seeing anothee, it will suddenly decide to merge into the incoming lane to make an upcoming left...
Completely agree. I think this is part of the hazard of the fsd training ethos that 'san francisco is the hardest case.' It completely dodges the issue that California actually maintains safe road markings compared to most other states.
I think in Japan or China this approach could actually work, but there is zero hope for it to work generally in the US.
Tesla has been collecting thousands of dollars, each, from car buyers and utterly failing to deliver what it represented, and keeping the money year after year. Would be let GM, Toyota, or Audi do this? Where is the criminal prosecution? Where are the refunds?
Elon Musk has literally said in a public conference keynote that robotaxis would be running in 2019, and earning Tesla owners 30k/yr since 3 years ago, and that it would be financially foolish to buy anything but a Tesla.
Juries have in the past often not been swayed by fine print, where consumers are involved. It seems to be there mainly to dissuade angry consumers from suing, rather than to win cases based on it.
I vividly remember a conversation I had with an acquaintance who had recently taken an engineering position with an AV company that occurred circa late 2018. He claimed that they were at most 1 year away. In fact, his exact words were something along the lines of "They're already here. It's just a few edge cases to work out and some regulatory hurdles to overcome."
The reality is that the first 80% of the problem had been solved quickly and significant progress had been made at the time on the next 10%. The end was in sight. Unfortunately, that next 10% ended up taking as long as the first 80% to solve, and the final 10% will likely take decades if it's even possible.
It’s naive to call yourself beta tester. People who have access to beta are basically just a part of PR exercise by Tesla, that tries to hide lack of meaningful progress towards multiple missed deadlines.
You aren’t beta testing complex automation system that’s operating on public roads. To do any meaningful testing you should have defined operation domain, specific behaviors to test, direct line to the engineering team to report issues, etc, etc.
You're thinking of QA testing. Beta testing is the final stage of testing before release and aims to test the system in as close as realistic environment as possible - that's why betas are often seeded to the general public, you want to find out if your system doesn't work on some configuration or circumstances that you didn't anticipate in internal testing. Better to have it fail for a limited number of users who signed up for that possibility than to have to do a Cyberpunk style recall.
> You're thinking of QA testing. Beta testing is the final stage of testing before release and aims to test the system in as close as realistic environment as possible - that's why betas are often seeded to the general public, you want to find out if your system doesn't work on some configuration or circumstances that you didn't anticipate in internal testing.
I'm thinking of being an unwitting test crash dummy on public roads and sidewalks for "Full Self Driving". They just slap the binding legal agreements and waivers on their drivers, but it is tantamount to fraud. I'd rather this didn't exist on public roads in such numbers, and that people don't have to pay to be free Tesla drivers, beta testers, and data points instead of being considered people.
QA testing doesn't exist for FSD. And you're applying standards used to test apps to share dickpics to a software that's moving killing machines in shared environment with other people.
It's not acceptable to have it fail for limited number of users. As my kid who may get killed by the failure didn't sign up for it.
How would you feel if this was a standard for aircrafts? "Let's just push this change to a small portion of the fleet, and boy, if that plane crashes, it's soooo much better if all our planes would crash. High five, where's my bonus?!".
That’s not 3 years away, it’s lacking fundamental knowledge of the world, which might require GAI to begin with. It is just shitty drive assist features sold as much more.
it definitely needs a broad knowledge base about all things road/traffic/sign related, but that's definitely not G nor fundamental.
still a very hard problem.
but what it requires is a consistent world model. and if it detects that the current stretch of road is incomprehensible for its model it should safely disengage.
of course Tesla opted for the cowboy version of this, and turned the confidence of the model up .. without having a good model :|
3 years was my minimum, assuming they scrap their current model and begin working on one with real geo-spatial awareness now.
They have the data for it already.
To over-essentialize the problem with their approach, you simply cannot safely drive in Atlanta as a person who has never seen the streets. Humans are no exception. You have to know how each road works, and remember that for next time.
Isn't Tesla supposed to be producing Optimus, their human-like android, next year?
Elon has been over-promising(i.e. flat out lying) about self-driving every year since.. 2014(there's a youtube video compilation of it)?
It seems like his strategy is to just come up with increasingly grandiose promises every year when he fails to deliver on his past promises. He's trapped in his swirling vortex of bullshit. Very worrying to see Karpathy leaving...
And Jobs' "reality distortion field" was primarily used to motivate / burn-out his team by demanding things that were technically very difficult but feasible (one of the original anecdotes was about reducing the boot-time for the original Macintosh), not something like solving AGI which FSD would ~apparently require.
I fully believe Elon and Jim Keller truly thought the deep learning HW delivered by Tesla was capable of self-driving in the early 201x time frame because, frankly, a lot of us did. More than likely, like the rest of us, they slowly came to see that the problem was much more difficult, and deep learning much less capable, than we had originally envisioned. Most of us didn't make a bunch of stupid promises to shareholders based on the original, incorrect assumptions though. Unlike Elon, we're able to admit we were wrong.
As someone who has worked for almost a decade with safety critical software, system design and functional safety analyses (both aerospace and automotive) etc. I don't think I ever met anyone who believed they would be anywhere near FSD/Level-5 what-have-you in the timeframe they presented.
The only ones I met who thought so was managers and software developers and ML-people, i.e. people who has never in their life seen the level of effort it actually takes to get something qualified for RTCA or other safety standards.
> As someone who has worked for almost a decade with safety critical software, system design and functional safety analyses
And you were right, but most people aren't like you, and people like you don't drive the discussion on what trends are coming. The world needs more folks with your background. Generally no one wants to listen to folks like you because you come off as a Debbie Downer despite being correct most of the time.
Hah, to be fair I am somewhat of a Negative Nancy from time to time ;)
I think Mercedes has a reasonably sensible game-plan as far as I know. They currently have Level-3 on some highways in Germany which is still freaking hard to achieve, but still a narrow enough scope that you might be able to pull it off.
Time will tell, my bet is its still 10+ years off.
ML approach for FSD was critized right from the start by mobileeye CEO who gave a very thorough talk on why that couldn't possibly work. I think the talk is still on youtube.
The guy was critized for being old school and not having adapted to the latest tech (including by elon musk when he dumped mobileeye), but in fact he was just plain right.
Until any FSD solution delivers, it's far too early to vindicate the mobileye CEO. As it stands, even with additional (expensive) sensors, it seems very likely that ML will be required regardless. Running purely off sensors is not going to be feasible when braking distances alone are greater than the sensors will be able to accurately measure.
He did give a second talk a few years later explaining how the purely cognitive aspect may be handled by ML, and the total system probably going to end up being a mix.
But his fundamental opposition to ML for things related to safety was, IMHO absolutely correct.
You don’t think SpaceX and Tesla have delivered quite a lot? FSD is really the only big project I can think of this far behind, and maybe Cybertruck. I don’t think it’s fair to ignore all his other accomplishments.
Solar rooftops hasn't gone anywhere. Boring company and Hyperloop didn't go anywhere.
Let's face it Neuralink isn't going anywhere too.
Tesla wasn't started by Musk and relies heavily on Panasonic battery developments and SpaceX and Starlink is good old fashioned ex-NASA and MIT talent.
What does Musk actually do? Annoy the SEC, go on Joe Rogan to smoke weed and have more girlfriends per hour than Gengis Khan?
I made my point. Pointing to additional examples of grandiosity (some might call ambition) doesn't invalidate it. Neither do attacks on his character or lame arguments like "Tesla wasn't started by Musk" ignoring that he grew the company from 4 people to 100,000 people. Or that somehow recruiting large amounts of NASA and MIT talent is a bad thing? That's good leadership.
Seriously? Either you're trolling or you have some irrational hatred of the guy.
Oh you don't get what I mean. Yeah those things are good and I'm grateful for the progress SpaceX is making. What I want to know is how much is really attributable to Musk and what his Ego is really worth at the end of the day.
Lets remember the only reason he's around today is because some very rich people and NASA bailed him out. SpaceX left to just Musk's ability, ego and resources died in 2008.
He relies so much on mysticism for parting people from their money, eventually some people get fed up with the 'mysterious future and repeated failure' business model.
Expectations around his personal abilities should be lowered. His track record is not that great and it's a fact people call trolling. Sorry, not sorry.
The abilities that made spacex and tesla were hundreds of highly skilled engineers. Elon just raised money and is the face brand for japanese battery tech.
He's not worth worshipping and his incompetence is covered by idolatry.
HN safely ignores anything positive about Elon because they prefer to think the richest man on the planet is both an idiot and also a failure in everything he does
I don't recall Tesla ever saying they were producing Optimus as a product any time soon. He's said prototype demo later this year. A prototype is very far from a finished product. As a product businesses/consumers can buy, at least 5+ years from now.
Well, to be fair, he only has to hit _that_ goal - at which point he can go back in time at his leisure and fix all the others. And he could hit even the time machine goal as late as he wants, and it won't matter.
Something something 'light cone.' If the machine was turned on a million years ago in another star system and then brought to Earth, the Primer model would put you in the other star system when you climb out of the machine. And then you'd need to make your way back to Earth. (It's unclear to me whether you would continue aging while sitting in the box and waiting for a million years to go by in reverse... But I assume you would.)
So I think the 'other civilization' thing only works if they're already approximately here with a running machine.
I think we can tell for certain that time travel will never be invented because if it were to happen, then logically we’d already be experiencing people visiting us from the future.
Yeah, but Google's vision + lidar tech also doesn't seem any better at solving it either. They have been working on this problem the longest and they aren't even confident enough to produce a product with it. Google is probably the leader in AI and AI research. They are also the leader in data and mapping. They have billions of cash to play with. Yet it seem like they haven't gotten any closer at solving this problem as well.
They are just going about it better but not trying to selling it.
Any reason why everyone seems to be stuck on this problem?
Waymo and Cruise routinely have driverless cars on city streets. In California, all collisions, however minor, have to be reported, and DMV posts them on their web site.[1] Most are very minor. Here's a more serious one from last month:
"A Cruise autonomous vehicle ("Cruise AV") operating in driverless autonomous mode, was traveling eastbound on Geary Boulevard toward the intersection with Spruce Street. As it approached the intersection, the Cruise AV entered the left hand turn lane, turned the left
turn signal on, and initiated a left turn on a green light onto Spruce Street. At the same time, a Toyota Prius traveling westbound in the
rightmost bus and turn lane of Geary Boulevard approached the intersection in the right turn lane. The Toyota Prius was traveling
approximately 40 mph in a 25 mph speed zone. The Cruise AV came to a stop before fully completing its turn onto Spruce Street due to the
oncoming Toyota Prius, and the Toyota Prius entered the intersection traveling straight from the turn lane instead of turning. Shortly
thereafter, the Toyota Prius made contact with the rear passenger side of the Cruise AV. The impact caused damage to the right rear door,
panel, and wheel of the Cruise AV. Police and Emergency Medical Services were called to the scene, and a police report was filed. The
Cruise AV was towed from the scene. Occupants of both vehicles received medical treatment for allegedly minor injuries."
Now, this shows the strengths and weaknesses of the system. The Cruise vehicle was making a left turn from Geary onto Spruce. Eastbound Geary at this point has a dedicated left turn lane cut out of a grass median, two through lanes, a right turn bus/taxi lane, and a bus stop lane. It detected cross traffic that shouldn't have been in that lane and was going too fast. So it stopped, and was hit.
It did not take evasive action, which might have worked. Or it might have made the situation worse. By not doing so, it did the legally correct thing. The other driver will be blamed for this. But it may not have done the thing most likely to avoid an accident. This is the real version of the trolley problem.
I think the way I would say it is, after a few tens to hundreds of thousands of such events, we'll be able to estimate the parameters of the human preference functions associated with a non-fatal trolley problem (the real trolley problem is irrelevant to the car ML folks, they are not aiming to maximize global utility) and that will help guide our confidence in rolling out more.
For all the woe and gloom in the news reporting, Google (and Cruise)'s rollouts have been more or less what I expected: no enormous accidents that were clearly caused by a computer, but instead, a small number of small accidents usually due to the human driver of another vehicle doing something wrong. That seems to lead towards greater acceptance of self-driving cars and confidence that they are roughly as good as an attentive newbie.
The next big situation, I think, will be some really large-scale pileup with massive damages and deaths, and a press cycle where the self-driving car gets blamed. But the self-driving car collected a forensic quality audit log, which of course will aid the police in determining which human caused the accident.
I've been reading those DMV reports for years, and there are clear patterns which repeat. One is where an autonomous vehicle started to enter an intersection with poor sight lines to the cross street. Sensing cross traffic, it stopped, and was then rear-ended by a human-driven vehicle that was following too closely. Waymo has that happen twice at the same intersection in Mountain View. There's a tree in the median strip there which blocks the view from their roof sensor until the vehicle starts to enter the intersection. So the Waymo system advances cautiously until it has good sensor coverage, then accelerates or stops as required.
Humans tend not to do that, and, as a result, some fraction of the time they get T-boned.
AI should absolutely mimic the behavior of real (good) drivers.
Although that's not what you're describing here, another problem for AI could result from it knowing more than an average driver; for example, if a high-mounted LIDAR were able to see around corners and let the car decide it's "safe" to do a turn that no human would attempt for lack of visibility, that could cause problems.
(Also, it's surprising that an autonomous car doesn't detect that another car is following it too closely, and slows down appropriately in anticipation. How is this not taken into account.)
An autonomous vehicle has to protect its occupants at the expense of everything else (or, at the very least, appear to do so in a convincing manner), because otherwise no one will step inside.
(At the very least, if a machine is going to sacrifice me or my family to save a third party, I need to know what hierarchy it is following, and how it was decided and by whom.)
But what this incident seems to illustrate is that it's difficult for a self-driving car to share the road with human drivers, and behave like a human -- meaning, allowing human drivers anticipate what it will do.
I drive a motorcycle and a bike in Paris; the reason I'm still alive is because after so many years of this I can tell what all the other cars will do at all times, before they know it themselves.
But an autonomous vehicle that would behave so differently from a human driver as to be unpredictable, would be terrifying.
Totally agree; beginner and truly dangerous drivers share a quality: unpredictability. Even the most aggressive, fast drivers can easily be accounted for when defensive driving. Unpredictable drivers are how collisions happen when speed and weather are not factors.
Even the simple automatic emergency braking features in my model 3 result in some dangerously unpredictable behaviour at times. I have them set as off and insensitive as possible and they still do some awful stuff from time to time. I was on a road trip in northern Ontario this week passing a service vehicle moving very slowly along the shoulder on the right. I was doing 105 km/hr in the right lane of a 3 lane road: 2 lanes in my direction and 1 opposing. The 2nd lane switches directions every 5-10 km to serve as a passing zone. Posted speed limit is 80 km/hr, but prevailing speeds on this road are 90-110 and you’d never be ticketed for anything under 110 in this region. There was a truck gaining on me coming up on the left, so I signalled left and partially moved over to give the service vehicle some room, but didn’t fully take the left lane to let the faster truck know I would let them through as they came past. Very common pattern on this type of road and circumstance. The AEB slammed on the brakes as I came level with the service vehicle despite the fact there was plenty of space to complete the pass. I wasn’t expecting my car to slow down let alone apply emergency braking force, so in my surprise I nearly collided with the service vehicle. I have no idea what the other drivers thought, but neither of them could have possibly been predicting or expecting me to slam on the brakes. I was really upset with the car since it took a highly dangerous action in an otherwise perfectly safe and common situation. And that was the automatic stuff that can’t be turned off; I don’t let autopilot drive ever because it is the worst type of driver: unpredictable. The choices it made in the extremely short time I tested it about lane placement, follow distance, defensive driving (and the complete lack thereof), and general behavioural clues provided to other drivers were genuinely terrifying.
> By not doing so, it did the legally correct thing. The other driver will be blamed for this. But it may not have done the thing most likely to avoid an accident. This is the real version of the trolley problem.
It performed the only sane solution to the problem - stopping. You can't possibly predict what the most likely thing is to avoid an accident, because there is a multitude of factors beyond just 2 cars moving towards each other. Are there pedestrians present? Other parked cars? Storefronts with customers inside? How close to the sidewalk? Trees? Construction/Debris?
Once we get AI working better than 99.9% of humans on roads with speed limits of up to 30 mph - we can expand it to faster roads and introduce advanced behaviors. But for now, stopping in uncertainty is the best available option.
> It performed the only sane solution to the problem - stopping. You can't possibly predict what the most likely thing is to avoid an accident, because there is a multitude of factors beyond just 2 cars moving towards each other. Are there pedestrians present? Other parked cars? Storefronts with customers inside? How close to the sidewalk? Trees? Construction/Debris?
Not only there are other factors, but you also need to predict what the other human driver will do. As stated in a sibling comment this is a huge part of safe driving.
> Once we get AI working better than 99.9% of humans on roads
But what you stated above require AGI, so what's the plan to get there? It's even more blurry than commercial fusion at that point.
Yes, this applies to autonomous vehicles being tested in the state. Tesla skirts this rule by reporting their system (including the supposedly ‘Full Self-Driving’ beta) as level 2 driver assistance features. That marks them out of scope for the regulation of reporting collisions (and crucially system disconnections).
Because self-driving has a bunch of tricky edge cases and most of them will kill people. Problems with hundreds of important edge cases cannot be solved by simply throwing more training data at the problem; that's how you solve AI problems in a "dumb" manner, and it works for lots of problems (like recognizing dogs in images) -- but not for self-driving.
To solve the self-driving problem we need "smart" A.I., which means we have to approach it with systematic engineering, and the solution will probably involve some combination of better sensors, introspectable neural nets, symbolic A.I., and logical A.I.
I remain convinced that "real" self driving (as in: go ahead and sleep in the backseat) will never happen without changes to road infrastructure and possibly some sort of segregation between robot-driven cars and people-driven cars.
Things like traffic signals that actively communicate their status to nearby robot cars (more than just a red lamp that can be occluded by weather, other vehicles, or mud on the camera lens). Or lane markings that are more than just reflective paint, but can be sensed via RF. Rules around temporary construction that dictate the manner of signage and cone placement that the robot cars can understand. The cones might have little transponders in them, I don't know.
But without a massive leap forward in AI capability, our current road system—optimized for human drivers over the past century—is not going to work.
If we can't make the cars just as smart as an alert and capable driver, then maybe we need to meet halfway and make the roads a littler "dumber" (simpler) to accommodate the robots.
> Things like traffic signals that actively communicate their status to nearby robot cars (more than just a red lamp that can be occluded by weather, other vehicles, or mud on the camera lens). Or lane markings that are more than just reflective paint, but can be sensed via RF. Rules around temporary construction that dictate the manner of signage and cone placement that the robot cars can understand. The cones might have little transponders in them, I don't know.
The problem you then face is that any of those could be forged / faked without some kind of way of securely validating the message in some way. You could cause absolute chaos by driving down the road broadcasting false messages. It's a little harder to hack and modify traffic light signals, for example. But we've also seen hackers screw up Tesla cars by sticking stuff on the back of their car to deliberate mislead it based on vision.
> You could cause absolute chaos by driving down the road broadcasting false messages
Even without self-driving cars, an "attacker" can go into a theater and yell "fire" and cause a stampede.
They can get a high-viz vest and clipboard, and stand in intersections directing cars to take detours they don't need and holding up traffic.
My point here is that society has a lot of trust baked in. We trust people don't just yell "fire" without reason. Just because it's FSD cars doesn't mean people will start broadcasting the equivalent of "fire" constantly. It's already easy to cause accidents.
Those attacks don't scale. They're also tied to a person or people who need to be physically present, making it possible to arrest them.
The consequences of potential attacks on centrally-orchestrated traffic are a lot more severe. Hack the control node, and you can stop traffic nation-wide. Or cause mass accidents that overwhelm first responders. And they can be executed by anyone, anywhere in the world, for a cost within range of many medium-size corporations (let alone nation states).
I won't comment on the challenges of the approach Tesla et al. are currently taking, but I don't think central control is the panacea commenters in this thread are making it out to be (and I'm personally glad this isn't the route we're pursuing).
An attacker can already do this, at scale. Whether it be overriding traffic lights to show green in all directions, or taking down critical air traffic control systems.
It’s like arguing that we can’t possibly build autonomous cars because then someone might turn it into an autonomous bomb.
Keep in mind that solving this is “worth” about 40,000 lives a year in the US - nearly $1 trillion in economic damages a year in life and property.
Bad things can always be done with good tools. As always, you provide layers of protection that make sense and in the end must rely on the underlying fabric of civilization to persevere.
Bad bad things.
I've had my Tesla on a highway either lose GPS precision and believe I was on an adjacent local road..
It immediately reduced speed from 60mph to 25mph .. aggressively.
This falls into a pattern of Tesla autopilot/NoA where it just doesn't seem to have much memory or foresight.
For example the car is driving itself on the highway, it knows it's been on the highway, for 20 minutes. I am not even in the exit lane, it knows what lane I am in. How could it think I am suddenly on the local road below the highway based solely on the GPS pin movement in the span of a second, without having moved to the exit lane and gone down the exit ramp?
For an example of lack of foresight - the car will happily speed towards an obvious semi-distant slowdown right until it needs to aggressively break from 60mph down to 30mph as it approaches following distance of the nearest car. I also find it can get really weird in stop&go traffic, not easing into speed, down to a stop very well as if it has only GO or STOP.
> For example the car is driving itself on the highway, it knows it's been on the highway, for 20 minutes. I am not even in the exit lane, it knows what lane I am in. How could it think I am suddenly on the local road below the highway based solely on the GPS pin movement in the span of a second, without having moved to the exit lane and gone down the exit ramp?
Oh wow. This happens often with a car-mounted GPS (or on a phone) and it's pretty annoying. Sometimes the GPS instructs you to do a U-turn at the next available fork in the road, and it takes a moment to understand what's going on.
But in a self-driving car it's terrifying! And absurd.
Who controls the signing keys and the whole signing process? What about key revocation if someone steals the key? Will a municipality in Texas really be willing to not be allowed to create a new stoplight without approval from the federal agency in charge of the keys? What’s to stop someone stealing a “real” stoplight from bumfuck nowhere and putting it in the middle of the 101 at rush hour? What about replay attacks? What about signal jamming? Etc. etc.
You raise some genuine possible concerns but generically I'd probably tend to just say, "laws will stop them" just like they already stop someone deliberately endangering lives by placing a real fake stop light in the middle of the 101 at rush hour.
The real concern would be whether someone can engineer a terrorist level mass scale attack but as long as it requires physical tampering that adds up to a tremendous amount of work. So if the signalling is largely burned into fixed infastructure it eliminates a lot of that or at least sets the bar high enough that its probably more work than various other types of attack that are likely to be just as impactful.
ADS Mode B (used everywhere for airplanes) already works this way, and there is no authentication or signing whatsoever. It's a single global broadcast frequency (1090MHz).
> The problem you then face is that any of those could be forged / faked without some kind of way of securely validating the message in some way. You could cause absolute chaos by driving down the road broadcasting false messages.
> I remain convinced that "real" self driving (as in: go ahead and sleep in the backseat) will never happen without changes to road infrastructure and possibly some sort of segregation between robot-driven cars and people-driven cars.
Yep, I talk to people working in traffic engineering, and their mindset is always building new road tech and road-side and cloud infra to support autonomous driving. They have no expectation of fully autonomous vehicle without road and infrastructure assistance.
And from historical perspective, the coming of automotive and the replacement of horse and other animal carts, are exactly facilitated by the road transformation; which has been the single largest scale infrastructure in human history.
It makes no sense that an even bigger transformation of the vehicle would require less drastic road transformation.
Yes. You bring up a great point: With some infrastructure improvements we could have "virtual" trains. Vehicles would talk to each other on the highway and organize into closely-spaced convoys. Only the lead vehicle would need an active human driver; the rest could follow at extremely close distances--drafting off each other--and would not need a human driver (or their human drivers could go off-duty). The point of using asphalt rather than rails is that it makes it easy to switch between individual car mode and automated virtual train mode.
This idea is not new and it mostly applies to freight convoys but I think it also has merit for ad hoc passenger car convoys on long highway trips.
> With some infrastructure improvements we could have "virtual" trains. Vehicles would talk to each other on the highway
I think the idea is to get rid of the roads and have the robo vehicles travel instead on tracks. That increases fuel efficiency and bypasses a lot of AI challenges.
Not only that. Overhead electrical solves a lot of the mining/environmental impact of batteries.
It’s the low-key case that Elon will be remembered poorly (like Robert Moses’ rapidly degrading legacy) for
1) Having the wrong vision for EVs (but successfully executing non it anyway)
2) Making space travel cheap (thereby increasing the amount of carbon energy dedicated to it) without really improving an average human’s quality of life
I wonder if all the "hype" from the Ubers and Googles about imminent self-driving is not actively harmful in that it is probably sucking out the oxygen from any perceived need to implement these things.
But for people, in the same way as for parcels, power, cable, fiber-optic, there's the last mile problem. Europe has almost solved it with high speed trains and efficient city transit. Then, it solves nothing for all those folks living in rural areas and having to drive f-150s.
> Then, it solves nothing for all those folks living in rural areas and having to drive f-150s.
Fortunately thats not a lot of people.
When electricity was being rolled out, Westinghouse and Edison didn’t bellyache about not being able to provide electricity to rural areas. They electrified all the cities.
Rural areas will just never be able to pay for modern infrastructure. And… thats ok, its not a lot of people.
I think the point was that in rural areas they'll still use cars or something, because it's way more efficient for the low density there. In cities, you would switch away from cars as much as possible. I don't think the commenter meant we'd "abandon the hicks" or something rude like that; the rural folks who farm and do other important jobs are a crucial part of our society for sure.
> Things like traffic signals that actively communicate their status
This is a thing in Europe, and even some US cities - my Audi has traffic sign recognition and when at a compatible intersection knows what the light is at (by radio, not by light), and how long until it changes (will show a countdown in seconds til the next light change).
> optimized for human drivers over the past century
Are our roads really? Most in cities over a certain age are just haphazard relics of times gone by, and don't get me started on "stroads" which are good for nobody
> If we can't make the cars just as smart as an alert and capable driver, then maybe we need to meet halfway and make the roads a littler "dumber" (simpler) to accommodate the robots.
This aspect of FSD has always fascinated me and I'm a little surprised it doesn't get more discussion. Meeting halfway. At what point could/would/should FSD influence the environment around it?
For example - a poorly painted road sign*. Tesla/Waymo could say "We cannot support L5 FSD on this road until you fix this sign." If it meant a step forward in autonomy, Tesla/Waymo could even offer to share the cost of that improvement!
There are a million reasons why implementation of that would be problematic. Costs and incentives would be all over the place. But I am more interested in the framing: The machines are the ones that need to adapt. Which is essentially hoping for continued hardware improvements or a spaghetti mess of if/else statements. ie "do this weird thing if you see this other weird thing in front of you". Can we get rid of the weird thing and avoid the engineering challenge altogether?
* Yes, this is an overly simple example. Some environment changes could be so large that they would require a full redesign of a city/buildings/traffic patterns. But surely there are classes of improvements where some are easier than others.
massive AI leap? but we can ask language models about how to drive through an intersection and it will list the steps correctly (and wildly incorrectly at other times)
what's missing is combining this kind of "human concept relations" model (language, rules, minimal reasoning, text encoded human preferences) with perception, and safety (which means that the model should know that if other cars are driving just fine in front then it's unlikely that the road is on fire, or that the low certainty crack in the road is okay if two other cars already went over it unimpeded, if the road marks and the signs are inconsistent, but other vehicles have formed a slow but consistent pattern of traffic then that's the local ruleset, and so on)
it's still a very hard problem. and the required amount of compute is still bonkers, the required amount of data and training is still absolutely huge, and the whole problem of safely disengaging, handling the asleep/drunk passengers (likely target audience after all)... are all hard problems too :)
You can get a self driving car ride and sleep in the back in Arizona, from Waymo. If you're on a special list, it sounds like you can do it in SF today with Cruise. Your claim is manifestly false.
`possibly some sort of segregation between robot-driven cars and people-driven cars.` just build out public transit at that point, what's the difference.
The pattern of zoning and development that the U.S. has used for the past 70 years assumes that nobody wants to walk or cycle anywhere and that they feel so strongly about it that they want cities to be built such that people will die if they attempt to do so. I grew up in a neighborhood where the elementary school was about half a mile away... on the other side of a five lane highway with no crosswalk and no sidewalk on the other side. Zero chance to assert your independence as a kid. Zero chance to live responsibly as an adult.
A part of the reason people wish they had FSD so badly is because they want to be rescued from this fundamental failure of NA-style urban planning that necessities driving, all the time, across both short and long distances.
This is changing though. Cities are getting bikes and bike lanes. They’re building dense developments close to transit. ebikes + protected bike lanes + TOD might be the solution.
I certainly hope so, and I've since moved to a place that is much more friendly for cycling; however if you go back to where I'm from (which for the purposes of this topic really could be one of any untold hundreds of counties in the U.S.), there are no signs of improvement or change. It's just the largest urban cores that seem to slowly be getting it. There is a breathtakingly large amount of suburban sprawl where the conversation about this topic is completely broken, and for the people willing to admit there even is a problem, they believe that the solution is one more lane, or one more stop sign, or just educating drivers a little bit better and nobody wants to talk about things like traffic calming, or upzoning, or cycle paths or anything else that makes anything other than car trips safe and desirable.
This is a bit of an oddity of US urban planning. The US has maybe 10 big to very big cities, where people can get around on foot, and then a host of what are called cities in the US but are essentially just collections of shopping centres, office parks, industrial estates, vaguely near each other.
It doesn't work like that in most countries; small cities and towns are navigable on foot and public transport.
Public transport often means buses, though, so self-driving AI is still very relevant there.
If you could replace double-decker buses that arrive every 15–30 minutes with self-driving minibuses that arrive every 3–5 minutes, that would be great! (for everyone except the bus drivers who lose their jobs)
Typically in cities the biggest constraint on bus frequency is bus congestion at bus stops (and to a lesser extent congestion in bus lanes), not number of buses. To the point where banning cash on buses can meaningfully increase system throughput, as it decreases lag time at bus stops. This can be alleviated with planning where people have to transfer to get anywhere (moving buses out of chokepoints) but in practice people don't like that.
Actually, I suspect this makes self-driving a particularly _bad_ solution for city buses; getting into the bus stops takes some manoeuvring, particularly when there are other buses there.
One place that self-driving buses could be interesting (and indeed there are already a couple of systems like this) is on fully/near-fully segregated lines, where they don't have to deal with human-operated traffic. Another would be small towns, but you're looking at full magic level 5 at that point.
I think a lot of those problems are solvable in principle, although unfortunately it’s probably not practical and incremental enough to actually happen...
I’m envisaging a system of minibuses, either AI-driven or at least dynamically directed by a central control system. People would use a phone app to book journeys; the app would tell them where to get on and where to transfer, and the central control system would optimise the fleet to get everyone where they need to go.
With much smaller buses, and electronic tap-in rather than cash payments, stops should be fast enough that buses can just queue up in order at each stop, hopefully mitigating the parking difficulties you mentioned (modulo breakdowns, medical emergencies, etc). Likewise, if transfers are fast and easy, hopefully they’d be less objectionable to travellers.
Requiring a phone isn’t ideal as it limits accessibility and privacy. There could also be pre-printed tickets, with QR codes that you scan at the bus stop to see the route info.
Why minibuses and not just taxis? I suspect there’s a good balance to be made between efficient road usage (buses) and efficient routing for each traveller (cars).
I’m also envisaging that if this system were to take off, personal cars could be gradually removed from city centres! Again, that makes life easier for the AI vehicles.
This is all pie-in-the-sky stuff, I know; but I do feel like there are ways we could radically improve city transport mostly using existing roads, rather than building new rails or tunnels.
A social one where you use experience to make inferences about what is going to happen next. Who hasn't observed an aggressive driver coming up from behind, weaving around other cars and said to themselves something like "that guy is going to cut me off, better ease off the gas so there's a little extra room in front for when he does". That level of situational awareness isn't coded for is it?
Edit: I have a whole mental model for other drivers and different approaches for them. Someone driving like a grandma? Pass when available. Nervous/erratic/lost driver? Keep extra distance then pass as soon as possible. Aggressive driver? Relax, give some space and let them get ahead. And so on. I get that stereotyping is bad but ignoring the subtle signals other drivers give off seems like it would be myopic. An AI that doesn't anticipate what others will do on the road will always be reactive rather than proactive.
Most self-driving vehicles aren't just coded; they incorporate some form of ML as well. Categorizing patterns of behaviors is well within the reach of ML algorithms, but my understanding is we have far more basic problems to solve first (Tesla's seems to struggle with object tracking over time, which would be a necessary first step to recognizing patterns).
Developing object tracking and a sense of object permanence would be a pretty big prerequisite.. And here I was thinking about the RL model needed to decide what to do in the presence of other agents.
How many people have been killed by Cruise or Waymo cars for the number of miles they've driven?
I maintain the position that if self-driving cars handle the most common situations as well as average human beings, there won't be a strong drive to make cars that drive significantly better than humans. Cruise and Waymo are collecting a limited version of that dataset right now. It's unclear how many deaths because some details are being kept under wraps.
I disagree with you. On the contrary, I think it can only be solved by throwing more data, but much more data that we have available now, to get first a smart AI that has common sense about the world (a world model). Then we can start trying to drive in it.
The counterexample to your point is that brand new 16-year-old human drivers seem to do a pretty good job of driving with very little training data. Yes, they do kill people and they're much worse drivers than 40-year-olds who have much more training data, but the fact that 16YOs kill as few people as they do with so little training data means some other mechanism than raw data quantity is important.
> Any reason why everyone seems to be stuck on this problem?
Because it's really, really difficult. A lot of AI-ish stuff pretty rapidly gets to the point where it _looks_ quite impressive, but struggles to make the jump to actual feasibility. Like, there were convincing demos of voice recognition in the mid-90s. You could buy software to transcribe voice on your home computer, and people did. And, now, well, it's better than in the mid-90s certainly, but you wouldn't trust it to write a transcript, not of anything important. Maybe in 2040 we'll have voice recognition that can produce a perfect transcript, and human transcription will be a quaint old-fashioned concept. But I wouldn't like to bet on it, honestly.
And voice recognition is arguably a far, far easier problem.
I'm old enough to remember all the breathless pronouncements in 2015 about how self-driving cars would be everywhere within 5 years. I was skeptical then and I'm still skeptical that it will ever happen outside of some limited range of well known paths that have been pre-determined to be safe - ie we'll have (and we do have) some self driving cars but they'll essentially be on a closed course and will not leave the course. There are already some small bus lines like that - the bus just goes around a closed circuit picking up and dropping off people always taking the same route.
I'd argue that voice recognition is in a worse place now — It creates a FALSE sense that it can do a good transcription, when it is actually corrupted in the worst way.
I took part in a legal deposition where an "AI Transcription Software" was being used. When I received the transcript it had numerous errors, but they were all subtle. More common names were inserted in place of the name that I said, e.g., "Kennedy" instead of "Kemeny". "You have a [something]" was transcribed as "I have a [something]", completely reversing the meaning. And many more errors.
The common thread between the errors was that what was inserted into the transcript would have been the MOST EXPECTED word or phrase, instead of the ACTUAL MORE SURPRISING (surprising in an information-theory way) word or phrase. It's evident that on top of the phoneme recognition layer, this transcription software checked questionable items against tables/graphs of most likely words to occur near the other words it confidently identified in that context. Makes a transcription sound great, but it is WRONG.
The result was that the "AI Transcription" actively destroyed key information and hid that destruction under the guise of a smoothly edited transcription.
Although this surely was not the intent of the system's creators, I cannot think of a better way to make a more evil transcription system.
I also imagine it is a space where...you very quickly get to a product that seems magical...but you can see in the data the places where your magical product will regularly be unable to make clear decisions. Unlike a lot of traditional product development, that knowledge makes it very hard for you to release, because the liability and ethical problems are very real.
It's somewhat unfair because we used to simply collect less comprehensive data on performance and therefore know less about our corner cases - but you don't get to live in the future without dealing with the problems of the future.
> You could buy software to transcribe voice on your home computer, and people did. And, now, well, it's better than in the mid-90s certainly, but you wouldn't trust it to write a transcript, not of anything important.
Is this the fault of software or people generally frequently using bad sound equipment in poor and noisy conditions, such as talking on the phone while driving, on the street, poor connection, wind/rain, etc. ?
Personally, because English isn't my first language, I frequently "fail" to transcribe what is being said and have to ask the other person to repeat themselves. It seems like an AI voice transcriber in 2022 is going to work better than me.
Even in ideal recording conditions, machines are quite bad at transcribing human speech. Look at subtitles on live TV shows, typically produced these days by machine transcription; they're typically barely usable.
>Any reason why everyone seems to be stuck on this problem?
ML maximalism focused on the narrow problem of 'solving driving' while not recognizing that any task as complex as driving requires probably something closer to general intelligence, and theoretically the field has been impoverished in favor of "throw more graphics cards at everything".
Couldn't agree more. Especially when it comes to city driving, which would obviously be necessary for robotaxis, when AI zealots promise "it'll be here in a year or two", I always wondered "Have these people ever driven in the city?" I mean, to drive in a city, you basically:
1. Need to understand all standard signage (seems possible with AI).
2. Need to understand all "unstandard" signage (not sure how possible).
3. Need to understand the cop with the thick NY accent yelling at you saying "Can't you see there's been an accident and the road is covered with glass you dufus? Turn the F around."
I can certainly see AI solving the problem of driving in specially designed limited access highways (which could also support normal human drivers), and that alone would be a huge benefit, but I never saw how so many were willing to make the leap to "robotaxis that can drive you anywhere in the city."
6. Need to reasonably predict what that human that just made eye contact with you would likely do next, and how that's different from what he might do when he doesn’t make eye contact with you. And all of that differs if you're in NYC or SF or small town, Indiana
7. Need to constantly read other drivers' body language (or "car language"?) and infer what their intensions are. I can constantly tell what a driver is literally thinking, by picking up on their subtle motions. Drifting over slightly (they're looking to change lanes but haven't yet turned on their blinker), they're slow to respond to green lights, or acceleration of forward traffic (e.g. they're on the phone, texting, etc.) I even routinely look at the eyes of the driver in front of me by looking in their rear-view mirror (yes this entirely possible and actually helpful). Are they looking down at their phone in their lap?
8. Generally understand the true context of things you're seeing. E.g a 4x8 sheet of plywood in the road, a squashed piece of road kill, a large metal pipe, a stopped vehicle, a baby carriage, a dishwasher, a tumbling large slab of styrofoam (all things I've encountered in the last year).
9. Is there a stopped firetruck in front of me (I list this because of the ludicrous fact that a Tesla plowed full-speed into such an obstacle).
7. Need to understand that drunk person staggering along the roadside has been repeatedly slipping off the sidewalk and there's a non-zero chance they trip and fall right in front of you.
8. Need to understand that sometimes the stripes you see aren't the real stripes, you're driving near sunset and are seeing the sun reflected off of old stripes that were painted over with glossy black (wtf WSDOT).
7.5. Need to understand that the drink person staggering _inside_ the robotaxi has just thrown up in the backseat. After they are dropped off, the robotaxi cannot pick up any new customers.
This is a dance move known as the "tenderloin lurch". Well, not really, the Tenderloin Lurch is when a person staggers up to a crosswalk, waits until they have a NO WALK sign and proceeds to cross, while ignoring all the drivers who about to run them over.
I've been in situations like this. In SF. I grew up here.
What you describe is the classic west coast non-confrontationalism. You're probably annoying enough that it's simply not worth it to counter anything you say because you just dive into a petty and self-aggrandizing argumentative mode.
Brash and confused conservatives think everyone agrees with them, but in reality people don't want to get caught up in absolute bullshit by joining any kind of interaction with them. This is a common pattern by now. You would do well to recognize it.
Even then, even if you can solve every case involving actual roads with perfect markings and intact signs and functioning signals, I’ve found myself just this week:
- driving across an unmarked grassy mound to park a car at a store in their designated area.
- paying a fee with coins to enter and exit a toll road.
- stopping to move around roadworks based solely on hand signals from one of the workers.
These aren’t even scratching the surface in terms of edge cases that could be encountered regularly.
People point at statements like "it just needs to drive better than the average driver" and "well lots of idiots drive".
On the flip side, lots of intelligent people with other skills do not have the confidence & comfort to drive outside of their towns local roads, or at all.
There are problem sets for ML like quant trading or phone album image tagging where you just need to be right 51% of the time to make money / be novel & useful.
We've seen plenty of examples showing ML is not appropriate for life&death problems like being the primary reviewer of diagnostic imaging for cancer screening.
The bar for ML to drive a car is much much higher, and I'm not even sure we have the right sensor suites feeding into the models, not to mention the compute capacity to operate at low enough latency.
Driving is a social problem, not a technical one. It's functionally the same as walking down a crowded sidewalk. The car is just a tool, just an extension of our bodies.
We can't build a robot which can walk down a sidewalk without running into people either. The sensor tech and mapping fidelity are red herrings. People drive well because only people are good at predicting human behavior.
This. I'm always amazed by a skilled taxi driver maneuvering their car at a busy street in Tokyo with bustling pedestrians and bikes, some of them are old and slow and others ignoring traffic lights, just passing them by centimeters. It's like as if the car is a part of their body and they're threading though a crowd. They have to anticipate everyone else's move. The cars really look like people here and others treat them as ones.
Sort of. Your wording actually assumes down to its core that driving is inherently social, when in fact I think it may be only provisionally social. Driving currently “is social” in a few senses, But of course the important (and obvious) one is that it currently involves accounting for human-operated vehicles.
Alternatively, an autonomous vehicle operator in a homogenous network full of other autonomous operators has capabilities and characteristics that greatly simplify failure modes. Maybe even majority autonomous, partially heterogeneous? You can literally slow or stop the whole show to deal with a catastrophic event. It’s still “social” but probably much reduced from the scenario where you’ve got the full scope of human expressivity behind the wheel.
The REAL problem is how do we take our roads to the crossover point where those simplified network features become accessible.
> “This assumes our roads are used exclusively by autonomous vehicles”
I not only assume it, I say it out loud: “[fully autonomous, …] maybe even majority homogenous, partly heterogenous?”
> “When our roads are not used exclusively by motor vehicles to begin with.”
I’m not following what you’re saying here in the context of the earlier clause. If you mean not used exclusively by autonomous vehicles, yes, and that’s why I’m pointing out the provisional aspect.
You know, I could be snide, and say, “We’ve already invented that — it’s called a train.” :D
But I’ve thought the same, too. If every car on the road is robot-controlled then it changes the problem. Modulo failures, discrete algorithms should behave predictably towards each other, like the unix API philosophy.
It seems hard to get there, though. Even today it’s a PITA to maintain API boundaries in simple libraries, never mind make sure that the new Tesla v12.4 Full Self Driving For Real This Time doesn’t trigger edge cases in Volvo v7.7a Actually Real Self-Driving We Promise.
Can we make software that allows cars to behave as predictably as rail cars, but without the rails? Maybe, but I expect only on limited-access freeways. I’m sure these robot-driven cars will remain incompatible with common road uses cases like pedestrians, cyclists, and children chasing balls.
I paid ~$10 for two rides after signing up as a regular ole user in Mesa AZ. It was great, the first ride was a bit nerve wracking, but the second felt very nice.
I certainly wouldn't argue with you that it isn't ready for prime time and wide distribution, but it is interesting to see their progress in San Francisco, a much different driving problem.
If it takes them 10 years to get to prod in Mesa, two (maybe three?) in SF, maybe they start shrinking that a lot in metros without winters. ¯\_(ツ)_/¯
Waymo has superior performance based on their historical statistics. It makes sense, since their lidar sensors capture more of the environment, and directly in 3D. Their AI also seems better QA’ed.
The Tesla AI Day[0] surprised me as it showed they only had a simple architecture for a very long time, simply feeding barely processed camera pixels to a DNN and hoping for the best with little more than supervised learning off human feeds. Their big claim to glory was that they rearchitectured it to produce a 2D map of the environment… which I thought they had years ago, and is still a far cry from the 3D modeling that is needed.
After all, sure, we humans only take two video feeds as input… But we can appreciate from it the position, intent, and trajectory of a wealth of elements around us, with reasonable probability estimates for multiple possibilities, sometimes pertaining to things that are invisible, such as kids crossing from nowhere when near a school.
Cruise also seems to have better tech; they had a barely-watched 2h30 description of their systems[1] which shows they do create a richer environment, evaluate the routing of many objects, and train their systems on a very realistic simulation, not just supervised training, which means it can learn from very low-probability events. They have a whole segment on including the probability that unseen cars may travel from perpendicular roads; Tesla’s creeping hit-or-miss are well-documented on Youtube.
> After all, sure, we humans only take two video feeds as input…
No, we use both our perception and proprioception when we drive or walk for that matter. We have two accute visual sensors mounted in a housing with six degrees of freedom and a sensor feedback mechanism.
We also have fairly sensitive motion and momentum sensors all throughout our bodies. Additionally we have audio sensors feeding into our positional model.
All this data is fed into an advanced computing system that can trivially separate out sensor noise from useful signal. The computing system controls the vehicle, navigates, and even performs low priority background tasks like wondering if the front door of the house was locked.
We have dozens of integrated sensor inputs. It's just silly to assume we only use our eyes when driving. They're certainly an important sensor for driving but definitely not the only one.
All the vision-only maximalists also ignore how insufficient Tesla cameras are compared to your human eyes in terms of resolution, dynamic range and refresh rates.
If you ever review your Tesla Dashcam footage you'll see that you can rarely even make out a license plate. The cameras are not even HD, let alone UHD.
Further the refresh rate is a paltry 36fps.
Drive the car at high noon or night and you realize how poor the dynamic range is with highlights and/or shadows get lost.
> Yeah, but I am sure Tesla's software can do the same.
They absolutely cannot. They won't even try, they require a driver to be there to be ready to take over with no notice. Their software also makes so many basic mistakes that even what they're allowing it to do is dangerously reckless.
There's been a lot of progress despite AVs not meeting intitial hyped predictions. Waymo and Cruise are operating driverless robotaxis in SF. We're probably a couple years from many major cities having them.
We often can figure out how to take a product that works 90% and bring it to 99%, and then to 99.9%. The engineering challenges involved in each nine are often vastly more than the percentages indicate, but they're conceivable. With AI we have absolutely no idea how much effort might be required to get to that next level of reliability. We hope that bigger models, or better AI technology might get us there, but there's also a chance that they won't.
Probably because it's really, really, really hard to solve the thousands of edge cases that occur in real-world driving situations. I don't think FSD happens until government gets behind it and starts putting infrastructure behind it. If we start building roads (and cars) to be highly visible to AI one way or another, it all becomes much easier.
>If we start building roads (and cars) to be highly visible to AI one way or another, it all becomes much easier.
Or we could build 1-dimensional roads, which would make the AI's jobs much easier. Like, we could put down two parallel piece of metal, which vehicles could "hook" onto somehow...
> Any reason why everyone seems to be stuck on this problem?
Because they're all trying visual- or line-of-sight methods only, I call this the "robo-human" fallacy in ML: trying to automate the processes that humans undergo so that you eventually have a drop-in replacement for a human. But that is a myopic and unimaginative approach because you could be re-assessing the system itself and eliminating inefficiencies that lead to poor performance.
In the autonomous vehicles space, there is massive potential for self-organizing swarm algorithms to control pelotons of cars, rather than individual cars with no intrinsic sense of the general flow of traffic. You wouldn't need a top-down "commander" style architecture, it could be designed so that cars only talk to their immediate neighbors and emergent patterns keep traffic flowing smooth and fast.
I have always been skeptical of the attempts to reduce the amount of information about the road that a car receives. (Moving from stereoscopic to monocular vision to save the cost of one camera seems just stupid.) But people who dream of "smart cities" really seem to see little more than The Jetsons in their mind, and it limits the scope of research to our detriment.
I’d like to hear an actual response from the people downvoting this comment. It’s an interesting perspective, and while I can imagine legit criticisims (SPOF, privacy concerns, coordination with vehicles outside the system, etc) actually articulating them would yield a more productive discussion than downvoting without comment.
How is AI supposed to confidently distinguish a real stop sign from someone/something holding up a picture of a stop sign?
Yes, this is a weird edge case, but I think it gets at the core issue being that it takes way more sophistication to release this tech into the wild then ppl would like to admit.
Has Zoox even gotten the permits to do fleet-scale driverless testing? Last I heard, they were still limited to 2 vehicles on specific streets in foster city. Doesn't seem very solved.
Google/Alphabet is stuck and will make advances in specific territories but will never get there without a fundamental change in approach. Their approach relies on very detailed mapping/modeling of specific terrain, so they can make a usable case sooner, but outside of the map/model territory, they're literally lost. And maps/models change constantly and rapidly.
Tesla is taking a fundamentally more broad and deep approach - working with the fundamental fact that a pair of visual sensors and a compute engine (eyes & brain) can successfully figure out driving in strange areas in real time, ergo, it should be possible without a map/model or lidar. Once they get it solved, it will be solved once and for all. Bigger gamble, bigger payoff. Equipping the car with dozens of eyes is the easy part. The question is whether enough compute power can be brought to bear on solving the recognition problems, and the edge cases. They have obvious issues with failing to recognize large objects like trucks in unexpected orientations, left turns etc. Using millions of miles of live human driver data as a training set is great, except that the average driver is really bad, so it's entirely polluted with bad examples, ESPECIALLY around the edge cases that get people killed. There, examples from professionally trained drivers, who really understand the physics and limits of the car, adhesion, traffic dynamics, etc, are what you want to train on, but that isn't what they have. It is also possible that even if the set of training data would actually be sufficient, the big question will kill them - perhaps the solution requires orders of magnitude more compute power to approach human performance, and they just don't have the hardware to simulate human compute power. So, have they just hit the limits of what their compute power can do?
I think Tesla's approach is fundamentally the way to go, as it is a general solution, compared to everyone else's limited map/model approach.
But both may require either or both a more specifically programmed higher-level behaviors, and/or something much closer to AGI than exists, something that has actual understanding of the machine-learned objects and relationships, which does not yet exist (if one is known, pleas correct me - I'd love to know about it).
Uber bet the company on robotaxis and lost. Tesla is still building very good cars that happen to not be able to drive themselves. Just like every other car. If they could lose their obsession with self-driving and just focus on their incredible cars, they'd still make money.
They are very much "default alive" BUT I get the sense that a lot of the company's plans and valuation are based on the idea that Tesla will not "just" produce competitive electric cars. They could make money just doing that - but they would never meet the expectations around their company's' trajectory that way and so they would be a "bad" investment (in ROI terms). So I think that everyone who is still working on the "more than a car maker" goal is going to insist the company will be more than that until they really can't.
Their leadership also doesn’t know when to STFU. There was a time when Tesla was _the_ luxury electric vehicle. But Musk is polarizing. He’s isolating formerly loyal customers and more and more compelling alternatives are coming to market every day.
Tesla is so overvalued only because of their claims of self driving. Musk utters a lot of bs but he's right that without this Tesla is doomed.
They have a big head start, but other car companies are now investing much more in battery tech etc and will quickly catch up. Not to mention Tesla's have terrible build quality, they have a lot of shady business practices like overcounting sales, reusing sold parts etc which came out in the recent leak.
Thing like the 4860 battery which were so hyped turn out to be not that much better. FSD is years away. Stop selling vaporware.
What they need to focus on is things they innovated on like OTA updates, integrated systems, no dealerships etc.
They have terrible manufacturing quality. The screens melt in Arizona heat. Maybe the ride is cool and feels good, but the car itself is not incredible.
How many other cars do you know that turn their A/C on while stationary for ‘cabin overheat protection’? My 2021 model 3 did that this week while sitting in the sun in northern Ontario and was helpful enough to send a push notification to let me know.
This is not to helpfully make sure the driver gets into a nice cool car when they finish their errand. It’s because the non-automotive grade screen isn’t rated or tested above 40C and were failing in heat. Rather than upgrade the part and do a recall Tesla opted to just use battery to keep the cabin below 40-45C.
I love EVs, but that’s some pretty dodgy behaviour.
I take your point but to me this seems like a feature rather than a bug. Many of the plastic parts in most cars (not just Teslas) are degraded by oxidation in the greenhouse heat of the car being parked in the sun for a long time. Some of that oxidation is caused by UV, but to the extent that it's caused by heat, keeping the interior of the car somewhat cooler should prolong the life of its interior plastics.
This is not really an option in an ICE car because it would require running the engine. In an EV, it's easy.
vinyl and plastic degradation is a many years-long process before it has any real impact on the performance of the parts. Failure or yellowing of the non-automotive grade screen in a model 3 has been seen in just a few weeks or months.
I've put a lot of miles on my ModelY in Phoenix in the summer. Phoenix in the summer is not my idea of fun, but the screen (and the rest of the car) worked fine.
I think the whole "terrible manufacturing quality" thing is a talking point people seized on. They had problems with manufacturing quality as they brought up their whole factory/assembly process -- was their manufacturing quality ever "terrible"? Meh, I think it was kind of "we have kinks to work out." Nowadays, they seem fine.
> If they could lose their obsession with self-driving and just focus on their incredible cars, they'd still make money.
They may want to think about that strategy soon. Model 3 is starting to seem dated (not to mention Model S, which is ten years old). There are very competitive alternatives on the market now that have strengths where Tesla is weak, and which are not especially weak in the areas Tesla is strong.
Model 3 is still the best electric car for most people. Its just too expensive now but so are others. No one else has a battery thats as efficient, the charging network, convenience features like key card, integrated navigation (although CarPlay/Android Auto are good too), the handling is much sportier, they hold their resale value, and no dealerships.
For a sedan, it's pretty compelling. But I expect my wife's next daily driver to be a crossover size, and more likely than not (at this point) a Mach-E. The range is there, the interior is good, CarPlay, and $15K less than a Model Y.
There are certainly some questions that are hard to answer. Maybe the Tesla will continue to hold value. But given that I see them everywhere now, they're feeling more like a commodity every day. I'm not sure how it will play out, the market is dynamic and recent events have been very disruptive. All EVs are selling out months ahead of production at this point.
In what sense is the Model 3 seeming dated? It's still one of the best electric cars on important measures like range and efficiency. It also has access to hands down the best charging network and is well loved by its owners, despite the well-documented problems.
Bjørn Nyland's review of the 2022 Model 3 Performance [1] confirms your view that the Model 3 is absolutely not dated. He tests scores of EVs and still considers the Model 3 the best balance of comfort, features, technology, performance, etc. If anybody here is seriously concerned that the Model 3 may be "dated," I suggest watching the video below.
Eh, looks dated to me. But I had an early one for a while. They're everywhere (at this point I lump 3 & Y together since for all practical purposes they're the same design).
Range is okay, but Tesla overestimates. Mine lost about 1.2 miles of range for every mile driven, and as a practical matter it was more of a 200 mile car than a 300 mile car, unless you really wrung it out. There are a bunch of 300+ mile EVs either on the roads now or releasing in the next few months. The only Tesla that has a range worth bragging about is the 400 mile version of the Model S. I do look forward (hopefully!) to regular cars having that kind of range, instead of just the top end ones.
I personally think Tesla is playing with fire leaving the quality so low. Well over half of all Model 3s have to return for repair within the first month. That's terrible. Recall how long it's taken other domestic manufacturers to regain any kind of reputation for quality, and even then many people will never believe they make good cars. For what a Tesla costs, people have high expectations. They're enjoying fad status right now, and that's great, but there is no shortage of Tesla owners already who've sworn off ever buying another one.
How so? They're not selling robotaxis or building factories to build them
> Tesla has repeatedly promised FSD is right around the corner
Which means it's years away and/or "FSD" means "automatic cruise control and lane keep assist" or whatever standard feature from auto manufacturers they've renamed
Because they chose to back themselves into that corner. Musk says that Tesla is worth nothing without full self-driving. Certainly it's the only thing left to justify the stock price:
The lies have been profitable so far. People have bought into the false promises. Perhaps they'll start demanding refunds for the full self-driving they paid for that has still not been delivered.
And I also think they could do some clever stuff with home HVAC, possibly using waste heat from crypto miners as the H.
Lastly, afaik they went camera-only in their cheap cars (3/Y) and still use fancy stuff in the S/X.
They gave fsd customers new computers once. What’s to stop them from going back to vision+lidar or whatever once the parts are available and retrofitting as needed?
The whole ‘vision-only is better’ gag seemed like an obvious ploy to keep being able to ship cars from the beginning of supply chain problems.
And yeah, Elon does not appear to be a good person.
> And I also think they could do some clever stuff with home HVAC, possibly using waste heat from crypto miners as the H.
That doesn't sound... particularly clever. Crypto miners are essentially disposable, with a useful economic lifetime of a couple of years, typically. And they are 100% thermally efficient. Which sounds quite good, until you consider that an air source heat pump is typically around 300% thermally efficient, rising to up to 500% for ground source.
So, once your crypto miner is no longer really making anything (a couple of years), you're left with a really inefficient heating system.
> Tesla has just confirmed that it has removed radar from the Model S and Model X as of mid-February 2022, moving its entire lineup to what it calls ‘Tesla Vision,’ which is an array of cameras that Tesla says negates the need for radar. The manufacturer did the same for the Model 3 and Model Y in May of last year and even though this prompted some questions from the IIHS, the institute is now fine with it after testing.
> This announcement comes after a 4 month sabbatical where Karpathy said he wanted to take some time off to “sharpen my technical edge,” which makes it sound like this is the result of frustration with the technical approach instead of burnout.
That's exactly what I would expect someone burning out to say. You feel the burnout so you need time to get over it and feel 100% (regain your technical edge). You're still burnt out after 4 months, so you don't come back.
Frustration with the technical approach can also cause burnout.
Yes, the vision only approach isn't something that you do in robotics.
There is usually a hierarchy of sensors, mainly for redundancy.
Example: Bumper sensory at the wheel base, sonar / Lidar at the mid, and a camera at the top for advanced sensing.
For the sake of cost cutting Tesla has done away with their radar sensors at the front of the vehicle. It would be a substantial cost overhead, but have very real repercussions when it comes to safety, while also providing a "ground truth" to what at least the front facing cameras are seeing.
I don't think Lidar is a practical sensor for them to adopt, because it is quite bulky and has limited viewing angles, but I would expect them to have adopted some novel, lower cost radar solution.
Apart from the lower cost of the camera, I think Elon's rationale for having a camera only FSD is not valid, has made the problem needlessly complex and unsafe. He believes since we have eyes, and we can drive a car, then it should be sufficient to drive the car, but we only use eyes because these are the sensors we were born with, it is the best we have. In my mind, Elon's approach is like looking at a horse, and saying to yourself, that you want to build a car based on a horse, where instead of wheels, you have four mechanical legs, and those mechanical legs are limited is so many ways, but they should still at least "work", but there is no reason to limit locomotion in that way. The same with the vision system on a FSD, the whole spectrum of light is available, with any number of configurations, providing data at rates and with precision far beyond what a camera system can do.
IIRC there was a presentation from Karpathy talking about the challenges with sensor fusion, particular in resolving divergence between e.g. the vision and the radar stack: https://www.youtube.com/watch?v=NSDTZQdo6H8&t=1949s
My background is in physics, but I find myself having a growing appreciate for the vision-only stack. It's really challenging building a formal understanding of the world that is robust to outliers that are so numerous as navigating in an urban environment. With vision, you have multiple kinds of information that are highly correlated (colour, spatial distribution, depth, etc) that are self-consistent. Whereas, fusing radar with vision, where object responses to radar are highly geometry & material dependent, is a much harder task.
I'm really not an expert, so this reads more as an opinion than an experienced view, but I can see the merits in doubling down on vision.
Monocular vision only seems pretty clearly not capable of solving the problem. Stereo/multi view systems have a shot (humans are proof), but Tesla bet against that long ago. I wonder what could’ve been in a proper multi view setup.
Humans are perfectly capable of driving on racing simulators using a flat screen though, where binocular vision makes no difference.
And Tesla cars have more than one camera on them. The front-facing camera is actually an array of 3 cameras (the two farthest ones are at about human eyes distance), but they're also equipped with forward and rearward looking side cameras, and back cameras.
I think Tesla underestimated how hard vision-only FSD is, but having a single camera (they don't) is not the reason.
Driving simulators are a bad example. It’s too easy to learn priors. A better analogy is humans with one eye that still drive - most are taught to induce parallax.
Also, I never said they have one camera. Multi camera != multi view.
BTW, Andrej, if you're reading this, it is not just excellent it is beyond excellent. I do a lot of tinkering with transformers and other models lately, and base them all on minGPT. My fork is now growing into a kind of monorepo for deep learning experimentation, though lately it started looking like a repo of Theseus, and the boat is not as simple anymore :)
> but their vision only tech stack doesn’t seem capable of solving it
Well, I'm not sure that anyone's tech stack is capable of solving it; the live examples of robotaxis are, well, not something you'd bet your company on (and generally their creators are _not_ betting their companies on them). There was, I think, a decade ago the idea that fully self-driving cars were a near-term inevitability. That's fading, now.
I think a lot of that came from the Tesla hype machine creating a strong association between electric and self-driving as being the immediate future of cars in popular consciousness, so when people saw electric becoming a reality they assumed self-driving was right around the corner when in actuality their maturity levels aren't related much at all. Fallacious thinking that may doom a few companies between Lyft, Uber, and Tesla
This is a good point. I think the market has to adjust to a reality where electric cars are just cars that happen to be electric, rather than a hyped up techno-dream. Electric propulsion all by itself is pretty great, we don't need to tie it to FSD dreams.
I don't see how smart roads would solve the issues of "the unexpected".. weird pedestrian behavior, getting cut off, parked trucks, construction, etc. It seems to me that smart roads would only solve the issue of general routing, and that already seems to be dealt with as far as I can tell.
It wouldn't. A real "smart road" implementation would require us to literally fence off those roads so pedestrians, deer, children, bicyclists, etc. cannot use them. And they would likely fail in any kind of inclement weather -- fog, snow, rain, etc.
Essentially an admission of failure for the self-driving car industry.
how is that any different from geofencing. The goal of an AV should be a car that can drive in any country/road. Not just fancy smart roads in western countries
I've long had this fantasy of a smart road pilot program wherein manufacturers would partner with governments (or toll road owners in some places?) to make self-driving-only smart lanes, where you have to surrender control but the car goes 120+mph. I imagine getting even a handful of popular longer routes enabled for that would be quite popular.
> if he felt they were close to solving the problem anytime soon
He felt? It's evident enough, that approach they used doesn't allow them to prepare FSD for real life and real streets. I think he just understood, that approach to be changed/improved significantly to reach the goal.
If humans can master driving with 2 eyes looking forward, why would a car with plenty of cameras in all directions not have sufficient sensory input to master it?
Same reason airplanes don't fly by just flapping their wings like birds. There's not always a biological equivalent for solving a problem, especially when you take into account human brain's evolution over millions of years. Sometimes computers need more help.
It does seem reasonable, though, that if humans can solve safe driving with a brain mass (3 lbs) that computers wouldn't need that much more help. Put another way: every machine learning "expert" who has a kid is amazed to learn how simple unsupervised learning is and how few training examples are required.
> This announcement comes after a 4 month sabbatical where Karpathy said he wanted to take some time off to “sharpen my technical edge,” which makes it sound like this is the result of frustration with the technical approach instead of burnout.
I think Karpathy realized (probably way back) that cheap sensors + no HD maps + their (reckless) public testing feedback loop doesn't advance towards L5 self driving and is bailing out. Karpathy has always backed Elon Musk whenever he talks about their technical approach, so it can't be frustration with the approach all of a sudden.
This announcement comes after a 4 month sabbatical where Karpathy said he wanted to take some time off to “sharpen my technical edge,” which makes it sound like this is the result of frustration with the technical approach instead of burnout.