
The Next Leap in Self-Driving: Prediction - edward
https://olivercameron.substack.com/p/the-next-leap-in-self-driving-prediction
======
irjustin
Predicting is absolutely key. If you've been around the Waymo cars they
horribly manage simple situations. They're VERY safe because they take no
risk, but that won't work at scale. "What's that? Stop wait..., what's that?
stop... wait wait wait"

They're very jerky at intersections where there's even a mild amount of
uncertainty and really only go when it's very clear. It reminds me of lots of
first time drivers reacting to new situations where safe is the best way
forward. But if we had pure first time drivers, we'd be a gridlock at even a
mild amount of traffic that enters and leaves the roadway.

Looking forward to this next set of 5 years. Hopefully us drivers will get
replaced in 10? maybe?

~~~
jacquesm
I actually prefer the Waymo approach over the Tesla and Uber approach. When
there are lives on the line you simply do not cross certain lines and in the
longer term Waymo will be the only player left whose reputation is still in
one piece. That more than anything will decide whether self driving vehicles
gain long term acceptance. Anything else will lead to a 'self driving winter'.

~~~
deegles
My fear is that Tesla and Uber playing fast and loose with their development
will lead to Waymo getting regulated as collateral damage.

~~~
ajross
Tesla's getting enough hours under control at this point that they're likely
to win the regulation battle _for_ Uber simply by virtue of statistics,
though. Obviously there has been the occasional tragedy (and there no doubt
will be more), but the point where they launch the "Autopilot is provably
safer than humans!" press release isn't too far out at this point.

Again, the measuring stick is real drivers. And real drivers are really bad at
this stuff too.

~~~
jacquesm
That press release was sent out more than a year ago and it is about as
fallacious as it gets and I'm surprised the whole thing keeps getting airtime.

For the comparison to hold you need to do the following:

\- discount all the hours when humans are in adverse conditions and autopilot
is not normally engaged

\- remove motorcycles and other vehicles from the pool

After you do that Tesla is still _much_ less safe than your average driver.
Their whole spiel of taking the easy 80% in miles of driving and claiming
victory has been debunked so often now it is getting boring. It's the
remaining 20% where the fatalities are and which - coincidentally - are the
hard bits.

------
snowwrestler
I would argue that driving in traffic is essentially a physically social
activity, like a pack of wolves or a flock of birds or a school of fish moving
together. There is a tremendous amount of communication happening between the
individuals, but not overt language. It's physical communication which
requires empathy (i.e. seeing things from someone else's perspective) to
accurately interpret and make predictions from.

A simple example is to be in a lane next to a lane that ends. Everyone sees
the same signs that indicate that lane is ending; everyone knows the cars in
that lane are going to get over. But there are a variety of strategies for
doing so: get over early; wait until the last minute; accelerate to merge;
slow down to merge; use the opportunity to pass; etc. Some drivers are nervous
and cautious; some are aggressive. The way the cars move on the road, the
signals they provide like brake lights and turn signals, are a form of
communication that you, the potentially yielding driver, can interpret and
make decisions about how to deal with the merge. Or prevent it! If that is
your inclination.

This seems like a hard problem IMO. Perceiving objects is essentially about
physics--detecting and recognizing. Next step harder, making physical
predictions e.g. how far a moving object will travel in 200 milliseconds.
Maybe that's enough to keep a self-driving car safe, but it won't make it very
efficient in heavy traffic (which is the norm in urban areas). To interpret
intent as part of a prediction seems way harder than even that. Reminder that
when humans start driving they typically have 16-18 years of continuous social
learning... and even then it usually takes years to become a comfortably safe
driver in traffic.

Of course this would be way easier if all the cars were self-driving, all at
once, but that seems like a really hard way to implement this technology. I
think you'd have to have a city with very strong leadership that defines a
small area that is self-driving only (working with one or more private
companies to provide the vehicles), and then slowly grow the boundaries of
that zone.

~~~
DennisP
You may be right, but one of your analogies undermines your point. Realistic
bird flocking has been famously replicated in simulations by having birds
follow three simple rules, none of them requiring empathy.

[https://en.wikipedia.org/wiki/Flocking_(behavior)#Flocking_r...](https://en.wikipedia.org/wiki/Flocking_\(behavior\)#Flocking_rules)

So it might be a relatively easy problem for cars, if all the cars could be
made to follow a common set of rules. It gets more difficult with humans who
may follow different rules, or even actively exploit the rules of the
automated vehicles.

~~~
snowwrestler
> So it might be a relatively easy problem for cars, if all the cars could be
> made to follow a common set of rules.

Yes, one way to make a problem easier to solve is to make the problem easier.
(But how...)

I admit that if people acted exactly like birds, self-driving cars would be
easier to implement.

~~~
DennisP
I wonder whether game theory would be helpful. If flocking rules could be
found which were a Nash equilibrium, then the incentive for human drivers
would be to play along. Maybe it would even be a decent predictor of what
humans already do.

------
TaylorAlexander
It’s funny, I see people online in random comments saying “AI” hasn’t
materialized in to anything useful. But if you realize that most talk of “AI”
is talk of ML, and if you pay attention to robotics, you’ll find it’s making a
huge difference. The fact that anyone can credibly say perception is not the
major hurdle anymore is a testament to what the current generation of machine
learning algorithms have done to advance the state of the art. I’m excited for
the time when even prediction feels like a solved problem.

I would think that perception in poor weather remains a real challenge, but
certainly we’re leaps and bounds ahead of where we were even 5 years ago.

~~~
KKKKkkkk1
It's categorically untrue that perception is solved, given that things like
this happen:
[https://twitter.com/greentheonly/status/1228067903666348045](https://twitter.com/greentheonly/status/1228067903666348045).
And that's considering that Tesla's perception team is run by one of the
world's most respected experts in the field, and that their access to data is
essentially unlimited.

Now what bothers me is that leaders in the industry say without batting an
eyelid that perception is solved, given that they witness this type of events
on a regular basis.

~~~
anonymousCar
There's a lot of assumptions in OPs post, but maybe we can assume they're
talking in context of their sensor suite.

Tesla has made a name for over promising, under delivering on fsd, and refuses
to consider lidar as a member of their sensor suite.

I don't think you can adequately compare two companies with vastly different
approaches.

~~~
HellDunkel
Elon Musk recently claimed „correct vector-space representation“ is the
hardest task of autonomous driving. If true or not for Tesla i dont know but i
guess its a yet to be solved problem for everyone in the field. Would you
agree?

------
ChrisClark
Recently I saw a good video showing Tesla's pedestrian prediction.

You can see it on reddit here:
[https://old.reddit.com/r/teslamotors/comments/f4fbfu/nav_on_...](https://old.reddit.com/r/teslamotors/comments/f4fbfu/nav_on_autopilot_construction_worker_and/)

In the video
([https://reddit.com/link/f4fbfu/video/5yg4oclye5h41/player](https://reddit.com/link/f4fbfu/video/5yg4oclye5h41/player)),
skip to 0:38. There is a construction worker walking at a constant speed
towards the car's lane. The car could only assume he would keep walking and
swerved and stopped.

For a human, we'd probably assume the worker is paying attention and will not
walk in front of us. Though at times that also ends up being wrong.

~~~
megablast
If only human drivers erred on the side of safety, we wouldn’t have over a
million deaths a year due to drivers.

------
chadmeister
What the hell have we been working on if not that? What good is an autonomous
driving system that is not able to consider what happens next?!?!?!?!????

~~~
asdfasgasdgasdg
Presumably that? I think the author is just saying that it's going to get
better.

He is not privy to what his competitors have been working on, in any case,
except whatever has been announced publicly. (Unless he is doing industrial
espionage.) Maybe it's just that Voyage's cars weren't predicting up until
now?

~~~
loopz
I would be careful in assuming competence in such a new dev area. Think: rules
vs predicting vs rise in false positives

Heard of control systems waiting for seconds before acting on new information?
So don't assume flawless, perfect technology.

~~~
asdfasgasdgasdg
I don't think there's any implication that the technology is perfect in my
comment. But the idea that e.g. Waymo's software is not making some attempt at
predicting future states, even implicitly, is absurd. This video[1] was
released two years ago and references prediction at the provided timestamp.

[1]: [https://youtu.be/B8R148hFxPw?t=77](https://youtu.be/B8R148hFxPw?t=77)

------
mightybyte
I could be missing something, but this sounds like a horrible idea. I've had
two accidents in my life. One of them was caused by inattention--hooray that
self driving cars have the potential to wipe this cause out. The other was
caused because I was a new driver and I predicted what the car in front of me
would do and it didn't do it. It was in my first year of driving and
fortunately it ended up being a very minor collision, but I distinctly
remember that my conscious takeaway was that I shouldn't try to predict what
other cars will do.

If my car is driving, instead of trying to predict the behavior of other
objects I want it to do something like a minimax search and do something that
will be safe in the presence of the unexpected, not the expected. Prediction
sounds like the complete opposite of good driving practices like the rule of
following at least 2 seconds behind the car in front of you, which is
specifically there to prevent accidents in the rare case where the car in
front of you on the freeway does NOT behave in the predicted way of continuing
to go 70 mph in the middle of the lane.

~~~
gilbetron
The prediction talked about here is if you are at a 4-way stop, and the car to
your left has it's right turn signal on, you can predict that they are
probably turning right, and so it is ok for you to go through the intersection
without waiting for them. Currently, and autonomous car will wait until the
other car is no longer in the intersection. With prediction, it will act more
like a human. That doesn't mean it 100% believes the car will turn right, but
will modify its behavior if it sees the car starting to go straight.

Basically, autonomous vehicles almost entirely assume the worst outcome in all
situations, and waits until it is safe. So at a 4-way stop, it will wait for
all cars to be out of the picture before going, which results in horrendous
performance if the intersection is busy.

~~~
c22
If I am at a 4 way stop and there is a car to my left my prediction is that
it's ok for me to go through the intersection regardless of the status of
their turn signal because the rules of the road require them to yield to their
right. I still pause to observe the trajectory of their vehicle, though,
because a lot of people don't seem to know that rule.

~~~
frickinLasers
I'm not sure if I'm reading your comment correctly, but in case I am...the
"yield to the right" rule is only to break a tie if two people arrive at the
intersection at the same time. Otherwise, the first to stop is the first to
go. I haven't run across many people who don't play by these rules (or maybe
my assertive driving style just means everyone else follows my rules anyway).

~~~
c22
Well yes, I always yield to vehicles that are _already in the intersection_ ,
irrespective, again, of their turn signals. Also it is not purely true that
"the first to stop is the first to go". Specific minutiae of the law differ
depending on region, but generally if I am approaching a 4 way stop with the
intent of turning left and another car straight across from me is already
stopped and going straight then, while I am waiting for that driver to cross
the intersection another vehicle arrives at the stop to my right I am still
obligated to yield the intersection to the new car despite having "arrived
first". Likewise, when many cars are enqueued at the intersection, turns
should continue clockwise, as in most board games, without regard to who is
able to zoom up and brake the quickest.

------
Piskvorrr
"Now that we can safely detect the critical objects around us"...except if
they're a pedestrian with a bike. Or something equally unlikely; meh, just run
it over, probably uninteresting anyway. /s

That's some _massive_ handwaving.

~~~
xiphias2
I agree with you. ImageNet is so far from really understanding the world, that
based on this article I wouldn't put any money in this company.

Tesla had a lot of presentations about the complexities of detecting the right
light to use in a crossing with about 20-30 traffic lights for example.

Depth detection is also crucial, and as you wrote there are lots of strange
objects/animals/people to handle as edge cases.

------
nihonium
I completely agree. If prediction algorithms properly implemented, the Uber
crash could be avoided.

I think the next step after proper visual prediction is interpreting the noise
around. For example, a human can hear the ambulance siren approaching and slow
down in a junction without seeing the vehicle. I think sound interpretation is
very important especially in city driving.

~~~
Retric
I don’t think accurate prediction is the right long term strategy. It’s edge
cases that cause accidents, so a more adversarial approach is safer.

Taken to the extreme this causes problems, but gently slowing down in the case
of extreme tailgating is a response to slow reaction times not direct path
prediction.

~~~
mellosouls
It's edge cases that will leave 100% self-driving unlikely in the near future.

------
prattatx
Context matters. For example, at an intersection, both the context of what is
happening with others at that intersection as well as what has happened to the
vehicle coming up to that intersection matter. An intersection near a soccer
field at 4PM has different dynamics than at 4AM. An intersection following a
highway exit has different dynamics than an intersection in a neighborhood.

~~~
zubspace
You're right. And that's why I can't imagine a solution without AGI. There are
simply too many context-sensitive parameters and we humans are quite good at
applying past knowledge to circumstances which are new to us. And we also have
access to information, which a computer may not have.

But software in their current state? Is it possible to provide enough logic to
handle all cases like humans? Can you do that with pattern matching, feature
detection, decision trees, bayesian models and anomaly detection alone? And if
we do, didn't we just replicate a human driver with all his flaws?

~~~
Piskvorrr
Aha, that's an interesting point: is the current SDV paradigm "build a faster
horse", to quote Henry Ford?

------
bobosha
I would modify the article to say "the next leap in AI is prediction". There
is a lot of activity in this space called Predictive Coding.

Look up the work of David Cox[1], Karl Friston et al. A fascinating variant is
Contrastive Predictive Coding[2].

[1]
[https://www.youtube.com/watch?v=P0yVuoATjzs](https://www.youtube.com/watch?v=P0yVuoATjzs)

[2] [https://ankeshanand.com/blog/2020/01/26/contrative-self-
supe...](https://ankeshanand.com/blog/2020/01/26/contrative-self-supervised-
learning.html)

------
sammyo
Looking forward prediction should be superseded by announcement. Each vehicle
should be broadcasting it's travel vector and next plan to alter that vector.
At some point the SDC traffic will look like a high speed motorcycle team
weaving between each other, a pedestrian wanders into the middle is just
avoided. High efficiency, high safety.

~~~
Piskvorrr
... assuming everybody plays nice for the common good, ignoring optimizations
with externalities.

------
Blake_Emigro
I think another leap can be achieved with communication between vehicles. Once
there are many more self-driving cars on the road, they can feed info to each
other as needed. A car experiencing black ice can pass that info to the cars
around it. Cars approaching an intersection can adjust their speed so that
they all pass through without stopping, etc. But, because there are several
manufacturers, this will be hampered without an open standard, and it doesn't
solve the randomness of the world that will still be experienced.

------
vladislav
"the state-of-the-art in computer vision has moved so significantly that it’s
arguably now not the primary blocker to commercial deployment of self-driving
cars."

The state of the art in machine perception is still far from human level
perception, regardless of how quickly progress has been made. Prediction is a
function of perception, so it's also bottlenecked on perception getting
better. Let's not confuse recruitment-oriented blogposts with credible
scientific writing.

------
nmaley
Has anyone observed that the problem of object classification is really just a
component of the prediction problem? That the old lady doing donuts in the
wheelchair is identifiable as such because her appearance changes in certain
(predictable) ways from moment to moment? In other words prediction error
minimisation is the basis of the classification, not something made possible
by object classification.

~~~
highfrequency
No, prediction is not the basis of classification. You can build a classifier
that looks at a single out of context frame and learns to label the objects in
it. It won't be able to predict where those objects move in the next frame--it
was never even trained using subsequent frames.

------
tlofreso
I've thought in the past, once the compute is available in a low-enough-power
form, you can continually increase the resolution (frame rate) that autonomous
vehicles view the world.

Essentially giving them 'The Flash' like vision. You could then have all kinds
of predictive models based on various patterns.

------
zie1ony
"It's hard to make predictions - especially about the future." — Robert Storm
Petersen

------
alexcnwy
I actually think humans playing a game on their phones is the solution to
self-driving.

Check out www.sebenz.ai

