> Perception for autonomous vehicles is effectively a solved problem.
The problem is that you can't separate "perception" from "cognition". A human could drive a car reasonably well via a webcam feed, so does a webcam count as "solved perception"?
"Perception" in the context of autonomous agents means a mapping from sensors to a model of the world; how effectively can the agent establish a reliable model given its sensor readings?
What GP means by perception being a solved problem is that existing sensors and algorithms produce high quality and reliable models in a wide range of environment conditions.
A webcam is a sensor device, the human observing the webcam feed most likely cannot construct a good model of the car and its surrounds, as they can't do things like shoulder checking or looking in the rear view mirror, which a good driver does frequently to maintain a model of where other cars are.
But then "reasonably well" for a human driving a car remotely via webcam feed is likely to be a far lower standard than we're holding self-driving cars to.
The problem is that you can't separate "perception" from "cognition". A human could drive a car reasonably well via a webcam feed, so does a webcam count as "solved perception"?