
Fusion of Stereo Vision for Pedestrian Recognition using CNNs [pdf] - Katydid
https://hal.inria.fr/hal-01501735/document
======
roymurdock
Problem: Improve the classification component of an ADAS system to be able to
discriminate between the obstacle type (pedestrian, cyclist, child, old
person) in order to adapt the car driver system behavior according to the
estimated level of risk.

Approach: Train a CNN-based pedestrian classifier using a combination of
intensity, depth and flow modalities on the Daimler stereo vision data set.
The resulting CNN classifies objects in images as "Pedestrian" or "Non-
Pedestrian" (rather than the more granular goal above).

Result: Outcomes are measured by the False Positive Rate (FPR) at the 90% True
Positive, or detection rate. Lower FPR=better.

The CNN created in this experiment ("Late Fusion") achieves an FPR of 0.0125%,
which is almost 50% better than the 2010 "Early Fusion" CNN's 0.02% FPR over
the same dataset.

Still, a 2011 model ("HOG+LBP/MLP, Monolithic HOG classifier") that uses
"explicit features" vastly outperforms both, achieving an FPR of 0.00026% over
the same dataset.

Conclusion: "We showed that the late-fusion classifier outperforms the early-
fusion one but does not go beyond the state-of-the-art."

~~~
selestify
Wow, thanks for doing this summarization. I wish papers were written more in
this clear manner.

------
itchyjunk
Pedestrian Recognition system with the ability to retrain for Gait Recognition
[0] [1] would allow systems to track any individuals uniquely. Add to this
sentiment analysis[2] and we end up in a very sci-fi realm. This make me
curious about how an adversarial system [3] would defeat a system like this.
Maybe early on, putting a quarter dollar in my shoes and thinking happy
thought is all it will take.

On a side note, I am reminded of an anime called Psycho Pass [4] which uses
psychological rating given by a super intelligence to decide who needs to be
taken off the streets to be treated.

\-------------------------

[0]
[https://www.newscientist.com/article/mg21528835-600-cameras-...](https://www.newscientist.com/article/mg21528835-600-cameras-
know-you-by-your-walk/)

[1]
[https://en.wikipedia.org/wiki/Gait_analysis](https://en.wikipedia.org/wiki/Gait_analysis)

[2]
[https://en.wikipedia.org/wiki/Sentiment_analysis](https://en.wikipedia.org/wiki/Sentiment_analysis)

[3] [http://www.popsci.com/byzantine-science-deceiving-
artificial...](http://www.popsci.com/byzantine-science-deceiving-artificial-
intelligence)

[4] [https://en.wikipedia.org/wiki/Psycho-
Pass](https://en.wikipedia.org/wiki/Psycho-Pass)

~~~
mabbo
Then add in some of the face2vec algorithms coming out (takes in an image of a
person, outputs a vector that stays roughly stable for the same person in
different images).

Imagine the airport of the future where they won't bother with "Mr Smith,
please report to Gate 6 for final boarding" while Mr Smith sits oblivious at
the bar. Instead it will be "Mr Smith? We brought you your bill early because
your flight is leaving in 10 minutes and you should get going" "But I didn't
tell you my name or flight..."

~~~
kirubakaran
Or more realistically 'The probability of Mr Smith making his flight has
dropped below x%, based on his body fat estimate, number of drinks, and
distance from gate. Sell his seat.'

"Would you like another drink sir?"

Why alert him when they can sell his seat, sell him a few more drinks, a
massage, and some entertainment?

~~~
icebraining
And there won't even be any "whistleblowers", since everyone will be able to
honestly say they had no idea. The programmers will just plug all the
information they have into a generic machine learning system, and ask it to
predict the probability of a missed flight for each flyer, based on opaque
correlations.

~~~
fenwick67
Okay this is freaky.

This is a legitimate reason to worry about AIs. The "world domination" angle
seems overblown, but machine learning has the potential to evolve some immoral
behaviors like this without anybody even knowing.

~~~
icebraining
There are people talking about those issues:
[http://www.econtalk.org/archives/2016/10/cathy_oneil_on_1.ht...](http://www.econtalk.org/archives/2016/10/cathy_oneil_on_1.html)

------
dclowd9901
It's great that this can recognize pedestrians, but I've still seen no theory
around decision-making. This last week, I happened upon a tree that fell into
the road after a bad wind storm. I _know_ a tree when I see it, and I know it
is large and heavy and difficult to move and would take a great deal of time
for that to happen, so I immediately cut to a route around it. But what if it
was a bus or truck turning around? Well I know that a bus turning around is
large and heavy and difficult to move, but it will be moving out of the way in
a moment, so I can wait for that to happen.

When are we going to start seeing methods for measuring intent or theory
around guessing the future nature of an object based on contextual
understanding of that object? How does a car recognizing a pedestrian help if
the pedestrian starts running toward the car?

It just feels like there's a lot of patting of backs that happens around this
stuff, when we really aren't even close until we have a system that has as
learning and understanding approach that is as abstract as a human's.

~~~
roymurdock
As we move from relying on humans to relying on machines, we'll need to
develop better infrastructure to guide those machines.

Our current roads are OK for human drivers, as they are designed heavily
around visual signals, we can process and synthesize a lot of
visual/audio/learned data well.

Machines can process visual data faster than us, but can't synthesize it with
other data nearly as well as we can. Adding sensors that passively communicate
information, such as the location of lanes, other cars, and humans, will
reduce a lot of the issues and edge cases of visual-based ADAS systems.

I could imagine a future where cars are always listening for smartphones (or
other wearable/implanted sensors) in "pedestrian" mode, which could be used to
override regular function and stop the vehicle if detected within a certain
distance, or intersecting at a certain trajectory.

------
joshontheweb
Frankly, I'm very terrified at the prospect of being tracked in my every move
and it seems entirely unavoidable. License plate scanners are already in wide
deployment. There was an article on HN last week about the FBI's 100 million
strong facial recognition database using peoples drivers license photos
without their consent or knowledge. Not to mention the wholesale tracking of
our online activities by the NSA which is already a done deal. I'm seriously
thinking about leaving the US to buy myself some time. I'm not hopeful that
anywhere will be safe in the end as other countries catch up with this
technology. What can be done if this isn't the world you want to live in?

~~~
BurningFrog
"Unavoidable" is the key thing to realize.

Even if you manage to get your government to never do this, it is (or soon
will be) easily within reach for private hobbyists.

My window faces a major freeway. I could probably rig up a license plate
reader that registers all cars going by fairly cheaply. If not now, then
surely in 5 years.

Everyone has a quite nice camera in their pocket. The car I bought last year
comes with 3 cameras. And so on. I think it's an unavoidable fact that in the
very near future, you have to assume you are being filmed whenever you're in
public.

So I think the real question is: How do we adapt to this new world? How can we
limit the bad aspects and enjoy the good aspects? Are we really sure what is
bad and good?

~~~
randcraw
Two thoughts:

1) It's a lot easier to recognize 100M distinct plates than faces. I don't see
anyone having the resources and patience to do the latter any time soon. Even
the FBI isn't claiming they can recognize 1 face/100M with low error today. At
best, they have such a database of pictures. I'm have big doubts that usefully
matching against so many faces is even technically possible.

2) I think you're right, given the rapid advancement of tech, our emphasis
shouldn't be whether we should outlaw surveillance practices and methods.
Instead we should focus on controlling access to such databases. Who gets to
gather or use such data and when should it be authorized? Should we require
that each ID match request be explicitly granted (like a warrant) or should we
at least require that all such requests be logged, and perhaps publicly
reported?

Once access to sensitive data is regulated, it no longer matters how the data
is gathered. Abuse of such a law should poison any fruits of the data, at
least on the part of lawful agencies.

~~~
jhartmann
Regarding 1, there are systems @ Facebook and inside Google that can do this
reasonably well and I know the technology intimately. I would consider this a
nearly solved problem, I give it 1 to 2 years at most to be more generally
available.

------
w_t_payne
Within the next 3-8 years every new passenger car will be fitted with camera
sensors. (Mandated by public safety standards). Most new cars will also be
networked. The ECUs on the vehicles may not be powerful enough for real-time
biometrics, and the available bandwidth may not be sufficient for this
particular application either .. but it wouldn't take _much_ to make it
technically feasible. Now., I feel uneasy about the societal risks involved
with us going down this route, but I know that there are others who would jump
at the chance to exploit this opportunity.

~~~
7952
Safety technology can take a long time to be adopted. The vast majority of
cars still emit toxic gas for example.

------
emcq
So an ensemble beats a monolith? Would be cool to see if the ensemble with
distillation could fit to a more computationally friendly model!

------
yitterational
No pedestrian recognition or accident avoidance system is going to be 100%
effective. But now we have the opportunity to apply to motoring what we have
done to aviation: every rare fatality and the situation that led up to it can
be analysed in depth to prevent recurrence.

------
waynecochran
Next we develop camouflage to avoid detection...

------
saosebastiao
Stuff like this is why I believe that level 4 autonomy within the next decade
is quite the lofty goal and level 5 is completely out of the question.

For example, in WA State, you can cross at any unmarked crosswalk which is
defined as any intersection (plus a few non-intersections), unless there is a
sign explicitly disallowing crossing. I wouldn't know how to accurately
estimate this, but there are probably over 100k unmarked crosswalks in Seattle
alone (most of which are ignored completely by humans). The pedestrian's
responsibility is to not enter the roadway without enough time for a car to
stop. The car's responsibility is to stop once the pedestrian signals intent
to cross. Signaling intent does not require waiting at the edge of the
roadway...merely walking towards the roadway while on the sidewalk is
considered signaling intent to cross.

What this effectively means is that the car must do far more than understand
pedestrian behavior in crosswalks. It must understand the periphery of the
road...not just the pedestrians it can see, but the pedestrians it can't. Most
speed limits are actually too high to take into account legal pedestrian
behavior. The speed limit is merely an upper limit; driving too fast for
conditions can always override the speed limit as a factor in an accident.
This means that trees, buildings, etc., that obscure view of and behavior of
pedestrians on the sidewalk implicitly lowers the speed at which cars can
drive on that road.

Now today we essentially get away with favoring the driver. The driver can
assume that they were acting reasonably by driving the speed limit, and we can
say that the pedestrian was out of line by walking too fast out into the
street, and most situations are resolved blindly based on testimony with no
evidence. But the risk model of autonomous cars is different. The manufacturer
is gonna be on the hook, not the driver, and the car needs to be able to obey
the letter of the law with a higher burden of proof and with dozens of sensors
recording the situation.

As we can see from this paper, _pedestrian recognition is still a hard
problem_. It's hard in the engineering sense, _and_ the mathematical sense.
But cars don't need to just recognize pedestrians...they need to recognize and
understand _the context in which a currently invisible pedestrian could appear
in the near future_. And perhaps more importantly, they still need to solve
the problem of _instantaneous_ path planning in a 2d space with a dynamically
changing 3d obstruction model consisting of far more than just pedestrians.
It's easy to look at the past and say we've made rapid progress in the past so
we'll make rapid progress in the future...but in the process of maturing a
technology we regularly see exponential progress in the beginning and
asymptotic progress near the end. This is going to be very difficult.

------
dagenleg
Wait, who ate most of the paper?

------
donatj
Automated jaywalking tickets, here we come.

~~~
gotchange
Or better stopping vehicle-ramming terrorist attacks like the one in Stockholm
last week. Just imagine a system setup on the vehicle where it detects above-
average acceleration in a downtown area (GPS supplied) in addition to abnormal
swerving coupled with pedestrian data from the sensor network, the auto-pilot
system kicks in and takes over the vehicle bringing it into grinding halt
immediately.

~~~
donatj
I feel like that would be easily circumventable given time and resources.

I imagine most attacks aren't people being like "You know what, I'm sick of
life. I'm just going to ram my truck into people"

~~~
gotchange
Right, it's an arms race between us and the terrorists but at this point in
time we need to make this weapon turn from low-tech to high-tech to operate
thus raising the barrier and make it prohibitively difficult for losers to
hijack or drive trucks to kill pedestrians on the street.

~~~
donatj
It's not an arms race. No amount of user-unfriendliness will ever make it
"that" difficult to purposefully use a vehicle maliciously. Terrorist
organizations are well funded. As long as vehicles powered by engines or even
motors, everything else in the vehicle is gravy. Rip out all the electronics,
put in new ones that just fire the cylinders or run the engine. It's not a
solvable problem.

[https://www.youtube.com/watch?v=HnOPD2MOngU](https://www.youtube.com/watch?v=HnOPD2MOngU)
Here is a small engine whose timing is controlled by an Arduino. You vastly
overestimate what can be done, and underestimate the intelligence and
willpower of terrorists.

All you're achieving is making it harder for everyday people to work on their
vehicles.

------
bArray
Please update the fact it's a download in the title.

~~~
gridit
Here is a more proper link, if anyone wants to avoid going straight to pdf:

[https://hal.inria.fr/hal-01501735/](https://hal.inria.fr/hal-01501735/)

