
A Face Recognition Algorithm That Finally Outperforms Humans - Mz
https://medium.com/the-physics-arxiv-blog/the-face-recognition-algorithm-that-finally-outperforms-humans-2c567adbf7fc
======
fenollp
> But when the algorithm is faced with images that are entirely different from
> the training set, it often fails.

Over training it is.

------
drpgq
NIST's latest FRVT results for commercial face recognition (FRVT 2013) is
available here:

[http://www.nist.gov/itl/iad/ig/frvt-2013.cfm](http://www.nist.gov/itl/iad/ig/frvt-2013.cfm)

------
gizmodo59
I had to say this, the articles from medium.com has more focus on PR than the
quality of content. Today itself I saw two articles, Schrodinger's cat and
this.

------
shreyassaxena
No it is wrong, the actual human performance on full images is close to 99%.
[1] Ctrl+F human

The performance of human it reports is on cropped images shown to human, which
is not a fair comparison.

[1] [http://vis-www.cs.umass.edu/lfw/results.html](http://vis-
www.cs.umass.edu/lfw/results.html)

~~~
apu
No, the 97.53 is more fair. See here:
[https://news.ycombinator.com/item?id=7638269](https://news.ycombinator.com/item?id=7638269)

------
apu
_sigh_ no, it doesn't "outperform humans" on "face recognition". In
particular, see my previous comments [1] and [2] for discussion on why this
method might be doing well.

As for "outperforming humans", a more accurate statement might be, "this
algorithm outperforms (for this simplistic task) one experiment done with a
limited set of humans on this one particular dataset which has been in the
community for 10 years now and is thus highly gameable."

But I realize that's a lot less pithy.

In particular, this dataset is nearing saturation, and whenever that happens,
differences in the accuracy numbers often don't mean much. So for example with
Facebook's number at 97.53% and this paper's at 98.52%, you're talking about
the difference between getting 148 pairs of faces wrong vs 89 pairs wrong. In
practical terms, as a researcher working with a dataset like this, you very
quickly learn to focus on just the ones your algorithm gets wrong, and it's
impossible to not subconsciously try to optimize for getting those few cases
correct, even if those techniques wouldn't actually help in the general case.

[1]
[https://news.ycombinator.com/item?id=7637866](https://news.ycombinator.com/item?id=7637866)

[2]
[https://news.ycombinator.com/item?id=7638269](https://news.ycombinator.com/item?id=7638269)

~~~
yummybear
You seem to know something about this - how accurate are the best face
detection algorithms on real-world datasets?

~~~
KaiserPro
depends on the lighting, camera setup, CPU, available processing time and
finally your intended purpose.

Face detection/recognition is a broad subject, and the applications are broad.

There are things like HAAR cascades which are able to pickout a face like
object from others (or anything else that it's been "trained" to do), however
they can't tell faces apart. They can be tuned so that they can be used in
realtime apps (like the autofocus on cameras)

HAAR cascades are limited in a number of ways: they can't tell faces apart,
and you need a different cascade for different views (profile/portrait/other)
they can also have trouble with skin colour as well(depending on training).

more advanced algorithms are able to workout face orientation in realtime
(google hangout moustaches and the like) but once again they arn't able tell
between two faces.

However there are no accurate real-time (or anywhere near realtime) algorithms
that are able to tell faces apart (i.e. put a name to a face in a crowd) In
fact I would go so far as to say that there are no non-realtime ones either.

~~~
Leszek
Haar* cascades (specifically, Viola-Jones, which is what I assume you are
talking about) are hardly state-of-the-art, they're over 13 years old now.

* Also, Haar is a surname, not an abbreviation, you don't write it in all-caps

------
chriskanan
This isn't really face recognition, this is face verification. In computer
vision, face recognition usually means tell me who this person is (in
psychology it means "have you seen this person before?"). Face verification
gives an algorithm (or a person) two images of faces and asks, "Are these the
same person?"

Their result is impressive, and it improves a bit over Facebook's recent
result on the same dataset with their DeepFace system (97.35% for DeepFace vs
97.53% for people vs 98.52% for the system discussed in the article).

Also, it is interesting that they are not using deep learning for this. They
are using a Discriminative Gaussian Process Latent Variable Model.

~~~
dheera
I'm pretty sure in reality Facebook also uses your social network graph to
restrict the candidate set and get higher recognition accuracy. This makes it
hard to compare Facebook's results to a pure recognition algorithm.

On that note, Facebook could do even better by restricting the candidate set
based on time, precise location, and compass orientation, given that most
mobile users have Facebook installed and are running it in their pockets when
they get their picture taken by others. (They could do rough recognition
purely based on position and orientation without even looking at the camera
image, if they really wanted to, so with the camera image it could really be
near 100% accurate, and even work if you take a picture of a friend's back.)

~~~
apu
Sure, but the numbers in the research paper are "pure" (i.e., not using all
this additional information). In production, I'm sure they must be using all
these additional cues as well.

~~~
dheera
Ah, I see, you mean the 97.35% is a "pure" algorithmic result. That's pretty
impressive.

