

Engineers Test Highly Accurate Face Recognition - edw519
http://www.wired.com/science/discoveries/news/2008/03/new_face_recognition

======
iamwil
This is rather neat, though the actual paper is less breathless than the wired
article. I skimmed through it, and here's what I got:

The algorithm assumes face images have been cropped and normalized, and only
has been run on frontal views. Pose variation, detection, and alignment aren't
handled by this algorithm--claimed to be a different problem. But it does
handle different illumination and expressions, as well as occlusion and noise.

According to No-free-lunch theorem, there is no generalized classifier that
performs better than any other classifier on all classification tasks.
Therefore, in any classification task, the solution is often tailored towards
the problem, and picking features is one way to tailor it. Often times, you
can't use all the features available to you, because they could be noisy or
using them all can be computationally expensive. This is why people use SVD,
PCA, or other techniques to filter out features that affect the classification
the least, so you only do classification on features that affect
classification the most.

That's why people use some heuristics like picking eyes, mouth, noses as
features, as it would seem like that's where most of the information is.

The weird thing being claimed here is that it actually doesn't matter which
features you pick as long as 1) feature space is huge 2) you pick enough
features 3) you use a "correct" sparse representation of the features

Apparently, they do recognition by trying to represent the new face as a
linear combination of all known faces in the database. So if the new face
belongs to a person in the database, it can be represented mostly by faces in
the database known to belong to one person. If the new face belongs to someone
not in the database, then we can only represent it as a combination of
different peoples' faces in our database.

In order for the surveillance mentioned in Wired to work, you'd still need to
have an annotated database of people's faces built from somewhere. Same with
auto-tagging people's photos on facebook. And you'd need more than one photo
of a person too.

However, given Louis Van Ahn's method of using games to entice people to do
recognition tasks, and facebook tagging, you can create database of people's
faces fairly easily.

~~~
pixcavator
>The weird thing being claimed here is that it actually doesn't matter which
features you pick as long as 1) feature space is huge 2) you pick enough
features 3) you use a "correct" sparse representation of the features

I haven't read the paper yet, but from your description this part is indeed
the least convincing. It still matters which features you use! What if I pick
the average color of the image (and millions of similar "features")? A feature
like that won't help me in face recognition.

~~~
iamwil
You're in computer vision, so you'd know more than me. Like you, I am a bit
skeptical about it, but maybe this particular technique happens to work well
enough for faces, due to the nature of face image data. Just like Bayesian
classifiers assume independence when there isn't any, it works "pretty damn
well".

If you do get around to reading it, lemme know if you picked anything up from
it that also smell fishy.

~~~
pixcavator
What they say "features" they mean simply pixels! I have no idea why. So, if
you have a collection 100x100 images, they are all points in a 10,000 space.
This seems to be a common approach in pattern recognition. Now for face
identification, the approach could only work under the assumption that people
are "linearly independent": no face in the collection can be represented as a
linear combination of the rest of the collection. It’s a bold assumption. If
it is true, then the challenge is to make the algorithm efficient enough. The
idea is that you don’t need all of those pixels/features and they in fact
could be random. That must be the point of the paper.

~~~
iamwil
Yeah, they simply mean pixels. I failed to mention that in my previous post.
It seems odd to me that it works because doesn't representing it as a linear
combo of pixels mean they throw out structural information of the image?

I don't know how they came to make that assumptions, but could it be that the
linear independence comes about because of the high dimensionality? The higher
the dimensions, the harder it is to find a combination of sparse
representations (linear comb) that gives you the sparse representation
matching the test image. And if you do, and it all comes from the same person,
then they say it matches.

There must be some shape or property of the face subspace that allows for
this, since potentially, you can match a duck to someone's face.

~~~
pixcavator
I don’t think they throw out structural information (like adjacency of
pixels). Since each pixel corresponds to an independent dimension, the
adjacency is still contained in those coordinates: (a,b,...) is not the same
as (b,a,...). So, as long as all images have the same dimension, it’s OK. If
you had both 100x100 images and 1x10,000 images in the collection, that would
mess up everything!

As we know from linear algebra, if a collection is linearly dependent, you can
find a subcollection that adds up to 0! In other words, everything cancels out
and you end up with a blank photo. Is it possible? If the dimension is too low
(the images are small), yes. If the collection is large (a lot of images),
maybe. What if it is small? (It is in fact!) In a very extreme case you may
need the negative for each image to cancel it: same shape with dark vs. light
hair, skin, eyes, teeth (?). Interesting stuff...

------
ed
Here's the paper:
[http://www.scribd.com/full/2359066?access_key=key-2ntxn9tb3u...](http://www.scribd.com/full/2359066?access_key=key-2ntxn9tb3uh1s4v7ie2t)

------
TheTarquin
Anyone have a guess as to whether or not this will pass the "bar fight test"?

I go to the bar and start shit with Big McLargeHuge and he busts up my face.
My nose is broken, one or both eyes are swollen, my jaw line is resting
differently because of busted teeth or possibly a broken jaw.

Can I still be recognized? People who know me will still be able to see who I
am. A person comparing my face to a photo of me in an unbeaten condition would
be able to recognize me. Will this software?

~~~
iamwil
I think only if your beatup face retained the same ratio of colors as your
normal face, if I'm not mistaken. It represents faces as a linear combination
of pixels, so it throws away structural information.

~~~
TheTarquin
Interesting. So it'd probably work alright if not too much of my face was
bloody/bruised?

------
mynameishere
This should obviate the need for RFID implants in our skulls.

------
jakewolf
That would make tagging photos of people in social networks a cinch as you
could check against photos of a user's friends. Facebook app?

~~~
maximilian
The only problem, as iamwil pointed out is that the algorithm doesn't work if
it isn't given perfectly cropped face on photos. Its a "different problem" as
they say, which i'd say is pretty valid.

------
kurtosis
haven't read the paper but it reminds me of the work that's been done on
single-pixel cameras and compressed sensing

