

A Global Database for Facial Recognition - drusenko
http://david.weebly.com/1/post/2011/01/a-global-database-for-facial-recognition.html

======
evandavid
I worked in Biometrics for a few years. I was on a fingerprint team, but I was
part a wider team that was also responsible for forensic-quality facial
recognition technology (a world leader). To put it bluntly, the technology is
still very, very, very weak. When it works, it's really impressive, but that's
normally during a demo from a nicely curated database. There are some facial
technologies that can "work" from a distance (and they are used in airports
already), but the success rate is low and the original image still needs to be
of decent fidelity. Often times the results from the automated matching will
be shortlisted for comparison by human operators. Maybe I was blinded by the
awesomeness of fingerprint technology (still blows my mind thinking about it).

Combine this general lack of automated matching awesomeness with the fact that
people age, can wear glasses or a beard, etc, and we are still many years away
from this kind of capability.

Still, it's always fun to think about the possibilities that biometrics can
and will provide.

------
drusenko
One thing I didn't elaborate on: Imagine pairing this with augmented reality.
Instantly know who everybody is on the street, in a bar, etc.

Quite creepy.

~~~
jared314
I thought that feature was cut from Google Goggles because of the creepy
feeling.

------
liuliu
Sometimes I have my doubt on how current facial recognition technology gonna
perform on tens of millions samples. Admittedly, we have become quite good at
the scale of tens of thousands real-world data (see Labeled Face in Wild
project), but how reliable it really is on large scale? My friend circle
(everyone I have met in life for a small period of time) probably only
consists of few thousands people, but I can remember many people who are
distantly related have very similar faces. My main concern is, the facial
variations between people may not be as large as we have imagined once we are
on tens of millions scale.

Here is an idea to validate my concern:

1\. Crawl over all photos on Facebook and run a frontal-face detector with it;

2\. Use current state-of-art methods (attribute based method or hybrid
methods) to get several pairs of similar faces (highest score ones) of
different people;

3\. Use the same method as in step 2 to get several pairs of similar faces of
same person with roughly the same score as in the step 2;

4\. Have human volunteers to blindly judge which pairs of person are in fact
the same one given only a small region of picture (e.g. face should occupy 20%
of the total area);

5\. Check the result in step 4 with ground truth data;

~~~
apu
The face recognition problem is very tough and we are nowhere close to solving
it for most real problems. The results on Labeled Faces in the Wild (LFW) [1]
show that even on the easier "verification" problem -- "are these two images
of the same person?" -- the best performance is under 90%.

To figure out how this translates to recognition -- "who is this person?" --
you can roughly take the verification rate V and number of different people in
your database N and get a recognition rate of roughly V^sqrt(N). (This is a
very very rough estimate!)

So with 1 person in your database, your rate is just V = ~90%. With 100
people, it's already down to 35%. And so on...

(I'm the author of one of the best methods on LFW right now -- the attribute-
based one [2].)

[1] <http://vis-www.cs.umass.edu/lfw/>

[2] <http://www.cs.columbia.edu/CAVE/projects/faceverification/>

~~~
liuliu
I am eager to replicate your attribute method but haven't got much time to do
so. Very excited to know that you are on Hacker News too. My impression is
that "verification" problem is already a leap forward comparing to
"identification" problem which consists of fewer people and many samples. The
problem space is much interesting now than before with the astronomically
number of digital photos available on the Internet (considering that for about
3 years, the "verification" result are improved from 69% to around 85%). But
my main concern is that our human being may not be good at "large-scale" face
recognition problem after all. If that were true, the fundamental of computer-
based face recognition problem would be flawed.

~~~
apu
We have results in our face verification paper showing exactly how well people
do on the LFW verification task. The answer is: very good, at 99.2%. However,
we also show that much of this is coming from the background and context in
the images in LFW. The fact that those images were originally gathered from
Yahoo News means that many images are taken at the same event -- with the same
background and same clothes, etc., and people can use this to make good
guesses, even without seeing the person's face!

I think the vision community is thus looking for how to exploit context
effectively. There have been a number of interesting approaches recently, from
looking at "things" vs "stuff", to using additional information such as
transcripts and captions, etc.

And of course, there are still lots of people looking at how to better solve
the basic recognition problem itself.

------
uptown
Facial-recognition alone is probably a ways off, but if you were able to
combine a number of digital hints you'd be able to give your system a
tremendous boost. Think about everything we do that's logged into a database.

cell phone location, credit-card purchases, mass-transit card swipes (and
cameras by those turnstiles), future location-specific purchases (airline
tickets, concert tickets, etc.), EZ-Pass for your car, IP address of your web
browser, location-tagged tweets, foursquare check-in, historical matches based
on a routine, and on and on and on.

You may not be able to easily match a random face against your entire database
... but if you were able to combine all of those elements together you could
almost say where someone was more times than not without even having an image
of the person. Right now only the government could get access to everything
listed above ... but if the trend of over-sharing and openness continues it's
not a stretch to imagine this being heavily commercialized. Depending on how
many of their services you use, Google probably has a pretty good idea where a
large percentage of their users are at any given time. Combine that with the
natural evolution of StreetView (live video) and you've got your product.

------
trotsky
As I understand it facial recognition only works that way on tv. Facial
recognition is still hugely computationally expensive. In the real world it's
used by security services by having a good picture of known bad guy x and then
having the system watch cctv in a couple of zones to try to spot him. Even
that takes hella cpu time, but it's still matching only a small fraction of
the faces... thousands vs. Hundreds of millions.

------
raphar
I'm sure that if the gambling industry think that this idea is useful and
feasible, there is a system using such database already installed on their
casinos.

~~~
MichaelApproved
They work from a smaller database with better pictures.

------
Tichy
I keep reading "crawl Facebook". How viable is that kind of thing? I could
think of various ways to use the data (like I proposed a friend I could create
a service that would help him identify the woman he did not approach - as a
bonus, that project might boost privacy awareness). But I assumed that
crawling Facebook would not be appreciated by Facebook, and also easily
prevented by Facebook. Since there are 500 million profiles to crawl, a lot of
different IP addresses for the crawler would be required.

Also, how legal is it to actually use data from such scrapes?

------
greendestiny
Yeah I've thought about it before, and I suspect Facebook would likely object
on copyright grounds. You could argue that it's simply an index to a visual
search engine.

I wouldn't be surprised if people had done this though, just not made it
publicly available.

------
spoiledtechie
Its already been done. By private companies selling to the Highest bidder,
mainly the Govt. How do I know, cause I have been watching my coworkers
working on the project for the past year. Its just not public yet...

------
51Cards
There may not be an expectation of privacy in public, but I suspect it could
be argued there is an expectation of anonymity. The tech is a long way from
this being a reality, but the thought of it is beyond scary.

------
ricaurte
Face.com seems to be essentially going down that path. Now whether they store
all of the faceprints or not, I don't know. They also have a developer API
that is in alpha right now.

------
kuahyeow
Someone's done research on anti-facial recognition too.
<http://ahprojects.com/blog/122>

------
barmstrong
If it's such a hard problem (as many commenters have pointed out), maybe
FindPeopleWhoLookLikeMe.com would be a nice pivot in the mean time.

------
phlux
DO. NOT. WANT.

