Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Our new online face recognition demo (pittpatt.com)
50 points by lbrandy on March 10, 2009 | hide | past | favorite | 25 comments

Very good demo and it also happens to be related to my interests so here's my little analysis.

1, The face detection of this program is more impressive than the face recognition.

2, The detector is able to detect across variaty of poses and come up with a an estimate for the pose of the face. It does a lot more than the best opensource detector (openCV) out there.

3, If you download the mac program and open up the package contents, you will see that under the data file directory, it uses a different set of files (features.bin, log_likelihood.bin) for frontal and profile view detection.

4, If you look at the company's CEO's background and publications here:


It would suggest that the detector is using a histogram-based detector as outlined in his paper. The other detector made by CMU is the Rowley-Kanade one but I don't think that one is fast enough to run in realtime.

Shamelsss plug, I implemented a javascript face detector ( with source code ) here: http://blog.kpicturebooth.com/?p=8 , it only handles frontal non-rotated faces =P

I have a few comments for you guys about our demo:

1. Image to Image is a bit misleading because we are fairly strict in our online demo in pose requirements, so many faces are "non-suitable" because they aren't frontal.

2. It works much better for video settings (but that type of online demo is way too CPU intensive) because we can organize each person into continuous tracks and we only need them to become 'frontal' once to do matching. And the more they are frontal, the more data points we have.

Here's an example of recognition applied to video: http://www.youtube.com/watch?v=jsjf3IDXef8

Does this technology work using both images simultaneously? Or could lots of images be pre-processed individually into something that allowed a cheaper comparison?

Just to try to explain the question better, here's an example. Let's say I have 10 images and I want to find the most similar people among any pair of images. Do I need to run every pair of images (45 full comparisons) or can I pre-process the 10 images into something such that the 45 comparisons can be done in a less expensive way?

Recognition is generally broken down into two steps, processing faces into "templates" and then comparing those templates. Generating templates includes all the preprocessing stuff as well: detecting faces in an image, estimating their pose, and finding landmark points. Our site goes into these issues in some depth (with some examples). So yes, we do break the process down: generating the templates can be done individually for every image, which allows you store that result and use it for future comparisons.

Generating 2 templates is many more times expensive than comparing those templates. However, as your dataset grows, generating templates grows at N, and the number of comparisons you need to do grows at N^2. So eventually, comparisons dominate.

It worked quite well for my face but not my wife's ( http://webdemo.pittpatt.com/recognition_demo/view.php?id=X3Y... ) because her head is slightly tilted and not facing front in these pictures. I'm certain your video recog. is better but I think you should look into making the requirements for photos a bit more lax because very few photographs have full-frontal, upright faces. And once you get it right, talk to Facebook/Myspace etc. :)

Impressive. What's the implementation language? I assume C.

It's a hodgepodge of low-level stuff. Some C. My favorite is the SSE-optimized assembly :)

Why do you assume that? It could just as well be MatLab or something similar, that makes fiddling with multimedia very easy.

How would you create a stand-alone executable file using MATLAB?


MatLab-performance can be quite good :-)

OK, I'm not familiar with MatLab performance. :-) But image processing is very computationally expensive, and the demo here is unbelievably fast. It must be optimized for its specific tasks. You just can't get that from a general package.

In retrospect, it's not surprising that they're optimizing some parts with assembler - even SSE (Streaming SIMD (Single Instruction, Multiple Data) Extensions http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions), which I hadn't heard of, but which is the kind of vector parallelism that gives supercomputers their speed.

Michael Jackson vs. Jackson 5


It needs a threshold of -3 to detect it, but it gets the right face (I think).

So...how exactly does one go about getting a-hold of this technology? There is no download link and it's not clear if the contact link will result in my being sent a demo.

My first R&D project at my company when I just got out of college was in image recognition, so this is pretty interesting.

Here's 2 I did:



The second one pretty impressive, even if it's under the default 0.00 threshold. I mean, RMS's mouth is obscured behind a katana and it still got a -0.14.

You need to work on the PG recognition though. It doesn't seem to want to recognize two photos as being both him. :P

Funny, my first test was also a pic of RMS.

CMU has been working on facial recognition for a long time. VASC and the now defunct MAPS lab did quite a bit of research for recognizing objects in general as well as facial features. It is good to see a spinoff company that can make money from the hard work put into it. More info:


I totally misunderstood what this is. You recognize and match two faces across two pictures! I thought you just identify that a face is there. You REALLY need to explain a bit clearer what it is - for you it's obvious, for a random person clicking through, it's not obvious at all. The Michael Jackson picture would be the best sample of this.

who else thought of this?


(synopsis: ancient mr. show sketch involving a corporate mascot called pit-pat. warning: swearing)

It was certainly the first thing that came to mind for me. After seeing the URL, I was pretty disappointed not to find a magical nonthreatening pansexual spokesthing on the actual page.

Sorry, we are experiencing a large volume of submissions. There are 1 images ahead of you. You can try refreshing this page later, or resubmit the same image to see the results.

How is this better than what Facebook uses?

Looks like something to give the government to track your EVERY MOVE.

Good job tho guys. Its pretty impressive.

Seems to work, good job.

This is amazing... although it needs some additional work( no ideas in my head for now).

Did you use your own algorithm for it?

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact