Very good demo and it also happens to be related to my interests so here's my little analysis.
1, The face detection of this program is more impressive than the face recognition.
2, The detector is able to detect across variaty of poses and come up with a an estimate for the pose of the face. It does a lot more than the best opensource detector (openCV) out there.
3, If you download the mac program and open up the package contents, you will see that under the data file directory, it uses a different set of files (features.bin, log_likelihood.bin) for frontal and profile view detection.
4, If you look at the company's CEO's background and publications here:
It would suggest that the detector is using a histogram-based detector as outlined in his paper. The other detector made by CMU is the Rowley-Kanade one but I don't think that one is fast enough to run in realtime.
I have a few comments for you guys about our demo:
1. Image to Image is a bit misleading because we are fairly strict in our online demo in pose requirements, so many faces are "non-suitable" because they aren't frontal.
2. It works much better for video settings (but that type of online demo is way too CPU intensive) because we can organize each person into continuous tracks and we only need them to become 'frontal' once to do matching. And the more they are frontal, the more data points we have.
Does this technology work using both images simultaneously? Or could lots of images be pre-processed individually into something that allowed a cheaper comparison?
Just to try to explain the question better, here's an example. Let's say I have 10 images and I want to find the most similar people among any pair of images. Do I need to run every pair of images (45 full comparisons) or can I pre-process the 10 images into something such that the 45 comparisons can be done in a less expensive way?
Recognition is generally broken down into two steps, processing faces into "templates" and then comparing those templates. Generating templates includes all the preprocessing stuff as well: detecting faces in an image, estimating their pose, and finding landmark points. Our site goes into these issues in some depth (with some examples). So yes, we do break the process down: generating the templates can be done individually for every image, which allows you store that result and use it for future comparisons.
Generating 2 templates is many more times expensive than comparing those templates. However, as your dataset grows, generating templates grows at N, and the number of comparisons you need to do grows at N^2. So eventually, comparisons dominate.
It worked quite well for my face but not my wife's ( http://webdemo.pittpatt.com/recognition_demo/view.php?id=X3Y... ) because her head is slightly tilted and not facing front in these pictures. I'm certain your video recog. is better but I think you should look into making the requirements for photos a bit more lax because very few photographs have full-frontal, upright faces. And once you get it right, talk to Facebook/Myspace etc. :)
OK, I'm not familiar with MatLab performance. :-) But image processing is very computationally expensive, and the demo here is unbelievably fast. It must be optimized for its specific tasks. You just can't get that from a general package.
In retrospect, it's not surprising that they're optimizing some parts with assembler - even SSE (Streaming SIMD (Single Instruction, Multiple Data) Extensions http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions), which I hadn't heard of, but which is the kind of vector parallelism that gives supercomputers their speed.
CMU has been working on facial recognition for a long time. VASC and the now defunct MAPS lab did quite a bit of research for recognizing objects in general as well as facial features. It is good to see a spinoff company that can make money from the hard work put into it. More info:
I totally misunderstood what this is. You recognize and match two faces across two pictures! I thought you just identify that a face is there. You REALLY need to explain a bit clearer what it is - for you it's obvious, for a random person clicking through, it's not obvious at all. The Michael Jackson picture would be the best sample of this.