
Ask HN: What is the current state of open-source face recognition? - boramalper
I am aware of<p>- OpenCV<p>- OpenBR<p>- OpenFace<p>but I do not know the (significant) differences between them and which one should I prefer over another in specific cases.<p>I would love to hear your comments if you had a chance to use one or another. Also, please, share your reasons and reasoning so that people can decide for themselves whether it is applicable to their specific situation.<p>Thanks!
======
lovelearning
I have used OpenCV's face detection and recognition capabilities for a couple
of projects \- home security system using Odroid and IR camera modules, a side
project for cat recognition, testing low-res cheap USB cameras in low lighting
- and have become fairly familiar with its gotchas.

Its weak area currently is the accuracy of face _detection_. Before
recognizing the identity of a face, you have to first find the positions of
faces in an image.

OpenCV provides multiple algorithms for this - cascades of weak classifiers
likes Haar cascades, Local Binary Pattern cascades, Histogram of Gradients
cascades - and a number of pretrained models of frontal and profile human
faces for each of those algorithms. There's even a frontal cat face model! But
all of them suffer from high false positives or false negatives depending on
subject distance and ambient lighting levels. The cat model has trouble with
even the slightest of angles.

So face alignment is a mandatory pre-processing step with OpenCV's models. But
OpenCV doesn't provide any end-to-end alignment routines - it's all upto you
to write the alignment code. The dlib library has all that built in.

Coming to OpenCV face recognition, it provides 3 approaches - eigen faces,
fischer faces and LBPH faces. Its docs explain the shortcomings of each well.
In theory, LBPH should give the best accuracy, but I consistently found
Fischer giving the highest among the three. Recognition too requires
considerable preprocessing - left and right histogram equilization, cropping
out hair and neck areas, etc. All the pre-processing makes dataset preparation
cumbersome. But it does work okish - 65 to 75 percent accuracy - with smallish
datasets of just 20 frontal faces per person.

If you plan to start with OpenCV for face capabilities, I suggest using dlib
instead.

I haven't used OpenBR but eyeballing the code tells me it too uses OpenCV face
APIs underneath and another library named stasm which has face alignment
capabilities similar to dlib but using OpenCV. OpenBR seems to make building
preprocessing pipelines easier using its own DSL - that should reduce the
trial and error time significantly. But it doesn't add any new algorithm.

I haven't used OpenFace but looked into it in the past. It uses dlib for face
detection and alignment, and then uses deep convolutional neural network for
feature extraction and recognition instead of eigen, fischer or LBPH. These
convolutional features are likely to do a better job than OpenCV's cascade
features. I'm not sure about the ideal training dataset size though.

Generally, in such cases where a dataset is likely to be small due to
practical restrictions, the preferred deep learning approach is transfer
learning where a large pretrained model like ImageNet is used for initial
layers and only the last few layers are retrained on the user's face dataset.

I've used deep object detection frameworks like YOLO and ResNet R-CNN in other
contexts, and found them to be good for person detection. I think a deep
object detection trained on faces to output face positions combined with deep
face recognition is the best combination. FaceNet does exactly that
([https://github.com/davidsandberg/facenet](https://github.com/davidsandberg/facenet))
and is probably the best one right now.

All said, identity recognition in our brains is actually multimodal (face,
body, gait, voice, gesture, etc). AFAIK, all the existing stacks support only
frontal face recognition with some tolerance for transformations, and none of
them support even recognition using profile face images, let alone multimodal
identity recognition.

~~~
boramalper
Thanks for the detailed answer! facenet and OpenFace looks promising, so I'll
have a look at them.

edit: LFW accuracies:

facenet (20170512-110547) -> 0.992

OpenFace (nn4.small2.v1) -> 0.9292 ± 0.0134

[https://github.com/davidsandberg/facenet#pre-trained-
models](https://github.com/davidsandberg/facenet#pre-trained-models)
[https://cmusatyalab.github.io/openface/models-and-
accuracies...](https://cmusatyalab.github.io/openface/models-and-
accuracies/#accuracy-on-the-lfw-benchmark)

