Hacker News new | past | comments | ask | show | jobs | submit login
Face detection library with speed of 1500FPS (github.com)
137 points by aginovski 42 days ago | hide | past | web | favorite | 29 comments

The number is of frames (or faces?) per second is meaningless without context...

- Detect at least one face in a frame (count)

- Detect all the faces in a frame (count)

- Detect the location of the first face in a frame

- Detect the location of all the faces in a frame

- Is the resolution 64x64 or 1028x1028

- Is it only finding some of the faces (what’s the accuracy)

- 1500FPS on what hardware

Anyone have the paper, otherwise, why is this here?

Extraordinary claims, require extraordinary proof, I don’t even see a README.

> Extraordinary claims, require extraordinary proof, I don’t even see a README.

Im confused. The link points to Github which literally shows the README on the front page.

> Is the resolution 64x64 or 1028x1028

From the readme: cnn (CPU, 128x96) 2.35ms 425.95 0.64ms 1562.10

> 1500FPS on what hardware

Intel(R) Core(TM) i7-7700 CPU @ 3.6GHz.

I could go on, but literally the answers to your questions are in the README that you somehow didnt see.

For reference, just saying the CPU isn’t exactly what I meant. However, I do see some of it’s there.

What I’m pointing out is that generally this is a very loose claim. 1500FPS is for a very specific problem, on a rig that is still not defined, using images that are similarly not defined.

What’s the RAM, IO speed, what type of images, etc. Comparing to other methods on the same rig solves that.

Again, I see your point, but I was generally pointing out that the analysis or claims of this nature are not clean or clear.

It's even more meaningless once you realize that you can get arbitrary FPS just by running multiple programs side-by-side.

Indeed, kinda what I mean by “hardware”. If I have 120 CPUs and 8 GPUs, personally I’d expect it faster. Face detection is super paralleliazable

You guys are silly, it's in the readme. But you're right, total click bait title. At reasonable resolution it's like 60FPS.

From 128x96 image. Another way to look at it is that on RPi you can handle 5 frames per second with 640x480, which probably means one frame every few seconds with the full RPi camera resolution. Sounds like the same results as python with opencv?

You always scale the original image down, it is pretty standard

The point being made here isn't how to get face recognition speed workable on a pi though it's that this is no faster than the common implementations despite the catchy title.

Yes but not to 128x96. Usually you want at least 640x480.

Wrong. Most facial recognition software reduces largest image dimension to 128. This allows caching to become effective, massively speeds up the process (hence the 15k fps).

A facial recognition pipeline contains a few steps, for feature extraction people DO resize image to something like 128x128, but for face detection (which is what this repo does) you DON'T.

Also, the reason why we scale down the input image is not for cache effectiveness, it's simply for reducing computations needed. Maybe except for naive matrix multiplication and convolution implementations, which is what this repo does. But there is no point to discuss performance if you are using a naive implementation which by design ignores cache/instruction latency/anything Computer Architecture related. Please, at least take QNNPACK as baseline.

Seems like something one might use in a pipeline following a Haar cascade to detect faces in a larger image in some sort of surveillance camera scenario.

I don't think that would be helpful, haar cascades have too many false-negatives (in part because it relies too heavily on specular reflections from your face's T-zone). Probably worth just using the CNN, you can make something else do region proposals if you really want. Probably another net :)

But they don't list the accuracy. If not important I'll have 3000fps ready for investment by noon.

My initial thought was, does FPS stand for frame per seconds for faces per seconds? :P

The SOD embedded CV library do also implements a Real-time frontal face detection via its Realnets architecture (~5 ms on HD video stream).



1500FPS when the image is scaled down from 640x480 to 128x96. 64FPS at full resolution, which is slower than the "reference" model at 81FPS.

I don't understand. You feed in a 0.3 megapixel image, then scale down - how is there enough resolution to recognise a face? If the face is 25 pixels wide in the original it's now 5px, that can't be enough detail??

These are rather poor specs. Commercial applications in this space are in the tens of millions per second range, while consuming multiple HD resolution videos, and quite a bit more: spoof detection, expression neutralization, illumination normalization, and even 3D reconstruction. The software I write, commercially, is in the high tens of millions if you are only doing face detection.

They are 128 bytes.

Cool, link to repo? Oh...

Does anyone know the paper describing the CNN model ?

Casual search didn't turn it up among the usual suspects (Scholar paper list, author's site).

I'm really interested in the performance (as in detection IOU score) benchmarks comparison with other methods. CPU cost perf is not the whole story.

I am not fully certain but based on the caffe file names, I think it may be a variant of what is in this paper,


which itself looks like a variant of MTCNN generally.

Probably a good thread to post this in... Anyone have an extraordinary use case for the domain facialrecognition.ai? I'd love to put it to good use.

Pro bono, right? That’s very kind of you in this day and age of domain squatting!

How does this compare to Haar Cascade?

faces per seconds

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact