- Detect at least one face in a frame (count)
- Detect all the faces in a frame (count)
- Detect the location of the first face in a frame
- Detect the location of all the faces in a frame
- Is the resolution 64x64 or 1028x1028
- Is it only finding some of the faces (what’s the accuracy)
- 1500FPS on what hardware
Anyone have the paper, otherwise, why is this here?
Extraordinary claims, require extraordinary proof, I don’t even see a README.
Im confused. The link points to Github which literally shows the README on the front page.
> Is the resolution 64x64 or 1028x1028
From the readme: cnn (CPU, 128x96) 2.35ms 425.95 0.64ms 1562.10
> 1500FPS on what hardware
Intel(R) Core(TM) i7-7700 CPU @ 3.6GHz.
I could go on, but literally the answers to your questions are in the README that you somehow didnt see.
What I’m pointing out is that generally this is a very loose claim. 1500FPS is for a very specific problem, on a rig that is still not defined, using images that are similarly not defined.
What’s the RAM, IO speed, what type of images, etc. Comparing to other methods on the same rig solves that.
Again, I see your point, but I was generally pointing out that the analysis or claims of this nature are not clean or clear.
Also, the reason why we scale down the input image is not for cache effectiveness, it's simply for reducing computations needed. Maybe except for naive matrix multiplication and convolution implementations, which is what this repo does. But there is no point to discuss performance if you are using a naive implementation which by design ignores cache/instruction latency/anything Computer Architecture related. Please, at least take QNNPACK as baseline.
Casual search didn't turn it up among the usual suspects (Scholar paper list, author's site).
I'm really interested in the performance (as in detection IOU score) benchmarks comparison with other methods. CPU cost perf is not the whole story.
which itself looks like a variant of MTCNN generally.