
An On-Device Deep Neural Network for Face Detection - olivercameron
https://machinelearning.apple.com/2017/11/16/face-detection.html
======
QasimK
I really have to applaud Apple for doing these things on the device instead of
on their cloud servers. They respect your privacy far more than most large
tech companies (I would just say “they respect your privacy, others don’t”.)

~~~
williamxd3
how would they do on the server? no internet connection = cant use the phone?

~~~
spynxic
I don't have the time at the moment to check whether or not Apple uses this
approach, however, Google went about solving offline-functionality in a clever
way. [https://research.googleblog.com/2015/07/how-google-
translate...](https://research.googleblog.com/2015/07/how-google-translate-
squeezes-deep.html)

------
pat2man
This appears to be what powers the basic face detection in the Vision
framework, not what powers Face ID:
[https://developer.apple.com/documentation/vision](https://developer.apple.com/documentation/vision)

It is not capable of identifying individual people, just recognizing faces and
facial features in an image. Still quite useful for things like Snapchat's
masks.

~~~
trumpeta
In my limited trials it was not terribly performant, running at 15fps or so
only.

~~~
ctlaltdefeat
I would consider that pretty performant for a task such as this on a phone.

~~~
cycrutchfield
Compared to non-DNN techniques? I don’t have concrete numbers but I’m pretty
sure “classical” techniques can do this with much higher throughput.

~~~
tedunangst
With equal accuracy?

~~~
tzs
If the non-NN techniques are faster, but the NN techniques are more accurate,
couldn't they be combined to provide both speed and accuracy?

Use the NN to find the faces, and the non-NN to track faces once the NN has
found them, and use the NN to periodically check to see that the non-NN is
still locked on.

~~~
tedunangst
Does that work? A NN might detect half a face, but then how do you switch over
to the "find two dots" technique when there's only one eye? This seems
susceptible to a lot of flapping.

~~~
tzs
I'd expect that the non-NN part would be more of a "track movement of this
arbitrary blob" thing rather than a "track movement of this face" thing.

Suppose the NN is only 25% of the speed you need to support the frame rate you
want. Then every time you get a new face blob list from the NN, the non-NN
tracker would to track the blobs for 3 frames. My guess is that in most common
photography situations where you need face detection, faces won't move very
far or won't change orientation or lighting very much in 3 frames.

------
newscracker
I understand this article is about face detection, but on a tangential note,
face identification against known faces in the Photos app has been laughable
for me on iOS 10 (haven’t checked if this has improved in iOS 11, since I
don’t have many new photos with new faces). Even with some manual training,
the face identification and name matching seemed quite primitive, getting most
of its guesses wrong (for example, matching kids’ faces against adults’
faces).

That said, I do strongly admire and appreciate Apple’s stance on privacy and
the software doing these kinds of work on the device and not on the cloud. I
can wait for these things to improve.

~~~
nkristoffersen
For me the faces function in ios photos has been pretty amazing. Accurately
detecting blurry background faces is surprising. I just checked it and it
picked out a picture of me where just my nose and part of a cheek is showing!
(aviator sunglasses and scarf covering my face).

------
cromwellian
I feel like this paper reads part like a paper on AI and like a marketing
piece humble brag, “back in 2014 when we were doing X” which seems like its
trying to say “we’ve been doing DNN image face before Google and Facebook made
it popular”. But if you read papers from Google, FB, MS, et al on AI, they
don’t have these kinds of humble brags and they contain far more citations of
previous work in the field.

~~~
ucaetano
Yep, plenty of it:

 _However, due to Apple’s strong commitment to user privacy, we couldn’t use
iCloud servers for computer vision computations. Every photo and video sent to
iCloud Photo Library is encrypted on the device before it is sent to cloud
storage, and can only be decrypted by devices that are registered with the
iCloud account_

~~~
sarreph
What do you want them to say?

Something like "We didn't use an already-existing, massive database of ours
because [REDACTED]."

Despite my somewhat facetious alternative, I am genuinely interested in how
you think they could have worded that 'better'.

~~~
cromwellian
They could have matter of factly just described their algorithm. How many
papers do you read from ACM that start off with a long, marketing oriented set
of paragraphs to justify why the paper's algorithm was written?

Let if you're reading a paper on style transfer using DNN, you don't get
"Well, we wanted to paint replicas of these paintings, but due to corporate
issues, our motto, or our lack of impressionists, we were forced to invent
this algorithm..."

I just felt like there was an excess of Apple branding and marketing in what's
supposed to be a scientific paper. Yes, Google does this too, but on it's
blogs, not in the actual CS papers it publishes.

The papers on Map Reduce or Dremel aren't full of humble brags.

~~~
threeseed
This isn't a research paper. Apple never claimed it to be one. It's not in
some scientific journal. It's not peer reviewed. It doesn't contain any level
of detail. It lacks any of the sort of norms you would expect from a paper.
It's even hosted on Apple.com in a blog style format.

So why are you holding it to a research paper standard ?

~~~
cromwellian
Maybe because I expect research to be published not just as marketing fluff,
but also to be published as papers too. Apple for a long time was known for
blocking their AI researchers from participating at conferences or publishing.
They published their first paper only recently in 2016
([https://www.macrumors.com/2016/12/26/apples-ai-team-first-
pa...](https://www.macrumors.com/2016/12/26/apples-ai-team-first-paper/)) and
it seems like squeezing water from rock.

This behavior rubs me the wrong way. Science is a collaborative community
endeavor. Look at how many papers have been published by Google
([https://research.google.com/pubs/papers.html](https://research.google.com/pubs/papers.html))
874 in machine intelligence, arguably Google's main secret sauce that you
could argue they should keep as a trade secret for competitive
differentiation.

Apple's secrecy in product development is fine, but IMHO, if you're consuming
the fruits of community academic and commercial research, and trumpeting your
products advancements based on that, it behooves you to publish more openly,
the papers, at least on preprint services like ArVix.

------
margorczynski
Pushing human-identification capable software and libraries everywhere is
starting to get pretty creepy...

~~~
yathern
Not sure if this was your implication - but this is not capable of identifying
people in terms of putting a name to a face. Just identifying where there is a
face.

~~~
MBCook
Right. And its used for things like helping the camera make sure it keeps
people in focus.

~~~
StanislavPetrov
What it is used for is unrelated to what its capable of doing. The danger
always lies in the capability, not the intent.

------
satyajeet23
"Every photo and video sent to iCloud Photo Library is encrypted on the device
before it is sent to cloud storage"

------
dontreact
No quantization? There’s a lot of optimization being given. Left on the table
by doing float math at inference time.

------
amelius
Isn't this part of every consumer camera and phone already?

