Re 3)
I don’t know any details of Apples implementation, but typically computer vision algorithms integrate data from multiple sensors to generate a 3D model. The more data you have, the more robust the output will be.
It’s possible to generate reasonable 3D models of faces from a single photograph. [1]
The highest resolution 3D scans I’ve seen are produced by aligning data from multiple high resolution photographies.
The big problem with that aproach is that it requires a lot of detail in the source material. Smooth surfaces, blurry images, or noise because of poor lighting makes it impossible for the algorithm to find features to align.
This is where the dot matrix projector comes in: by projecting a bunch of dots on your face, you get features that the algorithm can align, making the scan faster and more robust in low light.
And if you're interested in building 3D models from multiple photographs, try Helicon Focus. You take a focus stacked set of images (basically get a macro lens, open it up pretty wide, and take a picture with the focal plane 1cm (etc.) apart until every part of your subject is in sharp focus), and it will look for the sharply-focused parts to infer depth information for the stack. It can then build you a 3D model.
Pretty neat stuff, though I've never found any actual artistic or practical use for it.
It’s possible to generate reasonable 3D models of faces from a single photograph. [1]
The highest resolution 3D scans I’ve seen are produced by aligning data from multiple high resolution photographies.
The big problem with that aproach is that it requires a lot of detail in the source material. Smooth surfaces, blurry images, or noise because of poor lighting makes it impossible for the algorithm to find features to align.
This is where the dot matrix projector comes in: by projecting a bunch of dots on your face, you get features that the algorithm can align, making the scan faster and more robust in low light.
[1]: http://kunzhou.net/2015/hqhairmodeling.pdf