Seems easy enough, except for the 'standing still' long enough to capture a live model lit from multiple angles in the same pose. Is that the key? Was it all that difficult, and did you use any particular techniques?
This seems a bit like various volume rendering techniques, except with the objects having no transparency (like they would in a typical tomographic reconstruction)
Maybe we're using different terms or definitions here --- but for me "volume rendering" is a clearly separate domain from "extract spatial information from 2D imagery"..
Extracting the spatial information is a subset of the same problem, I think. Things like tomographic reconstruction use either multiple rays through the same object or parallel rays through the object at different angles, but this gives density through the object and from that the volume and its shape. If you consider the same problem where you have no density or 100% density, you get the spatial information from multiple 2d objects.
I'm not really savvy on the problems though so the math could be totally different and unrelated, but I think you can accomplish the same thing with either method.
There is a lot of research in the Computer Vision literature on this topic and its generalization. One particularly prolific researcher on the subject is Jan Koenderink[1].
Or search for "Shape from Shading" or "shape from occluding contours"
That paper is great. I see people are thinking my thoughts before I was even born. I am going to have to spend a weekend reading through all this new material.
May I plug shamelessly my yt video of voxel carving and coloring. I am currently working on my master thesis where I build a 3d scanner with a turntable and a camera:
This is pretty much what I did for my Masters, in 2000.
I also applied a triangular mesh to the surface, extracted textures for each triangle, and then tried to optimize edge flips to minimize blurriness in the resultant triangles.
Yes, the linked video just shows the first step of the whole scanning process. I'm optimizing the 3d reconstruction based on texture correlation and estimate true depth for each surface point via cross ratio from four known points. Depth estimation from calibrated images via cross ratio is particularly interesting (imho), since it seems that it hasn't been done before.
The Radon transform was the first thing that came to my mind, too[0].
It does the same thing, but more rigorously and flexibly. Imagine instead of just looking at the shadows (which are essentially a binary 'this is part of the object/this is not') and reconstructing the surface ad hoc like OP did[1], you could also apply it to a translucent object and reconstruct its interior as well.
Being a crystallographer, this appears analogous to transforming Reciprocal space -> Real space in X-ray diffraction crystallography.
Only, in this case the negative space (from a shadow) is the raw data instead of diffraction spots. So no Bragg's law or Ewald's spheres involved here, but fascinating nonetheless!
Awesome writeup and experiments. Especially like the crossed-eyes stereo image.