Hacker News new | past | comments | ask | show | jobs | submit login
A back-projection algorithm to extract 3D volume from shadows (minardi.org)
144 points by doctoboggan on July 29, 2013 | hide | past | favorite | 25 comments



This result (the shape that can theoretically be reconstructed from its shadows) is called the Visual Hull. See http://pdf.aminer.org/000/293/374/the_visual_hull_of_curved_... for the original paper.

Awesome writeup and experiments. Especially like the crossed-eyes stereo image.


Space carving is a similar idea [Kutulakos and Seitz, 2000]: http://www.cs.toronto.edu/~kyros/pubs/00.ijcv.carve.pdf


Very interesting. I knew I was not the first to think along these lines, I just was not sure what terms to google for prior research.


You may also enjoy http://www.cs.princeton.edu/~smr/papers/rt_model/rt_model.pd...

http://gfx.cs.princeton.edu/proj/3dscanning/

Or you might find some things of interest in my article collection https://dl.dropboxusercontent.com/u/315/articles/index_no_im...

(No doubt Dropbox will disable my public links soon. I wish there were an easy way to mirror my Dropbox folder to S3.)



I too casually obsessed with algorithms for extracting geometry data from photographs. Here's a Javascript experiment/demo from a few years ago. http://francoislaberge.com/labs/normal_mapping/me/

I'd always meant to make a writeup about it. I thought your article was great. Email me sometime if you continue to geek out in this space.


Very cool. Any info on how you created the normal maps for the pictures of faces?

I can see how you'd do it for a 3d object you can easily render out various ways, but I'm a bit puzzled by the photos.

Edit: Ok so I found this: http://www.zarria.net/nrmphoto/nrmphoto.html

Seems easy enough, except for the 'standing still' long enough to capture a live model lit from multiple angles in the same pose. Is that the key? Was it all that difficult, and did you use any particular techniques?


That article essentially captures it, I just wanted to write up a more intuitive explanation.


This seems a bit like various volume rendering techniques, except with the objects having no transparency (like they would in a typical tomographic reconstruction)

See: http://dl.acm.org/citation.cfm?id=378484 and http://en.wikipedia.org/wiki/Tomographic_reconstruction

Very cool for a simple implementation, and considering that the author doesn't appear to have dug deep into existing work.


Maybe we're using different terms or definitions here --- but for me "volume rendering" is a clearly separate domain from "extract spatial information from 2D imagery"..


edit: The above comment https://news.ycombinator.com/item?id=6123729 is more correct than I am I think :)

Extracting the spatial information is a subset of the same problem, I think. Things like tomographic reconstruction use either multiple rays through the same object or parallel rays through the object at different angles, but this gives density through the object and from that the volume and its shape. If you consider the same problem where you have no density or 100% density, you get the spatial information from multiple 2d objects.

I ran into this when playing with IDL ages ago: http://northstar-www.dartmouth.edu/doc/idl/html_6.2/VOXEL_PR... and learned about the rest from the references there.

I'm not really savvy on the problems though so the math could be totally different and unrelated, but I think you can accomplish the same thing with either method.


My research group actually implemented the tomographic approach 14 years ago[1] (and in realtime![2]) It worked pretty well

[1] http://www.disp.duke.edu/~dbrady/imaTutorial/papers/science6... [2] http://www.disp.duke.edu/~sfeller/Publications/spiecr76-13.p...


There is a lot of research in the Computer Vision literature on this topic and its generalization. One particularly prolific researcher on the subject is Jan Koenderink[1].

Or search for "Shape from Shading" or "shape from occluding contours"

[1]http://www.cs.rutgers.edu/~decarlo/readings/koenderink-perce...


That paper is great. I see people are thinking my thoughts before I was even born. I am going to have to spend a weekend reading through all this new material.


Here is an implementation of a similar idea: http://www.vision.caltech.edu/bouguetj/ICCV98/index.html


You can do this with uploaded photos (not shadows) online using Autodesk 123D Catch:

http://www.123dapp.com/catch


May I plug shamelessly my yt video of voxel carving and coloring. I am currently working on my master thesis where I build a 3d scanner with a turntable and a camera:

http://www.youtube.com/watch?v=h1lQid08a3k


This is pretty much what I did for my Masters, in 2000.

I also applied a triangular mesh to the surface, extracted textures for each triangle, and then tried to optimize edge flips to minimize blurriness in the resultant triangles.


Yes, the linked video just shows the first step of the whole scanning process. I'm optimizing the 3d reconstruction based on texture correlation and estimate true depth for each surface point via cross ratio from four known points. Depth estimation from calibrated images via cross ratio is particularly interesting (imho), since it seems that it hasn't been done before.


Reminds me of the Hough transform and the Radon transform.

Edit: Hough


The Radon transform was the first thing that came to my mind, too[0].

It does the same thing, but more rigorously and flexibly. Imagine instead of just looking at the shadows (which are essentially a binary 'this is part of the object/this is not') and reconstructing the surface ad hoc like OP did[1], you could also apply it to a translucent object and reconstruct its interior as well.

[0] https://en.wikipedia.org/wiki/Radon_transform

[1] Not to criticize OP here. It's cool to solve problems from scratch as practice and I don't think the Radon transform is common knowledge.


Hough


Being a crystallographer, this appears analogous to transforming Reciprocal space -> Real space in X-ray diffraction crystallography.

Only, in this case the negative space (from a shadow) is the raw data instead of diffraction spots. So no Bragg's law or Ewald's spheres involved here, but fascinating nonetheless!


One could probably do backprojection with the otherwise useless Leap Motion controller and get 3D reconstructions of your hands... or other things.


marching cubes is a beautiful algorithm




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: