When in the viewer, press C to see the 3D interpretation of individual shots and M for a map of the path taken by the camera.
libmv's codebase seems to be forked, with an earlier version at goggle code: http://code.google.com/p/libmv/ which also contains an interesting summary of other libraries in the 3D reconstruction space. Blender also has its own fork, which it uses for matchmoving, which is the integration of animated objects into a real world scene.
In turn, libmv seems be be influenced by the work of Marc Pollefeys? The tutorial is a readable summary of how to go from a collection of 2D images to a 3D model.
Question: Can a knowledgeable person here suggest which codebase is the best to start experimenting with, to build an application that converts a 2D photo sequence into a dimensionally accurate 3D model?
We actively maintain and release new features as they are published. While we don't provide a full out of the box pipeline (yet!), there are plenty of examples and documentation which walk you though the math, implementation, and other issues. If you want to read about the graphical models underlying GTSAM, see 
Utilizing OpenCV for feature detection and association is pretty much all you really need to add to a program in order to recreate Photosynth using gtsam. I'd also you recommend KAZE features from a former post-doc out of our lab, it's state of the art and recently added OpenCV wrappers. However, it's also trivial to integrate other sensors such as IMUs, GPS, lasers, etc. for full navigation problems.
If you wish to know more about the actual subject, I definitely recommend Hartley and Zisserman's Multiview Geometery Book
the one i spend the most time at is pretty flat, though. i suspect the green hills are hard to get a match on? here is an example: http://www.youtube.com/watch?v=EyZcERAlBeE
Briefly, there are three main steps required to go from images to a 3d viewer like PhotoSynth:
1. Figure out where each image was shot from (the "camera pose") and get a sparse set of 3d points from the scene. These two are estimated simultaneously using bundle adjustment .
2. Go from a sparse set of 3d points to a dense 3d model. This is done using a technique called Multiple View Stereo (MVS), of which the leading (open) implementations are PMVS/CMVS [3,4].
3. Build an image-based rendering system that intelligently blends between the 3d models and images to minimize artifacts.
The VisualSFM software will do steps 1 and 2. Step 3 is still quite a challenging problem, but depending on what you're doing, you could use standard 3d modeling environments to look at your data.
A nice reminder that Microsoft really does have some great engineering talent and they can break new ground.
Not to dispute this--they certainly do.
But sadly for Microsoft, Blaise Agüera y Arcas (one of the creators of the original Photosynth) just left MS for Google in December.
Does anyone see an environment in this new version that still allows freedom of movement?
The technology is clearly deeply related to the more freeform movement variants of the past. It's likely that even better freeform movement can be stitched together from a collection of linear videos than could be from a collection of stills in the past. I wouldn't be surprised if we see that start to happen soon.
In the previous version, random pictures around a scene could be stitched together allowing an experience you could explore. For example, one of the original, popular photosynths allowed you to explore inside an art studio. You could look up at the ceiling, walk on various paths, move close into pictures, etc. In this new version, you're stuck on rails.
tl;dr: In this version each node has two exit points: next or previous picture. In the previous version each node had an unlimited number of exit points to other pictures.
I imagine there could eventually be better interactivity with the underlying 3D model than video could provide. Certain surfaces could be links to more information or another photosynth, for example. It kind of reminds me of some of the VRML demos from the 90s, but without the plugins and working backwards from photos instead of forward from models.
Is it possible to release the source code of some of the older projects like the PhotoTour Viewer?
There are still some strange artefacts remaining, though. For example, on this demo - http://photosynth.net/preview/view/c7287786-a863-4291-a291-d... - watch the bases of the dragons as the camera pans left to right. The first two seem to stitch together fine, but the last two go wrong and bend outwards as if they are moving in the wrong direction. It's strange because other parts of the scene are perfect.
But making meaningful 3d triangulations out of point clouds is a whole other story.
The glitchy charme of the new pales to the wonder of seeing a explorable pointcloud created out of a pile of photos from Stonehenge.
First off, I have a ton of respect for everyone I've met and spoken with on the Photosynth team. They represent all that is great about Microsoft Research (well, Photosynth has moved to the Bing Maps department a few years ago).
The first iteration of Photosynth was the one shown by Blaise Aguera y Arcas in what is now one of the most popular TED talks of all time . Basically, it automatically arranged photos in 3d space from where each picture was taken, and allowed the user to "fly" from one photo to the next, giving a real feeling of navigating through 3d space.
The prospect and amazing, working demonstration of taking all the world's photos and mapping them together into a single quasi-3d space was a pretty incredible idea (for which Apple has just had a patent approved - WTF! ).
The Photosynth service itself in my opinion did not go far enough to combine the content of different users in order to achieve the goal achieving huge groups of images spanning very large (even city-wide) spaces. (There must have been significant usage / copyright issues which prevented a service like this from aggregating as many photos as would be required to achieve this).
On the user side, regular people had some trouble with the UI of Photosynth -- while the technology was obviously impressive, breathtaking at times, navigating this 3d space on a 2d screen is a very difficult thing to design well, and there is a learning curve. This was something which I think prevented more wide-scale adoption. (The other thing which personally turned me off was the silverlight requirement...)
Around the same time, Google built a "look around" feature in Panoramio  which was a very similar functionality, but remained fairly obscure, despite being eventually baked into the Panoramio layer on Google Maps/Streetview.
A couple years later, the Photosynth team built an iPhone app for stitching panoramas, and redesigned the Photosynth service to be more centered around 360° photography.
The Photosynth iPhone app was absolutely groundbreaking for its time, blowing away every comparable app in every respect (the size limitation of the output pano, as remarked elsewhere here, is small, this is mainly due to the strict RAM limitations of the iPhone, rather than any fault of the app itself). It has taken 3 or 4 years for anything to catch up to the quality and usability of the Photosynth app (Android Photosphere now has that crown).
Now, we are seeing the "New Photosynth" (which Microsoft seems to be calling "Photosynth 2" but it seems to me more like "Photosynth 3". This New Photosynth, to me, is simply awesome. What is interesting about it is that it seems to have the same guts as the original Photosynth, but the UI is completely redesigned and built in a very linear way, which is obviously addressing the original "weirdness" of the Photosynth 1 UI. This accomplishes a few things: it directs users to make a more consistent type of content (you now have 4 different types of photo sequences you can shoot), and it gives viewers one and only one way to consume that content. It also allows a better kind of "autoplay" functionality, if you want to simply watch the sequence of images without interacting with it.
What I don't like about the content that I've personally created so far is that it seems to be quite glitchy. Even when I shoot something carefully, there seem to be numerous artifacts in the 3d shapes that are created. I am guessing that this could be reduced considerably if the full resolution of the images was used for the 3d reconstruction, at the expense of more expensive computation.
All things considered, I really like where Microsoft is headed with Photosynth, and I look forward to seeing where things move.
One hint at what could be to come is that the amazing new Ricoh Theta has Photosynth support , which hopefully means that there will be some way to join together spherical panoramas into a "synth" at some point in the future, allowing a more freeform navigation within the 3d space.
The Louve: http://photosynth.net/view.aspx?cid=3d67aa96-ac60-43ee-9644-...
Underneath Eiffel Tower: http://photosynth.net/view.aspx?cid=f0f50007-42cb-4236-83a9-... (look up!!)
They're very fun to take and the apps they have make it super easy to do. Curious to try these new versions (though they seem sort of more cumbersome..)
- walkthrough of a wealth manager's office
- boat cruising around a marina
- a walk through an exclusive shopping district with an Hermes and Louis Vuitton.
- a duomo in Florence.
EDIT: I guess not having to use a dolly for smooth motion is a huge plus. But the tradeoff, of course, is loss of quality in the interpolated "frames".
I guess the Oculus would improve the 3D aspect of it but you wouldn't be able to look around left and right while traveling.
I tried pressing `c`, but it showed only some cracked images, doesn't seem really meaningful.
The smooth transitioning between the stills is the tech here.
Given that most cameras nowadays record video this seems rather pointless. Only advantage I can think of is that you can normally record higher quality images with stills vs video.
Interesting tech, but that's a real pity.
I beg to differ than it never caught on ;-) full disclosure: I've based my career around it ;-)
Also in Chrome canary it frequently crashed the tab or gives the "WebGL hit a snag" message, which requires you to click reload before the site works properly again.
Edit: Why is this been downvoted? All you get on the first page is a large photograph and a circle with more photos. Until you click a photo, or learn more, it is not clear what the site is about...
Also, think I'm missing something because I just get a HTML5 video of a scene.
However, I'd personally find it useful to know if my project was working in pre-release browsers, especially if it is more than your basic web app, to ensure future compatibility before that pre-release version makes it out as a stable version.
If you visit the site for the first time, like I just have, you have no idea what it is. You are just looking at a large image with some additional photos in circles. Just a simple phrase, such as the first line from the Learn More page: "Capture [and view] the places you love in amazing resolution and full 3D.", and perhaps a "Try it, select a scene" on near the circles would make it much more obvious.
Or perhaps even better, when you visit for the time time give them a quick demo or walk through.