I hope we see this technology actually become readily available. There might still be work to be done, but in general if they can reproduce the demo videos with other content then they're on to something people would want.
It appears that what they're doing here is simply extracting keyframes from the video, using them to compose a photosynth, then converting the autoplay of the synth to a video. If you load a photosynth and press "c", you can even see a the same point clouds and scene reconstruction seen on the research page.
Source: I worked on photosynth.
EDIT: After reading their description, I agree they are going the photosynth route. Why not, they have the technology that you worked on. And they say that the naive subsampling I described above doesn't work...
Wow. Actually, if they that add that technique to the mix it might solve the deformed "pop"-effect you see in some videos, like the deformed building you see around 16 seconds into this videos:
With sensors (gyros etc) the camera path would be trivial, instead of recovering that from the video. Rendering the results would be possible on a mobile GPU. Just leaving the frame conversion to a point cloud in terms of compute and memory.
Maybe some scheme where you down sample the input frames to create the deformation mesh, then apply that to the full size frame would be the way to go
Well... not quite trivial. They're calibrated differently per model, and it's actually quite tricky to reconstruct the path based on accelerometers and gyroscope alone. There's also the likely issue of synchronising the data from these sensors with the video input. If you solve that second issue however, it could in theory at least help with recovering the path from the video, creating better predictions where the point cloud has moved to for example.
For normal speed you wouldn't need this :)
It would, I think, be even more distracting if the video was higher resolution.
This is good stuff, I like it, but it isn't as wow as the structure from motion work.
And for the folks saying just up the framerate, that won't really help because the head motion needs to back in the same position as a previous frame. It is a function of how much and at what frequency the motion you want to remove is.
This was on my todo list, item removed.
On the other hand, if it is actually generating a lot of "best guess" images to put between gaps that are too large to bridge (too many bad frames in a row) with the current frames I could see that taking a bit longer, but not a week.
I now think we both got it wrong (but me more so than you): Table 2 specifies "1 min/frame", but the source frame selection happens for output frames, not input frames. Table 1 lists a total of 2189 output frames for the 23700 input frames of the "BIKE 3" sequence, so I guess we're looking at 2189 minutes?
Correct me if I misread anything. Again.
Also you probably saw this over the past week: http://jtsingh.com/index.php?route=information/information&i... (disregarding the politics of that) Whatever he's doing (I assume a lot of manual work) it has a very similar effect and it has these beautiful transitions between speeds.
Amazing work and the videos are stunning.
This would be possible. Although it would require providing some UI so the user could specify which parts should be sped up.
I've seen the Pjong Yang video, it is beautiful work. It requires very careful planning and shooting, and a lot of manual work to create such nice results. We're trying to make this easier for CASUAL, but it's still FAR away from the quality of professional hyperlapse.
Some of the videos demonstrate unusual "popping" effects and deformations when standing still - especially notable in this video, top right, sixteen seconds in:
I understand how the extreme situation of climbing is a challenge, but what is it about standing still that causes this? Do you have any thoughts on how you might tackle this problem in future work? (although it appears you already combine an amazing breadth of techniques, so I'm not sure how many options you haven't looked at)
The hyperlapse of the climbing video looks like an FPS game from a decade ago with texture refreshing as you get closer.
What we're trying to say is, if you feed your video to the program, you're going to get output that is sped-up 600x compared to real life. That's a ridiculously high speedup.
I see they have listed a Windows app coming. Is that Windows desktop app?
They said they'll offer it as a windows app, and I imagine it's for very corporate reasons.
It'll still be slow (a couple hours for a 10 min video) but running on a singe standard PC (with GPU).
> "In this work, we were more interested in building a proof- of-concept system rather than optimizing for performance. As a result, our current implementation is slow. It is difficult to measure the exact timings, as we distributed parts of the SfM reconstruction and source selection to multiple machines on a cluster. Table 2 lists an informal summary of our computation times. We expect that substantial speedups are possible by replacing the incremental SfM reconstruction with a real-time SLAM system [Klein and Murray 2009], and finding a faster heuristic for preventing selecting of oc- cluded scene parts in the source selection. We leave these speed-ups to future work."
Set aside $50-100 a month, it'll probably be a lot cheaper in a year. (assuming optimizations and cheaper cloud services)
Upload video, generate hyperlapse, generate a URL and view the higher bitrate video on iPhone, Android or Windows. Considering GoPro/Drone videos generate lots of interest, this will be a very highly useful service.
This video http://vimeo.com/13669078 has time-lapsed audio. I wonder what it would sound like using the hyperlapse effect.
I'm having difficulty imagining what this more in-depth model would represent, and how you'd either strategically take the clips to "paint" this model.
I've been waiting for that for years. Searched for it a few months ago, still nothing.
They may not be killer apps but it is one way for Microsoft to distinguish itself from Apple and Google.
Edit: It's not even working. Photographed something from different angles, synth it, and it appears on their website as a slideshow. Like a normal jquery slideshow except you need Silverlight®.
I make a lot of 4K hyperlapse movies, it is tedious as AfterEffect's warp stabilizer is useful only in a small fraction of cases, Deshaker is more consistent but also not perfect, and the only option in the end is multi-pass manual tracking and stabilizing which is very time consuming and tricky for long panning shots.
I'm curious to see what happens if they insert more action-packed footage. An MTB course with trees, switchbacks, and jumps would be an interesting stress test of this technique.
I recall those childhood days when we could not explain why moon appeared to come along as our car moved. :-) Those differences in apparent speed betrays something about the distance of the object/point under consideration.
This is what this system does, by using the movement of the camera to interpolate a 3D map of the area around the path taken, as well as the original images that best adjust to the part of the 3D map being seen.
The technical video breaks down some of the techniques they used. Global match graph is particularly interesting. This technique alone could lead to a big improvement in timelapses, by trying to select consistent changes between frames.
http://cg.cs.uni-bonn.de/aigaion2root/attachments/FastSimila... <- maybe this?
That being said, it really does look amazing!
now to implement it open source ;)
Also, I will pay $$$ for this to use with my motorcycle footage from GoPros.
I'm also curious if anyone else got motion sickness while watching the video.
For a slightly more practical use this could be a tool to give people previews of hiking trails, tours, or routes they're about to take.
One of the by-products of this algorithm is fully textured 3d model representing filmed environment. Offering that as pure data dump, or even a manual process allowing user to control camera would be as valuable as fully automatic one-off timelapse no one ever watches (except maybe your granny).
What sounds better - a video tour of a house, or a 3D model of a house you can traverse however you like?
I wonder if 3 letter agencies have better structure from motion implementations a la "Enemy of the State" (Isnt it sad that this film turned out to be a documentary?). I suspect something like a 3d reconstruction of Boston Marathon (FBI did collect all video footage of the event) would be very helpful to the investigation.
I would guess that I could upload a shaky video to youtube to get it smoothed out, download it, and speed it up with similar to their rate and get similar results. The timelapse that they show that is so much worse uses way less frames of the raw footage (every 10th frame?) and goes way faster than their "hyperlapse". It isn't a fair comparison.
From the paper intro:
Video stabilization algorithms could conceivably help create smoother
hyper-lapse videos. Although there has been significant recent
progress in video stabilization techniques (see Section 2),they do not
perform well on casually captured hyper-lapse videos. The dramatically
increased camera shake makes it difficult to track the motion between
successive frames. Also, since all methods operate on a
single-frame-in-single-frame-out basis, they would require dramatic
amounts of cropping. Applying the video stabilization before
decimating frames also does not work because the methods use
relatively short time windows, so the amount of smoothing is
insufficient to achieve smooth hyper-lapse results.
As mentioned in our introduction, we also experimented with
traditional video stabilization techniques, applying the stabilization
both before and after the naive time-lapse frame decimation step. We
tried several available algorithms, including the Warp Stabilizer in
Adobe After Effects, Deshaker 1, and the Bundled Camera Paths method
[Liu et al. 2013]. We found that they all produced very similar
looking results and that neither variant (stabilizing before or after
decimation) worked well, as demonstrated in our supplementary
material. We also tried a more sophisticated temporal coarse-to-fine
stabilization technique that stabilized the original video, then
subsampled the frames in time by a small amount, and then repeated
this process until the desired video length was reached. While this
approach worked better than the previous two approaches (see the
video), it still did not produce as smooth a path as the new technique
developed in this paper, and significant distortion and wobble
artifacts accumulated due to the repeated application of
No you certainly wouldn't. Watch the technical video at the bottom of the page. It will explain why this is not trivial to do and why standard stabilisation technologies aren't useful to smooth out time lapses.
I never said that it was trivial, just that similar stuff has already been done and made a "standard stabilization technology", automatically and easily just by uploading to youtube. It seems that youtube's techniques aren't necessarily completely different: there's a screenshot of an article from Google in this video  called "Auto-directed Video Stabilization With Robust L1 Optimal Camera Paths". However, I do appreciate and shouldn't disrespect the specialized work being done for time-lapsed videos. My apologies.