Hacker News new | past | comments | ask | show | jobs | submit login

The most interesting thing about this (compared to results from similar research/projects) is that in all of the examples, camera movement is forward and down relative to the original perspective. The results are really good, and some of that may be due to a superior algorithm, but it's also aided in large part by the choice of movement. Since objects lower in the frame tend to be closer (foreground elements), the downward camera movement causes those objects to occlude parts of the background above (and behind) them, meaning that a relatively small portion of the background needs to be inpainted by the algorithm. If too much is interpolated, visual artifacts often ruin the illusion.

The forward movement also helps in this case as those foreground objects grow in size (relative to the background) with time, so there's even less need for interpolation. If the movement were primarily lateral (or reversed relative to the original image), I imagine the algorithm would have a much harder time producing good results [1].

EDIT: After skimming the paper, it appears that the algorithm is automatically choosing these best-case virtual camera movements:

> This system provides a fully automatic solution where the start- and end-view of the virtual camera path are automatically determined so as to minimize the amount of disocclusion.

That is pretty impressive. I had originally assumed the paths were cherry-picked by humans, so it's cool that the paths themselves are automatically chosen (and that the algorithm matches the intuitive best-case scenario in most cases). It's still slightly misleading in terms of results because they mention that user-defined movements can be used instead, but of course, the results are likely to suffer significantly if the movement doesn't match the optimal path chosen by the algorithm.

[1] The last example shown in the full results video illustrates the issue with too much background interpolation in lateral movement: http://sniklaus.com/papers/kenburns-results

Thank you for sharing your thoughts! We designed the automatic camera path estimation to minimize the amount of disocclusion which indeed simplifies the problem. As you correctly pointed out, inpainting the background is an additional challenge and while we address it, the inpainted results sometimes lack texture.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact