Hacker News new | past | comments | ask | show | jobs | submit login
3-Sweep: Extracting Editable Objects from a Single Photo [video] (youtube.com)
354 points by rellik on Sept 10, 2013 | hide | past | web | favorite | 72 comments

The key here is really complementary use of ‘what humans are good at’ and ‘what machines are good at’.

In this case, it’s fair to say the machine, by analyzing pixels, can’t figure out perspective very well. The human can do that just fine, given an interface mechanism.

The machine is good at detecting edges and seeing similarity between pixels. Given hints from the human that ‘this point is within an object’ and here is the perspective, the machine can infer the limits of the object based on edges/colors and project it into 3 dimensions. Amazing.

The perspective analysis is done pretty darn well by the machine in these examples.

I'm not a HN etiquette stickler, and I'm not accusing anyone of any foul play, but the actual YouTube video was submitted 17 hours prior to this post: https://news.ycombinator.com/item?id=6358080

This is just in case you want to throw a few upvotes their way for being first. This also illustrates that late night (PDT/UTC -8) posts don't get a whole lot of votes and proper timing is crucial to getting lots of votes.

It was also submitted here even earlier: https://news.ycombinator.com/item?id=6351712

Personally, I'm just glad to see this video finally getting traction. It really is such a cool demo. It even stands out in the field of consistently high-quality SIGGRAPH demos. Can't wait to read the paper!

I submitted it too https://news.ycombinator.com/item?id=6352371

It's weird that it has received quite a few votes each time and never made it to the front page. Was it a timing issue (late night, early morning, non-American hours) or is YouTube "weighted down" somehow?

What I was thinking all along: "Oh come on! It can't be this perfect, show me where it fails." And they did!

This is indeed magic. I'm so happy to live in this age, and be part of the "Sorcerers' Guild".

Yeah, I was thinking the same thing. Funny how them pointing out the failures of the product make it seem cooler (since it seems more real).

The paper is not out yet, but you can read the abstract here:


If you marked shadows and associated them with their source, could you then recover the light source(s) and be able to remove the baked shadows and recast them in real time?

Also, with the shiny objects, could you specify the material properties and have it "back out" the reflection such that the reflection was recomputed as you moved the shape around?

Yes, there are other projects that do things like insert a synthetic object into a scene, with natural in-context lighting that is inferred from the light gradients on other objects in scene.

Yes, here's a cool demo video from 2011 SIGGRAPH Asia...can't even imagine how much more things have progressed since then:



Forget the Photoshop stuff, this needs to be integrated with 3D printing immediately.

Spit out a design file into Tinkercad[1] for some minor adjustments and BAM, you've made a printable 3D model.

[1] https://tinkercad.com/

That's what I thought when I saw it. Break something, take a quick snap and import it, fix the damage, send to printer. Very little 3d modelling skill required, making it way more accessible to the average person.

I want this + i❤sketch now, but unfortunately I suspect that jumping up and down and shouting isn't likely to help.


Thanks for the reference. Are you aware of any (commercially available) handdrawing to CAD software? I need to do a bunch of relatively simple figures for my thesis, and it would be much easier if I could just use handdrawings and than transform them to professionaly looking figures.

Have you tried Google SketchUp? It might be the closest thing to this type of ease-of-use that I've seen so far, and it's free.

Thanks, will try it out, but wrom tutorial vids it looks more on the CAD side than handdrawing side.

Wow, super impressive. And meanwhile, Silicon Valley is working on the gazillionth social photo sharing app.

The other side of the argument is that social networks improve far more lives than academic research projects like this.

Without social networks like Reddit or HN I would have been unlikely to ever see most of the academic research that I do come across, like this!

Citation needed.

Is it not self-evident? How many people connect through social networks, which is an obvious benefit? How many people benefit from research papers about 3d model generation from photographs?

Is it not self-evident?

No, it's not.

How many people connect through social networks

That's roughly quantifiable. FB has roughly 1.15 billion users, not sure of its daily use stats. Some numbers: http://expandedramblings.com/index.php/resource-how-many-peo...

which is an obvious benefit

Now _there_ is a questionable assumption. Given that increasing numbers of people are _leaving_ FB in saturated markets, and peak membership seems to top below 50% of the population, there seems to be a limit. And I could turn up studies showing negative effects of social networking / media saturation ranging from social isolation and depression to broken marriages and lost jobs to health and life-expectancy loss due to inactivity.

How many people benefit from research papers about 3d model generation from photographs?

First: a false equivalence and shifting goalposts. Your initial claim was "most of the academic research".

Secondly: academic research covers a huge range of areas, from improved health and diet to better machines and alternative energy sources to faster and more accurate computer algorithms.

Third: what you see as a useless toy has some pretty evident applications that I can consider. Attach this method to a 3d CAD/CAM or printing system and you have manufacturing or parts replacement from a 2D photograph (AutoDesk has demonstrated similar modeling/capture systems but based on multiple images, but these can come from any camera). Art interpretation, archaeology, X-Ray modeling, geological imaging, and astronomical applications come to mind. There might be applications in protein folding or other microscopic imaging applications.

And the beneficiaries of such technolgies could extend far beyond just those who are currently plugged in.

Blindly claiming social media vastly exceeds the value of such research fails to pass the most casual of sniff tests.

I don't think it's reasonable to question that assumption. Humans are social creatures, and social networks make it easier to connect with people over arbitrary distances. To deny that social networking is not beneficial is equivalent to arguing that telephones and postal services are not beneficial.

Your analysis focuses only on Facebook. Of course people are leaving Facebook. But is the total user population of all social networking apps decreasing? I doubt it.

> First: a false equivalence and shifting goalposts. Your initial claim was "most of the academic research".

Poor phrasing on my part. My original goalpost was "the academic research like this," which is admittedly vague. What I meant was research projects focused on image processing and interpretation.

> Third: what you see as a useless toy has some pretty evident applications that I can consider.

I don't see it as a useless toy. I just think it's far less useful than social networking services, which have a very practical obvious benefit.

> Blindly claiming social media vastly exceeds the value of such research fails to pass the most casual of sniff tests.

It's not a blind claim, it's what I feel is an extremely obvious claim.

I don't think it's reasonable to question that assumption

It's reasonable to question ALL assumptions.

Your analysis focuses only on Facebook.

No it doesn't. I pointed at FB as the largest of the present SNs, but referenced other SNs as well. FB is a leading exemplar of the field. My use of it isn't intended as exlusionary of other SNs.

My original goalpost was "the academic research like this,"

Which largely moots the rest of the argument. Though as I pointed out, "research such as this" actually does pose some reasonably interesting and useful applications. We can argue over those magnitudes, but I'll stick with my initial assessment that the net benefits of such research are likely to be high.

Also, but narrowly identifying what you feel is and isn't valuable research, you're sharply skewing the results to your favor. It's as if I said "but I meant by 'social media' 4Chan and HotOrNot".

it's what I feel is an extremely obvious claim.

And it's what I feel requires citation.

Which you've failed to provide, being rather more inclined to engage in rhetoric.


This is sorcery!

This technology is awesome. If it's as user friendly as they make it looks, I could see a lot of application for that!

One application I can see is teaching people how to model objects in 3d. You could use this as the 3-dimensional analog to tracing and have tutorials where you first get good at tracing the model and then try to recreate it from scratch.

For example, I have only tried my hand at 3d modelling once or twice (and sucked at it enough to give up), but just watching this I feel like I could model vases and lamp posts with a bit of practice.

most impressive thing for me about this demo is how good the shape detection is (seems way better than magnetic lasso in Photoshop), and how they brought different pieces of separate technologies together to such a fluid experience. And how the presenter sounds about 12.

These guys/girls know what they're doing.

> seems way better than magnetic lasso in Photoshop

Indeed, and it's very impressive work.

It makes sense that this is the case, because this system is doing edge detection with fairly strict constraints: the edges must match the outline of a fairly simple shape which you roughly know the size and orientation of. That seems like it's inherently going to yield better results than completely-unconstrained edge-detection as in photoshop....

This is the single most impressive example of image processing I've seen to date.

I think they patchmatch algorithm they use to fill the background is cooler, to be honest. Check out their video: https://vimeo.com/5024379#at=0

nothing new though, photoshop has had content aware fill for a while

The video is 4 years old.

you need to get out more.

It looks so simple, yet my limited understanding of image processing tells me this requires a ton of research and technology. The pace of innovation is staggering!

I wish I had the time to sit down and understand all the math and algorithms behind this. It's awesome.

I am skeptical, although I remain hopeful that my skepticism is misplaced. The "software" somehow seems to know what pattern of colors should exist on the other side of the object. Can someone explain to us how this aspect of the software works?

Looks like the flip whatever is on the visible side. If you look at the underside of the telescope, it's just a repeated pattern of what was originally visible.

Looks like they just reverse the front.

Is there a reason many of these crazy image processing technologies never seem to have actual demos or releases? The only exception I can think of it the "smart erase" idea, which has been implemented in Photoshop as well as Gimp.

This is an academic project - the main goal is typically to publish a paper, not to release a demo or product.

Photoshop has added, and is regularly adding more, features based on similar technologies. An example of a content-aware featuree: http://www.youtube.com/watch?v=D58CG_-AWnY

What other image manipulation software do you follow closely?

I don't follow any closely, I just remember seeing several tech demos similar to this.

A lot of cool rendering/modeling research seems amazingly well-suited for the film industry and this is a perfect example ... besides the obvious applications in making CGI versions of real-world scenes, you can just imagine the director saying "oh no, that lamp is in the wrong location in all that footage... move it (without reshooting)!"

I wonder if it's just a coincidence, or whether the mega-bucketloads of money the film industry throws at CGI are a major factor in funding related research even in academia?

Not to imply that this technology is anything short of fantastic, if you look closely at the video again you will notice fairly obvious artifacts when an object is 'moved' from it's original location - the background replacement is only so good from a single image. Likewise, the 3D objects themselves created by this system show unrealistic artifacts. I'd like to see the results after they expand this system for multi-photo input, of the type used in film with multiple images from a moving camera. My point being, this is a fantastic combination of known technologies to create something truly new, and with refinement will be suitable for feature film work. However, as it is shown in the video, not high enough quality for VFX applications. (Disclaimer: VFX pipeline developer here.)

> However, as it is shown in the video, not high enough quality for VFX applications

Sure, understood.

The thing is, I imagine film VFX guys are already doing this kind of task—making 3D versions of real objects from the movie and doing CGI additions from them—and tools like this (with, as you say, refinements) could be a great help in speeding up that process...

this is one of the most impressive things i've seen in a while.

Question for the entrepreneurs: how would one monetize such a cool algorithm? I come across plenty of cool stuff like this, but without any idea how they can solve real problems.

This tech, as is, is suitable for one to make models of most of their household furniture as well as the rooms of their house. Possible applications: 1) Virtual home makeover, 2) child's "play/doll house" is their own home (virtual or 3D printed)... and on and on and on... Note that this system does not handle irregular, organic shapes (people, plants), so those need a different solution.

Software for use with 3d printers. Modeling stuff for printing is really hard right now, definitely a big barrier.

Also awesome is that it handles the background replacement so well. This could also be used to just remove an ugly lamp post, telephone pole, etc from an otherwise good photo. (assuming you can remove objects and resave the image)

Edit: I am aware that Photoshop has some of this available. I've not played with it so I don't know how they compare.

If you're just removing part of the image after cutting around it with a tool like this, having the object interpreted as 3D isn't really going to be of any benefit.

The impressive thing here, imho, is the seemingly effortless and seamless transition and replacement. The background is fixed and the surface texture is stretched in what seems like real time.

Yes... I know the 3D part is the more impressive part. But I was also impressed with its ability to back fill the background.

The video says they used the PatchMatch algorithm for the background fill, and a quick search reveals that PatchMatch was developed in 2009 in association with Adobe and later incorporated into Photoshop CS5.

AFAIK, Photoshop uses essentially the same method -- half the authors on the original Patchmatch paper were from Adobe.

This is amazing. My first thought is this could allow F1 teams to get a much better idea of what new packages their competitors are bringing to races early on just by looking at photos and video footage and modelling the new parts.

This doesn't tell you any more about the contents of the photo than what is already visible. It can't actually see what's behind an object, it just synthesizes a plausible fictional fill.

This indeed is very impressing and I see the how much work passion is into this project.But I still have to say it almost only about round or cylindrical objects, there is still a long way to go

He did a couple rectangle ones too. And it's cylindrical tool handles a lot more than cylinders.

Is it too much to hope that this tech will be implemented in a program that's within an "average" user's budget? (i.e. non-enterprise).

I think this is really impressive. Do you think it will be years before this actually gets used in public 3D modelling tools?

I vote for this to be used with 3D printer

This is awesome - but how do they reconstruct the backgrounds that the objects previously obscured? There must be more photos?

i thought about that too - i think the background is simply a mirror image of the foreground, and that the object 3d shape is symmetrical

apparently they are using some other algorithm to do this - even more impressive!

however, it seems strange in the first example how mountain ranges appear where none were before... how did the algos know to put it there?

This is the PatchMatch algorithm:


This video is currently unavailable. Anyone else getting static@youtube?

Simply astonishing... imho this technology is revolutionary.

A worthy successor to SketchPad, beautiful user interface.

I read this as "from a single photon"

I'm going to need to buy more filament...

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact