Hacker Newsnew | comments | ask | jobs | submitlogin
This Is a Photoshop and It Blew My Mind - Photosketch (gizmodo.com)
320 points by AjJi 338 days ago | comments


47 points by anigbrowl 338 days ago | link

Sigh...third time today and I only got one karma point for pointing it first. http://news.ycombinator.com/item?id=862216 if you feel sympathetic :)

EDIT: gee guys, the smiley should tell you this ^ is not a serious complaint. I thought people might actually be interested in reading the actual paper v, since project page link has been inaccessible. Sheesh.

Site is down but this is the Siggraph paper: http://www.ece.nus.edu.sg/stfpage/eletp/Papers/sigasia09_pho...

I have downloaded the binaries (also requires openCV1.1, recently updated OpenCV2.0 doesn't work) and have made some progress, though it's very clunky and the instructions are, um...lacking. http://sourceforge.net/projects/opencvlibrary/files/ http://opencv.willowgarage.com/wiki/

-----

21 points by hyperbovine 338 days ago | link

In today's hypercompetitive world, branding is key. Compare:

"Photorealistic image composition from simple sketches"

             vs.
"This Is a Photoshop and It Blew My Mind - Photosketch"

Which one would you rather click on?

-----

63 points by anigbrowl 338 days ago | link

The former, obviously. This is why I can't have nice things :-D

-----

6 points by BearOfNH 337 days ago | link

The one that isn't a PDF.

-----

2 points by chanux 337 days ago | link

And some one said HN doesn't accept sensational headings.

PS: It was in the days I started here. Fair enough, it has been a while.

-----

5 points by nopassrecover 338 days ago | link

This guy got downvoted for giving the most informative comment in this thread?

-----

4 points by cema 337 days ago | link

Well, I think he got his karma back now. But yeah, the way some people downvote gives me a funny feeling.

-----

-1 points by joe_the_user 337 days ago | link

I'm tired of meta-comments. I'd rather discuss the link. I down vote all meta-comments.

-----

3 points by ugh 337 days ago | link

Don’t you now have to down vote yourself?

It’s ironic :)

-----

1 point by nopassrecover 336 days ago | link

You can downvote our subsequent comments but his original was about the paper that belongs to the article :-)

-----

-2 points by avinashv 337 days ago | link

> the smiley should tell you this ^ is not a serious complaint.

Then why put the link at all?

-----

3 points by ComputerGuru 337 days ago | link

because it's useful?

-----

18 points by chime 338 days ago | link

I'm having a very hard time believing that this is real. If it is, they have just made breakthroughs in multiple domains in computer graphics, recognition, and composition at the same time. Here's hoping it's real.

-----

11 points by albertsun 338 days ago | link

The Gizmodo story says that they presented it at SIGGRAPH Asia 2009. According to http://www.siggraph.org/asia2009/ that doesn't take place until December.

-----

14 points by apu 338 days ago | link

Papers for SIGGRAPH & SIGGRAPH Asia are accepted a few months before the conference.

These are the best two conferences in computer graphics, and the bar to get in is extremely high -- reviewers are very very tough and submitted papers have to not only be technically accurate, but also very polished in writing and presentation (and video).

That being said, a lot of these kind of systems have been coming out in the past few years, and it's a little hard to judge how successful they are due to the enormous amount of work required to reproduce results (their releasing a binary is commendable).

My own personal feeling is that this system probably works quite well for "common" things in their database, but there might be many small artifacts in generated images. Also, if you start trying to include stuff that's not well represented in the database, then the artifacts probably become quite severe.

-----

2 points by Pistos2 338 days ago | link

Well, it is listed here: http://www.siggraph.org/asia2009/for_attendees/technical_pap... "Friday, 18 December | 9:00 AM - 10:45 AM | Room 301/302" Not sure what to think. Maybe they're writing in past tense in the same way sports sites write "such and such team played on Tuesday" even though it was Tuesday at the time of writing.

-----

5 points by jerf 338 days ago | link

Surprisingly, not really. Computer vision has been moving along lately. This is good stuff, but incremental progress against the backdrop of other good stuff in the field.

A few times in the past few months the question of whether "all the good stuff" has been discovered or whether there's no progress left to be made came up, and computer vision has been one of my go-to examples of a field that has just been booming lately. Interestingly, digging in shows that it's still not "AI", just as you don't see any "AI" here, but there's still been a qualitative sea change in the past few years. I think the fact that a $1000 machine is now unbelievably powerful has been a real boon for the field.

-----

6 points by joeyo 338 days ago | link

I have seen this argument being made frequently here recently--that the "AI" that we have produced is not "real" AI. This is understandable true in a way, since after all, AIs cannot do many of the things that natural intelligences can do. And so there is this perceived disconnect between the weak AI that we have (rules, heuristics, belief networks, bayesian inference and friends) and the strong that we AI (magic?).

But there is a similar disconnect that looms between the rudimentary intelligence of simple organisms and the more sophisticated intelligence higher mammals. Many researchers have noted this taken the insight that with a simple set of rules you can get complex behavior. This has lead to something of a revolution in ML, robotics and neuromorphic engineering. If this analogy holds, then it may be that the distance between weak and strong AI will be bridged in the same way that the distance between single celled organisms and primates appears to be bridged: by iteratively building complex systems on top of simpler building blocks. There simply won't be a secret sauce to be found, but each layer of complexity will allow for increasingly sophisticated behavior.

-----

2 points by fortybillion 337 days ago | link

When I took an AI class back in University, the professor noted that the term AI is usually only applied to the problems we can't solve. As soon as we determine an algorithm for something, they no longer call it AI.

So technically, it was a Data Mining class I guess.

-----

8 points by shib71 338 days ago | link

The technical terms and researcher's names pass the Google sanity check, e.g. http://cg.cs.tsinghua.edu.cn/prof_hu.htm, http://en.wikipedia.org/wiki/Kadir_Brady_saliency_detector.

-----

3 points by jcl 337 days ago | link

This is one of those SIGGRAPH papers that are like magic tricks -- unbelievable until you find out how they're done and what their limitations are.

There are existing projects that pair an image composition function with a tagged image database (like this one: http://graphics.cs.cmu.edu/projects/photoclipart/). This project sounds like it adds a fitness function so that it can find an output image that looks good.

-----

3 points by jacobolus 338 days ago | link

Most of the parts have been done (a couple things look new, but just read through the list of SIGGRAPH papers for the last few years; the field is sprinting forward, as computers become fast enough to do this sort of computation), but putting it all together is pretty damn impressive.

Now, as hard as what's been done on the algorithms side, someone has to figure out a usable user interface for these tools, so that regular people can play with them.

-----

14 points by NickM 338 days ago | link

My guess is that it's real, but that it doesn't usually work as well as the example images provided. If it really works for any kind of input sketch, then how come there are so many images of dogs catching frisbees and bears catching fish?

-----

6 points by JacobAldridge 338 days ago | link

Yup. When I see a real-time demonstration showing a bear catching a frisbee (probably, but not definitely, while jumping out of a shark-threatened helicopter), then I'll believe it.

Edit: I'll also believe it if anyone on this site claims to have seen same.

-----

4 points by lurkinggrue 337 days ago | link

The algorithm is optimized for dogs/Frisbees and bears/fish.

Reminds me of a company I worked for they were working on a video compression and the video source the developers had was a security camera.

The compression was optimized for bookshelves and hand waving.

-----

3 points by fname 338 days ago | link

I agree... But they have the source available if you can get to the site. Otherwise, here's the Google cache link:

http://74.125.95.132/search?q=cache:http://cg.cs.tsinghua.ed...

-----

10 points by patio11 338 days ago | link

The download doesn't include source.

It does include 5 executables totaling ~140k. I am not quite brave enough to run them but I did run them through strings, and a cursory inspection suggests either that they're either attempting to do what they're claimed to do or they're the best disguised malware in history.

-----

5 points by anigbrowl 338 days ago | link

Oh the software's legit - I ran them all and nothing bad happened :) It's just not very stable or easy to get going with, is all.

-----

5 points by uiohnuipb 337 days ago | link

So gentlemen our plan is simple. We study computer graphics for 10years, get PhDs, develop a system that gets headlined at siggraph, become famous and - then we release a binary that the HN readers will download.

World domination is in our sights <evil laugh>

-----

2 points by joe_the_user 337 days ago | link

"Real" is relative. It might be "real" for a set of a hundred images but it might seriously bog-down or spit-out garbage if you tried to implement it for ten thousand, never mind ten million.

Image processing is full of things that sort-of work, that are impressive but not fully reliable.

-----

1 point by NathanKP 338 days ago | link

We've been seeing some pretty fascinating developments in graphics and pattern recognition recently. A few days ago someone submitted that amazing Photoshop plugin that allows you to select a region and move it around seamlessly, and also fill in areas seamlessly. The demo showed the software filling in the broken areas of the Pantheon and the results were quite impressive. A combination of both technologies would be simply incredible.

Edit

The official site of the plugin is definitely down. Its probably overloaded by all the people who are interested in checking it out. ;)

-----

4 points by anigbrowl 338 days ago | link

Just FYI, that stuff to which you refer is a teaser for CS5, the next iteration of Photoshop...due around April, most likely.

-----

1 point by fortybillion 337 days ago | link

The SIGGRAPH presentation video for this is here: http://www.youtube.com/watch?v=dgKjs8ZjQNg

-----

1 point by ja2ke 338 days ago | link

Siggraph is home to some crazy presentations. I'm always surprised by the more impressive videos and papers, but on the other hand, never that surprised, just because it's always only one step crazier than the crazy thing from last year's.

-----

1 point by jeremyawon 338 days ago | link

the automated object recognition and extraction is what impresses me. everything else is a killer usage example of that core technology.

-----

55 points by wmf 338 days ago | link

The obligatory shark attacking a helicopter is a nice touch. This technology has epic potential in the LOLcat market.

-----

9 points by jonhohle 338 days ago | link

Ariel Shamir (4th author listed in the paper) also worked on Seam Carving (http://www.seamcarving.com/) and Improved Seam Carving (http://www.shaiavidan.org/papers/vidretLowRes.pdf), two widely publicized papers from the past two years.

-----

6 points by mseebach 337 days ago | link

Hey, this is the technology from Wag The Dog.

De Niros character is a Hollywood producer, hired to produce a fake war in Albania to distract from a sex scandal before an election. At one point, he's directing news footage by shouting something like "there's a girl in front of a village, it's on fire .. hmm, no, more smoke.. her hair is too light. Can she have a cat? Show me cats" while having a technician type in the request and watching the result appear in real time.

-----

2 points by chriskelley 336 days ago | link

I work in the VFX industry, and can confirm this is how we have been doing things for years.

-----

3 points by cwg 337 days ago | link

Ha, true. FWIW, though, the character you're describing is played by Dustin Hoffman. Awesome movie, though.

-----

1 point by mseebach 336 days ago | link

Yeah, you're right. I think the quotes page on IMDB has the names backwards, or I'm really in need of re-watching that film.

-----

5 points by jf 338 days ago | link

A link to the original research paper that works: http://cg.cs.tsinghua.edu.cn/montage/main.htm

-----

5 points by ErrantX 337 days ago | link

This is going to make the copyright brigade go into apoplexy if it starts to become popular!

-----

3 points by harisenbon 338 days ago | link

The probability of mis-matched images seems like it would be incredibly high, but if they manage to pull this off, it would be amazing.

However, I wonder about this technology being used for evil (custom porn). =/

-----

6 points by hyperbovine 338 days ago | link

Wait, custom porn is evil? I guess I'm the only one who's tired of all this cookie-cutter porn.

-----

6 points by JacobAldridge 338 days ago | link

So glad the downvote maxes at -4

|-----------|.......................................

|..............|Bear...................................

|..............|.........|-------------|...............

|..............|.........|-------------|Fish...........

|-----------|................................|--------|

....................................Yo Mom|--------|

-----

10 points by growt 337 days ago | link

Box for 'Yo Mom' needs to be bigger! ;)

-----

2 points by tezza 337 days ago | link

  probability of mis-matched images seems like it would be incredibly high
It would still be quite useful if the sketch narrowed the options to a list. Traditional point-n-click could refine/revise from there.

This innovation seems to be a shortcut mechanism based on pictographic gestures, where they also take the position, relative size into account. This is very nifty.

I cannot see how this would be faster than traditional Illustrator/Indesign workflows of selecting the elements from dropdowns, dragging to correct location and dragging the handles. If you use their sketching method, you are probably going to have to fall back to 2D drag'n'size methods post generation

-----

1 point by ABrandt 338 days ago | link

Alright I may have a tendency to be overly excited about new technologies, but this is seriously amazing. The possibilities (and problems) that arise from this application are vast. This could completely revolutionize the stock photography industry and web design in general.

Copyright and privacy issues are my greatest concern though. For example, could you imagine seeing yourself as a digital model on some corporate website--doing something you never actually did. That is a scary thought in my opinion. I wonder if you could input your own pictures into the system and have it perform the same procedure though (I loathe the pen tool).

-----

3 points by stcredzero 338 days ago | link

Sounds like hyperbole, but this technology has the potential to change the way people use language and speak and think.

Ever notice how many Americans use "like" as a preamble to a reenactment of a scene from a TV show, or an event, or even an abstracted, generalized occurrence? "It was like..." then on to the enactment. People wouldn't speak like this if it wasn't for ubiquitous video entertainment.

In David Brin's Uplift trilogy, uplifted sentient dolphins sometimes "spoke" to each other by mimicking echolocation returns and beaming pictures and scenes directly into each other's heads. If software like this gets good enough to compose scenes for us on the fly, it will drastically alter the way we speak, just as television did.

-----

1 point by gnaritas 338 days ago | link

People wouldn't speak like this if it weren't for California girls in the 80's.

-----

2 points by anigbrowl 338 days ago | link

That's not just a California thing, actually. Growing up in 1970s Ireland, the word was sprinkled liberally throughout our sentences, like. You may like this exploration of metaphor as a fundamental part of consciousness (if so, do follow up and read the mentioned book): http://www.therebel.org/opinion/health/the_thing_to_be_descr...

-----

2 points by thomasfl 337 days ago | link

If this could be extended to movies, the possibilities would be awesome. Just write a script for a movie, sketch out the different scenes and voila, a ready made rough of your new romantic comedy!

The program could be extended further to recycle old movies, and to replace the actors heads with new one.

I have to start writing a business plan for this right away. Please send me a message if you want to join in on this.

-----

2 points by anigbrowl 337 days ago | link

In all seriousness, that's part of why I find it so interesting. The software works (I use the term loosely!) on downloaded images, ie it doesn't yet include a web client that does an image search for you.

So for movie purposes, you could save a lot of time using a (more polished) version of this to prepare your storyboards - shoot pictures of your selected or desirable actors, background plates that look like your desired locations, and major props. Draw stupid sketches and voila, you have rough photoboards. In conjunction with some other imaging technologies, it has massive possibilities. There's a saying that a film (especially a low budget one) lives or dies in pre-production; the more decisions you can make before you begin shooting, the less expensive the production process is and the more predictable and cheaper your post-production will be. Sure, quite a lot of Great Art happens on the spur of the moment, but serendipity is rare while dithering is common, and expensive. Adobe, for one, is pushing strongly to bring the use of their tools forward in the production process, so that the film is sketched out before shooting takes place and directors can spend more of their time 'filling in the blanks'.

What you mention (going from this to an actual movie) is obviously not practical now, but I'm happy to say that it's being reduced to an engineering problem - the fundamental technology to do most of what you describe already exists, and it's a matter of making it usable and timely. I will venture a guess that we'll be able to do this in clunky/very expensive form by 2015, and by 2020 it will be possible to make an entire feature film this way that looks about as good as a mid-90s low-budget sci-fi film - say, Escape from LA - at home.

tl;dr although this has been presented as an entertaining toy, with thematically-organized material there's a lot of near-term commercial potential.

-----

2 points by noonespecial 337 days ago | link

  ------------------------------------
  | |+Obi-Wan|                       |
  |  --------                        |
  |     -------------                |
  |    |+Qui-Gon Jinn|               |
  |     -------------                |
  |  -----------------------------   |
  | |s/Jar-Jar Binks/Halle Berry/g|  |
  |  -----------------------------   |
  ------------------------------------
Yeah. I'm so in.

-----

1 point by yosho 337 days ago | link

I don't get how this is going to replace photoshop.

The images will not look as good, or as clean as a regular photoshopped image, I doubt that automated image editing is that advanced right now.

The examples they gave, is probably the best that was available, I bet on average, the results don't look nearly as good or clean.

But hey, I could be wrong.

-----

4 points by gcb 337 days ago | link

photoshop + 24hours of work = awesome image

photoshop + 1h of work = worthless image

this snake oil thing + 1h of work = so-so usable image

-----

1 point by uiohnuipb 337 days ago | link

Good enough for display on TV news or a newspaper ?

Get me a picture of the president, put a girl in a red dress in his eye-line. Get me a picture of evil dictator - add some WMD in the background.

At the moment you need an intern to do this be much better when you can do it yourself.

-----

2 points by rooshdi 338 days ago | link

Wow, this is revolutionary. Anything that can use sophisticated technology to actually make things simpler for the end-user, instead of the other way around, is definitely a plus. I want to try this thing out right now.

-----

3 points by qd 338 days ago | link

Did anyone get a chance to try it before their site went down so we can hear some first hand accounts?

-----

2 points by akrymski 337 days ago | link

If this is for real, then why can't the same thing be done for music? Drum a beat, whistle a tune, sing a few notes - and let the software extract best-matching samples from a library of millions of songs.

-----

1 point by jrockway 338 days ago | link

I really want to try this, but the actual site seems to be down now.

-----

2 points by hussong 337 days ago | link

Gotta love the random deer in picture 5.

-----

-1 points by gorm 338 days ago | link

That picture with the weeding, the sunset, the sailboat and the birds is so unnatrual that it makes my soul twist.

-----

1 point by cubicle67 337 days ago | link

ah, nothing better than a bit of weeding at sunset

-----

-4 points by nikcub 338 days ago | link

boring. i have been able to do this in emacs for over 10 years now

-----

-4 points by blogimus 338 days ago | link

Images aren't from flickr, not enough cats.

-----




Lists | RSS | Search | Bookmarklet | Guidelines | FAQ | News News | Feature Requests | Y Combinator | Apply | Library

Analytics by Mixpanel