Hacker News new | past | comments | ask | show | jobs | submit login
Photoshop 'unblur' leaves MAX audience gasping for air (9to5mac.com)
564 points by suivix on Oct 11, 2011 | hide | past | web | favorite | 122 comments



If you are wondering how they might be doing it, here is one approach that I saw in a computer vision class (no idea if they are doing anything similar to this)

(slides: http://cs.nyu.edu/~fergus/presentations/fergus_deblurring.pd... (~60 MB ppt) paper: http://cs.nyu.edu/~fergus/papers/deblur_fergus.pdf (~10 MB pdf) )

The basic idea is that you have an unknown original image and it is convolved with an unknown blurring kernel to produce the observed image. It turns out that problem is ill-posed. You could have a bizzare original image blurred with just the right bizzare blurring kernel to produce the observed image. So to estimate both the original image and the kernel, you have to minimize the reconstruction error with respect to the observed image, while penalizing unlikely blurring kernels or original images. If one extracts enough statistics from a dataset of natural images, one can tell whether an image is likely or not by comparing that image's statistics to the corresponding statistics of your dataset of natural images. Similarly, simple blurring kernels are favored over complex ones (think "short arc of motion" vs. "tracing the letters of a word with your camera")


----------------------------------------------------

Here are the actual research papers for this video:

http://people.csail.mit.edu/sparis/#publi2011

- Blur kernel estimation using the Radon Transform http://people.csail.mit.edu/sparis/publi/2011/cvpr_radon/Cho...

- Modeling and Removing Spatially-Varying Optical Blur http://people.csail.mit.edu/sparis/publi/2011/iccp_blur/Kee_...

----------------------------------------------------


Fascinating stuff --- thanks for digging that up. The main question in methods like this is how to constrain the blur kernel and the original image. I've only skimmed the first one, but the key idea used there seems to be to constrain things by looking at the edges in the image. The image has its own, natural distribution of edges, but there are also artificial edges created by the blurring that can serve as a clues for the motion of the camera (imagine if everything is blurred diagonally --- there will probably be a lot of artificial diagonal edges)


I recently discovered two neat papers on deconvolution from a Bayesian perspective, written by Kevin Knuth.

http://knuthlab.rit.albany.edu/papers/knuth-ica99.pdf

Abstract: The problem of source separation is by its very nature an inductive inference problem. There is not enough information to deduce the solution, so one must use any available information to infer the most probable solution. We demonstrate that source separation problems are well-suited for the Bayesian approach which provides a natural and logically consistent method by which one can incorporate prior knowledge to estimate the most probable solution given that knowledge. We derive the Bell-Sejnowski ICA algorithm from first principles, i.e. Bayes' Theorem and demonstrate how the Bayesian methodology makes explicit the underlying assumptions. We then further demonstrate the power of the Bayesian approach by deriving two separation algorithms that incorporate additional prior information. One algorithm separates signals that are known a priori to be decorrelated and the other utilizes information about the signal propagation through the medium from the sources to the detectors.

http://knuthlab.rit.albany.edu/papers/knuth-eusipco05-final....

Abstract: Source separation problems are ubiquitous in the physical sciences; any situation where signals are superimposed calls for source separation to estimate the original signals. In this tutorial I will discuss the Bayesian approach to the source separation problem. This approach has a specific advantage in that it requires the designer to explicitly describe the signal model in addition to any other information or assumptions that go into the problem description. This leads naturally to the idea of informed source separation, where the algorithm design incorporates relevant information about the specific problem. This approach promises to enable researchers to design their own high-quality algorithms that are specifically tailored to the problem at hand.


Yes, and it seems that their algorithm needs hints for natural image verses text


I assume the RedLaser barcode app uses something similar from captured video frames combined with data from the accelerometer, no?


Barcodes are designed to be easy to decode. Even blurred, the 1 or 2 dimensional frequency information is pretty well preserved. If I needed to combine data from several blurred images of a barcode I'd extract the likely values of the barcode from each image separately and then combine those.


Does it make a difference on the technique used if the image is out-of-focus versus blurred by camera motion?


So long as the motion blur is a small enough angle/translation, the difference is only in the shape of the blurring kernel. The kernel will be circular or gaussian for out of focus, while a line or arc for motion.

In one dimension, you'll be looking at a slice (more accurately, a summed projection) of the kernel. Focus = guassian, motion = square wavelet (line) or irregular (arc)


I doubt it. Blind deconvolution is hard. I think they may use something like super resolution where multiple blurred pictures can be used to recreate a less blury picture (and this is significantly easier to do than blind deconvolution).


It's called blind deconvolution. Blind means that they have to first estimate the original convolution/blur kernel and in the second phase, apply the deconvolution. If there's acceleration sensor on the camera, you can use data from that for the blur kernel.

It's nothing new really, but algorithms for it have advanced tremendously. For example, there's some results from 2009 http://www.youtube.com/watch?v=uqMW3OleLM4

Teuobk on HN also made a startup/app based on this, but it seems to be down now: http://news.ycombinator.com/item?id=2460887


One key problem with deconvolution is it's very susceptible to noise. I'm guessing they developed a way to ramp up the coefficients so you see an increase in clarity while keeping the noise below visible levels. So much of image (and audio!) processing is about getting away with noise the person can't detect :)

On a side note: does anybody know of workable deconvolution algorithms that vary the kernel over the image? The example would be compensating for a bad lens.


> So much of image (and audio!) processing is about getting away with noise the person can't detect :)

Adobe's noise processing algorithms in their software are something else, especially in ACR6/Lightroom 3.

I moved 'down' from a 5D2 to a Panasonic GF1 as I don't shoot pro anymore, and ISO1600 on this thing with a slightly-off exposure can be pretty noisy. Lightroom cleans it up incredibly well without losing clarity/sharpness. Before that, I'd just do all I could to keep the ISO down so I didn't 'lose' images to noise.


It's a very different kind of noise. Artifacts, is what he meant to write. Motion deconvolution done wrong tends to lead to things like ring artifacts and other things that look like the image has been really badly and inaccurately oversharpened.

The basic theory for doing this type of deblurring isn't too bad, but making it easy and automatic becomes a really difficult computer vision-related problem. Adobe has been working on this (and a lot of other pretty cool stuff) for quite a while. It'll be interesting if they ever ship it.


I'm guessing that there is a whole lot of analysis left out by loading pre-configured coesfficient files to deconvolve these images. Getting that process to be relatively easy for the lay-person will be a challenge.


The early Hubble images were fixed, maybe there's something around for those?

(http://adsabs.harvard.edu/full/1992ASPC...25..226K)

What an awful URL.


Just 10 minutes ago I put my daughters to bed and I told them about Hubble. I realized myself how incredible it is that we humans have dragged what is basically a huge camera out in space just so we can shoot pictures of the stars without lots of air being in the way. I promised them we would Google for Hubble photos tomorrow.

Sorry for OT, was just happy to see the coincidental link to a Hubble photograph.


>> So much of image (and audio!) processing is about getting away with noise the person can't detect

For example, if you see what looks like blurry skin, and you de-blur it into skin, no-one will complain. Unless your de-blur noise elimination thinks the blurry eyes are noise, and turns the subject into a faceless monster.


Thats just a different kind of kernel applied to the whole image.


Indeed, I gave it a go with a blind deconvolution product for the consumer market. In the end, I decided to kill it. Here's my blog post describing why I pulled the plug: http://www.keacher.com/?p=872


FWIW - Trying to turn what is naturally expected to be an item in a graphic editor's menu into a paid service was a ballsy, but otherwise a futile idea. The problem is there, but your solution was way too complicated from the ux perspective.


I'm no expert in this so correct me if I'm wrong, but isn't a blur convolution only reversible ambiguously? You still have to perform smart guesses how to undo the average, right?


Yes! So now you can guess why he was loading a specific different parameter file for each picture! Translation: someone tried many different parameters for each image in this demo and them manually selected the ones that make a better result. This might still be useful if there is a good UI for users to do the same.


Blind deconvolution is easier in the time domain, and for digital signals (because you know both the input and output can have only certain levels, thus constraining the problem). It's been used routinely in modems for years.


Does this algorithm lend itself to video?

Probably would take a massive cloud of systems to correct any highdef video signal, but it would be impressive for many applications (news broadcast, sports, or any live event, security, or remote robots)

Granted to achieve performance on the order of near realtime dsp, it would require an impressive hardware system. Then again, when I can spend the price of a coffee and get access to a cloud of cpu's...


One of the things I did during my stint at Sony was video stabilization for their desktop nonlinear video editing software. We were hoping to not only stabilize (which is a pretty straightforward thing to do these days) but also use the motion data to inform a deconvolution to clean up motion blurred frames. Alas, my project ran out of time and we ended up licensing a third party stabilizer.

But yes, most image processing techniques like this lend themselves to video. This is something I hope my startup will someday be able to integrate into our collaborative video editor...


I should clarify: many things like this lend themselves to video, but not directly. As ibuildthings pointed out you have to be very careful becuase for video the human eye is very sensitive to the sharp edge artifacts algorithms like this can produce; much more so than it is to motion blur.


Yep, I can imagine continuity problems, flicker etc. could show up in video.


Without having looked into it too much (mostly just the video), there seem to be fairly strong parallels with the motion tracking/estimation that modern video codecs do.

As I understand it, their system is trying to estimate the causes of noise in the image (camera shake, badly focused or dirty lens, etc) and apply an inverse transformation to produce the original 'real' image.

They can't extract detail out of nowhere, so a lot of it will likely be heuristically driven, accounting for specific sources of error and applying adaptive transforms with some sort of 'looks like what we think it should look like' cost function.

Figuring out the cost function is probably one of the hardest parts here.

In terms of video, you've got more data to work with, which in theory means more constraints to your solution making it potentially more accurate, but I suspect it'd also be very easy to get bogged down in the quantity of data.

Combining and merging data from multiple independent sources is a different problem entirely, probably more like synthetic aperture imaging: http://vision.ucsd.edu/kriegman-grp/research/synthetic_ap_tr...


Actually human eye/brain is much more forgiving to motion blur in video that in still images. The chances are that one will spot quite a number of artifacts [compression as well as blur] in a grabbed frame of an HD quality video, which is otherwise unacceptable in a photograph.

In theory, you can tweak this method such that it is run in parallel frames 'almost' independently. I say 'almost' since you will have to factor in temporal coherence, such that deconvolution kernels do not vary much between subsequent frames (otherwise it will cause unwarranted flicker and jitter like effects)

However, more pertinent and related problem with respect to video is that of image stabilization. And computer vision community is making some exciting strides in the front; for eg: http://web.cecs.pdx.edu/~fliu/ .


For a while digital camcorders has had a digital image stabilisation feature build in. It seems to work like this.

http://camcorders.about.com/od/camcorders101/a/optical_vs_di...


The random anti-intellectual comments from the guys in the wheely chairs were extremely annoying and unfunny. This guy is there, showing something truly amazing, and they're all "What's an algorithm? Haha!". And they'll get away with it too.


Rainn Wilson is actually a really smart guy. He was playing the classic stupid guy being wowed by a genius card, it was an actor's way of complimenting him.


Anti-intellectualism aside, they still interrupted an amazing talk without adding any value. An actor's way of dominating the stage even when it's not your turn. A bit rude and unnecessary.


He was paid by the company to interject with quips during the demos, so any rudeness is understood by both parties. An awkward situation for all involved.


Being paid to be rude doesn't make it not rude.


He was paid to be rude to the people who were paying him to be rude? That might not make it not rude, but I can't imagine anything that would come closer.


"Look, I CAME HERE FOR AN ARGUMENT, I'm not going to just stand..."

"OH, oh I'm sorry, but this is abuse."

"Oh, I see, well, that explains it."

"Ah yes, you want room 12A, Just along the corridor."


It was rude to be rude to the people paying him to be rude because it appears to the other members of the audience who were not being paid to be rude that being rude and interrupting the talk is acceptable practice.


How about you all lighten up, yeah?


Whoosh


I said all, not just you.


I'm guessing you weren't at the talk and therefore are making some assumptions about the participants and their behavior based on the 4 minutes of video you just saw.


You're right, I was. During that four minutes, I thought he was being a bit rude and playing an unnecessary role. Better?

Honestly though, if people were enjoying them, I don't think it's any less rude. It says more to me about the anti-intellectualism mentioned earlier. Reminds me of the Diesel campaign surrounding the word "Stupid".

I don't know, I think you can nail this kind of repartee and it's funny and makes things flow - which stops the day from getting boring and stiff. But I hate it when it's not quite right. Kinda like the MC at a gig who tries to make bad jokes and ends up being super awkward.


They added humor, and value is a subjective idea.


He made a good point. The presenter might as well have said "it works by being a computer program."

It was a pretty funny way of asking for more details imho.


It was funny. "This should be in the next version of Photoshop. Will I pay for it? No."


Let me load the specially constructed set of parameters specific to this image so that when I do the next step you get a really clear image.

That was a little too hand-wavy. I'm a little dubious until I see what went into that phase.


To an extent, this is already available -- for example in the Topaz Labs InFocus Photoshop plugin. There are some params to play with that make it easier to find the blur trajectory when the blur is motion-related (although if you leave it in "figure it out for yourself" mode, it gets it right often enough). InFocus (the current version) will only do linear trajectories, though -- it can't handle curves as well as this Photoshop sneak does.

The parameter preload isn't cheating -- if they're anything like the InFocus params, they're pretty obvious but somewhat tedious. They're things like telling it that you're trying to correct motion blur rather than focus blur, what level of artifacting you're willing to put up with (for forensics or text recovery, you can put up with a lot of noise in the uninteresting part of the picture), the desired hardness of recovered edges, that sort of thing. It would have just been a time-waster for the demo (and, like in the demo, InFocus allows you to save the params as a preset).


Yeah, I tried the Topaz InFocus plugin after it got a bit of buzz from the TWIP podcast. It wasn't quite as magical as the demo images on the site made it seem, and I ended up not buying it (Topaz's Denoise plugin, OTOH, is quite incredible).

I suspect people will have to manage their expectations with this Adobe plugin/feature as well.


I've gotten some rather amazing results with InFocus myself, but it does take a lot of tweaking of sliders and so on. It was mostly recovering irreplaceable stuff, otherwise it wouldn't have been worth the bother. (Taking better pictures is always the better option when you have it.) I do prefer the output of a mild application to unsharp masking for photos that aren't actually blurry, though. And, of course, running it after DeNoise just gets you your noise back, often sharper and more noticeable than before.

I sort of expect Adobe to do better -- they've got a lot more resources to work on the problem. Maybe I'll finally find a reason to upgrade from CS3.


Curious - how badly blurred were your images?

When I tried InFocus, I used them on some shots where I blew the focus at wider apertures (now that I'm using a camera with better high ISO performance, that kind of defocus is rarely an issue, because I have the luxury of stopping down), and I couldn't get adequate results and I wasn't willing to spend a lot of time tweaking sliders.

I am totally with you on the idea of taking better pictures though. The more you can reduce your effort in post with technique, the better. A lot of stuff is unfixable in post.


He used the same parameter profile each time.


No, he's loading a different parameter file each time.


Either way, there's no point in watching a tech preview and being cynical about it.


I think there is. I much prefer cynicism to the blind adulation of "this tech is amazing and magical". Otherwise it's almost marketing spin.


I think both are extreme ends of a spectrum on which you should try to stay in the middle somewhere.


I'm more impressed with that overhead display - seems impossible?

How does it disappear at the end - or is that a virtual digital overlay?

Wait, is the entire background rear projected, like a borderless movie theater screen? Must be massive resolution ?!


I was thinking the same thing.


I'm inclined to say much or all of the background is indeed all rear-projected. If you look at the top-right around 5:06 or 5:07, you see what looks a lot like it might be light-emitting text floating in midair. (EDIT: Correction! It's not just floating there, it scrolls left just as the camera is coming down, I missed that on the first viewing. So it's definitely not just on a physical banner.)

As for "massive resolution", slicing up a framebuffer and shooting out the components to multiple projectors wouldn't be a new idea, and I'll bet that's what was done here.


Yes, it was all projected. (also the side walls of the theatre). Some 20+ projectors pushing 300 million pixels/sec. The intro to the keynotes was pretty amazing as well, meshing the projection with light effects and live performance: http://www.youtube.com/watch?v=VrDPgUjqTQ8&feature=relat...


Forensic police drama writers everywhere: vindicated.

This is seriously cool technology.


not really, because it's not the same thing

what CSI does it add information that wasn't initially there whereas what this is doing is just unscrambling the information. This is only for photos where the camera has moved during exposure, so should be great for low light shots (indoors etc) where you need a shutter speed a bit slower than optimal, say 1/5 second


Yes. This achieves the maximum possible theoretical resolution of an imaging system. The CSI-ish "enhance" feature typically goes well beyond the maximum possible resolution of an image.


hahaha i was just thinking the same thing.

"zoom in on that. good. now....ENHANCE."


I assume this just undoes the effects of an unsteady camera during exposure--all the information is present in the image, just smeared along a predictable path, and can be reversed by something akin to a deconvolution (but more complicated as it involves arbitrary nearby points).

Upscaling an intrinsically low-resolution image is still in the realm of creating information where there was none, I think.


"... and ROTATE"



Does this work with just motion blur or also with aperture blur? It seems like they are calculating the motion of the camera so perhaps just the former.


Defocus (or "aperture blur") cannot be corrected by the methods they mention in the video. However, there are other kinds of blur you can correct.


This is an interesting way of completely avoiding defocus blur: http://www.lytro.com/picture_gallery


How is this different from what FocusMagic http://www.focusmagic.com/ has been offering for over a decade?


FocusMagic handles only focus blur or linear motion blur, and either way, it requires a high level of user interaction to direct the deblurring. In effect, it is "non-blind" deconvolution.

The Adobe approach, on the other hand, handles complex (non-linear) motion blur and does so in a so-called "blind" way.


We're obviously going to need many more independent samples to compare both.


Wish they applied that algorithm on the video...


Indeed. And held the camera steady. I have motion sickness now, and I'm sitting at my desk :(


Been hoping for this for a while! The information is there, it’s just distorted. Great to see Adobe keep pushing this kind of photo editing magic forward. I bet the maths are crazy.


The information is not really there, because the phase is not captured by the sensor. All you have is the intensity of the light.


I believe the speaker mentioned this algorithm was based on the Point Spread Function[1] but modified to model the movement of the point in time. Dougherty has[2] a static PSF deconvolution implementation that is fun to play with.

[1] http://en.wikipedia.org/wiki/Point_spread_function [2] http://www.optinav.com/Iterative-Deconvolution.htm


To me it seems the magic is in getting the blur kernel in the first place, how do you get that ?


Now all Photoshop needs is an unCrash feature.


Well, it KINDA looked like stuff was being unblurred, but it's really hard to tell with the camera panning around out of focus. The only part I could really be sure was actually unblurred was the phone number.



I suppose this is more of image sharpening rather than reconstruction. Is this very different from technology on cameras/phones that tries to reduce of photo blurness due to unsteady hands?


What makes you suppose that? On an abstract level, you can model blur and camera shake with a convolution kernel. You can then invert the kernel and get back the original image. As an analogy, imagine that someone gives you an audio file with an echo. You can subtract the echo with a filter. Camera shake is harder because of the extra dimension. (Of course, you only get back the exact original in the world of mathematics)


Should hopefully get easier with a dedicated asic crunching on the extra dimension gathered from future built-in gyroscopes in cameras.


I don't believe the echo is a simpler example. It's just as complicated as blur really. You can remove both to a large extent, providing you have the correct model with correct parameters. Discovering right parameters is a hard problem in both cases.


What's the difference between "sharpening" and "reconstruction"? Both take an input image, apply some filter which is the best estimate of an inverse of a filter originally applied (e.g., wrong focus).

The technique to unsteady hands artifacts often has access to motion data, which this thing does not.


One of the tricky thrones in this method is extracting/guessing the camera motion path purely from image measurements. Better the estimate, the better the deblur kernel will be. What might be cool is if they can extract out meta-information using some kind of inertial and gyroscope-akin sensors (which are fast becoming standard in phones and cameras), which can supplement the motion path computation algorithm.


So now we need blurring algorithms that cause actual information loss (I'm sure they already exist, but now there's suddenly a bigger market for them).


We always did need those algorithms. There was a paper a few years ago on decoding gaussian-blurred documents to reveal the redacted passages. The only safe way is to completely remove the original pixels, e.g. by drawing a black box over the text.


and not by just drawing that box over them in the original (textual content) pdf. You need to squash it down to images, then destroy the bits you need to. (There might be other ways, but enough Big Government Agencies have screwed it up to make it worth noting)

Is the 'decoding' paper this one: http://dheera.net/projects/blur.php ? The blur function is just smearing pixel values across their neighbours in blocks, so you can treat it as a hashing function, and then generate enough candidates that eventually you get something that hashes correct (or close enough)

I'm not sure how practical it'd be on data longer than a credit-card number, but it's an interesting hack nevertheless.


Just use content-aware fill in CS5.


I've read a few of the very technical responses and they are great, but, for me, the take away was the audience response. It's exactly what I look for when I write software. I want that gasp, that moment where someone realizes they can do a hard thing much easier. Where they realize that they just got a few moments of time back.


Can this overcome some of the soft blurring media companies/journalists use to hide naughty bits and to protect identities?


Now that is a feature I would upgrade for.


Imagemagick can already use this "algorithm". See "fourier transform" applications such as:

http://www.fmwconcepts.com/imagemagick/fourier_transforms/fo...)


Deconvolution is easy. The main difficulty is estimating the blur kernel in a so-called "blind" manner. That's the advance being shown off by Adobe.


Right. It wasn't clear to me in the video that Adobe is using "blind" deconvolution. I did catch a glimpse of a small black square within a pure white field (on the right side palette in the video). I'm assuming that it is a spatial domain motion blur filter - variable by which the interface tweaks the effect.


I thought that was an output field, showing what motion it had estimated. Guess we'll have to wait until it's released to find out :-)


Is this much different than the Lytro "focus later" camera? http://www.lytro.com/. I don't know much about imaging, but I've been drooling over the demos I've seen online.


Completely different. This is a way to remove motion blur. Lytro is a way to move some of the focusing elements from physical to computational.


What they should do is partner with a DSLR maker and put this in the firmware of the camera itself. Imagine one button blur correction. That'd be amazing.


I don't think DSLRs have the processing power needed to unblur the images. It took several seconds to unblur a section of an image. It'd probably be faster to upload the image to a cloud processor and have it unblur it for you.


It'd probably be faster to upload the image to a cloud processor and have it unblur it for you.

It might be more efficient to just teach people how to use their cameras so that they can minimize unwanted blurring.


Then perhaps it can be implemented on a newer Android version or in the (soon to arrive) big 10" Kindle Fire.

This is a perfect "web service" where you can profit from selling the device to have free access to the service.


Maybe they could apply this technology to Flash, so that video streaming on YouTube isn't so blurry :)


My version of Photoshop had that feature for years.

1) Load image. 2) Filter -> Gaussian Blur 3) Undo

;-)


So, why did the whole video look really fake? It seemed to bob around in a very predictable manner. When the first sharpening took effect, it panned and zoomed exactly in time with the appearance of the second image.

I'm not claiming the demo is fake, I'm just wondering why the video looks so strange.


I think it would be unreasonable to claim that it is fake anyway, since relevant research has been public for some time. If there's an amount of trickery involved, it's in the parameter files he loaded (they might be nontrivial/nonautomatic to produce).


Again, I'm not questioning the demo, I'm asking why the video has so many very strange traits.


I know, I was referring to people who might question it (such as the posters below you), hence the "anyway".


I'm guessing it's a hand held camera with some stabilization algorithm in post processing.


That plus the kinda fakey-looking screen made me think the whole thing was staged. Then again, if you were going to stage it, you could do a lot better than that camera motion without trying very hard.


You do realize that the video is not from Adobe, but just from visitors of the conference?


Yes. That's what makes the weird motion seem so fake. I expect random movements from a hand-held video, but this really, really had the feel of pseudo-random, repetitive motions that I've seen in CGI effect demos.


All first-look demos like this are staged/faked whatever, why would you think otherwise?


The demo doesn't necessarily look fake, the video looks fake.


So who shot Kennedy?


Need this for my wife. Will that camera's RAW format need to be compatible with this algorithm?



It leaves out a couple of my favs:

Super Troopers: http://www.youtube.com/watch?v=KiqkclCJsZs

Red Dwarf: http://www.youtube.com/watch?v=KUFkb0d1kbU




Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: