It's called blind deconvolution. Blind means that they have to first estimate the original convolution/blur kernel and in the second phase, apply the deconvolution. If there's acceleration sensor on the camera, you can use data from that for the blur kernel.
One key problem with deconvolution is it's very susceptible to noise. I'm guessing they developed a way to ramp up the coefficients so you see an increase in clarity while keeping the noise below visible levels. So much of image (and audio!) processing is about getting away with noise the person can't detect :)
On a side note: does anybody know of workable deconvolution algorithms that vary the kernel over the image? The example would be compensating for a bad lens.
> So much of image (and audio!) processing is about getting away with noise the person can't detect :)
Adobe's noise processing algorithms in their software are something else, especially in ACR6/Lightroom 3.
I moved 'down' from a 5D2 to a Panasonic GF1 as I don't shoot pro anymore, and ISO1600 on this thing with a slightly-off exposure can be pretty noisy. Lightroom cleans it up incredibly well without losing clarity/sharpness. Before that, I'd just do all I could to keep the ISO down so I didn't 'lose' images to noise.
It's a very different kind of noise. Artifacts, is what he meant to write. Motion deconvolution done wrong tends to lead to things like ring artifacts and other things that look like the image has been really badly and inaccurately oversharpened.
The basic theory for doing this type of deblurring isn't too bad, but making it easy and automatic becomes a really difficult computer vision-related problem. Adobe has been working on this (and a lot of other pretty cool stuff) for quite a while. It'll be interesting if they ever ship it.
I'm guessing that there is a whole lot of analysis left out by loading pre-configured coesfficient files to deconvolve these images. Getting that process to be relatively easy for the lay-person will be a challenge.
Just 10 minutes ago I put my daughters to bed and I told them about Hubble. I realized myself how incredible it is that we humans have dragged what is basically a huge camera out in space just so we can shoot pictures of the stars without lots of air being in the way. I promised them we would Google for Hubble photos tomorrow.
Sorry for OT, was just happy to see the coincidental link to a Hubble photograph.
>> So much of image (and audio!) processing is about getting away with noise the person can't detect
For example, if you see what looks like blurry skin, and you de-blur it into skin, no-one will complain. Unless your de-blur noise elimination thinks the blurry eyes are noise, and turns the subject into a faceless monster.
Indeed, I gave it a go with a blind deconvolution product for the consumer market. In the end, I decided to kill it. Here's my blog post describing why I pulled the plug: http://www.keacher.com/?p=872
FWIW - Trying to turn what is naturally expected to be an item in a graphic editor's menu into a paid service was a ballsy, but otherwise a futile idea. The problem is there, but your solution was way too complicated from the ux perspective.
I'm no expert in this so correct me if I'm wrong, but isn't a blur convolution only reversible ambiguously? You still have to perform smart guesses how to undo the average, right?
Yes! So now you can guess why he was loading a specific different parameter file for each picture!
Translation: someone tried many different parameters for each image in this demo and them manually selected the ones that make a better result.
This might still be useful if there is a good UI for users to do the same.
Blind deconvolution is easier in the time domain, and for digital signals (because you know both the input and output can have only certain levels, thus constraining the problem). It's been used routinely in modems for years.
Probably would take a massive cloud of systems to correct any highdef video signal, but it would be impressive for many applications (news broadcast, sports, or any live event, security, or remote robots)
Granted to achieve performance on the order of near realtime dsp, it would require an impressive hardware system. Then again, when I can spend the price of a coffee and get access to a cloud of cpu's...
One of the things I did during my stint at Sony was video stabilization for their desktop nonlinear video editing software. We were hoping to not only stabilize (which is a pretty straightforward thing to do these days) but also use the motion data to inform a deconvolution to clean up motion blurred frames. Alas, my project ran out of time and we ended up licensing a third party stabilizer.
But yes, most image processing techniques like this lend themselves to video. This is something I hope my startup will someday be able to integrate into our collaborative video editor...
I should clarify: many things like this lend themselves to video, but not directly. As ibuildthings pointed out you have to be very careful becuase for video the human eye is very sensitive to the sharp edge artifacts algorithms like this can produce; much more so than it is to motion blur.
Without having looked into it too much (mostly just the video), there seem to be fairly strong parallels with the motion tracking/estimation that modern video codecs do.
As I understand it, their system is trying to estimate the causes of noise in the image (camera shake, badly focused or dirty lens, etc) and apply an inverse transformation to produce the original 'real' image.
They can't extract detail out of nowhere, so a lot of it will likely be heuristically driven, accounting for specific sources of error and applying adaptive transforms with some sort of 'looks like what we think it should look like' cost function.
Figuring out the cost function is probably one of the hardest parts here.
In terms of video, you've got more data to work with, which in theory means more constraints to your solution making it potentially more accurate, but I suspect it'd also be very easy to get bogged down in the quantity of data.
Actually human eye/brain is much more forgiving to motion blur in video that in still images. The chances are that one will spot quite a number of artifacts [compression as well as blur] in a grabbed frame of an HD quality video, which is otherwise unacceptable in a photograph.
In theory, you can tweak this method such that it is run in parallel frames 'almost' independently. I say 'almost' since you will have to factor in temporal coherence, such that deconvolution kernels do not vary much between subsequent frames (otherwise it will cause unwarranted flicker and jitter like effects)
However, more pertinent and related problem with respect to video is that of image stabilization. And computer vision community is making some exciting strides in the front; for eg: http://web.cecs.pdx.edu/~fliu/ .
I remember doing some blind source separation for audio when i was at school, that allows you to discriminate multiple voices in noise environment. I wonder what would be the set of output signals in the case of images ... maybe the image illuminated by different sources?
It's nothing new really, but algorithms for it have advanced tremendously. For example, there's some results from 2009 http://www.youtube.com/watch?v=uqMW3OleLM4
Teuobk on HN also made a startup/app based on this, but it seems to be down now: http://news.ycombinator.com/item?id=2460887