I think labelling this as "DSLR quality images" is highly misleading. It's another interesting image enhancement technique, this time using DCNs to further predict what information isn't present or is incorrect.
However as the article says camera phones are fundamentally limited by the size of the lens and sensors, so they can never gather the same amount of information that a larger DSLR camera can. They simply can't gather the same detail and although we can get better at being able to predict detail in some cases it surely can't ever be same as having that information in the first place.
From the sample images, all I can see is that the "DSLR"-ized images look brighter and have subtly more pleasing colors. But is that all to it? Am I missing the big picture?
If you right click and view the images in a tab (at higher resolution!), they basically all look look like they've had a brightness/contrast and sharpness filter applied. Honestly, I much prefer the originals, they look cleaner and the lighting is less harsh with better contrast. But maybe I just have unusual æsthetic tastes.
Having actually seen DSLR output, I can say that neither the original nor the "enhanced" version looks anything like what a DSLR can do. The debayering artifacts and noise are still there.
They tested quality in a blind test, where people were asked to choose a better looking photo originally taken with an iPhone and enhanced by one of the three methods: Apple Photo Enhancer, their CNN method, and a professional graphics artist using Photoshop. 42 people were shown 9 pictures each. From the paper:
"Although human enhancement turns out to
be slightly preferred to the automatic APE, the images enhanced by our method are picked more often, outperforming even manual retouching."
Was thinking exactly the same thing. From those images alone, it's not entirely clear that the same can't be achieved by brightness/contrast and hue/saturation sliders.
I agree. It looks like something you could do quickly in Lightroom. Raise the shadows, lower the highlights, set white balance. Maybe you could use the same algorithm for DSLR RAW pictures and then get real DSLR quality?
> The images were captured in automatic mode, we used default settings for all cameras throughout the whole collection procedure.
I guess while that technically makes the images 'dslr quality', that's not how dslr cameras are generally used. Shooting in auto and doing no post processing, will never yield the results you see with even amateur dslr photography.
For people who view camera settings as an inconvenience, then yes that sounds like an attractive thing. When using a camera as a tool, a photographer sometimes wants complete control over the capturing process. Some things you don't get to change later no matter what algorithms are involved, such as the balance between noise and exposure time.
If the camera had hardware light metre, then recorded the ambient light level as well, then wouldn't it be more powerful to be able to make full exposure adjustments after the fact?
No because the act of taking the photo freezes the exposure settings due to how photos work.
I've played a bit with astronomical photography and the amount of work to getting the signal in the picture good enough without introducing noise is quite high. (Going as far as taking pictures with the lens on the telescope to capture the natural sensor noise, which is usually good for about a few hours of use)
However, you don't get second tries on a computer. If your signal is too weak or the noise is too strong, the picture is ruined, no amount of editing will fix that.
The only solution I'd see for that is to take multiple images at once with different sensors, each using a different exposure. But that still doesn't provide a sliding scale, only a stair function of "is this close enough"?
The better solution is to take a proper light meter and properly setup the camera in the first place.
Or you can shot a several shots with different exposure simultaneously (see e. g. Light L16). It may not work for an astronomy but should work in many other cases.
That's not really a solution, especially because that requires a lens per exposure setting (you can't expose a sensor with different settings at the same time, that's basic physics). So any number of exposure levels over 2 is going to be a huge problem.
An unsynchronised manual gearbox also gives you full control over the process. And like an unsynchronised manual gearbox, a fully manual camera is great in a few situations but most of us don't find it that convenient as a daily driver.
To tackle the general photo enhancement problem by mapping low-quality phone photos into photos captured by a professional DSLR camera, we introduce a large-scale DPED dataset that consists of photos taken synchronously in the wild by three smartphones and one DSLR camera. The devices used to collect the data are iPhone 3GS, BlackBerry Passport, Sony Xperia Z and Canon 70D DSLR. To ensure that all devices were capturing photos simultaneously, they were mounted on a tripod and activated remotely by a wireless control system
It opens up the space for others to experiment. Who knows maybe an app for cheap smartphones that takes Pixel quality pics.
I am curious what is the inference cost of doing this? How much memory and CPU performance does it take to run the trained network on each image captured by a camera. Are we talking about milliseconds and can it be run on a phone?
However as the article says camera phones are fundamentally limited by the size of the lens and sensors, so they can never gather the same amount of information that a larger DSLR camera can. They simply can't gather the same detail and although we can get better at being able to predict detail in some cases it surely can't ever be same as having that information in the first place.