Hacker News new | past | comments | ask | show | jobs | submit login
Image Scaling using Deep Convolutional Neural Networks (flipboard.com)
89 points by hlfw0rd on Aug 20, 2015 | hide | past | favorite | 17 comments



John Resig used a Convnet to upscale japanese prints, waifu2x

http://ejohn.org/blog/using-waifu2x-to-upscale-japanese-prin...


Neat, specializing for Japanese prints must have improved the outcome.


Interesting stuff! It would certainly benefit from a comparison to other super-resolution techniques, e.g.

Glasner et al. "Super-resolution from a single image" Freeman et al. "Example-based super-resolution"


What's intriguing about this is that the output isn't really real. The best place to see this is the bark patterns on the trees in the last 3-way comparison. The output is convincing and yet not quite right. The neural net didn't know, so it guessed plausibly. Keep scaling and I bet you'd see Google inceptionism style dream details slipping in.


This is very cool! Next in DNN adventures should be a network trained with lots of videos for animating still pictures!


You might like this paper. Multi-view Face Detection Using Deep Convolutional Neural Networks : http://arxiv.org/abs/1502.02766


I'll admit that I skimmed the article but I have the feeling this CNN didn't learn what they intended it to learn. Looking at the examples shown they started with a full resolution image and applied some downsampling algorithm to get the lower resolution to apply their algorithm to. Their algorithm has learned to undo the downsampling that they applied. This doesn't mean it will perform well on images that haven't been downsampled or images that have been downsampled in a different way.


This is possibly true, but it's pretty much the only practical way to do this.

If we look at most of the literature around upscaling, this method is used pretty frequently.

For a more comprehensive look at using CNNs for image upscaling, see e.g. http://research.microsoft.com/en-us/um/people/kahe/publicati...


There is a more recent version of this paper published here: http://arxiv.org/abs/1501.00092v3

At Flipboard, we did not have time to do a full comparison of related upscaling research, but we were happy with the low amount of error our CNN achieved.


They down sampled by 2x, in other words, they just dropped half the pixel data. They then gave this half data to the algorithms. The full data was only used so they would have something to compare the output to.

How is that any different from just not capturing half the picture data? I don't see how it would be. You do realize a digital camera is just an array or sensors, right? What happens if your camera has half as many sensors? The same things as what they did, you have half as many pixels.


Is it possible to differentiate a downsampled image from an image captured natively at a given resolution? My gut tells me no, but this certainly isn't my field.


Not my field either. If a picture is just a big 2d array, then i don't think so, you're deleting bits, and information is lost.

If the picture is a collection of summed sin waves, maybe. If the big picture is just sampling more frequently, then maybe it's cheating by looking at the encoding. the smaller resolution will have sampling problems, it'll lose higher frequency data, because it's not sampled enough.

I dunno. I can see the op's point. Maybe there are artifacts introduced by scaling down. Still, regardless of the mechanism, information theory tells us there's no lossless compression. Information is lost, and the NN needs to make something up to fill in the blanks. Looks better than bicubic to me!


What would have made this article interesting was the examples of different images being scaled.


There are several examples in the article that show original, bicubic and DeCNN upscaling side by side for comparison.


they arent exactly super convincing, I wouldnt be surprised if something like super2xsai + GS4xHqFilter beat those examples :/


waifu2x is based on the same basic principle as this, and it beats the pants off of those algorithms, for the kind of images its best at (waifu2x was designed for, and therefore trained on, anime/manga images)


FWIW, waifu2x was inspired by this paper from researchers at Chinese University of Hong Kong and Microsoft Research Asia http://arxiv.org/abs/1501.00092v3

waifu2x is a great demonstration of this approach applied to a specific domain.

By coincidence, Flipboard's DNN approach was developed around the same time as the MSRA research in summer 2014.

I'm excited to see future research in applying deep learning to generative tasks. Some of the CNN music composition work is quite impressive.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: