
Pixel Recursive Super Resolution - somerandomness
https://arxiv.org/abs/1702.00783
======
throwaway287391
I hate to be a debbie downer but the results don't look particularly great to
me. For comparison, see "AffGAN" [1] and "LAPGAN" [2] for better (IMO) super-
resolution results using GAN-like techniques. Granted, all three papers are
applied in somewhat different settings (different input/output resolutions,
different datasets), so direct comparison is difficult.

[1]
[https://openreview.net/pdf?id=S1RP6GLle](https://openreview.net/pdf?id=S1RP6GLle)

[2] [https://arxiv.org/abs/1506.05751](https://arxiv.org/abs/1506.05751)

~~~
ygra
Does research always have to yield a result that's better than any other
approach? In my understanding it's worthwhile to pursue different ways of
approaching a problem even if some of them don't work as well as others (but
you don't know that beforehand).

~~~
aHeng
Basically, this is a survial law in the academic jungle even if depressing.

------
web007
It's really cool to see a writeup for this on real-life images. Similar work
at Pinterest [http://engineering.flipboard.com/2015/05/scaling-
convnets/](http://engineering.flipboard.com/2015/05/scaling-convnets/), for
Anime scaling
[https://github.com/nagadomi/waifu2x](https://github.com/nagadomi/waifu2x) and
for sprite scaling (can't find the reference I'm thinking of).

This tech has been around for several years, and some variation was presented
in concert with the Boston Marathon investigation.
[https://arstechnica.com/information-
technology/2013/05/hallu...](https://arstechnica.com/information-
technology/2013/05/hallucinating-a-face-new-software-could-have-idd-boston-
bomber/)

(Not clear if this was _used_ as part of the investigation, or if it _could be
used_ for future investigations)

~~~
mortenjorck
Kind of alarming to see this being proposed as an investigative tool. Isn't
the entire point of this area of image processing that the network is creating
_plausible_ information where there _is none?_ That's great for creating
higher-resolution versions of entertainment assets, but it would seem
categorically inappropriate for forensic science.

~~~
Geee
However, it is possible to extract high-resolution images from multiple frames
of low-resolution video.

~~~
dreamcompiler
Yep. Happens because the phase of the sampling grid changes on different
frames, which effectively increases resolution.

------
somerandomness
Interesting tid-bit: First author is Ryan Dahl, creator of NodeJS, now a
Google Brain Resident.

~~~
robinduckett
I caught that. I was wondering what ry was up to these days!

------
dbcooper
Another recent "super resolution" method (RAISR) from Google Research:

[https://arxiv.org/abs/1606.01299](https://arxiv.org/abs/1606.01299)

[https://research.googleblog.com/2016/11/enhance-raisr-
sharp-...](https://research.googleblog.com/2016/11/enhance-raisr-sharp-images-
with-machine.html)

>Given an image, we wish to produce an image of larger size with significantly
more pixels and higher image quality. This is generally known as the Single
Image Super-Resolution (SISR) problem. The idea is that with sufficient
training data (corresponding pairs of low and high resolution images) we can
learn set of filters (i.e. a mapping) that when applied to given image that is
not in the training set, will produce a higher resolution version of it, where
the learning is preferably low complexity. In our proposed approach, the run-
time is more than one to two orders of magnitude faster than the best
competing methods currently available, while producing results comparable or
better than state-of-the-art.

>A closely related topic is image sharpening and contrast enhancement, i.e.,
improving the visual quality of a blurry image by amplifying the underlying
details (a wide range of frequencies). Our approach additionally includes an
extremely efficient way to produce an image that is significantly sharper than
the input blurry one, without introducing artifacts such as halos and noise
amplification. We illustrate how this effective sharpening algorithm, in
addition to being of independent interest, can be used as a pre-processing
step to induce the learning of more effective upscaling filters with built-in
sharpening and contrast enhancement effect.

~~~
dbcooper
BTW, if you're interested in using advanced scaling methods in GPU-accelerated
video playback, check out the madVR and MPDN projects:

[http://forum.doom9.org/showthread.php?t=146228](http://forum.doom9.org/showthread.php?t=146228)

[http://forum.doom9.org/showthread.php?t=171120](http://forum.doom9.org/showthread.php?t=171120)

------
B0073D
Forgive me if I've missed something here, but these where only trained against
synthetic images (images that where scaled down using various formula). Due to
this, I'd expect this to not work as well as it could on actual images taken
by sensors.

Do any datasets even exist where the images are at sensor pixel level?

That way the model would 'know' about imaging effects (I can't think of any
specifically mechanical effects that could be in play here right this second)
etc?

Or am I way off base here....

~~~
petters
No, I think you are correct. I think the result for the CelebA dataset is a
toy. But many results in this area are toys, e.g. deep dream.

------
bryogenic
Zoom! Enhance!

[https://youtu.be/LhF_56SxrGk](https://youtu.be/LhF_56SxrGk)

(sorry couldn't help myself)

~~~
glacials
Super enhance!

[http://www.dailymotion.com/video/x2qlmuy](http://www.dailymotion.com/video/x2qlmuy)

------
Pica_soO
I wonder, could you craft a shader for tree-foliage from this?

Given the Background, and the leave texture + alpha, instead of rasterizing,
anti-aliazing and then using z-baked lightsources and probe reflections to
light it semi-correctly, what would a neural net implementation look like?

Would you even notice the mistakes in a constant flickering scenery like this?

------
drcode
There seems to be an error on figure 7 in the third row: The face image for
"ground truth" is a duplicate of the "Ours" result.

~~~
anigbrowl
Specific observations like this are best forwarded to the authors, who are
unlikely to see your (entirely valid) observation here.

------
amelius
Any actual implementations of this (or other superresolution algorithms) to
play with?

~~~
contravariant
There's the waifu2x project [1] which also uses neural networks for super
resolution. There's also the MPDN extensions project [2] which has various
kinds of image scaling methods. It depends what kind of algorithms you want to
play with really.

[1]:
[https://github.com/nagadomi/waifu2x](https://github.com/nagadomi/waifu2x)

[2]:
[https://github.com/zachsaw/MPDN_Extensions](https://github.com/zachsaw/MPDN_Extensions)

------
anigbrowl
Amazing, but also a bit scary. This will surely be used for retroactive
identification from existing photographs, and will be a free gift for
authoritarian law enforcement. Of course such extrapolative technologies are
subject to challenge, but criminal juries have a tendency to accept forensic
claims at face value notwithstanding their actual scientific reliability. It's
partly because of this that if I ever found myself on trial for a crime I
didn't commit I'd probably waive my right to a jury trial - laypeople are far
too easily fooled.

~~~
dharma1
The results are synthesised/generated - there is no way to use this for face
recognition from low res images because the result, while plausible, is not
real

~~~
anigbrowl
So what? That's never stopped people before. If it's good enough to be useful,
it will be used. that's how things are in the real world.

[http://www.livescience.com/49929-faulty-forensic-science-
fai...](http://www.livescience.com/49929-faulty-forensic-science-failing-
united-states-court-system.html)

[https://ncforensics.wordpress.com/2013/03/04/thousands-of-
ca...](https://ncforensics.wordpress.com/2013/03/04/thousands-of-cases-
compromised-due-to-faulty-forensic-analysis/)

------
pmoriarty
I wonder if something like this could be used to boost the fidelity of radio
signals that are picked up by SDR (software defined radio).

Here are some manual techniques people currently use to hunt signals on
SDR.[1] A lot of what they do is visual, and enhanced visual fidelity of
potential signals would definitely be a big help, if it worked.

[1] -
[https://www.youtube.com/watch?v=9fXnwkK2kQI](https://www.youtube.com/watch?v=9fXnwkK2kQI)

------
ivemadeahugem
So far I think this has only been optimizied for anime girls
[http://waifu2x.udp.jp/](http://waifu2x.udp.jp/)

~~~
NTripleOne
It works surprisingly well for non-anime-styled stuff too. Just about any 2D
art with decently-defined edges upscales beautifully.

------
deepnotderp
They don't have a comparison to GANs which is weird.

------
tunnuz
Imagine how could it would be if this would be implemented in JS and used in
website to increase the resolution of low-quality pictures.

------
aHeng
Techanically, the work should be compared with the well-known GAN SR work
cited as [18] in it to show its power.

------
VMG
The celeb sets look like nightmare fuel.

------
dharma1
nice. wish pixelCNN/wavenet wasn't so computationally heavy to train and run

~~~
zardo
We'll have an Intel vs NVidia arms race kicking off this year. And... There
are probably some major algorithmic speedups on the table still.

~~~
ReverseCold
Nvidia v AMD, Nvidia will win

Intel doesn't make GPUs.

~~~
zardo
No, but they intend to go head to head on deep learning performance.
[https://newsroom.intel.com/news-releases/intel-ai-day-
news-r...](https://newsroom.intel.com/news-releases/intel-ai-day-news-
release/)

------
whatnotests
Zoom. Enhance.

