
Fast-SRGAN: Deep learning model to convert low res pictures to high res - ivymike3257
https://github.com/HasnainRaz/Fast-SRGAN
======
anjc
I looked through the GAN paper that this idea uses, and it seems that they
generate training data by downsampling high res images. So you can presumably
create large amounts of high quality training data for problems like this.

Do such methods introduce any problems in terms of e.g. overfitting (say,
overfitting to the method that you use to degrade images)? Also, do fields
tackling ML problems such as this where you can generate perfect training data
tend to have faster improvements over problems where you can't (i.e. where you
need to manually label training data, or where the target is ambiguous like
sentiment analysis)?

~~~
ivymike3257
Hey thanks for the question.

Regarding your first question: that was exactly what I wondered, and I found
out that they don't necessarily overfit to the type of degradation (the repo
shows a section called "extreme super resolution" where the input is already
high res, and the output shows no artifacts). But they do tend to overfit to
the domain of images. So you would probably observe artifacts running this on
pictures of...curcuit boards? But those are also functions of batch norm, as
some literature points out, and I tried to mitigate this problem by not using
batch norm layers in the end of the network. I always welcome people testing
it out and reporting any problems they observe though.

For the second question: I would say definitely. You don't need to waste a lot
of time worrying about data quality, or even annotation in this case. Also,
ground truth to the model is actual, real, without noise ground truth, so they
may be able to learn better features representative of the distribution of
data. I'm sure there's an argument to be made here about the generalization
capability of models being trained on noisy data being higher, but I
personally think cleaner data = better data. Especially for a GAN where the
goal is to learn the distribution of your input space. Noise would only hinder
that.

------
faeyanpiraat
Is there something similar which works with video?

~~~
ivymike3257
That's the goal of this repository but the code base doesn't support video
atm.

You could theoretically use OpenCV to extract video frames and upsample them
one by one through this model in real time (384 -> 1024 upsampling runs at
30fps).

