
SinGAN: Learning a Generative Model from a Single Natural Image - groar
https://arxiv.org/abs/1905.01164
======
jeeceebees
I think the approach is really cool but the processing time required is too
much for this to be very useful at the moment.

On a 1080 Ti it takes 45-90 minutes to train networks for the various tasks on
256px images (depending on some quality parameters and which task). Each task
also requires training individually so if you'd like to try them all for a
given image you'll need to train 6 times.

Also the pyramid of GANs approach is very memory hungry. I was only able to
get up to 724px images with 11 GB of VRAM. This was also only possible with a
higher scale factor (sparser pyramid) which sacrifices a lot of quality and is
incredibly noticeable at larger image sizes. I only tried for larger sizes
with the animation task though, perhaps there is a way to combine the super
resolution and animation task and achieve better results. Training on larger
sizes was taking upwards of 6-8 hours.

All of this was tested with the official repo[1] about a month ago.

[1] [https://github.com/tamarott/SinGAN](https://github.com/tamarott/SinGAN)

~~~
spunker540
Could you put that in context in terms of $$? How much does it cost in
aws/gcp/azure to run a 1080 ti for 90 minutes?

Would you say that could be a downside of the single image approach? Rather
than feeding images into a generalized model you’re training a whole model per
image which is costly to scale?

~~~
marcyb5st
You can't run nVidia consumers cards in datacenters. You need to use the
expensive versions. However, for a one-off of 90 minutes they still come cheap
(like
[https://cloud.google.com/products/calculator/#id=cd604b5c-76...](https://cloud.google.com/products/calculator/#id=cd604b5c-7644-48f7-a52c-afa3b2898b85)
)

------
maffydub
Not read the paper in detail yet, but it reminds me of Deep Image Prior
([https://sites.skoltech.ru/app/data/uploads/sites/25/2018/04/...](https://sites.skoltech.ru/app/data/uploads/sites/25/2018/04/deep_image_prior.pdf)).

~~~
xenonite
the authors of the paper were actually aware of the deep image prior and even
compare to it. The SinGAN is apparently clearly superior to the deep image
prior (DIP).

------
kidintech
This looks fantastic and gets me excited about where the field is going,
despite the performance issues.

------
ticktockten
Having spent some time in trying to do style transfers, this looks very
promising.

The harmonization aspect of the paper actually makes it very useful. There
certainly are cases where you want to introduce an image component as an
overlay and want the style to integrate.

Really cool stuff, and with code!

------
sharemywin
where there any images with people?

~~~
alleycat5000
Check out

[https://github.com/tamarott/SinGAN/tree/master/Downloads](https://github.com/tamarott/SinGAN/tree/master/Downloads)

They ran it on the Berkeley Segmentation Dataset; the human faces came out a
little interesting...

