
Progressive Growing of GANs for Improved Quality, Stability, Variation [video] - visarga
https://www.youtube.com/watch?time_continue=1&v=XOxxPcy5Gr4
======
duskwuff
I love how using training data from the Internet has resulted in the GANs
believing that a picture of a "cat" often contains text -- or, at least,
shapes resembling text -- at the top and bottom of the image. (Visible on the
right side of the screen starting around 4:20.)

~~~
pault
The interpolating animals look so much like my dreams it's unsettling. I often
wake up when things get too "unrealistic" and people or animals are changing
shape too rapidly. Whenever I see something come out of a NN that looks like
something that came out of my brain I get a little future shock.

~~~
platz
Did you see the the horror faces it generated for humans? Good for Halloween!

------
a1371
The progress in this field is astonishing. It wasn't even a few years ago that
the typical demonstration in a paper on GAN would have been an array of very
small images. I recall often seeing numbers between 64x64 and 256x256. At the
time, the argument was raised that the resolution can hardly be increased
because it is our brain that tries to identify objects and faces in the photos
to figure out what they are representing.

Then here we are, with indistinguishable 1024x1024 recreations and trippy
latent space interpolations. I know not every researcher or entrepreneur has
the resources of NVIDIA to train for this many days, but let's not forget,
that part needs to occur only once. It makes me wonder about the day that a
GAN manages to bankrupt stock photography services.

~~~
rjtavares
> I know not every researcher or entrepreneur has the resources of NVIDIA to
> train for this many days, but let's not forget, that part needs to occur
> only once.

It's not like they trained this on a GPU farm. According to the paper [1],
they "trained the network on a single NVIDIA Tesla P100 GPU for 20 days".

[1]
[http://research.nvidia.com/sites/default/files/pubs/2017-10_...](http://research.nvidia.com/sites/default/files/pubs/2017-10_Progressive-
Growing-of//karras2017gan-paper.pdf)

~~~
a1371
Where I live that thing alone is $17,000.

~~~
bitL
P100 is essentially 1080Ti, so you can grab 1080 Ti and have the same speed.
V100 might be the expensive version, up to 10x faster.

~~~
jamesblonde
You should look at GPU memory bandwidth for a proxy for performance when
training DNNs. The P100 is about 40% faster than a 1080Ti. The V100 is only
about 75% faster than a 1080Ti.

Based on this, i expect these commodity GPU servers (with 10 1080Ti cards)
that cost 1/10th of the DGX-1 will be huge:
[https://www.servethehome.com/deeplearning11-10x-nvidia-
gtx-1...](https://www.servethehome.com/deeplearning11-10x-nvidia-gtx-1080-ti-
single-root-deep-learning-server-part-1/)

------
argonaut
Worth pointing out the main criticism of GANs, which is that right now
researchers don't really have a way to tell if a GAN is just copying and
pasting the training data or not (there is no "test set" unlike in supervised
learning). And in fact an _ideal_ GAN could just learn to output the training
set. One example someone found in the generated images for this model:
[https://twitter.com/nalkalchbrenner/status/92401333254951321...](https://twitter.com/nalkalchbrenner/status/924013332549513217).

~~~
317070
That is true, but if you don't look at them as a machine learning method, but
rather as a computer graphics method, then it is quite impressive. It has the
added benefit of being allowed to overfit as long as the average human does
not find out. If you optimize for psychovisual metrics, GANs are fine.

~~~
mannigfaltig
Actually, GANs reach state of the art in anomaly/outlier detection and
drug/molecule prediction, so there is certainly more to it than just artistic
applications:

[https://openreview.net/forum?id=S1EfylZ0Z](https://openreview.net/forum?id=S1EfylZ0Z)

[https://www.ncbi.nlm.nih.gov/pubmed/28703000](https://www.ncbi.nlm.nih.gov/pubmed/28703000)

[http://pubs.acs.org/doi/abs/10.1021/acs.molpharmaceut.7b0034...](http://pubs.acs.org/doi/abs/10.1021/acs.molpharmaceut.7b00346)

------
asciimo
Looking at synthetic celebrities is unsettling. They're all familiar. They're
obviously celebrities, but it's impossible to remember their names.

~~~
kpil
Exactly as the actual celebrities then... I think i recognized one of them.

------
visarga
Generated images start at 0:43 in the video. Link to paper here:

[http://research.nvidia.com/sites/default/files/pubs/2017-10_...](http://research.nvidia.com/sites/default/files/pubs/2017-10_Progressive-
Growing-of//karras2017gan-paper.pdf)

Source: NVIDIA

------
heroprotagonist
Well, I guess the propaganda bots will be a little more convincing now that
they can have unique profile photos.

Would be nice if I could use this to convince Facebook that some fictional
image is myself, though.

~~~
kpil
Oh, damn. I've already had enough of fake personas, but now there is virtually
no limit.

------
v64
Holy crap that video. It makes me think of the scramble suit from A Scanner
Darkly[1]

[1]:
[https://www.youtube.com/watch?v=BWne23FfKW8](https://www.youtube.com/watch?v=BWne23FfKW8)

~~~
subcosmos
Had the same thought! Imagine how much time they could have saved drawing it
all!

------
yeldarb
I’ve seen this going around twitter and the video looks cool but I don’t
really understand what’s going on.

Can someone explain it in layman’s terms?

~~~
partycoder
[https://www.youtube.com/watch?v=Sw9r8CL98N0](https://www.youtube.com/watch?v=Sw9r8CL98N0)

~~~
mattss
Wow. Watching this guy is painful

~~~
edanm
Why?

------
olalonde
I wonder if something like this could be combined with 3D animation to produce
super realistic computer-animated films.

~~~
IshKebab
Nice idea.

------
davesque
Some of the generated celebrity images seem pretty much indistinguishable from
reality. Does reality exist any more? In the future, it may get harder and
harder to know.

------
avaer
Wonder how long until this is co-opted by the porn industry and what the law
will have to say about it.

Is it illegal to own a digital brain that can think up illegal porn?

~~~
IshKebab
I would say that it probably is. If you trained a network using illegal data
(e.g. cp images) then not only did you have to have that data once, which is
of course illegal, but the data itself is at least partially encoded in the
network weights which I think should make it illegal.

~~~
yorwba
I guess you might be able to get around this if you train only on legal
content and can interpolate into content that would be illegal if a real
recording.

However, I'm not sure whether there are any other applications for this
specific interpolation scenario that would lead to it being developed, as the
effort required to make it work is likely much higher.

~~~
letlambda
Having the model produce realistic interpolations through areas of the latent
space that had no associated training data is surely something that people
will be trying to make happen.

------
stablemap
The authors have a GitHub repo containing code and links to the paper and more
outputs:

[https://github.com/tkarras/progressive_growing_of_gans](https://github.com/tkarras/progressive_growing_of_gans)

~~~
rayuela
Written in Theano, of course! R.I.P.

------
eltoozero
A GAN is a Generative Adversarial Network[0], the video is like animated Deep
Dream[1] stuff but way more refined.

I don't like the horizontal sliding transition because I'm way focused on the
bizarre iterations of the various targets.

Gonna have to update our camouflage patterns again to combat computer
vision...

[0]:
[https://en.m.wikipedia.org/wiki/Generative_adversarial_netwo...](https://en.m.wikipedia.org/wiki/Generative_adversarial_network)

[1]:
[https://en.m.wikipedia.org/wiki/DeepDream](https://en.m.wikipedia.org/wiki/DeepDream)

------
aleju
For people interested in the details, I wrote a small summary of the paper:
[https://github.com/aleju/papers/blob/master/neural-
nets/Prog...](https://github.com/aleju/papers/blob/master/neural-
nets/Progressive_Growing_of_GANs.md) (Prior knowledge of GANs is recommended,
otherwise its probably hard to understand.)

------
NTDF9
This+VR = porn industry dream (also propaganda industry dream)

------
geraltofrivia
Meanwhile, my Neural translation models don't converge. Sigh.

------
goombastic
Interesting. In another 300-500 years, I am pretty sure we will start
simulating sensory experience and ultimately the past. I am not sure if I am a
toy simulation of the past from the future, right now.

~~~
has2k1
> In another 300-500 years

instead of "In the future", triggers me. Do you know something we do not? I am
resigned to the notion that everything is up for grabs.

> I am not sure if I am a toy simulation

We need "reality discriminants". The trippy thing is if they could exist and
their output is not necessarily boolean. There would be a threshold point at
which beings can exist along the _simulated-real_ spectrum, were by they can
understand the output of the "reality discriminants", yet they are not real.

~~~
subcosmos
The way these GAN algorithms work is precisely by building a network that
discriminates reality from fake. That's why they become so good at this!

