Researchers have spent several years try to even create a GAN that can fit well ...

sillysaurusx · on Aug 24, 2023

There are so many holes in this, but to pick just one:

> for example your buggy BigGAN implementation

The bug was that the batchnorm gamma was initialized to zero instead of one, so all the model params were being multiplied by zero instead of one. Literally any model that uses batchnorm is susceptible to this bug, and it certainly has nothing to do with GANs.

I think it’s too easy to spout dogma, and calling gwern delusional is something I’ve learned from experience will prove foolish as t approaches infinity.

GaggiX · on Aug 24, 2023

It has to do with GANs as the added complexity makes the model more difficult to debug, the other day a friend of mine had a problem with the normalization layers of his diffusion model, it was fixed almost immediately, if something goes wrong you have less to worry about.

Also no I don't take "he is gwern" as an argument ahah

sillysaurusx · on Aug 24, 2023

Bookmarking this for when time proves you wrong after a couple years. I’ll drop by to say hello.