Hacker News new | past | comments | ask | show | jobs | submit login

Researchers have spent several years try to even create a GAN that can fit well a distribution made of aligned faces (resulted into StyleGAN1/2); with a simple unet with e-objective and cosine schedule you can fit much complex distributions, still using one loss: L1/L2.

Reading your comments make me feel like that you believe that just every researchers (even extremely smart dude like Karras) just switch to diffusion models because they are idiots, they should have instead focus on GANs and today we will have GANs that are as powerful or more than the diffusion models we have today and also work one step; this is just a weird delusion. Diffusion models are just simply much easier to train (just a L1/L2 loss in most cases), write (for example your buggy BigGAN implementation), they usually work out-of-box on different resolutions and aspect ratios, you can just finetune them if you want to create an inpainting model; and for what is right now you just need much less compute to reach a good image coherency or maybe just reaching a coherence that as not been achieved by GAN models; like I would be curious even on a small scale experiment what a GAN (with ~55M parameters) would be able to perform after a 1-day/2-day GPU time of training on Icon645 dataset, because my diffusion model I can assure is much better than I could have imagine while being trivial to implement (I just implemented a Unet as I remember one, nothing rigorous and of course no architecture sweep).




There are so many holes in this, but to pick just one:

> for example your buggy BigGAN implementation

The bug was that the batchnorm gamma was initialized to zero instead of one, so all the model params were being multiplied by zero instead of one. Literally any model that uses batchnorm is susceptible to this bug, and it certainly has nothing to do with GANs.

I think it’s too easy to spout dogma, and calling gwern delusional is something I’ve learned from experience will prove foolish as t approaches infinity.


It has to do with GANs as the added complexity makes the model more difficult to debug, the other day a friend of mine had a problem with the normalization layers of his diffusion model, it was fixed almost immediately, if something goes wrong you have less to worry about.

Also no I don't take "he is gwern" as an argument ahah


Bookmarking this for when time proves you wrong after a couple years. I’ll drop by to say hello.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: