Hacker News new | past | comments | ask | show | jobs | submit login

glid-3 is trained specifically on photographic-style images, and is a bit better at generalization compared to the latent diffusion model.

eg. prompt: half human half Eiffel tower. A human Eiffel tower hybrid (I get mostly normal Eiffel towers from LDM but some sensical results from glid-3)

glid-3 will be worse for things that require detailed recall, like a specific person.

With smaller models you kind of have to generate a lot of samples and pick out the best ones.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: