glid-3 is trained specifically on photographic-style images, and is a bit better at generalization compared to the latent diffusion model.
eg. prompt: half human half Eiffel tower. A human Eiffel tower hybrid (I get mostly normal Eiffel towers from LDM but some sensical results from glid-3)
glid-3 will be worse for things that require detailed recall, like a specific person.
With smaller models you kind of have to generate a lot of samples and pick out the best ones.
eg. prompt: half human half Eiffel tower. A human Eiffel tower hybrid (I get mostly normal Eiffel towers from LDM but some sensical results from glid-3)
glid-3 will be worse for things that require detailed recall, like a specific person.
With smaller models you kind of have to generate a lot of samples and pick out the best ones.