Wow, not only do they get this wrong, it’s the core example they use to demonstr...

Imnimo · on Jan 14, 2023

Yeah, and their next figure isn't any better. They show a latent space interpolation figure from DDPM, and they seem to think this is how Diffusion models produce a "collage" (as they describe the process). Of course, this figure has nothing to do with how image generation is actually performed. It's just an experiment for the purpose of the paper to demonstrate that the latent space is structured.

In fact, this only works because the source images are given as input to the forward process - thus, the details being interpolated are from the inputs not from the model. If you look at Appendix Figure 9 from the same paper (https://arxiv.org/pdf/2006.11239.pdf) it is clear what's going on. Only when you take a smaller number of diffusing (q) steps can you successfully interpolate. When you take a large number of diffusing steps (top row of figure 9), all of the information from the input images is lost, and the "interpolations" are now just novel samples.

It's very hard for me to find a reason to include Figure 8 but not Figure 9 in their lawsuit that isn't either a complete lack of understanding, or intentional deception.