Hacker News new | past | comments | ask | show | jobs | submit login

Something…seems fishy? Like the example with the guy next to the robot figure. Their model happened to predict exactly the same type of figure?! Diffusion models are not omnipotent…



That's the entire point. It didn't "happen" to predict exactly the same type of figure. It used the context photos to know what type of figure it should render.

You might be getting a bit confused because here the training process has to happen every time you use it, whereas in most AI applications you only perform inference for actual use.


The model gets the reference images as "context", so it can just copy the robot from one of the other images.


Ahh I see, this makes a lot more sense now!


I wonder if he is holding that umbrella to aid the model in recovering the 3d scene/scale from the reference images.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: