Hacker News new | past | comments | ask | show | jobs | submit login

No, not related. We just took some of Loopy's demo images + audios since they came out 2 days ago and people were aware of them. We want to do an explicit side-by-side at some point, but in the meantime people can make their own comparisons, i.e. compare how the two models perform on the same inputs.

Loopy is a Unet-based diffusion model, ours is a diffusion transformer. This is our own custom foundation model we've trained.






This took me a minute - your output demos are your own, but you included some of their inputs, to make for an easy comparison? Definitely thought you copied their outputs at first and was baffled.

Exactly. Most talking avatar papers re-use each others images + audios in their demo clips. It's just a thing everyone does... we never thought that people would think it means we didn't train our own model!

For whoever wants to, folks can re-make all the videos themselves with our model by extracting the 1st frame and audio.


Yes, exactly! We just wanted to make it easy to compare. We also used some inputs from other famous research papers for comparison (EMO and VASA). But all videos we show on our website/blog are our own. We don't host videos from any other model on our website.

Also, Loopy is not available yet (they just published the research paper). But you can try our model today, and see if it lives up to the examples : )


[flagged]


No



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: