Hacker News new | past | comments | ask | show | jobs | submit login
Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency (loopyavatar.github.io)
10 points by caohongyuan 12 days ago | hide | past | favorite | 2 comments





Fucking amazing examples, will you guys be putting up on huggingface to play with or releasing the model... or going commercial and locking it up?

TL;DR: we propose an end-to-end audio-only conditioned video diffusion model named Loopy. Specifically, we designed an inter- and intra-clip temporal module and an audio-to-latents module, enabling the model to leverage long-term motion information from the data to learn natural motion patterns and improving audio-portrait movement correlation. This method removes the need for manually specified spatial motion templates used in existing methods to constrain motion during inference, delivering more lifelike and high-quality results across various scenarios.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: