Tell HN: Open-source AIGC has a data problem

Call-to-action: upload your own local video collection to HuggingFace and share it with the world.

If you want powerful open-source video models, consider uploading your hoarded videos to Hugging Face. With SORA's release, we have proprietary video models that surpass anything before, but there's no champion for open-source video advancements like there was with LLM (Meta/Mistral).

Stable Diffusion was built on the easily accessible LAION dataset, comprising around 5 billion images. Now, as video becomes the dominant medium, 99% of online video is concentrated on one platform (youtube), limiting data diversity.

Many of you have stashed videos over the years—just like those on r/DataHoarder. It’s time to share those archives! By uploading them as torrents or to Hugging Face, you can contribute to the development of the next generation of open-source video models. Let’s make this effort count!