Hacker News new | past | comments | ask | show | jobs | submit login

Because training intrinsically involves making a copy.

Perhaps requiring that deep-pocket companies actually compensate copyright holders would be a starting point for a fairer system?




Mostly I see two outcomes. Either the holders "lose" and it basically follows "human" rules meaning you can train a model on any (legally obtained) material but there's restrictions on use and regurgitation. I say "human" rules because that's basically how people work. Any artist or writer worth their salt has been trained on gobs of copyrighted material, and you can totally hire people to use their knowledge to violate copyright now.

the other option is the holders "win" and these models must only be trained on owned material, in which case the market will collapse into a handful of models controlled by companies that already own huge swaths of intellectual property. basically think DisneyDiffusion or RandomHouse-LLM. Nobody is getting paid more but it's all above board since it's been trained on all the data they have rights to. You might see some holders benefit if they have a particularly large and useful dataset, like Reddit or the Wall Street Journal.


Both no?

People with power and money, can get paid. Artists who have no reach and recognition get exploited. Especially those from countries which arent in north America and Europe.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: