Meta Admits Use of 'Pirated' Book Dataset to Train AI

kristianp · 2024-01-11T21:03:09

> While the use of Books3 is not contested by Meta, the question remains whether the company was in the wrong when it did so.

It seems that AI companies will have to pay publisher to use their content, after the courts have decided meta and openai were indeed wrong to use pirated works.

artninja1988 · 2024-01-11T22:02:40

Well, hopefully courts will decide you can train on content freely

regularjack · 2024-01-11T22:10:51

I hope courts decide that rights owners have a say on how others use their content.

artninja1988 · 2024-01-11T22:43:43

Courts have already decided that people can use others content in various transformative ways such as parody or commentary. Ai training should fall under this aswell, besides the odd memorisation. Really, if the output is not infringing then the model shouldn't be

wang_li · 2024-01-11T23:35:35

Fair use doesn’t allow you to pirate the source material in the first place, even if your parody or commentary constitutes fair use. It sounds to me like Meta committed 195,000 acts of copyright violation for commercial purposes. What is $250,000 x 195,000 in criminal penalties?

EMIRELADERO · 2024-01-12T07:52:18

> Fair use doesn’t allow you to pirate the source material in the first place, even if your parody or commentary constitutes fair use.

Why wouldn't it? While the two instances (the initial acquisition of the copy and the actual use of that copy) are analyzed separately, they're still both ultimately copyright infringment.

sn0n · 2024-01-11T20:39:58

Not me looking at my large book collection and pondering..