Hacker News new | past | comments | ask | show | jobs | submit login
OpenAI Wants New York Times to Show How Original Its Copyrighted Articles Are (torrentfreak.com)
17 points by HieronymusBosch 5 months ago | hide | past | favorite | 9 comments



I find this interesting because the NYT is saying it's work is original and copyrightable, and OpenAI is saying "ok, prove it."


It's called a "burdensome discovery request" and will likely be rejected outright or curtailed greatly in scope.

To the extent that it's interesting it's because (on top of brazenly stealing everyone's copyrighted material in the first place) it simply showcases just how thuggish and antidemocratic the brain trust behind this company truly is.

In a nutshell this is like a thief breaking into your house, stealing all your underwear and jewelry -- and then going to the judge to compel you to produce receipts to prove that it's all yours. And certificates of origin to prove they aren't all counterfeits.


> thief breaking into your house, stealing all. . .

This is where your analogy is flawed. You are pre-supposing the "defendant" is indeed the thief that stole your property. Whereas that is entirely a legal determination which is the outcome of a trial AND at the heart of this discovery request. More aptly if you thought steve stole your red Ryder bb gun, and Steve was indeed found to be in possession of a red Ryder bb gun, it would still be the prosecution's burden to prove that Steve stole it from you (instead of purchased it from a store).

Similarly here, if NYTimes is claiming that openai's gpt4 illegally reproduces "to be or not to be. . ." (Or whatevs) from issue #8628 page 76, it's still their burden to prove that is actually a thing that is both copyrightable and that they own the copyright to vs. openai just reproducing Hamlet instead of a nytime's reporter's particular review of a production of hamlet in that issue. Etc. etc.

More germanely, if you point an llm at a pile of source documents and ask it to write a newspaper article, it'll happily do so in 2024. Understanding if/how this is fundamentally different from what a reporter does when synthesizing that same article goes to the very heart of this case (i.e. which transformative works are indeed copyrightable)


It's not an exact analogy, but I think it captures the basic moral point of what's at issue here.

It's perfectly obvious what OpenAI did - and it knows perfectly well how limited its horizons will be if it can't build its empire on the basis of stolen material.

That's why it's now hissing and spitting like a cornered animal.


It could be a little more subtle than that, and legal slight of hand. All the work is original and copyrightable, because someone wrote it. But does the NYT control that copyright? Or has the NYT violated someone else's copyright, in exactly the same way OpenAI does? And if OpenAI can demonstrate that some of the work NYT is claiming copyright on is not in fact copyrightable by the NYT, then they might be able to get the whole case tossed. It would be even more fun if the NYT was found to be claiming copyright on work produced by OpenAI's software.


So the tactics seem to stem from this https://web.archive.org/web/20240114171420/https://www.texas... "lawsuit from hell", tiring out the publishers.


I wonder if the inevitable conclusion of this is further consolidation so like openai buys Reuters or something - AI companies end up also becoming the publishing companies because the only way to make money off of novel material is if it’s no there for all the AI companies to scrape up.


It'd be cheaper to pay people to summarize articles and train on the summaries.


Right, but what happens if content starts eating itself such that there are no articles to summarize




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: