Hacker News new | past | comments | ask | show | jobs | submit login

Guaranteed value is a licensing payment that compensates the publisher for allowing OpenAI to access its backlog of data, while variable value is contingent on display success, a metric based on the number of users engaging with linked or displayed content.

...

“The PPP program is more about scraping than training,” said one executive. “OpenAI has presumably already ingested and trained on these publishers’ archival data, but it needs access to contemporary content to answer contemporary queries.”

This also makes sense if they're trying to get into the search space.




> OpenAI has presumably already ingested and trained on these publishers’ archival data

So they're admitting to copyright violations and theft?


Whether training a model on text constitutes copyright infringement is an unresolved legal question. The closest precedent would be search engines using automated processes to build an index and links, which is generally not seen as infringing (in the US).



No, they have not done that. Presumably they believe that the model training was done in fair use and no court has said otherwise yet.

It will take years for that stuff to settle out in court, and by that time none of that will matter, and the winners of the AI race will be those who didn't wait for this question to be settled.


They believe a lot of things, I'm sure.

> and the winners of the AI race will be those who didn't wait for this question to be settled.

Hopefully they'll be in jail.


Its not just the big companies you have to think about, lol.

Sure you can sue OpenAI.

But will you be able to sue every single AI startup that happens to be working on Open Source AI tech, that was all trained this way? Absolutely not. Its simply not feasible. The cat is out of the bag.


The US government has worked hard to make the lives of copyright infringers miserable for years, even driving them to suicide.


> The US government has worked hard to make the lives of copyright infringers miserable for years

They really have not. The fact that I can download any movie in the world right now, and use all of the open source models on my home PC proves that.

I am sure there are some random one off cases of infringers being punished, but it mostly doesn't happen.

Especially if we are talking about the entire tech industry.

The government isn't going to shutdown every single tech startup in the US. Because they are all using these open source AI models.

The government isn't going to be able to confiscate everyone's gamer PCs. The weights can already be run locally.



My point stands. Thats like one guy. Thats not ""an entire industry gets shutdown by the government".

That was my point. Sure, they might go after like one guy or one company. They aren't going to take out half of the tech startups in all of the US though. They also aren't going to confiscate everyone's gamer PCs.

I also think its funny that you literally posted a wikipedia page, where in the page itself it contains the "illegal" numbers.

So that proves my entire point. Your best example, is apparently an example where I can access the "illegal" information on a literal public wikipedia page!


> Thats like one guy

Also known as an example

> So that proves my entire point

Your point is that you can't use it commercially? Great! We're aligned, then.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: