The argument the AI companies are making is that training for LLMs is fair use which means a copyright statement means fuck all from their point of view. (Even if it does, assuming you're in the US, unless you register the copyright with the US copyright office, you can only sue for actual damages, which means the cost of filing a lawsuit against them--not even litigating, just the court fee for saying "I have a lawsuit"--would be more expensive than anything you could recover. Even if you did register and sued for statutory damages, the cost of litigation would probably exceed the recovery you could expect.)
Of course, the big AI companies are already trying to get the government to codify AI training as fair use and sidestep the litigation which doesn't seem to be going entirely their way on this matter (cf. https://arstechnica.com/google/2025/03/google-agrees-with-op...).
Fair use requires transformation. LLM is as transformative as it gets. If I'm on the jury, you're going to have to make new copyright law for me to convict.
I am personally happy to have everyone, people and LLM alike, learn from my wisdom.
No, it doesn't. There are four factors for fair use, and whether the use is transformative is part of one of them. And you don't need to win on all four factors.
> LLM is as transformative as it gets.
The current ruling precedent for "transformative" is the Warhol decision, which effectively says that to look at whether or not something is transformative, you kind of have to start by analyzing its impact on the market (and if you're going "doesn't that import the fourth factor into the first?" the answer is "yes, I don't like it, but it's what SCOTUS said"). By that definition, LLMs are nowhere near "transformative."
Even pre-Warhol, their role as "transformative" is sketchy, because you have to remember that this is using its legal definition, not its colloquial definition.
> If I'm on the jury
Fortunately, for this kind of question, the jury isn't going to be involved in determining fair use, so it doesn't matter what you think.
That's untrue. See my comment elsewhere around here. It doesn't rely on the commercial aspect, though if it's not commercial the bar for fair use is set lower.
The argument in Warhol relies on the fact that the derivative work, ie, Warhol's painting, is substantially similar in function to the original photograph. If Warhol had used the picture as stuffing for a soft sculpture, it would not infringe.
In addition, we need to start paying attention to the growing legislation about AI and copyright law. There was an article on HN I think this week (or last) specifically where a judge ruled AI cannot own copyright on its generated materials.
IANAL, but I do wonder how this ruling will be used as a point of reference whenever we finally ask the question "Does material produced by GenAI violate copyright laws?" Specifically if it cannot claim ownership, a right that we've awarded to trees and monkeys, how does it operate within ownership laws?
And don't even get me ranting about HUMAN digital rights or Personified AIs.
Copyright is for topics like redistribution of the source material. You can’t add arbitrary terms to a copyright claim that go beyond what copyright law supports.
I think you’re confusing copyright with a EULA. You would need users to agree to the EULA terms before viewing the material. You can’t hide contractual obligations in the footer of your website and call it copyright.
What about if my index says "This are the EULA, by clicking "Next" or "Enter", you are accepting them", and a LLM scrapper "clicks" Next to fetch the rest of the content?
It's reasonably likely, but not yet settled, that LLM training falls under fair use and doesn't require a license. This is what the https://githubcopilotlitigation.com/ class action (from 2022) is about, and its still making its way through the court.
This prediction market has it at 12% likely to succeed, suggesting that courts will not agree with you: https://manifold.markets/JeffKaufman/will-the-github-copilot...
> It's reasonably likely, but not yet settled, that LLM training falls under fair use and doesn't require a license.
I would say it's not reasonably likely that LLM training is fair use. Because I've read the most recent SCOTUS decision on fair use (Warhol), and enough other decisions on fair use, to understand that the primary (and nearly only, in practice) factor is the effect on the market for the original. And AI companies seem to be going out of their way to emphasize that LLM training is only going to destroy the market for the originals, which weighs against fair use. Not to mention the existence of deals licensing content for LLM training which... basically concedes the point.
Of the various options, a ruling that LLM training is fair use I find the least likely. More likely is either that LLM training is not fair use, that LLM training is not infringing in the first place, or that the plaintiffs can't prove that the LLM infringed their work.
I do not read it that way at all. The Goldsmith decision mainly turns on the idea that an artist protections include that for derivative works. Warhol produced a work that does substantially the same things as Goldsmith's, ie, is a picture that can be viewed.
When talking about parody, they note that the usage as the foundation for parody is always substantially different from the original and thereby allowed, even if it would otherwise infringe. LLMs are always substantially different from the original, too.
If I want to write software that draws that picture exactly, the code would not be a copyright violation. It is text and cannot be printed in a magazine as a picture. If I used it to print a picture that was a derivative work and sold that, it might be.
A large language model has no intersection with the picture or, for that matter, anything that it absorbs. It is possible that someone might figure out how to prompt it to do exactly the same picture as Goldsmith did but fairly unlikely.
Unless you could show that this was easy, common and part of the intent of the LLM creator, I can see no possibility that it is infringing.
> This prediction market has it at 12% likely to succeed
Randos on the internet with a betting addiction are distinctively different from a court of law. I wish people would stop talking about prediction market as if they mattered.
this isn't about copyright but about computer access. the CFAA is extremely broad; if you ban LLM companies from access on grounds of purpose you have every legal right to do so
in theory that legislation has teeth, too. they are not allowed to access your system if you say they are not; authentication is irrelevant.
every GET request to a system that doesn't permit access for training data is a felony
The reality is that a lot of these small websites have very permissive licenses. I really hope we don't get to the point where we must all make our licenses stricter.
The reality is that none of these LLM scrapers give a damn about copyright, because the entire AI industry is built on flagrant copyright violation, and the premise that they can be stopped by a magic string is laughable.
You could sue, if you can afford it, meanwhile all of your data is already training their models.
Sure, because Meta certainly followed copyright law to the letter when they torrented thousands of copyrighted books from hundreds of published and known authors to train Lama. Forgive me if I doubt a text disclaimer on the page will slow them down.
Unfortunately copyright is no limit to these companies.
Meta is stating in court that knowingly downloading pirated content is perfectly fine (ref https://news.ycombinator.com/item?id=43125840) so they for one would have absolutely no issue completely ignoring your copyright notice and stated licensing costs. Good luck affording a legal team to try force them to pay attention.
Copyright is something for them to beat us with, not the other way around, apparently.
The only reason copyright is so strong in the US is that there are big players (Disney, Elsevier) who benefit from it. But gig tech is much bigger, and LLMs have created a situation where big tech has a vested interest in eroding copyright law. Both sides are gearing up for a war in the court systems, and it's definitely not a given who will win. But, if you try to enter the fray as an individual or small company, you definitely aren't going to win.
Greenpeace USA and its subsidiaries are US based non-profits with assets in the US. Nowhere near $600M, but what they do have can be collected, and Greenpeace USA and all its subsidiaries will certainly be driven into bankruptcy
I would encrypt the information that has 1000 lines that says "get a warrant" and email that encrypted document to them stating "My key is on your server you sunsetted last week" :)
Because, many autocrats do this, a recent example is the Cultural Revolution in China. It took 50 years and a lot of hard work for China to recover from that.
Destroying the educational system allows these people to consolidate and maintain power.
>If you want to see the US rapidly lose its place in the tech world over the next decade, this is a great way to go about it.
Too late, unless DOGE is stopped now and Trump is impeached, the US will lose its lead in tech and health (pharma) and many other industries. Pure and simple. Already the smartest of the smart are leaving the US for Europe and probably China.
If this is allowed to continue, in 6 months to a year, the US will be isolated and a third rate economy. All it will have is a first class war machine, which will not bode well for the world.
And that won't last for much longer after losing those other sectors, either, as military dominance is a function of economic and technological superiority.
Yes, it is theft of the US Treasury, pure and simple. DOGE is salivating to get their hands on the Social Security Trust Fund, a fund all US people paid into over the past ~80 years.
I'm not defending DOGE here but the Social Security "Trust Fund" is more of an accounting gimmick than a real fund. It's not like a regular individual trust fund with a named account beneficiary. Excess Social Security taxes that weren't immediately needed to pay current benefits were used to buy Treasury bonds (or bills). All of the funds are co-mingled. One branch of the government owes money to another branch of government but there are no real assets.
Correct, and a great example of this is "Greenspan's Bait and Switch" back in the Reagan years - basically a bunch of measures put in place to increase how much was being paid into FICA (which notably has a cap on how much income it applies to, so it's regressive) and slow outgo.
Any overage beyond current needs is put into a "trust fund" which is required by law to be kept in US Treasury Bonds, aka loaned to the US Government. For the truly cynical, think about it as years of loaning a bunch of money to your uncle, and around the time that money starts needing to be paid back your uncle starts looking for contract assassins (aka "privatization"). If Social Security can be killed then oh my! Guess all that money owed to it just doesn't need to be paid back.
If the money "borrowed" in that way had been spent in ways that would make providing the services it's for easier and more cost effective that would be one thing, but that's not how it works out because the best ROI for private capital is purchasing politicians and policy.
By that standard, isn’t any cash I hold in treasuries also an accounting gimmick?
Sure, in one case the parties are me and the US Department of the Treasury and in the other case it’s the SSA and the US Department of the Treasury, but I don’t really see how this matters.
The sooner we can get America off of relying on Social Security and into a real investment program, the better.
SS is literally a ponzi scheme disguised as a social insurance program. Everyone would be much better off if you were forced to invest the same amount of money into a personal 401k and some % of returns was skimmed off to be redistributed to low income earners. At least then we'd all be honest about what the programs goals are and how much money is redistributed.
I prefer to have universal enrollment in a safe old age payment system like social security so that even if people do fall for ponzi schemes they will be protected by social security
In most countries, certainly the UK where I live social security payments in are made from workers to the government. At the same time the government pays out social security to those in need. There is no fund, payment in approximately matches payment out. It is not a ponzi scheme, it's normal government taxing those with money and paying out to those in need. It's nice the US has a fund but it's not necessary.
My point is that its not a scheme and no damn politician is going to take social security away from us. And I and most others disagree with changing it to not be universal
Social security is already being cut. Elon is slashing their budget, removing offices while requiring more visits for services. They're starving the beast so they can claim it doesn't work, which means they'll have cause to eliminate or destroy it. Social Security is about to bottle neck to the point they can't fulfill their mandate. This is by design.
It doesn't matter what you or most others think. The US Govt is being held hostage by like 4 people right now and they don't care what you want.
Look, I am not even opposed to treating Social Security like a Pension, where the money is invested. I might even support a slow switchover (it would take decades) to a real pension fund. But would I trust these clowns to do it? HELL NO.
You're doing that thing where you are creating a plausible justification of their goals and methods that they have not in fact presented and that is not real. The fact that someone could have a good reason for wanting to change social security does not mean that musk does, or that any good result can come from this one.
They are doing this illegally, these are crimes. Stop propagandizing in support of them.
What is your definition of a ponzi scheme? It is literally, definitionally one. The only difference from other ponzi schemes where newe entrants pay out unsustainable amounts to old entrants is that new entrants are forced in.
It's not literally, definitionally one because a ponzi scheme is first and foremost a scheme -- fraud. Social security is literally, definitionally not a scheme; it's legal, audited, and fully transparent.
> new entrants are forced in.
Yes and that makes it materially different from a ponzi scheme.
The scheme can just mean "plan". For the first "difference", ~nobody studies the terms of how social security operates before starting to pay into it. Would you say that if I run a ponzi scheme but put a bit of fine print somewhere the investor will not read when joining, admitting it is in fact a ponzi scheme, it is suddenly not a ponzi scheme? I can even make it audited to warn the investors when it runs out of money decades later, no recourse though! Still a ponzi scheme.
For the 2nd one, indeed, that is a material difference. Of course, that makes it worse than just a regular ponzi scheme! In the above example, if I somehow read the disclaimer and refuse to participate, the ponzi scheme operator now points a gun at me and forces me!
And of course, this "material difference" only makes a real difference as long as demographics are favorable.
So yeah, it's a thinly-disguised ponzi, combined with forced labor.
Give people an option of taking a refund of all their SS payments so far and roll over to an IRA/401K. No more SS. I'm sure some people will be interested in that. Government doesn't need to collect from everyone to reallocate. Govt can print money to give to the poor, doesn't need anyone's permission to print money, which they do by trillions.
Sad things are getting to this point. Maybe I should add this to my site :)
(c) Copyright (my email), if used for any form of LLM processing, you must contact me and pay 1000USD per word from my site for each use.
reply