"I suspect the demise of stock image providers to be the first palpable win for generative AIs, if the copyright question doesn't bog this whole field down"
I'm surprised the copyright issues aren't given more attention. It's technically not legal (in the US) to modify copyrighted images without the authors permission. I don't see how it's possible that systems like DALL-E haven't already done that. There's a near 0% chance that they aren't trained on at least one copyrighted image.
Humans photographers are also trained on copyrighted images.
They look at countless numbers of them and learn what is the correct "professional style", etc. This is why you can instantly recognize most stock photos, because they all follow the "stock photo template".
The difference is that AI models so closely recapitulate specific features in copyrighted images that stock image company watermarks show through [0]. This is several levels beyond a human artist implicitly getting inspiration from copyrighted images, and more on the level of that artist explicitly copy/pasting specific pixels from them.
That's exactly my point — they replicate highly specific features in images with such fidelity that their training is not analogous to humans' artistic inspiration.
They replicate common features. If you paint the same happy little tree in your picture as thousands of other people then it will probably show up in an image produced by a model trained on those images but your tree is hardly unique then isn't it?
How is the ai supposed to know these watermarks aren't a style element? They're present in tens of thousands of input images, after all. Therefore, I'd say this is a bad example of an AI literally copying from one specific source. It's similar to it using Arial letters: they're everywhere in the source data.
The i stands for imagination/ignorance at the moment. Intelligence (or something indistinguishable from it) doesn't seem too far away but isn't here yet.
So all we have is a dumb bot that can appropriate styles and ideas. Revolutionary, but not quite to the extent needed to sue it for copyright.
Copyright law doesn't work like that for photos. When you take a photo of something you become the owner of the image.
In the context of AI, the issue is specifically with using a copyrighted image and creating something new based off of that. That is explicitly illegal for human artists.
But where do you draw the line? If AI imagines 3 people around a business table in front of a flip chart, is that copyright infringement on similar stock photos? Note that in the AI created image, the people are unique, they never existed, the business table is unique, the flip chart is unique, and in general you can't point to any existing photo it was trained over and say "it just copied this item here".
If so, why isn't it also copyright infringement when a human photographer stages another similar shot?
Well that's sort of the whole thing with copyright law. It's fairly arbitrary. Copyright specifically forbids derivative works:
"A derivative work is a work based on or derived from one or more already exist- ing works."
It's vague on purpose because copyright infringements generally need to be handled on a case by case basis.
Now there are AI's trained on images that are copyrighted. If the image is copyrighted, should the AI have been allowed to train on it?
The reason human training/inspiration isn't specifically forbidden is because it can't be. We are impressioned by things whether we like it or not. Regardless, we can't prove where someone's inspiration came from.
But the act of training an AI on copyrighted images is deliberate. I feel that's a key difference.
> The reason human training/inspiration isn't specifically forbidden is because it can't be. We are impressioned by things whether we like it or not. Regardless, we can't prove where someone's inspiration came from.
And there's plenty of cases that say if you're too inspired, that's illegal and/or you own damagaes/royalties.
Then the AI is performing a sort of collage of copyrighted work and the AI / prompt writer would not own the copyright to the derivative work. If a photographer stages a photo based on an existing photo, and it shares enough features with the original work, it likely would be copyright infringement.
The court has already ruled that you can't own the derivative work anyways, because copyright law requires an individual artist. If I ask bob to make a picture for me, bob actually owns the copyright to start (but can assign it to me). I don't automatically get given copyright because I 'prompted' bob with what I wanted drawn (draw me a mouse). Copyright is given to the artist on the artists specific output.
If I ask an AI for a picture, there is no artist 'bob' to be assigned ownership under copyright law and therefor it's not copyrightable under existing law.
Funny how originally all these pro-AI art people were anti-copyright law but I can see them sometime soon lobbying for MORE restrictive copyright law (granting it in a larger pool or circumstances hence making more things copyrighted) so that they can overcome this.
It’s explicitly allowed to create new based on photographs, assuming the resulting work is not similar with the original
> For example: if they base their painting on an oft photographed or painted location, generic subject matter, or an image that has been taken by numerous photographers they would likely not be violating copyright law.
> However: if they create their painting, illustration or other work of art from a specific photograph or if your photography is known for a particular unique style, and their images are readily identifiable with you as the photographer, and an artist copies one of your photographic compositions or incorporates your photographic style into their painting or illustration they may be liable for copyright infringement.
"incorporates your photographic style into their painting or illustration"
Seems pretty cut and paste to me. If it has trained on my images and then uses that trained dataset to generate new images those images are in violation. Using training sets that include unlicensed copyrighted works requires attribution and licensing. TO be legal otherwise the end user/AI company would have to be able to prove in a court of law that without training on my copyrighted work it would have still generated that specific image which I can't see the users/company being able to do.
It is not illegal for a human to look at something another human created and learn composition, strokes, lighting, etc... and then apply it to their own future creations. This is all the AI is doing.
Taking copyrighted images and dumping them into a machine learning model is deliberate usage. The AI isn't a person, so it doesn't draw on past experience by happenstance.
AI is just a lossy form of storing the copyrighted work and using pieces of the copyrighted work for future output. It definitely requires licensing of the works stored (I mean 'trained on')used if used outside of 'personal use'. I can't just re-compress a tons of pictures into crappy jpg format and then use them however I'd like. I also can't just come up with a new format for machine storing copyrighted images to be used for creating derivative works, call it AI, and say it's 'different'. The AI company has to be able to prove in a court of law it could have generated the image if it hadn't been trained on my copyrighted work. We already covered this area of law with sampling in music. If you didn't want to continue over ownership of the work from the owner of the 'sample' you either license it or.... don't use it.
if it is storing the copyrighted work, then I'm sure you could point which part of the weights corresponds with a specific work, right? Same way that you could do it if we were to "re-compress a tons of pictures into crappy jpg format", or if we were "sampling music". Oh, you can't do it? Then, I'm afraid it's not the same.
It's hugely different - imagine the number of decisions a person makes when making an oil-painting - each stroke is somewhat influenced by past experience but also by the current state of the painting, their emotional state etc. The AI is just directly interpolating based on past input.
Making the two processes equivalent is very reductive.
The AI is a product created by a company. A vacuum sucking up the scraped remnants of the internet. Hundreds of millions of dollars are spent to pull this off. Stop acting like this is a human or anything resembling one. This is a product and not a person.
Yes, it can be illegal. It happens plenty of time in music, where artists produce songs which are too similar to previously existing songs, and owe damages.
Am I allowed to take an imagine and apply a lossy algorithm (say jpg) to it and then use it as my own for business purposes? Nope. You say learn, I say apply a lossy algo and then use the result for business purposes. Seems like clear copyright violation.
This kind of 'training' is not at all equivalent. There's a reason copyright places value on the expression of an idea (i.e. taking the photo) - image-making is difficult and was a valuable skill, even for a stock photo.
Getty's case is active in the court system in multiple jurisdictions, until we get there outcome of that weren't not going to have a resolution of this. Unless countries legislate/decide to allow training on publicly accessible documents, eg as Fair Use/Fair Dealing or whatever.
In short, the copyright issues appear to be given a lot of attention? Legal precedent takes time.
This will take years for the courts to figure out. In the mean time, Adobe Firely has apparently not been trained on anything copyrighted, so people that are nervous about lawsuits will use that.
Isn’t it just fair use? Reading the four factor test for fair use it seems like these generative models should be able to pass the test, if each artwork contributes only a small part to a transformative model that generates novel output. The onus will be on demonstrating that the model does not reproduce works wholesale on demand, which currently they sometimes still do.
Arguably also, the copy is achieved at generation time, not training time, so the copyright violation is not in making the model or distributing it, but in using it to create copies of artworks. The human artist is the same: in their brain is encoded the knowledge to create forbidden works, but it is only the act of creating the work which is illegal, not the ability. The model creators might still be liable for contributory infringement though.
Anyway, I reject the notion that any use of unlicensed copyrighted works in training models is wrong. That to me seems like the homeopathic theory of copyright, it’s just silly. If copyright works that way we might as well put a cross over AGI ever being legal.
Should the model be allowed to train on the copyrighted image in the first place? I think, the answer is no. If I'm an artist, I don't volunteer my art for you to do what you please.
Now consider that these systems are already being used for profit, before this matter has even been settled.
I'm surprised the copyright issues aren't given more attention. It's technically not legal (in the US) to modify copyrighted images without the authors permission. I don't see how it's possible that systems like DALL-E haven't already done that. There's a near 0% chance that they aren't trained on at least one copyrighted image.