...which is the twitter user's reading into that line. The user acknowledged that File 4 (source of the picture) [1] doesn't mention Nightshade and that File 5 [2] contains it.
As File 5 is not searchable, I skimmed through it and found no mention of Nightshade. In fact, some points actually expressed the opposite of the twitter user's reading:
Section III 1. (1) 電子透かしが導入すべきである "[The content creator] should add digital watermarks."
The seventh point ノイズ付与の場合、電子計算機損壊等業務妨害罪になるといった可能性が指摘されているので、省庁や国としてこういった無断学習妨害ツールや電子透かし技術の正当化を担保してほしい "In the case of adding noise, someone may accuse the [creator] of the crime of business obstruction such as damaging the computer. We hope that the local and national government can ensure the legitimacy of tools for preventing unauthorized learning and watermarking."
The other points similarly stated that the content creator should watermark their works and that AI contents must embed some metadata to be identifiable as AI-made.
On top of that, these files are just minutes. Basically the Japanese government invited some companies, lawyers, and AI related associations into the Prime Minister's Office to discuss AI and IP laws, so anyone can say anything. It's not legally binding. And there are statements like we hope to use NFT as identifiers, so you can understand how serious this is.
For context, Nightshade is a tool used to alter images in such a way as to poison a generative image model if the altered images are included in the training set. Poisoned models produce noticably worse output and the only way to fix them is to identify and remove the poisoned training images, and retrain.
That said, it's kind of bonkers to me that publishing an image with alterations like this might be considered illegal, particularly if an image is being published by the copyright holder. It's like saying you can't watermark anything you publish.
This kind of claim is almost always suspect. There was a time that rotating or other manipulations caused issues for the models and those were largely taimed. Hinton's work with Capsule Networks [1] produced much more resilient models, though they are computationally much more intensive.
It's also interesting that this is coming from Japan, where they also have a law regarding copyright and training that is very permissive. [2]
> That said, it's kind of bonkers to me that publishing an image with alterations like this might be considered illegal, particularly if an image is being published by the copyright holder. It's like saying you can't watermark anything you publish.
This should be generally true, but in law, intent typically matters. If you are doing something with the primary reason being to cause harm, then normal protections often don't apply.
"If you are doing something with the primary reason being to cause harm, then normal protections often don't apply."
This seems pretty confused. There is no law banning "harm" (A term so vague that everyone is guilty.) and even if there was watermarking images does not appear to be more "harmful" than making an AI using the copyrighted images of others without permission.
Good point on intent. I have to think if an artist published poisoned images, with a notice like, "don't use these for training, they're poisoned, contact me for licensing of clean images", it would be hard to argue either negligence or malicious intent to disrupt training.
Which would be distinct from one silently poisoning and reposting images everywhere with the goal of gumming up everyone's training.
The problem with nightshade is that while it worked in laboratory conditions, in practical application it wouldn't work without extensive coordination between everyone using it.
For example, if one artist who draws dogs uses it to bias towards cats and another uses it to bias towards horses, the bias data is less of a signal and more noise. In the paper, they biased all the images the same way.
The issue compounds when you consider the multiple data points that needs to be biased. An artist who draws impressionist dogs, an artist who draws cartoon dogs, and an artist who draws cartoon cats would need to bias 'impressionism,' 'cartoon style,' 'dogs,' and 'cats' all in ways that are standardized across nightshade users to have the tool be effective.
This isn't realistically going to happen.
So ultimately it's about as effective as the users who put the "you don't have permission to use my data" clauses on their MySpace two decades ago. Feels good, and totally ineffective.
Interesting, but that doesn't really answer my question. Has anyone done wider studies about the effectiveness of this sort of thing generally? As in, does it have to be tailored to a specific model or does it offer generalized protection with all models?
The Ars article doesn't talk about this aspect.
If these methods are effective, can a similar thing be done with text?
I can't speak to broader study of this, but I suspect this is only doable with images because you can make meaningful changes to the pixel data that remain mostly imperceptible to the eye. I think there's less room for this sort of thing in text.
It's a terrible practice to pollute art that is owned by the public and ultimately belongs in the public domain. Are they going to come back 75 years later and filter out their data sewage?
I can go get a course of antibiotics if I have to eat shit soup, but ruining the data ruins it forever.
It's the artists own work, they can do whatever they want with it. If people don't like that they should be filtering this out of their dataset before training.
The Japanese Government knows that it is their job to further the public good, and that LLM AIs will help the public good. Therefore any attacks on the development of LLM AIs should be illegal.
Surprising and somewhat depressing. I wouldn’t be surprised if this leads to many artists in Japan who had been posting their works on Twitter, Instagram, and Japanese SNS sites deleting their posts and ceasing sharing. There’s no incentive to feed a machine that will never compensate you for your involvement and efforts despite generating value for others.
Submitted title was "Japan: Using Nightshade, etc. tools to disrupt AI training is illegal" but that's too far from what the OP says, so I've attempted to put a shortened version of the latter up there instead. In a case like this, that's the best way to follow the guideline "Please use the original title, unless it is misleading or linkbait; don't editorialize." - https://news.ycombinator.com/newsguidelines.html
I am not a lawyer, but my understanding is that interference with business operations in general is a crime in Japan. You will sometimes see in their media people asking others to leave a store under the threat that failure to comply would result in being reported to police for interfering with their businesses. It seems quite logical that this would extend to things to interfere with businesses that do AI training.
> asking others to leave a store under the threat that failure to comply would result in being reported to police for interfering with their businesses.
Does Japan not have laws against trespassing? That seems it would be a much more straightforward charge for that sort of thing, as it's more black-and-white.
I can't read the japanese text but the "with the intent of business interference" makes it seems like could be about harming a competitor, and not for running your family photos through nightshade before uploading them.
“Business interference” and “harming a competitor” are awfully vague, though. Who’s considered a competitor? Is it “business interference” for example to protect one’s character designs with Nightshade, with the tool serving as a sort of compliment to copyright protection that can be enforced without court involvement? What about using it with landscape stock photography as an AI-proof counterpart to visual watermarking?
For sure it is vague. But there is already law covering illegal business practices and competition. Without understanding Japanese its hard to say if its under that umbrella, which might make it not vague at all.
> In the newly released AI-related meeting documents by the Japanese government, which does not want to regulate AI, it is stated that using technology to disrupt AI learning with the intent of business interference could lead to criminal penalties.
As File 5 is not searchable, I skimmed through it and found no mention of Nightshade. In fact, some points actually expressed the opposite of the twitter user's reading:
Section III 1. (1) 電子透かしが導入すべきである "[The content creator] should add digital watermarks."
The seventh point ノイズ付与の場合、電子計算機損壊等業務妨害罪になるといった可能性が指摘されているので、省庁や国としてこういった無断学習妨害ツールや電子透かし技術の正当化を担保してほしい "In the case of adding noise, someone may accuse the [creator] of the crime of business obstruction such as damaging the computer. We hope that the local and national government can ensure the legitimacy of tools for preventing unauthorized learning and watermarking."
The other points similarly stated that the content creator should watermark their works and that AI contents must embed some metadata to be identifiable as AI-made.
On top of that, these files are just minutes. Basically the Japanese government invited some companies, lawyers, and AI related associations into the Prime Minister's Office to discuss AI and IP laws, so anyone can say anything. It's not legally binding. And there are statements like we hope to use NFT as identifiers, so you can understand how serious this is.
[1]: https://www.kantei.go.jp/jp/singi/titeki2/ai_kentoukai/gijis... [2]: https://www.kantei.go.jp/jp/singi/titeki2/ai_kentoukai/gijis...