Hacker News new | past | comments | ask | show | jobs | submit login

You could make the same argument that as long as you are using lossy compression you are unable to infringe on copyright.



That's a huge understatement. 5 billion images to a model of 5GB. 1 byte per image. Let's see if one byte per image would constitute a copyright violation in other fields than neural networks.


You took the images, encoded them in a computer process, and the result is able to reproduce some of those images. I fail to see why the size of the training set in bytes and the size of the model in bytes matters. Especially if, as other commenters have noted, much if the training data is repeated(mentions of thousands of mina Lisa's) so a straight division(training size/parameters size) says nothing about the bytes per copyrighted work.


Except that you can't recreate them. At least not without a process that would be similar to asking an artist to create a replica of a painting. Just because photoshop has the right color palet available to recreate art, it doesn't mean the software itself is one big massive copyright infrigement against every art piece that exist.


Past a certain level of overfitting you can definitely recreate them just by asking for them by name. And it's possible to unintentionally or even intentionally overfit.

So it would be quite easy to make a trademark laundering operation, in theory.


It will be interesting to see how they legally define the moment where compression stops being compression and starts being an original work.

If I train on one image I can get it right back out. Even two, maybe even a thousand? Not sure what the line would be where it becomes ok vs not but there will have to be some answer.


There only needs to be an answer if it's determined that some number isn't copyright infringement. The easy answer would be to say that the process is what prevents the works from being transformative(and thus copyrightable) and not the size of the training set.


Another thing worth referencing in this context might be hashing. If a few bytes per image are copyright infringement, then likely so is publishing checksums.


Once you start recreating copyrighted works from hashes this analogy becomes relevant, until then how can you compare the two when the distinguishing feature is its ability to reproduce the training data.


What is a 1080p MP4 video of a film if not simply a highly detailed, irreversible but guaranteed unique checksum of that original content?


I think this is overstretching it. That would be a checksum that can be parsed by humans and contains artistic value that serves as the basis for claims to copyright. An actual checksum no longer has artistic value in itself and cant reproduce the original work.

Which is why this is framed as compression, it implies that fundamentally SD makes copies instead of (re)creating art. Leaving out the issue of recreating forgeries of existing works, using the training data for the creation of new pieces should be well covered inside the bounds of appropriation. Demanding anything more then filtering the output of SD for 1:1 reproductions of the training data is really pushing it.

edit: Checksums arent necessarily unique btw. See "Hash collisions".


Overfitting seems like a fuzzy area here. I could train a model on one image that could consistently produce an output no human could tell apart from the original. And of course, shades of gray from there.

Regarding your edit, what are the chances of a "hash collision" where the hash is two MP4 files for two different movies? Seems wildly astronomical.. impossible even? That's why this hash method is so special, plus the built in preview feature you can use to validate your hash against the source material, even without access to the original.


Once you are down to one picture, collisions become feasible given the right environment and resolution of the image.

Pretty sure this is nitpicking about an overused analogy though.


The distribution of the bytes matters a bit here. In theory the model could be over trained against one copyrighted work such that it is almost perfectly preserved within the model.


You can see this with the Mona Lisa. You can get pretty close reproductions back by asking for it (or at least you could in one of the iterations). Likely it overfit due to it being such a ubiquitous image.


if it's sufficiently lossy, yeah. don't know where you draw the line tho. maybe similar to fair use video clips.


Citing fair use is putting the cart before the horse here. The debate is around whether or not the stable diffusion training and generation processes can be considered transforming the worka to create a new one in the same way we do for humans which allows for the fair use of video clips. To say that it would be similar to fair use is assuming the outcome as evidence, aka begging the question.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: