An artist can look at images for reference, and draw something new inspired by them. Why does it matter if a software tool can do this much faster?
If the artist makes the image very similar to one of the reference photos, it may be a copyright violation. It doesn't matter if the artist used a pencil or software to create the new work.
Current AI image generation does, however, make it easy to unknowingly violate copyright. If it generates an image similar to something else out there you wouldn't know.
I don't know much about copyright law though, am I wrong?
The resolution is much weirder than that, the court argued that the pose isn't original enough for the photo to deserve copyrights at all, independently of what the plagiarist did with it.
Every original image is copyrighted. You're suggesting making a digital copy of every image there is to check that AI isn't generating digital copies of every image there is.
If I understand correctly, wouldn't a hash database of <just the training set> be larger than the actual model? (in fact by 1 or 2 orders of magnitude?)
Yeah, I guess so. The models are only 4 or 8 GB. A giant list of hashes would be bigger, sure. But they're 2 very different things. Model is for generating new images, this hash database is copyright enforcement. If you really want to check for violations I don't know how else you're going to do it.
Hash yes, fingerprint maybe no. Maybe I'm using the term incorrectly here, but I think of fingerprint like a lossy hash. Like one way of doing this would be to resize the image to, say, 8 by 8, and quantize it to say, 16 colors. So the fingerprint size is 884 bits=32 bytes. Tiny changes aren't likely to change the fingerprint. You'd probably have to do something a little more clever so as not to get too many false positives though. Or once you get a hit, do a deeper comparison.
If the artist makes the image very similar to one of the reference photos, it may be a copyright violation. It doesn't matter if the artist used a pencil or software to create the new work.
Current AI image generation does, however, make it easy to unknowingly violate copyright. If it generates an image similar to something else out there you wouldn't know.
I don't know much about copyright law though, am I wrong?