Hacker News new | past | comments | ask | show | jobs | submit login

It seems it is possible to generate images which are very similar to the existing stock photos if you feed getty images' description into DALL-E.

I tried it with a distinctive banana image:

https://imgur.com/a/0OrIr6e




"very similar" insofar as it's following the narrow prompt, sure.

> Different runs can generate different size, orientation and placement of the bananas, as well as different shades of pink.

At that point it's definitely the curation causing any possible derivation. The image generator is innocently doing what you ask in an unbiased way.


Those bananas are completely different. There's no copyright infringement there. I could take a photo of a banana and photoshop it repeatedly onto a pink background. That would look just as similar, and there's no copyright problem there.

You can't copyright an idea.


Images are different, but it appears that DALL-E is inspired by the aesthetics and the layout of the copyrighted material.

Another example, picking a random image from the Getty Images site. "A young parkour flips through the city,guangzhou,china, - stock photo":

https://imgur.com/a/pPruwzA

The images are obviously different, but it appears that DALL-E maps the getty images description to similar tone, similar perspective, similar background, and similar weather conditions. I'm sure there are thousands of possible backdrops in Guangzhou, and many ways to show a parkour flip. Even in the Google image search results there's more variance than in the output of DALL-E.

So you can't copyright an idea, but you can certainly scrape a copyrighted DB with image metadata, and use it to create your own product. My point is that DALL-E itself might be a derivative work of Getty Images and thousands of other online catalogs.


Interesting. Adding "stock photo" to the string generated that getty tag? That is probably the most attackable (alas easy to fix) part of the issue. It will be an interesting question how close to the original a picture has to be to be considered the same (I'm sure there's some case law) and maybe there's some new research to be done regarding how to recreate the training data images with the correct search string (I suppose one could build an ML model for that).

Fun times ahead


No, I didn't get the tag. But I suppose that Getty metadata as well as the images were used for training.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: