Either, (1) LLMs are just super lossy compress/decompress machines and we humans find fascination in the loss that happens at decompression time, at times ascribing creativity and agency to it. Status quo copyright is a concern as we reduce the amount of lossiness, because at some point someone can claim that an output is close enough to the original to constitute infringement. AI companies should probably license all their training data until we sort the mess out.
Or, (2) LLMs are creative and do have agency, and feeding them bland prompts doesn't get their juices flowing. Copyright isn't a concern, the model just regurgitated a cheap likeness of Indiana Jones as Harrison Ford the world has seen ad nauseam. You'd probably do the same thing if someone prompted you the same way, you lazy energy conserving organism you.
In any case, perhaps the idea "cheap prompts yield cheap outputs" holds true. You're asking the model respond to the entirely uninspired phrase: "an image of an archeologist adventurer who wears a hat and uses a bullwhip". It's not surprising to me that the model outputs a generic pop-culture-shaped image that looks uncannily like the most iconic and popular rendition of the idea: Harrison Ford.
If you look at the type of prompts our new generation of prompt artists are using over in communities like Midjourney, a cheap generic sentence doesn't cut it.
You don't even need to add much more to the prompts. Just a few words, and it changes the characters you get. It won't always produce something good, but at least we have a lot of control over what it produces. Examples:
So... ask it to dress them differently. You can just ask it to make whatever changes you want.
"An image of an archeologist adventurer who wears a hat and uses a bullwhip. He is wearing a top hat, a scarf, a knit jumper, and pink khaki pants. He is not wearing a bag" (https://sora.com/g/gen_01jqzkh4z2fqctzr9k1jsfnrhy)
Those are great, I would watch any one of those movies. Maybe even the "Across the Indiana-Verse" one where they are all pulled into a single dimension.
You just have to write the prompt in a way that is not so obviously pointing to Indiana Jones, and you get something that is not Indiana Jones...
"A nerdy archaeologist adventurer in a pith helmet, with glasses and a backpack, stumbling his way through a green overgrown abandoned temple. Vines reach for his heels" (https://sora.com/g/gen_01jr0yd810e8xsenp85xy2g47f)
"A nerdy archaeologist adventurer in a pith helmet, with glasses and a backpack, nervously sneaking her way through a green overgrown abandoned temple. She is wearing pink khaki pants, and a singlet" (https://sora.com/g/gen_01jr0z837jecpa770v009bs1m3)
Is it as creative as good humans? Not at all. It definitely falls into tropes readily. But we can still inject novel ideas into our prompts for the AI, and get unique results. Especially if you draw sketches and provide those to the AI to work from.
This is the opposite of how people have thought about creativity for centuries, though.
The most creative person is someone who generates original, compelling work with no prompting at all. A very creative person will give you something amazing and compelling from a very small prompt. A so-so creative person will require more specific direction to produce something good. All the way down to the new intern who need paragraphs of specs and multiple rounds of revision to produce something usable. Which is about where the multi-billion-dollar AI seems to be?
Or, (2) LLMs are creative and do have agency, and feeding them bland prompts doesn't get their juices flowing. Copyright isn't a concern, the model just regurgitated a cheap likeness of Indiana Jones as Harrison Ford the world has seen ad nauseam. You'd probably do the same thing if someone prompted you the same way, you lazy energy conserving organism you.
In any case, perhaps the idea "cheap prompts yield cheap outputs" holds true. You're asking the model respond to the entirely uninspired phrase: "an image of an archeologist adventurer who wears a hat and uses a bullwhip". It's not surprising to me that the model outputs a generic pop-culture-shaped image that looks uncannily like the most iconic and popular rendition of the idea: Harrison Ford.
If you look at the type of prompts our new generation of prompt artists are using over in communities like Midjourney, a cheap generic sentence doesn't cut it.