Hacker News new | past | comments | ask | show | jobs | submit login

> SD does okay on some artists that don't have works in the image dataset

I don't think this is true. Can you provide an example?




It works because SD (and DALLE2) don't only infer from the priors from their image training dataset, they infer and mix up concepts coming from the text embedding as well - as this was also trained on images (previously, as CLIP or OpenCLIP).

So CLIP can have picked up an association that a named artist usually is synonymous with for example "broad strokes, moody lighting" and then that is fed into the diffusion model, which doesn't know the artist but DOES know what broad strokes and moody lighting is.

But sure, if CLIP didn't know about the artist name either it won't work of course.

By the way you can still just enter the particulars of the artist you want to mimic by text as well. There is not THAT much information in a style and you won't need to feed an image into the system.

I guess all of this with artists trying to protect their online works by watermarking or glazing will only be a very short speedbump for better or worse. If a human can do a 1-shot style transfer by a single glance at a work, the next round of AIs will as well, and won't be hampered by adding noise to the works and you might have "style extraction" tools that could work like chatgpt in that you iteratively instruct by text commands what to do to get closer without ever letting the AI look at an image.


Thank you, this is a clear explanation.

> There is not THAT much information in a style and you won't need to feed an image into the system.

I think it really depends on the style. Some styles are simple, others are more complex. How that translates to an AI understanding it though, I have no idea. But in terms of brush strokes I don't think it's fair to say every style can be described as "broad strokes, moody, dark". Some very simple styles, maybe.


The recently released game "Atomic Hearts" has a robotic style that didn't exist in the training dataset of SD, and yet i have seen the similar robotic styles generated for it. Of course, i cannot tell if it was a fine tuned model made for such a purpose.

But i do feel that unless your style is very unique and no existing "roots" in existing styles, it would not be possible to "protect" it technologically.


Why wouldn't it be true?

That's the purpose of the training, no? To generalize the model so that it can produce images it has never seen before. That includes images in styles it has never seen before.

Whether or not models have generalized to that point is a different question, but if they do, (and let's be honest, alot of artistic styles aren't that unique or different) the only thing that would be different is that the model cannot conjure up the style by providing the artists name in the input, instead one would have to describe the style in other ways.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: