ML revolutionized translation a long time ago and the demand and pay for translators went down. It also used the works of thousands of humans who translated text and they never saw a dime. Copyright applies to translations so we've already gone through something similar. The same will happen with art and every other medium that ML touches. I have a love/hate relationship with ML because of this. It seems that there will be a painful transition period as many humans are displaced, probably even average coders like me. In a way, it's no different than what a human does but humans can't scale like computers. I hope someone can engineer a new economic system that works for what's coming.
> ML revolutionized translation a long time ago and the demand and pay for translators went down.
ML translation also caused a noticeable drop in the average quality of translated text in my experience. Companies now ship machine-translated manuals for minor languages that are often little more than gibberish. Human translators weren't perfect - for example, often you could see that the translator didn't know the terminology of the field. But at least the rest of the text was usually intelligible.
If this is where we're heading to with visual art as well I'm not looking forward to the future. Imagine an instruction manual where illustrations are machine-created. Everything looks kind of weird and inconsistent. If you quickly flip through the book it looks like all the important points are shown on the pictures, but looking closely shapes don't match with reality and the details are all wrong. The number of bolts changes from picture to picture. It's all there just to check a mark on someone's list, but doesn't really help you in servicing the thing.
I disagree with the first part, I remember home appliances made in Asia in the 90s that came with completely ridiculous manuals, possibly translated with a dictionary and little else. Machine translation can really be good if used properly. Of course I wouldn't use it to translate literature.
It was better than those, but many "professional" companies that would in the past hire out a group of translators to translate their documentation into various languages now just use machine translators with a cursory pass.
If you read mainly English, you won't notice it, but you'll see it if you speak a language that does have translations, but aren't primary.
That sort of translation always had a lemon problem. How do you know if the translation to Portuguese is any good, if you don't speak Portuguese and don't know anyone who does? You can pay the expensive translator company, or the dodgy cheaper translator company that claims to be equally good.
Did you choose the responsible expensive option? Sucks, because they have been competing with the bad/automated translators so long that they squeeze their human translators so hard, they produce bad translations anyway.
Also, it didn't help that you didn't give them any context, but just sent them one sentence at a time, extracted from the strings in your program. Because we've all done that, haven't we?
Hah. Yes. And as an amusing result, emails and chats are filled with errors resulting from automated translation. The service was supposed to be a crutch, but it became widely adopted and lowered the standards everywhere.
Anecdotally, I now check my email, because Outlook likes to change what I wrote.
It's honestly pretty fascinating to watch perfect be the enemy of the good in this space. As a result of the birth of cheap-to-free machine translation of human language, the space of things available on the global market has become incredibly vast, because the good-enough stab at translation is often good enough for a dedicated hacker to make the thing work anyway.
It applies to even English language transcriptions as well. Do I pay pennies a minute or a $1+ a minute? It depends on the use case and my budget. If my time is expensive and I'm publishing externally (e.g. for a professional podcast or an interview for a client), I'll probably go with a human. If I just want to refer to some notes to pull out a quote or go back to check something? An ML transcription with timestamps is probably fine. Ditto if the alternative to ML transcription is nothing.
Former professional translator here (Japanese-to-English, 1986 to 2005). I have very mixed feelings about MT.
On the one hand, as you and others point out, rates for human translation have been hurt, and the quality of a lot of MT-enabled translation visible to the public is mediocre to poor.
On the other hand, MT is enabling communication among people that was not possible or practical before. A small start-up is able to find customers and suppliers in countries with different languages. People with common interests but diverse languages are able to chat and share ideas on social media. A person receives an e-mail from a long-lost relative in a language the recipient doesn’t know; MT enables them to correspond and eventually meet up. Little of that communication would be taking place if a human translator had to be found and paid for each interaction.
> On the other hand, MT is enabling communication among people that was not possible or practical before. A small start-up is able to find customers and suppliers in countries with different languages. People with common interests but diverse languages are able to chat and share ideas on social media. A person receives an e-mail from a long-lost relative in a language the recipient doesn’t know; MT enables them to correspond and eventually meet up. Little of that communication would be taking place if a human translator had to be found and paid for each interaction.
Yep. I translated tens of KWords between English and Polish by now - thanks to Deepl, which speeds this up by at least an order of magnitude (and the end result is probably better too).
I even planned to translate a book only available in Polish ([this one](https://en.wikipedia.org/wiki/Perfect_Imperfection)) - I translated about 3% already, but unfortunately the author didn't consent to sharing the translation anywhere - which is annoying, because original (Polish) text is freely available on libgen. And English version doesn't exist. And the book is 18 years old at this point. So the potentially lost profits are negligible or nil. Meh.
As an example, as a hobby, I am interested in things like anime video compression and such. Two of the most essential sites for me in terms of resources and information have been almost entirely Chinese. And while they appreciate and respect the international users, at the end of the day, they can’t translate each and every single forum post into English or any other language that someone may need.
And also unfortunately, I don’t have the time yet to invest in learning the language enough to use those sites without assistance. Instead, I rely heavily on things like Google Translate, Yandex, and DeepL. They’ve been totally essential to the point to where a major pain point for me with my preferred browser (FireFox) has been a lack of native MT support (until very recently, which has been nice for Russian and stuff, but I’m still waiting for Chinese support in Bergamot).
Similarly, while I studied in Russia for a while, I’m not a native speaker, and my skills aren’t what they used to be now that I’ve been out of the country for a while. There is a lot of interesting stuff on the Russian internet (Habr, RuTracker, various articles, etc.) that I only really have the level of convenient access to that I do because of MT.
I don’t think it’s all sunshine and rainbows, certainly, as with stuff like DALL-E 2, but it definitely provides a lot of people real value and happiness. Hopefully, humanity will figure out how best to balance the positives and negatives of AI services like these.
I see that both sides (lol, sorry) of this often take pretty hardline stances on “art is art; there is no difference” vs. “these are just copy-pasting people’s hardwork and can’t actually create anything themselves”. I think the value is sort of indisputable, but I worry about how it’ll affect people’s livelihoods — e.g., after playing with DALL-E 2, I think it is totally capable of replacing things like stock photos a large percentage of the time (not always, of course, but it definitely can sometimes).
I’m of the mindset that you should be able to remix, adapt, copy, etc, virtually everything in an ideal world. Maybe I am just entitled, I don’t see why I should have to waste the man-hours recreating or reverse engineering something simply because of a license incompatibility, for example.
Similarly, I don’t see why I shouldn’t be able to transform or straight-up copy and distribute someone else’s art if it provides people with happiness. As an example, talking about anime earlier, I think things like fansubs can provide lots of additional value to people by either providing more natural translations or through extensive typesetting that companies tend not to do because it is not “necessary”. For example, those familiar with the show Bakemonogatari are probably aware of the extreme amount of dialogue displayed exclusively visually, with rapid flashing cards and such. While I think the official subtitles don’t get enough credit 99% of the time, that is a prime example where fansubs which go in and actively mask over the text on the screen and replace it with the viewer’s native language is very helpful. However, despite the value these things provide, it is copyright infringement.
Rationally, I understand that the emotional and protective side of people exists, but at the same time, I don’t “understand” it; I want people to use my stuff — I want people to be happy with their limited time on this planet. In that sense, I think the emotional argument for copyright does not appeal to me.
That being said, pragmatically, people need jobs. We live in a society where you must work. A society where you have bills to pay, groceries to buy, and fees for schools and hospitals. Stripping away copyright protections entirely for things like this would hurt a lot of people right now. And I am really curious how this’ll all turn out in court when the majority of these datasets are trained on things that they have no right to be using.
Will we be in a world where you can essentially bypass all copyright protections by throwing everything into TensorFlow before feeding the computer a little prompt? Or will we be in a world where many of these tools are crippled to the point of uselessness? Or will we be in a world where companies just accept that sometimes someone will take them to court and settlements will just be another business expense?
Aside from the economic issues, if I’m going to be writing some Adderall-infused essay right now, I figure I may as well rant about the other two.
The first of those is biases; while I have always been aware of this problem, it was very evident after playing with DALL-E 2 this past week. The datasets seem to be very European-centric (US, Canada, NZ, AU, etc included in that). There were a number of things that I could not get it to generate correctly or would produce wildly different results based on the prompts I fed it.
As relatively minor, silly example, while I had no trouble getting it to produce white-washed movie posters for by asking it for “Netflix live-action adaptation of Dragon Ball Z”, I couldn’t get it to do the same for India and Bollywood. I am sure someone who has spent more time doing “prompt engineering” than I have could possibly get better results, but the point still stands in the sense that it seems to like to produce what Americans & Europeans are more exposed to.
Another example is, just ignorantly guessing, the way things are weighted — e.g., I wanted to have it produce art in the style of shows like Tatami Galaxy, The Night Is Short, Walk On Girl, and things like the album covers for Asian Kung-fu Generation. All of those listed are done by Yusuke Nakamura. However, perhaps rubbing salt in the wound for artists, I was unable to get the results I wanted by asking for things in Yusuke Nakamura’s style. Instead, I had much better results by using “Yuasa Masaaki”, the director of Tatami Galaxy and The Night Is Short, Walk On Girl. My only guess here is just that Yuasa is a more common occurrence in text, at least in terms of the English internet.
In that sense, I worry that these tools will not only replace people, but they’ll reinforce existing cultural and societal biases even more so than we already do on our own now.
Further compounding issues like that is the censorship and filtering on OpenAI; virtually anything political, violent or sexual is not currently allowed (and may never be? I figure no company wants to be associated with the potential PR disasters that come with it, like Microsoft’s Tay). This is extra problematic given how important all of those things are in art. I worry that we will lose artists and have fewer people pushing the boundaries of art.
And ironically, for a society that criticizes countries like China so fervently for censorship, we rely so heavily on giant capitalist companies which self-enforce much of the same censorship, helping further enforce the status quo and restricting marginalized minorities. I am not a fan to say the least, despite understanding the potential for violent speech, propaganda, etc.
Anyway, that’s the end of my long-winded 6 AM phone rant. Apologies for any weird formatting or incoherent thoughts!
Speaking of MTL, I have never found a satisfactory replacement for ChiiTrans. An app that shows multiple translations side-by-side, including literal dictionary translation and phonetic translation. It would even show you alternative meanings on mouse-over. Is there any more modern app like that? The western fans of Japanese visual novels were decades ahead of their time.
And I agree on the censorship angle too. Imagine if politicians had to stoop to Youtube Video Essay levels of self-censorship, talking about the parallels between badguy Germany and present day badguyism. Or about the psychological effects of naughty trafficking and childhood naughty abuse. I do not want my political discourse limited to PEGI-6 language.
Similarly, I don’t see why I shouldn’t be able to transform or straight-up copy and distribute someone else’s art if it provides people with happiness.
Would you stop doing it if the artists said “this is making me unhappy”?
He doesn't have to, parody is inherently transformative under copyright, but he's a decent guy, and he's kept a song or two in the crate because the original artist didn't want a parody released.
I support that, and I also support not caring, as a matter of principle. It's supererogatory; it shows that Weird Al is a good person, but not that someone who releases a parody without asking first is a bad person.
You gloss over the economic issues, but those are fundamentally at the center of it. Some copy-able work can be done by semi-amateurs in their free time, and some can be paid for as a 1-off contract without needing exclusivity, but most seems to require trained labor with a guaranteed return on investment.
What it comes down to is that under capitalism, a lot of art like movies, tv, video games, certain more polished professional tools, etc, requires an artifical concept of ownership to ensure that people are able to spend time creating them, because choosing to do too much uncompensatable labor is effectively choosing to starve to death.
But maybe, like me, you see the possibility of a future where we can find a way to compensate art that isn't locked into the grim realities of capitalism. Well, cool, but then political strategy has to be considered. As is, ML seems to be a technology that funnels money away from compensating creators and towards large corporations that can invest in ML and their shareholders. And in our practical reality, more money buys more speech, more speech tends to lead to more power. Thus, self interest rules the day, with the wealthiest uninterested in changing the fundamental economics systems and working to stop it.
That's why I am concerned about ML and copyright. It would be no problem in an ideal world, but in our world, it makes the status quo worse in a way that prevents progress towards that ideal.
That the translators "never saw a dime" isn't accurate. A large corpus of parallel texts is supplied by the EU, which pays its translators and then publishes the documents for free. Other parallel corpora include manuals. While there's definitely copyright on that, it doesn't lie with the translators.
Machine translation has been a curse for those who enjoy content that's not usually promptly translated and instead sits there for years if not decades. Not uncommon are people asking in broken English how to use certain tools e.g. automated DeepL scripts and releasing their "translations" (or edited machine translations, changing just some of most glaring issues with the text, generally), thus "burning" a release, that is not going to be taken by real translators that require weeks, months of draining work to complete so that context, subtlety and most meaning from the original source is not lost.
It's a race to the bottom. It doesn't sound nice to say, but as things become more accessible it's harder and harder to filter out genuine content from look-alikes, asset flips and low-effort automations, this may be unfamiliar to anybody looking at Steam releases for example. I wonder how it'll turn out for artwork. I want to be optimistic and think of it as an additional tool to artists or people that want to create new things rather than a displacement wave with its significantly smaller costs.
I can concur; there was a scandal within the fan translation scene for PC-98 games when it came out that at least one high-profile "translator" was just lightly-editing machine translated output to sound correct. Problem is, there are all sorts of mistakes you can make in translation that will not sound incorrect at all unless you speak both languages.
I must ask what you mean by "burning" a release, though. If we're talking about fan translations, stuff gets retranslated all the time. And most of the stuff that gets fan translated is far too niche to justify a commercial release to begin with. If something is popular enough to get an official release, the fan translations either get taken down or disappear on their own.
Translation is not exclusively a salaried job. Most translators in the fields I'm familiar with (games and manga) are freelance contractors, and most are getting paid by the word/page, not a yearly salary.
ML has depressed the rates those people earn. It's been good for the demand for editors though since they need to pay someone to fix the DeepL output up so they can release it.
> I hope someone can engineer a new economic system that works for what's coming.
What's coming is post-scarcity for all kinds of intellectual work. There will be nothing to economize: supply of both labor and product will be infinite. I don't want corporations engineering a made up economy where there is none.
Unfortunately, all intellectuals also need to eat, bathe and clothe - areas where no "post-scarcity" is on the horizon (with climate change, both food and water may become less abundant). So an economic system will need to be invented.
I predict a new system arises in which styles themselves are copyrightable and eligible for royalties.
Human-sourced styles will become trend, as people seek originality to establish identity in an otherwise indistinguishable world of machine-driven experiences.
Good luck figuring out a style that isn't "cribbing" off of another artists style. In which case if that becomes legal then the ML will just be used to generate unique styles which are similarly cribbing.
There is no definition of "style" here you can rely on without causing just as much harm to human artists as you would be restricting AI art. Everything is derivative of something/someone else after all.
I'd say the patent system, and the current implementation of the copyright system, are far from perfect and commonly abused, yet they remain intact. I'm not sure if your criticisms will prohibit governing bodies from introducing lobbyist-friendly legislation. Fair-use policies will help keep Average Joe out of prison, while also diverting revenue.
Just like copyright today, even though the individual supposedly benefits from this scheme, it's companies who have greater stake in the outcome.
I think your idea is prescient. I can imagine a market of proprietary "styles", I suppose in the form of pre-trained networks/datasets, that can generate any number of concrete instances of images, texts, and other media.
Yes, big-name artists working with corps to create style generators for a campaign. Watermarked images which cause proprietary, legal ML scrapers to reject the sample.
Suddenly, copping someone's steez becomes a matter of revolution.
This is kind of ironic, as many artists who have a unique visual style insist their art is about the deeper meaning and not just the look. (I don't really agree that is that important to most observers)
Yes, if you were ruler of the world, you're going to have an awfully huge amount of people doing f.a. There are only so many hairdressers, masseurs and plumbers required.
What you need is a kill switch, something to slow down population growth.
Anyway, are you up to date with your mandatory medical treatments?
Population growth is slowing all of itself. India has reached replacement, China is close to starting population decline. Few places are left outside sub-Saharan Africa with above replacement birth rates, and they are dropping there too.
Community ownership. We need everyone to be part owner in the economy that they rely on. Then productivity gains are good for people even if there is less work to do.
The question that fascinates me about this whole debate is on what basis (if any) one draws the line between a machine learning from copyrighted data vs a human learning from copyrighted data.
And I do think “learning” is the correct metaphor here. There are good reasons to believe that the brain is doing something conceptually similar to generative statistical models. i.e compressing inputs (/percepts) into some kind of abstract representation space. Even if we don’t understand the specific details yet.
As far as I’m concerned, it makes no sense to prevent training on copyright data. If I use a trained model to substantially reproduce parts of a copyrighted work, feels like the issue is largely with me and what I do with the content.
I think it makes sense to have a distinction between human and machine learning.
• Human memory has natural limits in capacity, accuracy, speed, etc. Even if you wanted to, you just can't launder copyright all of humanity's works in a blink of an eye. Things done by humans are limited to human scale, but automation can elevate things to an industrial scale (it's legal to fart, it's not legal to have an industrial-scale sewage spill).
• Humans have rights, machines don't. Being able to learn and use knowledge is fundamental for humans, and (pre-Disney) copyright is supposed to be a balance between individual's freedom of creativity and control over their works. Machines don't have needs or rights to learn and express themselves or need to control "their" creative work. Without creator's needs in the picture there's nothing to balance, it's just a business model based on using other people's work.
It does give broader access to illustrations and artworks, and this is likely just a stepping stone to even better tools. ”Prompt engineering” is becoming a creative skill itself.
OTOH a corporation scraping artworks and making money from producing knock-offs (you can even directly name an author to copy) doesn’t feel fair to me.
I'm wondering how the best possible model at some given time in the future would not be reliant on the resources of a huge tech corporation like OpenAI and cloud organizations that have the computing power to process massive datasets like CLIP's.
One of the first links provided on the subreddit for Disco Diffusion is a page describing the effects of adding an artist's name to an image prompt. There are hundreds of artists listed on that page. How long will it take until one of those artists notices that ML art associated with their name is getting significant attention and says it makes them unhappy? How many deceased artists on said list that are unable give their opinion would have objected if they were still alive?
I suspect that even if all the arguments about copyright and learning fail to make any difference in the legal system, the original artists will eventually catch on and provide their individual opinions on what they think of the entire concept. Then the issue becomes not one of copyright infringement, but disrespect of someone else's time and labor. It would become a social problem with no technological solution. You can't tell creators what they ought to feel about people generating new art by reducing their name to a weighted keyword to be fed into a corporation's million dollar ML model.
As precedent, some authors don't want people making derivative works from their content, for reasons that might not necessarily have to do with copyright. An example is not wanting NSFW content derived from the same fictional universe. In the case of copyright, sites like fanfiction.net ultimately sided with the authors who didn't want fanworks and banned users from posting them. AO3 was subsequently created in response to fanwork bans.
I'm wondering if a "no AI" clause will become part of many services' terms of service in the future, splitting them along ideological lines instead of legal ones and causing a general air of distrust to arise from the ensuing need to detect which art was not created by humans. It feels like ethical reservations about including artist keywords in prompt engineering would be stronger over on the artists' side (depending on the artist's opinions), and could cause pushback at some point. Whether or not it will make a difference remains to be seen. My guess is it won't really, because it's impossible to ban GPUs or technological progress.
How many deceased artists on said list that are unable give their opinion would have objected if they were still alive?
If a deceased artist has an estate that's profiting off their work, said estate will be especially vicious in trying to establish precedent that anyone running an artist's name through an image generator needs to work out some kind of licensing agreement with the artist, or get sued.
As a working artist, I am looking forwards to this happening.
Humans get away with much more, such as just photographing other people's work and then selling those photographs[0]. I don't see how generating a bunch of images, selecting one, and perhaps tweaking it more is any less fair use.
> Humans get away with much more, such as just photographing other people's work and then selling those photographs[0]. I don't see how generating a bunch of images, selecting one, and perhaps tweaking it more is any less fair use.
Humans have also gotten away with murder. I don't think that kind of reasoning is very solid.
My understanding is in industries where it really matters, like music, "tweaking" like that would require royalty payments at least.
I suspect that the difference is in knowing versus unknowing use, and the deliberate choice of including learned elements or not on the part of the user.
In the case of the ML model, it can reproduce a part of a training example perfectly, without the user knowing this happened (see the early Copilot analysis wherein it was possibly to get it to quote well known blocks of code verbatim). This is much less likely to happen when a user is consulting her own memories, though obviously never say "never".
On the basis that the "machine" is not a human and there is nothing that necessitates allowing neural networks to do all the things humans are permitted to do.
I don’t think it makes sense to talk about these systems as though they have any agency (and I think I was careful not to).
Of course there is nothing that necessitates letting these systems do what humans do.
What I am saying is that I am allowed as a human to ingest copyrighted material, learn from it, form mental abstractions and use those learnings to generate new art. Why then, would I be forbidden from externalising some of that process into an artificial system that operates on similar principles.
The trained model obviously does not have agency. It is a creative and computational tool, much like any other digital tool or product.
> What I am saying is that I am allowed as a human to ingest copyrighted material, learn from it, form mental abstractions and use those learnings to generate new art. Why then, would I be forbidden from externalising some of that process into an artificial system that operates on similar principles.
For the same reason you aren't allowed to make unlimited photocopies of a library book.
Where I think this analogy breaks down is that these models are not simply copying / memorizing the training data. They are using the data to fit a model, highly compressing the specifics while holding on to general patterns and abstractions.
I really feel like "reading lots of books" is a better analogy. (Of course its not perfect, since these systems can scale way past the number of books a human can read).
This analogy doesn't hold. The human equivalent of a photocopier is scribing a copy of a book. Scribing 100 copies and distributing those is just as illegal as creating and distributing photocopies.
Usage of the model can (and will) be automated. Also, unlike you, the model scales. This is a qualitative difference in the output produced. You are left only with the superficial similarity of the input process.
The article gives several examples of AI-regenerated Starry Night, and says:
It looks like it, but it’s not the exact same thing, it’s almost as if the AI is drawing it from memory, which in some way it is, it’s re-constructing what Starry Night looks like.
This seems to ignore that those are pretty clearly derivative works of Starry Night?
I'm not copyright expert, but I would have imagined the answer is relatively straightforward: if an image would be infringing if drawn by a human, it's also infringing if drawn by a model, and similarly for non-infringing. Does the manner in which the image is generated play into whether it infringes a copyright?
> Does the manner in which the image is generated play into whether it infringes a copyright?
Yes, it does. Let's leave AI out for a moment.
If you lock yourself into your room with no Internet access, and you draw (or write etc) something independently that just happens to look exactly like some already existing copyright-ed work, you are still not infringing any copyrights.
If you produced the same work by making a copy, you would infringe copyright.
AI isn't locked in a room. No end user can verify the sourced training images, so anything coming close to an existing work should be assumed to have been part of the source material for the derivative.
Oh but you can. I've scanned the LAION training data set (which was used for mini DALLE, Imagen, and Disco Diffusion) and I recognized several images that long to business partners of mine.
How does an AI know what Obama looks like? It memorized thousands of images of him, most of which were by professional photographers and, hence, copyrighted.
Well, it probably didn't memorize any single of them. One thing that has changed dramatically from back when I studied AI (a whole 5ish years ago, sigh...), to my shock, is that the number of training epochs has dropped dramatically. Sometimes down to just one. If it's only ever looked at your image and updated its weights once, could that really be enough to copy it? Only if virtually everything in your image is also present in thousands of other images, which would suggest there was very little originality in your photo to plagiarize. (This is certainly the case with Obama's press photographers - political press photographers aren't hired for originality!)
> political press photographers aren't hired for originality!
Yes they almost definitely are, what else would they be hired for? You wouldn't watch the same news everyday (even if it feels that way)
> the number of training epochs has dropped dramatically
So? Lets imagine a documented nurodiverse person (lets use a photographic memory) "snapsohts" a famous "works" and then sits in a room, after 5 years training, to produce "some work" are they free of copyright claims? Have you not seen the nurodiverse person who draws the entire New York sky line from 1 helicopter ride?
Your approach seems non-sensical, 1 epoch - 10,000 epochs, it doesn't matter - its still a copy / derivative work (which may exempt it, fair enough).
Personally I think all current AI work is just copyright on steroids, DALL-E outputs Shutterstock logos given the right input and Githubs co-poilt is just a lawsuit waiting to happen (implying anyone has the funds to actually sue Microsoft these days, the T&C's of Github must be incredibly broad).
> Yes they almost definitely are, what else would they be hired for?
Craftmansship. They have subtle, but probably very good, intuitions about lighting, facial expressions etc. and also a lot of concrete knowledge about these things. All of which an algorithm can pick up on by examining thousands of images once each.
Once or many matters because if you just examine each data point one, there can be no overfitting in the traditional sense. And that's what shocks me about the "one epoch is all you need" fruits.
> Once or many matters because if you just examine each data point one, there can be no overfitting in the traditional sense.
What makes you so sure? If my algorithm was literally just storing stuff in a hashtable to look up later, you'd get overfitting from a single exposure.
Well gradient descent doesn't do that. And the models, while big in terms of parameter data, are not nearly big enough to actually store all the training data.
Think of it in terms of updating beliefs about the target distribution. With backpropagation, you predict based on the input, and update your beliefs according to how wrong you were. So in a sense it's unsound to re-use data - your beliefs already incorporate them! And traditional overfitting is all that - it's when you use up all the information in your training data. This was many people's objection to neural nets (and I thought it was a good objection at the time, and thought myself that the future lay with more "sound" methods, which performed better on most metrics anyway at the time, rather than with dodgy biomimicry which wasn't really even similar to biological brains at all).
But yes, there are other types of overfitting if you want to get philosophical about it. It's just that the one I and everyone used to worry about, from training too much on your data, just isn't important anymore. And most of those clever principled and less-principled regularization methods just don't matter anymore!
I think the problem is with the definition of copyright. It doesn't apply to the current environment. It could even be the case, that it never made sense in the first place, and only now, we are realizing.
I have never seen Obama in real life. If I can draw him because I know him from copyrighted works, is my draw copyright infringement?
Are you copying one or more of those works? If so, yes. If you aren't, then no. How do you know? Well as an artist, after thousands of hours of copying, and thousands of hours of drawing from imagination, you know which activity you are doing. As an untrained person, you might not know, but the quality/value is so low no one cares.
But drawing from imagination is just copying from an abstraction existing in your brain. And that abstraction comes from inputs from the outside that you saw in the past, where would it come from if not that?
In the Obama example it comes from copyrighted inputs. Without the photos in the magazines and newspapers I would not have idea how Obama looks. Any draw I do, it's going to be derivative of those copyrighted photos and, maybe, other inputs more generics that I saw in the past (that could be copyrighted too).
How is that different from what that software is doing? It's not, it's just that it's done in an industrial scale.
This is just the same problem an artisan had when factories started to appear.
You are using the words 'input' 'output' and 'abstraction' interchangeably between the computer version and the brain version. We know next to nothing what those words mean regarding the brain. Declaring an equivalence seems like a substantial overreach. What we do do is use the terms from the latest technology to describe the brain[0]. What's particularly egregious about this iteration is we are also using human terms to describe the latest technology, and then using those terms 'inspired by', 'creative', 'intelligence', 'learning' as if the algorithms have anything to do with what humans are doing. And then even more, we go on to use them as smokescreen for piracy "the computer 'learned', therefore it's somehow not copying the source material".
Publishers license text to print in books. Music producers license samples to use in songs. Artists license photos and textures to use in illustrations and scenes. _License the source material used in training sets_. _Then_ go out and industrialize art, everyone wins!
How does a human know what Obama looks like? They memorized thousands of images[1] of him, most of which were by professional photographers and, hence, copyrighted.
At what point is a neural network sophisticated enough that it can be compared to human learning for copyright purposes?
[1]: For most people at least. Some will have actually seen him in person, and for those this statement would of course not be true.
Wait, if you draw Pikachu or Homer Simpson from your memory, even without access to internet, you are still infringing the rights of the holders of the characters.
How is it different than OpenAI or Midjourney drawing pictures of Homer Simpson from their database/"memory" ?
Characters are different, as recognisable characters have copyright on their own (not all characters do). But let's say you're drawing a painting from memory, you may or may not be infringing copyright. The question is about substantial copying.
> If you lock yourself into your room with no Internet access, and you draw (or write etc) something independently that just happens to look exactly like some already existing copyright-ed work, you are still not infringing any copyrights.
> Does the manner in which the image is generated play into whether it infringes a copyright?
Is taking a picture of a painting enough to violate copyright? Is a digital copy of a VHS tape a copyright violation? Is copy/pasting an image a copyright violation?
A computer is not creative because it cannot think. It can only copy what others do; it doesn't understand what it's doing, only how to do something. Equating a human with a computer is unreasonable at the current state of computing and I doubt we'll see a truly conscious AI in the future.
For an automated tool to replicate art into a state such that it would no longer be a mere reproduction of copyrighted material, one would assume that the tool would gain such sophistication that the tool itself should deserve copyright rather than its operators. After all, a commissioner of an art piece may provide the prompt but the art itself is under copyright by the author.
We'll have to see how the courts look at these reconstructions. Personally, I believe tools like DALL-E and Copilot are no more than fancy copy/paste systems and should only be trained on copyright-free materials for their use not to be subject to copyright issues.
I’m also not a copyright expert, so I wonder about things like if I draw a (crappy) picture of Spider-Man and slap it on a tshirt, can I sell it without violating copyright? I’m sure there’s established law about this kind of thing.
If your shitty Spidey is a drawing you did yourself, you are not violating copyright. You are however blatantly violating Disney’s collection of trademarks on “Spider-Man”, the distinctive appearance of the character, and a host of associated marks.
I have to admit that I did not see it coming. Ignorant of AI's capabilities, I always imagined it to automate away our boring, mechanical, repetitive stuff. Which in some hopeful utopian scenario would mean that we could focus on the humanities: family, health/sports, culture and anything creative.
For AI to come for art, at the very core of our humanity (the oldest cave painting is 64K y/o) is quite the confrontation indeed. One that will rapidly escalate more broadly (poetry/writing, photography, cinema, music) and more deeply. The amount of AI art will explode, in turn serving as input for even more AI art.
The speed at which this will happen suggests to me we need to be far more alarmist in coming up with answers to many questions. Philosophical, legal, socio-economical, etc.
I really think the attitude "I'm not an artist so this will only benefit me" is shortsighted, and I'm saying that as a non-artist myself. It's like we're at the beginning of a very short and steep exponential curve of total disruption.
This is how I feel as well. I'm a software engineer by trade, but I'm really much more interested in writing music and stories. I always imagined myself retiring the day I have saved up enough money so I could focus on art and creativity. My life certainly has felt much less meaningful since I saw what DALL-E 2 can produce.
And my main concern is not that of professional artists' financial situations. I'm mainly worried about how the massive influx of computer-generated content will inflate away the meaning of human-created art. And I'm afraid that when art becomes so good and so customized to each individual consumer, we won't have much common culture left.
In a way, I'm so mad at my fellow engineers. There's so much good we could do, is really killing the creative arts the right thing to do?
And I don't believe for one second this talk about AI just being a tool, and prompt-engineering being a new craft for artists to learn. I think this kind of interface will quickly go away, and soon enough AI apps will look much more similar to TikTok or YouTube than they do today.
We seem to have some things in common. I'm also a software engineer with a creative hobby: wildlife photography.
In that space, there's already a kind of abundance problem. It's saturated. So even in the case of a "hit piece", the appreciation lasts about a few hours. A bunch of likes and some shallow comments. That's it.
If I were to base my meaning on that, I'd say it has no meaning. That's why I find meaning in my deep love for my subjects, as well as the process of photography itself. I decouple external validation from my intrinsic motivation. Few photographers do this, they crave the attention, which is why they're restless and miserable. They would surely feel even worse in an AI world where current saturation levels will do a 1000x.
The above you can probably apply to lots of other creative endeavors. For example, creative writing on a blog. You already can't get noticed today, imagine the avalanche of AI writing making this problem exponentially worse.
So we will have little shared meaning, at best personal meaning, but only for the strongly intrinsically motivated, the rest may stop altogether. On top of that, the new "creators" will also experience little meaning. You can generate the most beautiful piece of AI art, but can't seriously claim: I made that.
Computer-generated content may pose an essentially insurmountable challenge to human practitioners of most art forms, but not all of them. The forms of art that are at risk of being overtaken by ML are those in which individual works of art can be faithfully represented in some storage format (which may be anything from a video file to a piece of paper with things printed onto it), which is used to transmit the work of art to an audience via inanimate means.
Those art forms that feature the physical presence of an actual human being, such as theater, dance, stand-up comedy, musical performances, etc, will presumably remain somewhat safe from the flood of computer-generated content, and people might even flock to such art forms in pursuit of authenticity. Things like improvisational theater also add an element of genuine human reactions to the mix, which will no doubt attract some interest in the age of AI art, which has no direct human will behind it. Of course, AI could produce imitations of recordings of such performances, but not the actual physical performances themselves, and people already seem to very strongly favor actual performances over recordings.
Ironically, mass-produced AI art might conceivably cause a cultural shift from our status quo of having an abundance of inanimate mass-market art with essentially global reach to a culture favoring local performative arts (which, aside from concerts, have no real mass appeal at the moment), which would essentially foster a unique local art scene with some limited number of performers for each city that mostly stay in that city. Such a scenario wouldn't result in hyperindividualized art, quite the opposite.
I like this view. That is a future I could find meaning in. I actually spent some time during my college years doing amateur musical theater, something I didn't think I would return to, but maybe I will.
> The amount of AI art will explode, in turn serving as input for even more AI art ... The speed at which this will happen suggests to me we need to be far more alarmist in coming up with answers to many questions ...
I agree. I don't want to live in a AI-hamster-wheel future.
Human creativity is driven by emotion, curiosity, intuition... etc. AI creativity is basically an advanced form of copy-paste/blend using context -- all of which is learned from human content.
We should be working harder to empower human-creativity... which is an argument AI artists often make in favor of generative creativity... however this narrative falters when money gets cheap and automation disrupts whatever economies exist for creatives (esp writers and artists).
Brynjolfsson and McAfee made a simular argument a decade ago - we will use automation to replace rather than augment human labor, the result being a local optimum far from the global one (that's in terms of results, the dystopian disempowerment of humanity is another story.
> I always imagined it to automate away our boring, mechanical, repetitive stuff
Which includes most art produced.
Do a google image search for something stock-photo-y like "man in business suit kicking". You'll find a staggering number of photographs and cartoons designed to be used in power point slides.
If you play around, you'll find any variation of "man in business suit" paired with an action will produce similar results. People were paid to make all of those, but it's so repetitive that it's not hard to see how an AI could find the pattern.
Copyright will need to be rethought in this new world, where artistic interpretations of copyrighted works can be mechanically produced. Personally I don't see how it can be made coherent.
It depends on what the painter does with the 'new painting'. If it is solely that the painter(considered a student) is learning how to paint, then copyright infringement would not apply as it would be covered under 'fair dealings'.
This would apply in the UK, perhaps Europe. I'm not familiar with US law.
Quoting the Act - "Fair dealing with a artistic work for the purposes of private study does not infringe any copyright in the work."
If the painter then goes on to sell this painting, well, that's copyright infringement because they are making a copy of a painting available to the public.(assuming no licencing or authorisation from the legal copyright owner was previously obtained)
Either you or I misread the question. I think it wasn't about selling a copy. It was about using copying to learn how to paint (which is fine, as you've stated) and then produce your own painting which you then sell. The sold painting isn't a copy. So .. no copyright infringement?
Ah, I think it depends how you interpreted the question. I took 'new painting' to mean a copy of other painter's work.
Looking at it from your perspective(which on reflection is the right interpretation), I would say no copyright infringement assuming the 'new painting' is an original creation i.e. not a duplicate of an existing piece of art by another artist.
Plenty of people, myself included, learn creative arts by mimicking others and then developing our own style which might be similar to the artists we got our inspiration from. I think as long as it's clear that there is no duplication of the original works(for sale), then all is okay.
So if I learnt my art by mimicking Van Gogh and my new paintings are of a similar Van Gogh style, as long as they are original creations, no copyright infringement imo.
Okay I dislike the US copyright system immensely - virulent yearning for reasonable terms and ending sampling payments forever more…
But let’s get practical here for a minute - if an ape can’t own the copyright to a selfie, then nothing generated by a machine - no matter how sentient - will legally be protected.
Just keep that in mind, because the ML results will not abide by human creation protections.
Feel like people would have said the same with images 6 months ago tbh. (I mean sure, you could get a fake photograph of a person, but the text control that clip has unlocked is just wild…)
I’m not sure I get your argument. I’m saying that it’s impossible to distinguish between machine written music and human written music.
You could maybe make the argument for distinguishing between a human and a machine playing a piece of music.
But if I write a song that just goes A, D, C, G on a loop (plenty of songs like that), how would you determine that a machine song that goes A, D, G, C is not human written?
>Moreover, the developers of these tools are aware of the potential pitfalls of producing exact replicas of art in their training datasets. OpenAI admitted that this was a problem in some of the earlier iterations of the program, and they now filter out specific instances of this happening.
This is the most telling thing. They have a system, which might not even be part of the "AI" it might simply be they are comparing outputs for close matches to the OG dataset and then hiding them from the end user.
I mean we've seen OpenAI's solution to their un-diverse data set so we know what standards they work to.
What's fascinating to me is how many of the people decrying what's happening as unfair to artists also brag about being software pirates, where actual human labor is going unpaid for
> Assuming that the input phase is fine and the datasets used are legitimate, then most infringement lawsuits may end up taking place in the output phase. And it is here that I do not think that there will be substantive reproduction to warrant copyright infringement.
This might be true, but feels like maybe the most practical answer to this whole question is to focus on the inputs and to establish ways for artists to control whether their art is used as input. Don’t let legally questionable content into training, and you automatically prevent getting legally questionable output. Why would we do anything else, really? It’s potentially a large assumption to make that the datasets are currently legitimate under the lens of copyright. Pulling images off the internet just because they’re accessible is not valid, and humans aren’t currently allowed to do that under the law. But it might be comforting to artists and practical for both artists and AI researchers if artists could specify a form of copyright or a license that allows ML training provided that their work is not recognizably reproduced. Certainly datasets like Flickr with Creative Commons search tools allows a wide variety of public use that is entirely legal. What reasons do we have to not establish something similar for machine learning?
If I interpret that article correctly, at least Dall-e does not claim copyright on the generation nor does it grant it to the person writing the prompt. The copyright question seems entirely unanswered.
Perhaps that should be codified in law explicitly: AI generated art cannot be copyrighted.
If it was created from the collective input of humanity (unasked), its output may as well benefit all of humanity as well. It's hard to make the case that somebody writing a prompt deserves full copyright protection. Even if so, just add one word and one can claim it's once again an "original".
Not respecting AI copyrights would also help protect the vast area of commercial art, as it lowers its value if it cannot be legally protected. This could get hairy though, because what happens if you take AI output as input and then start manually tweaking/editing it. It would then be a mixed AI/human works.
This just says that the AI can’t own the copyright, just like how Photoshop, the software program, can’t be granted a copyright but a person using Photoshop can.
This is not what this says. This says that the US Copyright Office currently holds that human authorship is a requirement for any work to be copyrighted. They hold that the AI is a form of non-human creator, and thus all of its works are not eligible for copyright.
If this all worked in a fair way it would be treated similarly to sampling in music where the original artists get a royalty each time a track is used. If that source is clear like where it's "x in the style of y" using these elements of x explicitly. However, the complexity is often too great and how can you prove fractional sampling of your work? Very tricky as the article points out! It's also blocking many new forms of art if you enforce the fractional sampling idea in law.
On the other hand, small startups selling to customers who need to produce proprietary content could incur a risk with AI generated products. Aka the customer won't want to risk their product by including potentially infringing items (like with Co-Pilot's potential use of unpaid labour or open source code and other licences).
Isn't this the other half of the equation for "I host all my pictures/art on site xyz because it's free" and somewhere buried in the accepted terms of use was "you give us permission to use it for anything we want"?
I have a different but related question. Let's say i use a dall-e like ai to generate character models for my next game. Are they copyrightable? What would stop someone to outright rip off the models from my game and use them in other works?
> It looks like it, but it’s not the exact same thing, it’s almost as if the AI is drawing it from memory, which in some way it is, it’s re-constructing what Starry Night looks like.
The blog post also has a picture of four slightly-different recreations of a Van Gogh painting.
The thing is, under current copyright law in the US, all four of those recreations would be legally considered the same image. The standard for copyright infringement is access plus substantial similarity. It would be easy to construe access to a training image set through ML software that was trained on that set; and thus any regurgitated output would be infringing.
I don't know if "copyright infringement" is really a useful legal construct here. The issue is not so much that a person's work is being duplicated, but that a person's work is being used to reduce the value of their labour without consent or compensation. It's probably more under unfair competition. Ultimately, though, this issue probably requires new constructs and laws, which is unlikely to happen. The law is the ultimate expression of the power of the dominant class, and these sorts of AI/ML systems benefit the socially dominant class of business owners and investors.
Does anyone have any thoughts on how this might relate to the Marvin Gaye estate vs Robin Thicke and Pharrell Williams "Blurred Lines" case? I know music has very different copyright rules to visual images but that case springs to mind as it was a situation where infringement was judged to have occurred on much looser grounds than previously required.
Could a similar expansive judgement happen in the case of AI art and shift the goalposts so to speak?
Independent of whether one thinks AI generated images are really art, I feel like they are getting very good at simulating "inspiration" and other indirect ways of copying other artists and their styles. I think the last thing we need to do is start incorporating copyright and other filters into these models to filter out content.
This just empowers others to control and monopolize content, worse than today.
It's more difficult to trademark stuff like style, there are certain rules of what can be trademarked; long story short, non-graphical representations need to be clear, precise, self-contained, easily accessible, intelligible, durable and objective.
Trademark also gives a more limited protection in some ways.
I searched just quick, and for ex. Getty does not allow to use pictures for ML.
Anyway, I think is pretty clear, usually it is not allowed, exept. user upload a picture to a platform and give all rights away because the platform say it in the EULA.
I think it’s clear that what’s happening is (mostly) legal, but the better question is whether it should continue to be. It will be a bad outcome if a few people profit off of a huge group of people’s labor without their consent and without compensating them, in order to put that large group out of work.
Technological advances change the labor market all the time, but it’s absolutely different to exploit your own labor in this kind of way to put you out of a job.
I think if someone wants to build a research ML model, it should be fine to use whatever data you can get your hands on. But if you want to make a commercial model, you should have to have the content creators’ permission to use their data to train your product.
Cat’s out of the bag. Stable-diffusion runs on an m1 mac, is competitive with dall-e, and will soon be open source. Midjourney was made independently from dall-e and is arguably better. This is the most revolutionary artistic tool to ever exist and anyone with a few thousand bucks can download a script from github and train it themselves. My advice to creatives is to figure out how to thrive in this new environment because you aren’t going to be able to hold back progress or stop time.
(Don't worry, us programmers will be next. Language models can already generate training data in the form of tests for code-authoring neural nets.)
The creative process, like many kinds of fields applicable to business is about problem-solving. AI text-to-image generation doesn't replace that function, however it does form an excellent tool, especially when it comes to rapid conceptualisation.
This will allow more people to be creative problem solvers without needing to possess technical skills in image creation. Much in the same way that graphics apps allowed people to make image without needing to learn studio or art skills. Or DTP tools allowed more people to publish without the tedium and high set up costs.
I will still be hiring illustrators and designers, and this may be one of their tools, and it would be their responsibility to be experts in it, but it doesn't replace them - it makes them better illustrators and better artists.
The right way to think about this is not that it shrinks the field, rather it opens it and accelerates it - for that it's a very welcome addition.
No creative is scared of this - they're looking forward to the next-generation approach, and it's clear that 2D images are not the end point. Soon we'll have 3D (already in progress), soon we'll have music, soon we'll have this for animation and programming.
One can easily imagine all sorts of ways this technology can be implemented, take a simple example of a video game: Imagine a horror genre that hones into what scares the viewer and delivers further upon that.
It's not just the model publishers that profit from generative models, the users gain something as well. And the access barrier being lowered, now many more people can benefit. Even the artists can benefit.
Let's not forget the contribution of the prompt to the final result. The users have a finger in there too. That means they are directly involved in the creative process, they deserve a claim to the copyright.
Training neural nets is a novel transformative process - it only takes gradients for each training example, it doesn't save any image, just take the gradients and sums them up overlapping the influence of all the examples, mixing up everything in a way we can't tell which feature comes from which training example. More like inverting the dataset into abstract principles and then generating top-down novel examples.
It's like learning, but differs in that humans can't possibly see so much data. It's not like copying, but it can generate copies of training examples as an edge case. It feels like stealing, but you can't assign who is the victim, maybe it draws a bit from multiple sources of inspiration. It's very creative and inspires all sorts of idea juxtapositions, but some people consider it's only parroting known styles.
In the end I think we will want to use the new toys so we will allow training on anything. Maybe a "Do-not-use-for-training-neural-nets" flag will be observed by the conscientious labs.
If I've legitimately obtained a copy of your work, then I'm entitled to use it, read it, even make my own copies for personal use. Any new work I attempt to distribute would be measured against your original to determine infringement.
only thing that should worry people is the fact that it is cheap and bad much like machine translation which would make people rely on it, downgrading the baseline of art for companies forever
> I prompted DALL·E for “Crumbling ruined city, sci-fi digital art by Simon Stålenhag”, and it produced a few images that were very much in the style of Stålenhag.
The image shown is nothing has some concepts in common with Stålenhag, but it has a lot more depth. I can't find any Stålenhag which evokes the same way.
In Stålenhag's work, or what of it is easily accessible online, there are some recurring gimmicks, which are absent in the DALL-E work, or present in a disguised form.
DALL-E doesn't juxtapose any mint condition 1970's Volvos or Saabs against a background showing a menacing superstructure from the distant future. It doesn't invoke any cheesy mechanical monsters or characters inspired by popular Sci-Fi. For instance, "
Stålenhag lays out everything plainly. In about a millisecond, you see what you're supposed to see: the easily recognizable elements like ordinary people or retro objects, against fantasy objects, set in some nice lighting: atmospheric effects like haze, sunset or whatever that completes the mood.
In the DALL-E there are some old objects, but they are not instantly seen, and they are in old condition. For instance what looks like the wreck of an airliner appears in the rubble in the foreground, but you might not see it right away. That makes the scene plausible. It is more of a spooky glimpse at something that could actually someday be, somewhere.
It has the DALL-E weird elements in it. Like why do the cables on what looks like a telephone pole seem to connect to elements in the distant background; DALL-E seems to have perpetrated an https://en.wikipedia.org/wiki/Impossible_object there, but it's not wham-in-your-face; not everyone will spot it right away.
There is what looks like a channel of water through the debris, but what looks like a reflection in it is so weirdly wrong that you can't make up your mind whether it's a weird local reflection or a glimpse through an aperture in the bottom of a floating sky structure, where a distant reflection is shining up from some ocean below.
Stålenhag's work doesn't have any of these sorts of demented optical labyrinths in store for the viewer. He's like a Norman Rockwel*, targeting the nostalgia of that viewer who grew up on a diet of 70's, 80's and 90's anime and sci-fi, and is longing for the 20th century.
--
* E.g. check out the cheesy fromage called "The Long Way Home". Rockwell plus Storm Troopers. This just screams "Hey, you look transparently like a Generation X geek; would you be interested in buying a limited edition print of me?"