Hacker News new | past | comments | ask | show | jobs | submit login
Adobe Firefly: AI Art Generator (adobe.com)
908 points by adrian_mrd 6 months ago | hide | past | favorite | 483 comments



What this reinforces is that unlike with previous big innovations (cloud, iphone, etc), incumbents will not sit on their laurels with the AI wave. They are aggressively integrating it into their products which (1) provides a relatively cheap step function upgrade and (2) keeps the barrier high for startups to use AI as their wedge.

I attribute the speed at which incumbents are integrating AI into their products to a couple things:

* Whereas AI was a hand-wavey marketing term in the past, it's now the real deal and provides actual value to the end user.

* The technology and DX with integrating w/products from OpenAPI, SD, is good.

* AI and LLMs are capturing a lot of attention right now (as seen easily by how often they pop up on HN these days). It's in the zeigeist so you get a marketing boost for free.


I think you're missing a fundamental reason: adding AI functionality into products is simply easier.

That is, these companies are largely not doing the hard part, which is creating and training these models in the first place. The examples you gave of of cloud and iPhone have both huge capital barriers to entry, and in iPhones case other phone companies just didn't have the unique design talent combination of Jobs and Ive.


>That is, these companies are largely not doing the hard part, which is creating and training these models in the first place.

fyi, for Adobe Firefly, we are training or models. From the FAQ:

"What was the training data for Firefly?

Firefly was trained on Adobe Stock images, openly licensed content and public domain content, where copyright has expired."

https://firefly.adobe.com/faq

(I work for Adobe)


I am definitely glad to see attention being paid to ethical sourcing of training images but I am curious: did the people who made all those stock images get paid for their work being used for training? Did they check a box that explicitly said "Adobe can train an AI on my work"? Or is there a little clause lurking in the Adobe Stock agreement that says this can be done without even a single purchase happening?


I'm not a lawyer and I don't work for Adobe. :)

The contributor agreement linked from here[1] is this: [2]

"You grant us a non-exclusive, worldwide, perpetual, fully-paid, and royalty-free license to use, reproduce, publicly display, publicly perform, distribute, index, translate, and modify the Work for the purposes of operating the Website; presenting, distributing, marketing, promoting, and licensing the Work to users; developing new features and services; archiving the Work; and protecting the Work. "

I guess this would fall under the "developing new features and services".

What is funny is that "we may compensate you at our discretion as described in section 5 (Payment) below". :) I like when I may be compensated :)

And in section 5 they say: "We will pay you as described in the pricing and payment details at [...] for any sales of licenses to Work, less any cancellations, returns, and refunds."

So yeah. Sucks to the artist who signed this. They can use your work to develop new features and services, and they do not have to pay you for that at all, since it is not a sale of a license.

1: https://helpx.adobe.com/stock/contributor/help/submission-gu...

2: https://wwwimages2.adobe.com/content/dam/cc/en/legal/service...


The terms seem like legalese for "you pay me money now and get to do anything with it". It doesn't seem far-fetched for training AI models to be a valid use-case. This is way better than scraping the whole internet, for art by artists who have had no commercial arrangement with Adobe.


> They can use your work to develop new features and services, and they do not have to pay you for that at all, since it is not a sale of a license.

And in this case, to develop new features and services that specifically undercuts your existing business, viz. selling stock photos for money. Sucks to the artists, indeed.


That is basically the same boiler plate legal release for photography/videography/audio works in general. You sell me this thing, and I can do whatever I want with it.

From the popular Getty Images, used by media worldwide: > When a customer uses one of your files, it may appear in advertising, marketing, apps, websites, social media, TV and film, presentations, newspapers, magazines, books, or product packaging, among many other uses.

https://www.gettyimages.com/workwithus


I'm starting to think that use of works in a training set is a category not covered well by existing copyright law, and it may be important to require separate explicit opt-in agreement by law (and receipt of some consideration in return) in order to be considered legitimate use.

The vast majority of copyrighted works were conceived and negotiated under conditions where ML reproduction capabilities didn't exist and nobody knew what related value they were negotiating away or buying.


If this kind of abstract copyright regime of 'I had the idea first, and anyone who uses a derivative of my idea must pay me money!' is a very sillpery slope of anyone who makes art or music of any genre needing to pay a royalty to sony/disney, because that is where these 'flavor copyrights' will end up going. The right kind of ambitious amoral lawyer in a common law regime will leverage an AI royalty law into a generic style copyright law because that is what will be needed to write this law properly.

And on top of that, it will become a spotify where each creator gets a sum total of $0.00000000001 per AI their media item was trained on and maybe a few dollars a month, while paying a greater tax to apple-sony-disney whenever their AI style recognizers charge you a royalty bill for using whatever bullshit styles it notices in your media items.

Copyright should stay in it's 'exact duplication' box, lest we release an even worse intellectual property monster on the world.


> If this kind of abstract copyright regime of 'I had the idea first, and anyone who uses a derivative of my idea must pay me money!' is a very sillpery slope

This is a cartoon conception of both existing copyright and the proposal I described.

Copyright applies to works, not "ideas" (William Gibson doesn't have a copyright claim on the "idea" of a virtual reality but he does on _Neuromancer_ as a work). It gives creators incentives and a stake in their work by allowing for some degree of control in how works are used and especially how they're distributed.

What I'm proposing isn't that different, and it's simple:

People should be able to decide if their work is put into a training set, and what they get in return for it.

That's it. No "flavor copyrights." No ownership of ideas. Just a claim on the creator's work and how it's used/ distributed. Specifically updated to cover a case that didn't exist 10 years ago, a case whose explicit intention is reproduction of not just one narrow aspect of the work but a combinatorially large number of aspects of the work (there's no other reason for adding it to the training set, that's the nature of these models).

> it will become a spotify where each creator gets a sum total of $0.00000000001 per AI their media item was trained on

"the artists will get such a small payout so we should make sure they get nothing" is quite the take.

I think "artists should be able to negotiate what consideration they want in return for their work for a training set" is a better one. It's certainly better than "people collecting training data should be able to take anything they can get their digital hands on without any consideration other than possession." Maybe some artists would take a nanocent per use. Maybe they wouldn't. That should be up to them.

Part of the reason why services like Spotify are so terrible economically is the way digital rights were assumed by labels and streamers often as extensions of older pre-digital conceptions without much in the way of negotiation by artists. The current moment is a chance to do better across an equally large (if not larger) technological shift.


Derivative works or remixes usually require a license. Artists could very reasonably argue that AI-generated images are derivative works from their own images - especially if there is notable similarity or portions appear to be copied. They could also point out that their images were used for commercial purposes without permission and without compensation to generate works that compete with their own.

For example, even a short sample used in a song usually has to be licensed. Cover versions of songs may qualify for a compulsory license with a set royalty payment scale.

However some reuse (such as transformative use, parodies, or use of snippets for various purposes, especially non-commercial purposes) may be considered fair use. AI companies could very reasonably argue that use of images for training AI models is transformative and qualifies as fair use, that no components of the original images are reused in AI-generated images, and that AI-generated images are no more infringing than human-generated images which show influences from other artists.

Absent additional law, I expect the legal system will have to sort out whether AI-generated images infringe the copyright of their training images, and if so what sort of licensing would be appropriate for AI-generated (or other software/machine-generated) images based on training data from images that are under copyright.


I propose that it is impossible to prove that any content created after 2022 did or did not utilize ML/AI during the process of its creation (art, code, music, audio, text.) Thus, anything produced after 2022 should not be eligible for copyright protection. Everything pre-2022 may retain the existing copyright protection but should be subject to extra taxes on royalties and fees given the exorbitant privilege.

Though this sounds extreme, enforcing the alternative would break any last remnant of human privacy. It would kill the independent operation of computing as we know it and severely cripple AI/ML research when we need it most: human alignment.

It is possible that a catastrophic event occurs and halts the supply chain of advanced semiconductors in the near future, in which case the debate can be postponed indefinitely.


> AI companies could very reasonably argue that use of images for training AI models is transformative and qualifies as fair use

Yep. They could. And people do.

But this particular use is entirely new, though. Old conceptions of fair use can't and shouldn't cover it. "Transformative"? Sure, but there's a difference between transforming a work once and laying hold of it in automation to be transformed on request millions or billions of times in degrees ranging from simple to convoluted. It's hard to argue that the right to this exists in fair use when awareness of the possibility didn't exist when fair use conceptions were constructed.

The output isn't the issue so much as the consent & consideration for use of the input. We'll need case law or statutory law that understands this. It should be possible, but it should be possible on an opt-in basis. When you're training a model, you should know.

> AI-generated images are no more infringing than human-generated images which show influences from other artists.

If we end up with a policy differentiating the two and privileging humans and works created by them, I'm fine with that.


We are working on a compensation model for Stock contributors for the training content, and will have more details by the time we release.

The training is based on the licensing agreement for Adobe contributors for Adobe Stock.

(I work for Adobe)


I would be very, very interested to see a compensation system that took into account the outputs of the trained model - as in, weights derived from your work are attributable to X% of the output of this system, and therefore you are due Y% of the revenue generated by it. It sounds like Adobe is taking seriously the question of artist compensation, and I'd love to see someone tackle the "Hard Problem" of actual attribution in these types of systems.


I've looked a few times, but have not seen any research on assigning provenance to the weights used in a particular inference run. It's a super interesting space for a bunch of reasons.

But the naive approach of having a table of how much each individual training item influenced every weight in the model seems impossibly big. For DALL-E 2's 6.5B parameters and 650m training items, that's 4.2 quadrillion associations. And then you have to figure out which weights contributed the most to an output.

I would love to see any research or even just thinking that anyone's done on this topic. It seems like it will be important in the future, but it also seems like a crazy difficult scale problem as models get bigger.


https://arxiv.org/abs/1703.04730

> How can we explain the predictions of a black-box model? In this paper, we use influence functions -- a classic technique from robust statistics -- to trace a model's prediction through the learning algorithm and back to its training data, thereby identifying training points most responsible for a given prediction.


This would be an interesting way to distill/finetune a model by finding the optimal minimum corpus needed to match a given use case.


Oh thank you! Will go read and digest.


Could you not use tags used to label the image? If your image contains more niche tags that match the user input, your revenue share will be higher. Depending how much extra people earn for certain tags, it might incentivise people to upload more images of what is missing from the training data.


That's interesting, but I'm not sure it works. I think that works out to "for any given prompt, distribute credit to every source image that has a keyword that appears in the prompt, proportional to how many other source images had that same keyword".

If I include the tag "floor", do I get some (tiny) percentage of every image that uses "floor" in the prompt, even if the bits from my image did not end up affecting model weights much at all in training?

Worse, for tags like "dramatic lighting", it's likely that the important source images will depend on the other words in the prompt; "sunset, dramatic lighting" will probably not use the rely on the same weights or source images as "theater interior, dramatic lighting".

And then you get the perverse incentives to tag every image with every possible tag :)

I'd love to be convinced otherwise, but I'm not seeing prompt-to-tag association working.


The tags could be added by a model rather than the user submitting the image. Maybe do both and verify the tags with a model? Users could get a rating based on how reliably they tag their pictures and are trusted to add more niche tags at higher ratings. You could even help tag other pictures to improve your rating.


> And then you have to figure out which weights contributed the most to an output.

Why do you have to, though? What do you hope to trace back to exactly?


The business problem is “which items from the training corpus contributes the most to this inference”


That is impossible. You might be able to do it if you invented a completely different method of image generation, but the amount of original images present in a diffusion model is 0% with reasonable training precautions, and attributing its weights to particularly any of its input is nearly arbitrary.

(Also, it's entirely possible that eg a model could generate images resembling your work without "seeing" any of your work and only reading a museum website describing some of it. Resemblance is in the eye of the beholder.)


I think the naive approach of just dividing revenue equally across all contributors could be acceptable, and would have lower overhead costs.


Thanks! I am delighted to know that Adobe's got plans on that front.


What about the openly licensed content you mentioned above?


Hope I can get better rates from this compared than those offered by Spotify to my musician friends. How much are we talking about? 0.00001¢ per licensed generated image?


Nobody owes creators who have been paid fully for their work "extra" compensation just because AI is involved. Assuming they have been paid, the work belongs to Adobe.


The prior deal was based on royalties for use. Adobe pays you 33% percent of anything they make. It is consignment. So if someone uses a specific photo for $20, you get paid $6.60, no money us paid upfront.

So what should adobe pay you for using the data in training? Some kind of fraction of the overall revenue they generate from the new product? The license currently used for their stock program make it seem like they don't have to pay anything at all, because this use cases wasn't understood previously. Adobe reserved to rights to do it, so legally they can - but if they want to continue getting contributions they will need to figure out some kind of updated royalty sharing agreement.


> Assuming they have been paid

That is the question we are asking, yes. Based on the reading of the contributor agreement it sounds like Adobe doesn’t have to pay a cent to the creators to train models on their work.

Does that sound fair to you?


Some very strange responses in this sub-thread.

When the agreement was signed no one was even able to imagine their work being used for AI. As far as they knew they were signing a standard distribution agreement with one particular rights outlet, while reserving all other rights for more general use. If anyone had asked about automated use in AI it's very likely the answer would have been a clear "No."

It's predatory and very possibly unlawful to assume the original agreement wording grants that right automatically.

The existence of contract wording does not automatically imply the validity of that wording. Contracts can always be ruled excessive, predatory, and unlawful no matter what they say or who signed them.


> If anyone had asked about automated use in AI it's very likely the answer would have been a clear "No."

Maybe. Maybe not. Very clearly there is a price point where it could be worth it for the artist. Like if adobe paid more for the rights than they recon they will ever earn in a lifetime or something. But clearly everybody would have said “no” at the great price point of 0 dollars.


And I am sure that you use your computer for work, to make money, and yet you based on the reading of the contributor agreement it sounds like [The computer buyer/you] doesn’t have to pay a cent to the [computer creator] for all the money you make using that computer.

Does that sound fair to you?

See how stupid that sounds?


It sounds stupid because it's a completely different thing.

A tool maker does not have a claim on the work made with a tool, except by (exceptionally rare) prior agreement.

Creative copyright explicitly does give creators a claim on derivative work made using their creative output.

That includes patents. If you use a computer protected by patents to create new items which specifically ignore those patents, see how far that gets you.

I expect you find this inconvenient, but it's how it works.


I don't know if there is a concept in copyright that prevents someone from viewing your work.

Like, if you created a lovely piece of art, hung it on the outside of your house, and I was walking on the sidewalk and viewed it. I would not owe you money and you would have no claim of copyright against me.

Copyright covers copying. Not viewing.

So an AI views your art, classifies it, does whatever magic it does to turn your art into a matrix of numbers. The model doesn't contain a copy of your art.

Of course, a court needs to decide this. But I can't see how allowing an AI model to view a picture constitutes making an illegal copy.


> Of course, a court needs to decide this. But I can’t see how allowing an AI model to view a picture constitutes making an illegal copy.

Memory involves making a copy, and copies anywhere except in the human brain are within the scope of copyright (but may fall into exceptions like Fair Use.)


And this is where it becomes tricky as holding a copy in memory so software can "use" it being against copyright also prevents it being displayed on Adobe's website, or Adobe shrinking it to fit on advertising collage for Adobe Stock images etc, all of which the artist would say is fine as otherwise their work cannot be advertised for them to make money through Adobe's service.

Supposing the AI said "here's the picture you asked for, btw it's influenced most by these four artists and here's links to their works on Adobe Stock"... is that better (or worse)?


> Creative copyright explicitly does give creators a claim on derivative work made using their creative output.

No actually, not for this situation. They don't if they sold the right to do that, which they did.

> except by (exceptionally rare) prior agreement

Oh ok. So then, if in situation 1, and situation 2, there is the same exact prior agreement on the specific topic of if you are allowed to make derivative works, then the situations are exactly same.

Which is the situation.

So yes, the situations are the same, because of the same prior agreement.

Thats why the situation is stupid. The creator sold the rights to make derivative works away. Just like if someone sold you a computer.

And then people used the computer, and also used the sold rights to make derivatives works for the art, because both the computer and the right to make derivative works were equally sold.

> which specifically ignore those patents

Ok now imagine someone sells the rights to use the patent in any way that they want, and then you come along and say "Well, can you considered that if the person didn't sell the patent, that this would be illegal?"

That wouldn't make any sense to say that.


The creators of these images assigned the rights to adobe, including allowing Adobe to develop future products using the images. So yes, this is perfectly fair.

It's completely different than many (most?) other companies, which are training on data they don't have the right to re-distribute.


> So yes, this is perfectly fair.

I think you are making a jump here. I’m not a lawyer, but your first sentence seems to be about why it is legal. And then you conclude that that is why it is also fair. I’m with you on the first one, but not sure on the second.

The creators uploaded their images so adobe can sell licences for them and they get a share of the licence fees. Just a year ago if you asked almost any people what “using the images to develop new products and services” mean they would have told you something like these examples: Adobe can use the images in internal mockups if they are developing a new ipad app to sell the licences, or perhaps a new website where you can order a t-shirt print of them.

The real test of fairness I think is to imagine what would have happened if Adobe ring the doorbell of any of the creators and asked them if they can use their images to copy their unique style to generate new images. Probably most creators would have agreed on a price. Maybe a few thousand dollars? Maybe a few million? Do you think many would have agreed to do it for zero dollars? If not, then how could that be fair?


No that isn't how it works with Adobe or any of the other big stock photo companies. The photographers or creators of the images still own the copyright. Both with rights managed and royalty free they aren't assigning rights to anyone else.


> Both with rights managed and royalty free they aren't assigning rights to anyone else.

Have you read the contributor agreement? That seems to contradict what you are saying.


The word "assign" doesn't appear at all


ChatGPT trained on my GitHub code and I wasn't paid anything at all. Is that preferable?


"Since someone screwed me, they should screw you"? Crab bucket mentality?


No one is screwing anyone. I also learned from reading GitHub code and continue to do so. I haven't "screwed" the original code authors by forming an abstract memory of their code.

Any country that tries to forbid this will be torpedoing their economic competitiveness against countries that allow it.


This is a common fallacious argument.


What is "this" ?


It might also be wrong, yes. I have plenty of code licensed under very permissive licenses that still requires attribution. It is an open question of how much the AI system is a "derived" work in a specific, technical sense. And it probably will remain hard, since the answer is probably on a continuum.


> Is that preferable?

I don’t see why you are asking this. Which part of my comment made you think it is preferable?


If they signed the contract, then yes.


I'm not sure it true that creators are owed nothing further... It seems analogous to a musician signing over rights for one thing, like recording rights on wax disks, records or whatever. Then along comes radio, after the artists signed away a smaller set of rights. The radio companies claim that they owe the artists nothing. But is that true?

And that's a different question from whether or not they deserve extra compensation. Is it moral or ethical to use their work to directly undercut them via ai 'copying' their work?


Heh Heh Heh

Half the problems with music is because of record companies magically inventing new ways to try and extract more money from each other and their supply chain.

"Oh, your band looked at some hookers they passed on the way to the recording studio? Well, they obviously owe those hookers a cut of the royalties now for inspiration..."

Trying to use AI as an excuse to be paid a 2nd time (for previously fully paid works) seems like another attempt at rent seeking in a similar manner.


Switch "creators" with "musicians" and "AI" with "streaming" and I we feel like we went back to early 2000. Same arguments.

Adapting what Steve Albini said at the time: If you're an artist, negotiate your future "AI training rights" separately. Make sure they're in writing. Make sure attribution rights stay yours.


Defining how “fully” paid a creator has been is the entire point of license agreements. It defines the extent of how much the rights have been purchased away from them.

It merits investigation as to have these creators been “fully” paid to the extent that they have no claim to any future royalties and can have no objection to their work being used as training data.


I'm sure that's a no. When you license a stock image you license it for any use whatsoever. You don't get to complain if it becomes the background to a porn movie or an advert for a product or person you despise. Songs can licensed on a case-by-case basis but images are so plentiful as to be a commodity.


Not quite. For example, this is one thing Adobe says in their FAQ: Images showing models can't be used in a manner that the models could perceive as offensive. (For example, avoid using images with models on the cover of a steamy romance novel or a book about politics or religion, etc.)

There are also a few other more Adobe-specific restrictions.


Well, Adobe are publishers of the stock images so they can enforce such terms. What I mean is that the person selling a stock image to Adobe, Getty, Shutterstock or whoever has no say on what happens to it afterward. The publisher is going to assert universal rights in exchange for cash. If they narrow their own terms as in your example above, they don't necessarily create any sort of duty to the seller whose photo rights they purchased.

I should have been clearer and specified sale of rights rather than licensing of a stock image in my original reply, which I now realize is confusing.

Companies certainly can offer more narrow terms/commitments about what the stock images can or cannot be used for if they want to, but economically they're incentivized to maximize their own freedom, and not to negotiate a complex bundle of rights for individual images. Of course for journalistic and artistic images that are unique in some way, the original creator can negotiate.


That's not what happens when you sign a contributor license with Adobe, Getty, Shutterstock or other stock image libraries... These companies pay you 0 until the time a customer of theirs acquires a license to reproduce the image you contributed. Only at that point do you get paid a percentage of the license fee. Currently nobody has been paid anything by Adobe for their stock images used in the training (at least as far as I know).


Sure, but once you sign you're agreeing to whatever terms are in force at the time of signing. I would guess that Adobe granted itself universal rights to do whatever they think will bring the customers in, but I haven't looked at one of their contracts.


It's probably not very common but people can and will absolutely bring lawsuits if you misrepresent them based on stock photography including but not limited to manipulating the photographs. (Of course, people can sue for whatever reason but using random stock photos for "bad people" can get you in trouble.)


Simply untrue, legally and socially


Thanks for correcting my bad assumption, appreciated.


Public domain content. So if i host a large cluster of AI generated lighthouses that slightly look like cocksOnTheRocks, Firefly will suggest that after crawling and retraining?

What a time to be alive.


Ha, I didnt even need to read the article to assume this!

I instantly thought of how bitchin' their library of images must be.

Can you tell us how many images/size of set?


This is why I love this website. Mostly civil communication between smart people.


Sure but are you standing on the shoulders of Stable Diffusion or not?

Fine tuning Stable Diffusion with your own images is way easier than creating Stable Diffusion in the first place.

If you're creating your own I stand corrected and that's some serious investment.


Stable Diffusion 1.x isn't original work either; it uses OpenAI CLIP.

But training your own is pretty doable if you have the budget and enough image/text pairs. Most people don't have the budget, but at least Midjourney and Google have their own models.


Adobe Firefly is not based on Stable Diffusion.

(i work for Adobe)


Where did you read that they were using Stable Diffusion?


This is not based on just fine tuning Stable Diffusion.


Adobe in particular, however, has been more twords the forefront of AI research. I'm pretty sure they aren't just using SD here. They might not even be using transformers at all. See https://news.ycombinator.com/item?id=35089661


They also know most of their business customers already have GPUs, and often have high-end GPUs, so they're able to tailor solutions to the hardware their customers already have. For example, the speech to text feature in Adobe Premeire runs on local hardware, and is actually pretty good.

Hopefully they'll continue to push the potential for locally run models.


They also have the resources to build a huge training set, together with people who willingly upload their art and photos to them, which they can use to make the training set better than publicly available data.


Just to be really clear what we do and do not train on:

Firefly was trained on Adobe Stock images, openly licensed content and public domain content, where copyright has expired.

https://firefly.adobe.com/faq

We do not train on user content. More info here:

https://helpx.adobe.com/manage-account/using/machine-learnin...


Is it also trained on all of the Adobe Stock images which were created with Generative AIs in the first place? I think that could get into a messy spot too. It looks like there are already millions on there, not all of which is actually labeled as generative.

If you go to https://stock.adobe.com/ and search for `-generative` you can see it filters out ~3M of them, but then go to the "undiscovered content" filter, and it appears to me like the vast majority of it is generative too.


One step further, they already have a huge training set. Stock libraries have the luxury of the hard part already being done: labeling. As of today, that's >313M labeled images they can use with no fear of legal woes: https://stock.adobe.com/search/images?filters%5Bcontent_type...

Stable Diffusion was trained on billions of images, of course. But having explored some of LAION-2B, it's clear that Adobe Stock has far better source images and labels.


That is, these companies are largely not doing the hard part, which is creating and training these models in the first place.

No, there is no real "hard part" to AI current. Training is simply "the expensive part".

It seems "the bitter lesson" has gone from reflection to paradigm[1] and with that as paradigm, the primary barrier to entering AI is cash for cpu cycles, other things matter but recipes is relatively simple and relatively available.

[1] http://www.incompleteideas.net/IncIdeas/BitterLesson.html


I get your point, but please do read the training logs for Meta's OPT, it's some Greek drama I tell ya


Which checks the box for great timing. All these companies with billions in the bank needing a new thing to prove they still warrant their growth valuation will dump obscene sums into this and cause things to evolve at a staggering pace, much like we've seen with AR and VR.


I don't think they missed it, their second point is that the DX (developer experience) is good.


> That is, these companies are largely not doing the hard part,

Are they the hard parts though? The short time it took from the first waves of public excitment around DALL-E to stable diffusion being the well established baseline looks more like the class quantity of problems that can be reliably solved by shoving enough resources at it. What I consider hard problems are those full of non-obvious dead ends are where no matter how much you invest, you might just dig yourself in deeper.


The hard part is building a product customers want, delivering it at scale, and iterating on product value and revenue.

The rest is just technology.


And that had been true enough even before the "quantity matters!" of ML entered the stage.


I think you are both correct

But this phrase from GP is pretty darn salient:

>>"They are aggressively integrating it into their products which (1) provides a relatively cheap step function upgrade and (2) keeps the barrier high for startups to use AI as their wedge."


Also missed: All these big tech companies were already invested in AI and using it in their products: it just happens that the latest batch of AI tools are far more impressive than their internal efforts.


Creating AI models has proven to simply be easier than other past innovations. Much lower barrier to entry, the knowledge seems to be spread pervasively within months of breakthroughs.

People seem to take offense at this idea, but the proof is in the pudding. Every week there's a new company with a new model coming out. What good did Google's "AI Talent" do for them when OpenAI leapfrogged them with only a few hundred people?

It's difficult to achieve high margins when barrier to entry is low. These AI companies are going to be moreso deflationary for society rather than high margin cash cows as the SaaS wave was


It's easier for large rich companies with infrastructure and datasets. It's very hard for small startups to build useful real world models from scratch, so you see most people building on top of SD and APIs, but that limits what you can build, for example it's very hard to build realistic photo editing on top of stable diffusion.


Most of the cutting edge models are coming from companies with a few dozen to a few hundred people. Stability AI is one example.

Training an AI model, while expensive, is vastly cheaper than most large scale products.

This wave will be nothing like the SaaS wave. Hyper competitive rather than weakly-competitive/margin preserving


I wrote it from the perspective of a small startup (<10 people, bootstrapped or small funding). I think it's far cheaper and easier to build a nice competitive mobile app/saas than to build a really useful model.

But yes I agree, it will be very competitive with much smaller margins.


Someone was able to replicate GPT 3.5 with $500. The training of models is getting very cheap.

[1] https://newatlas.com/technology/stanford-alpaca-cheap-gpt/


I've tried it, sure it's good, but not even close to the real thing. But yes it's getting cheaper through better hardware, better data and better architectures. Also it builds on Facebook's models that were trained for months on thousands of A100 GPUs.


By fine-tuning a leaked model trained with a lot more money than that. If somebody leaks the GPT 3.5 model, can I say that I replicated it for $0?


That's not really true though. 4 months on and no one else is close to matching the original ChatGPT.

It's too early to say how hard this is, for all we know no one but OpenAI will match it before 2024.


> * Whereas AI was a hand-wavey marketing term in the past, it's now the real deal and provides actual value to the end user.

Ehhh.... Sometimes. It's still a hand-wavey marketing term today. Almost every sales call I'm in either the prospect is asking about AI, or more likely the vendor is saying something like "We use our AI module to help you [do some function not at all related to AI]". Also, even when it's "real" AI (in the sense that we're talking about here), it's not always providing actual value to the end user. Sometimes it is, and sometimes it isn't.


Yes, not everything AI is working out – never has, and never will. The same is true in any field. And yes, there will be a display of incompetence, delusion and outright fraud. Again, in any field, always.

However, with AI in general, we have very decidedly passed the point where it Works (image generation probably being the most obvious example of it)

Even if, starting now, the underlying technology did not improve in the slightest, while adoption rises as it is going to with any new technology that provides value, anyone who does not adopt is going to be increasingly uncompetitive. It quite simply is too good already, to not to be used to challenge what a lot of average humans are paid to do in these fields.


Like a toothbrush with “AI” that tells you when you’ve brushed enough


> I attribute the speed at which incumbents are integrating AI into their products to a couple things

And also it's something many companies have been working on for a big part of the last decade. Kevin Kelly in particular has been talking about it for at least the past 7 years. In 2016 he released a book titled "The Inevitable: Understanding the 12 Technological Forces That Will Shape Our Future" and the addition of AI to everything is covered in that book.


Another point is that many of the incumbents have seen the trend for far longer than the general public, and had time to gather inhouse talent. For example this isn't Adobe's first stint into generative AI, back in 2016 they announced (and quietly dropped after the backlash) Adobe VoCo, "photoshop for voice".


Right? Adobe was first to market with properly integrated AI-based photo editing features with stuff like Content Aware Fill back in 2015 iirc.


This is a great point. It appears the success of OpenAPI has validated their approach, specifically around (1) using the web as a training set and (2) using transformers.

I imagine a lot conversations with in-house AI folks is around deploying these methods.


I attribute it more to open source and free plugins into existing Adobe products like Photoshop.

People are already using these plugins to do inpainting etc with Stable Diffusion. Adobe is trying to provide official support simply to keep up.

To me, the most novel thing is the data source being free of licensing concerns.

But that, too, will be eroded as more models appear based on datasets with straightforward licensing for derived works.

Image stock collections (and prior deals around them) seem more valuable now than they did before all this.


> * Whereas AI was a hand-wavey marketing term in the past, it's now the real deal and provides actual value to the end user.

The skeptic in me thinks it's more:

* The market is rewarding companies for doing X (integrating AI), so companies are doing X (integrating AI).

Song as old as time.


It's important to note that this is generative AI.

As pretty much everyone on HN is aware, AI is a broad term for a variety of technologies.

AI has been in our everyday lives for quite some time, but not in a way that generated (no pun intended) such buzz.

Having my iphone scan emails and pre-populate my calendar with invite suggestions is far less newsworthy as the ability to generate a script for an Avengers film where all the members are as in articulate as the Incredible Hulk.

If anything, with generative AI being so buzzy, this latest round of AI integration is more marketing.


I can’t think of a company more suited to take advantage of the generative AI hype. If firefly is built into the adobe stack you’ll have a rather elegant composition and refinement toolkit to modify anything you dislike about the generative output.


> They are aggressively integrating it into their products

And no reason to suspect that it won't spectacularly backfire. Not that they're integrating (or more correctly slapping it on with some hot glue) transformative technology they barely understand into legacy products which are themselves likely to go defunct. Why even use a spreadsheet when you can get GPT-4 to write a whole data analysis pipeline and summarize everything? Or why bother with word documents when GPT-4 can generate, spell/grammar check, translate (if necessary), format and output into any format you want?


From a practical perspective, because of inertia and all managers seemingly being "burned once, twice shy" on re-designs.

If my company already uses spreadsheets and teams of people skilled at manipulating them, it might not be very hard to convince my manager that good AI use cases (quick formatting, intelligent auto-completes, intention detection from sheet structure) could probably speed us up by 20% - but convincing management of the value of a full system re-design by comparison would likely be a very tough sell, if not viewed as an insult or sign that I'm not a team player.

Just because the future may prove these methods antiquated doesn't mean management is liable or likely to change their tune today.


> good AI use cases (quick formatting, intelligent auto-completes, intention detection from sheet structure) could probably speed us up by 20%

That would imply these AI systems have been integrated well, which in most cases are not and I suspect the amount of work it would take to do it properly would not pay off considering the limited window available before people transition to a fully chat based system.

> but convincing management of the value of a full system re-design by comparison would likely be a very tough sell

In this case it would be convincing the management to transition to a system where you can just order around and someone/something would get the job done, which is well in their comfort zone (I'm only half joking).


I think the other thing driving this is the market / investors.

Companies have to get involved in this field because it's all the buzz, while everything else you said is true, this means companies have no choice to get involved.

If someone like Adobe, google etc, doesn't get into the game, they are sure going to be a lot of questions on there next investment / financial calls.

The incumbents need to keep up appearances and not look like they are behind / unaware of AI, else investors will be worried that they will simply be sidelined/replaced at some point in the future.


Except for Chat GPT I havent yet seen super impressive implementations. Dall-e, codepilots, text to speech etc. are still all not good enough to use for more then playing around.

However this landingpage looks amazing.

Any good other tips?


Midjourney v5 might be ready for prime time, in most cases it seems to have solved faces, hands, etc. The difference between version 1 from exactly one year ago and v5 now is rather striking: https://substackcdn.com/image/fetch/w_1456,c_limit,f_webp,q_...


This is effectively DALLE-2 (apparently retrained on different data) with a native frontend designed by experienced designers.

You don't see the value in effectively not needing any art/design skills to make aesthetically pleasing logos/mockups/memes? Even just for "playing around" - I bet there's a large market of people who want to add "fun edits" to their existing personal photo library.

Focus less on landing pages and more on the implications a technology brings. Then you'll see where things are headed.


I've tried to use DALL-E for practical purposes such as generating icons, art etc. for commercial products but most of it is unusable.


Dall-e is far behind Midjourney and Stable Diffusion.

As we speak I’m generating images of my dog as every Pokémon type (for fun) that are frankly quite a bit better than the art of most Pokémon cards.

I’ve also used it for a photo composite: https://youtu.be/c_1mRWthhek


Yeah I was just using DALLE2 for the first time in awhile because I hoped it would integrate more simply alongside my code for the GPT4 api. Results are just objectively underwhelming after having used stable diffusion for so long.

It’s also still under beta and given the recent codex termination, I wouldn’t be surprised if they get rid of it.


> “incumbents will not sit on their laurels with the AI wave.”

Yes, because they are already technology companies who understand tech. In the disruptive times of the Internet, then iPhone (mobile Internet), the companies being disrupted were not tech companies, so they didn’t know how to catch up.

This time this is a massive tech improvement, but in a market already saturated by key tech power players who are in prime position to implement these improvements into their existing competitive advantage.


Exactly this. This makes will eventually make it difficult for new AI startups, once those existing solutions get technically up to speed, if they can't differentiate in a meaningful way.

My own startup is https://inventai.xyz. Subscriptions aren't ready, so you can't actually generate anything yet, just trying to move quickly.


I think what we're going to see is that all the small startups going for big, broad ideas ("we do AI writing for anything", "your one-stop marketing content automation platform", etc) are going to flat out lose to the big companies. I predict that the startups we'll see survive are the ones that find tight niches and solve for prompt engineering and UX around a very specific problem.


> unlike with previous big innovations (cloud, iphone, etc), incumbents will not sit on their laurels with the AI wave

Adobe had a whole bunch of smart people including Dave Abrahams helping transition to iPhone and iPad, and to build for the cloud, so I'm not sure what you're thinking here.


> unlike with previous big innovations (cloud, iphone, etc), incumbents will not sit on their laurels with the AI wave

Wasn’t “cloud” also added to major products? Even Adobe relies on that heavily. Similarly, most big companies (including Adobe) offer an App Store app.


Let’s just take this moment to declare that AI did not turn out to be the next 3D hype.

That’s something I almost forgot about. This was not a forgone conclusion - there was a pretty big divide between people who thought AI would have real value and staying power and those that didnt.


> It's in the zeigeist so you get a marketing boost for free.

I never thought about it that way, but now that you mention it, it makes so much sense.


Startups? Cloud and mobile were not spearheaded by startups but by megacorporations.


Iphone?

Didn't the incumbents come together for Android as a group?


From that page's FAQs:

> trained on a dataset of Adobe Stock, along with openly licensed work and public domain content where copyright has expired

> We do not train on any Creative Cloud subscribers’ personal content. For Adobe Stock contributors, the content is part of the Firefly training dataset, in accordance with Stock Contributor license agreements. The first model did not train on Behance.

Not sure what "first model" means there.

Also interesting: https://helpx.adobe.com/stock/contributor/help/firefly-faq-f...

> During the beta phase of Adobe Firefly, any Adobe Firefly generated assets cannot be used for commercial purposes.

> Can I opt [my Adobe Stock content] of the dataset training?

> No, there is no option to opt-out of data set training for content submitted to Stock. However, Adobe is continuing to explore the possibility of an opt-out.


From Adobe's reddit post[1]: > We are developing a compensation model for Stock contributors. More info on that when we release

If they can properly compensate the stock contributors based on usage then I think this is a very fair approach.

[1] https://www.reddit.com/r/photoshop/comments/11xgft4/discussi...


I didn't see this before I posted, but I'm glad that's the case. In fact, it might be great for contributors that don't have a large library or aren't ranked as well.


with ai image generation the value of stock imagery in terms of licensing out to third parties approaches zero. On the other hand, in terms of its contribution to training a model, the image value also approaches zero on a per image basis.

I dont have any confidence they can create a fair compensation method.


Still on the fence for whether or not you should be able to opt out of training (I'm sure many artists would love to "opt out" of humans looking at their art if the human intends to, or might, copy the artists' style at some point).


hi, I'm an artist, I do not give a shit about other humans looking at my work, I am delighted when a younger pro comes to me and thanks me for what they learnt from my work. That tells me they were fascinated enough with it to look at it and analyze it again and again. I made a connection with them via my drawing skills.

I am catastrophically unhappy at the prospect of a corporation ingesting a copy of my work and stuffing it into a for-profit machine without my permission. If my work ends up significantly influencing a generated image you love, nobody will ever know. You will never become a fan of my work through this. You will never contribute to my Patreon. You will never run into me at a convention and tell me how influential my work was to something in your life. Instead, the corporation will get another few pennies, and that is all.


As a fellow artist, I can kind of see where you are coming from. I've had my art copied in several places, nobody ever asked for permission and probably nobody knows I did it. Nor will. I've seen my work in memes, Doom mods, promotional posters and even ads. It's harsh, but it's naive to expect a reward, the world doesn't work like that in practice. And that's human fault and has happened way before AI imaging was even considered a possibility. Look at how many "classic" artists only became known by chance (and posthumously), despite the quality of their output.

Also, if anything, being included in a training set means you are already notorious. The big ones are usually trained on stock art or famous professionals/classics, and the ones trained by random people are going to use quality filters to get the best images of a specific subject. If you aren't already notorious, you simply will not make it into a set. The other possible motivation is spite, but that'll only happen if you participate in internet slapfights.

And...Patreon? Fans? Influence? "younger PRO" (so a non-pro doesn't count even if they are trying their best?) I sincerely recommend you choose sentences more carefully. Makes you sound phony and using art as a social strategy. When there's so much talk about art and its "soul" you really don't want to make yourself sound like art is a mere means to an end or you risk making the cold, calculating machine seem "purer".

Personally, art should be its own goal. I had plenty of time to stomach one isn't entitled to anything just because effort, so I just want to make my art better. And saves me from the trouble of chasing trends to stay relevant.


I'm an artist as well. I think this can happen whenever anyone sees your art anywhere online. They can copy it. They won't tell you about it. They might copy it really well. And they might copy not just your technical style, but what your art says and how it says it.


Honestly having worked as an artist and illustrator for a long time I'm kinda relaxed when it comes to this.

companies stealing your work isn't a thing that just happened to become possible because AI can make it easier to imitate. The real world scenario is much rather: _they just steal your existing work_

ever since I've been publishing my own illustrations I've always ran an image search on the stuff I've got on my homepage once or twice a year, and almost every time i did this I've found multiple individuals or companies who just used my illustrations for something without ever contacting me.

On the other hand, if my style was ever to become known and sought after enough for people prompting AIs with my name, I'm pretty confident that this would mean that:

A) I've grown big enough to have more inquiries than i could serve anyway and

B) The ones doing this are neither people or companies I'd liked to have as clients, nor would they have booked my services for the price I ask if it weren't for AI.

Big agencies with interesting clients couldn't allow themselves to risk their reputation with faking original work, and other large corporations that just don't give a fuck because they are ready to litigate if i challenge their usage are likely to have such reach that they'll end up still contributing to the desirability of my original artwork if they appropriate it. And the smaller ones, they fulfill premise B) on the one hand and would also feed the hype.

So I truly feel what actually bothers you, forever I've been drawing first and foremost because i want to inspire people to express themselves and make experience the same experiences I've had when loosing myself in a drawing. And i do think the bittersweetness of imaging AI remains that while some people will gain the confidence to DO try to express themselves creatively, others will at the same time never want to seek achieving this by their own physical skill and raw will.

But when it comes to myself and my art, I actually see only benefit and progression, regardless of what current dystopic scenarios might manifest :)


The problem is to be impacted negatively doesn't necessary mean "I've grown big enough to have more inquiries than i could serve anyway".

It could just mean AI can imitate above a certain "good enough" level the kind of style you and others working in a similar vain do (say "whimsical children illustrations") and all other styles besides, so once-would-have-been-customers now just use some AI stuff with a generic prompt.

In other words, AI aet doesn't have to request your specific name to affect you. It's enough that it provides art for many/most gigs that would now go to a human illustrator.

It would be hard to see just "good developments" when paying gigs dry out.


It's not like 90% of the work available to human commercial artists would even be there in the future with these developments...

Most customers currently paying an artist will just slap some quick AI stuff and call it a day.

Big customers might still make it a matter of prestige or pride or "quality" differentiator to use "real human art" from some famous graphic designers, but that is like 1/1000 of the commercial gigs going on today.


Do you think Rembrandt know his fans? Until Renaissance, artist don't even signed their work.

We are making huge steps to promoting ourselves in last decades.


Is there a license that exists that you could put on your work to prevent its use in model training?


Not as far as I know. There needs to be one, and internet-scrapers need to be able to be sued for ludicrous amounts of money if they violate it, IMHO. Training AI models feels way outside the scope of what I think "fair use" should cover.


They do not currently operate on the basis of fair use. Operating as a human by looking at images and learning how to draw or paint is not 'fair use', it's a right given to you by either God or Mother Nature, so the legal basis for neural nets learning from other art is that it's learning like a human and creating new art from just knowing what art human think is good and optimizing its creation of art to mimic if not borrow the same qualities while still making something new.


Also if you are going to say that the neural net has the same divinely-given right as humans then I feel you are also implying that the neural net has all the other rights of a human, and I think labor laws would have something to say about keeping a six-month-old idiot savant in a box and making them draw whatever random querents on the Internet asked for, 24/7.


As far as I know there are no religions or legal systems that posit that there are any rights inherently given to machines.


The rights are exercised by the humans building and operating the machines. You're saying the hammer doesn't have rights. Well, yeah, duh. It's the person holding it.


"so the legal basis for neural nets learning from other art is that it's learning like a human and creating new art from just knowing what art human think is good and optimizing its creation of art to mimic if not borrow the same qualities while still making something new" sure sounds like it's attributing the rights to the machine, not the person using it, to me.


Not for lack of trying though. https://transhumanist-party.org/tbr-3/


Non-commercial internet scraping for model creation is explicitly legal in the EU; the result of a model trained on a billion images really has nothing to do with anyone in particular's art. Although the model would likely work pretty well without ever seeing any "art" images.


If, as I alluded to, you and the SCOTUS (and other courts) interpret AI art as similar enough to humans where the 'training' process is analogous to a human looking at art and learning how to create good art (or even copy another artist's style), then the license you apply to art does not matter, because it'd just be "learning" about how art works, and not any actual usage of the original work. In this case the AI would be considered a human for the purposes of copyright infringement, where it would infringe on the original work if it recited or recreated any single work from memory without any substantial changes to turn it into either a parody (fair use) or its own work separate from the images it has learned from, even if it mimics the art style of any single artist (since artists can't copyright their styles).


I think you’d consider those responsible for the model to be humans, the model to be a machine, and all outputs to be derivative works of the training set with the humans holding culpability.


It's not quite the same... And I'm not sure how people on HN of all places are failing to grasp that these algorithms aren't sentinet, much less people.

I think this is incredibly cool technology, but using other people's property without their consent is stealing (I'm not talking about legality, but morality here).

The second reason why it's not the same is that people can't look at X million pictures and become proficient in thousands of different art styles. So, again its not legality but more about ethics.

I guess different people have lower moral standards than others, and that's always been part of the human condition.

With all that out of the way, I think artists won't get replaced, because these tools don't really produce anything... Substantial on their own. An artist still needs to compose them to tell a story. So, all this nonsense about how it will replace artists is misguided. It can only replace some parts of an artist's workflow.

I know there was an art competition where someone won with a piece that was AI-aided, but honestly it looked like colour sludge. The only thing that was really well executed in it was the drama created by the contrast from sharp changes in values near the centre of that work, and something vaguely resembling a humanoid silhouette against it. You could've called it abstract art if you squinted.


I'm curious, do you hold the same beliefs about text?

Do you think ChatGPT should not be allowed to read books and join ideas across them without paying the original authors for their contribution to the thought?


I do! If they aren't in some way public domain, then the authors should have a say, or be if the work is purchased.

I have a bit of cognitive dissonance on the subject of blog posts or articles in general, since those are kinda public domain? But I still think it should be opt in/out-able.

I realise I'm also a bit of a hypocrite since I've enjoyed playing with these AI tools myself, and I realise they'd be nowhere as cool if they didn't have access to such large datasets.


IANAL: Authorship is protected in the US by default https://www.copyright.gov/engage/writers

In order for blog posts (or other written works) to be in the public domain, authors must explicitly waive those rights. But, not that it needs saying, copyright's applicability in training data is basically the entire subject of debate right now. https://creativecommons.org/2023/02/17/fair-use-training-gen...


Ah, I had no idea that was protected too! That's good. I think the reason I was morally on the fence was that people already put blog posts out with the intent of sharing their knowledge with the rest of the Internet...

So my assumption was that anything trained on it will just help further expand that knowledge.

Although I do realise now as I'm typing this—AI could diminish their audience, clout and motivation, which isn't what I'd want.


But these stock image artists provided consent when signing a contract and selling their work to Adobe. The contract is pretty clear that you basically don't own the work anymore and Adobe can do whatever they want with it.

If you don't like it, don't sign the contract.


Oh right, sorry. I was talking generally, not specifically to Firefly.

Yeah, I think Adobe is a publisher and as such, you give it distribution rights. So, I agree with you on this case.

Slightly tangential, but Imagine a singer or actor's voice of face being used without their consent just because the publisher has rights to distribute their performance. That probably wouldn't fly very well, and I assume this doesn't fly with some artists either (even though they signed a contract).

I assume publishers will probably have an AI consent form soon.

It's all very exciting, and I hope we don't ruin it with greed and disregard for the works of the very people that made these technologies so successful. Like, if it weren't for the scraped works, the AI feats would've been both much more underwhelming and and much more expensive to train.


> thousands of different art styles

Are you saying all those art styles are original? They are not. Artists influence artists. Sometimes the influence is strong, other times just a glimmer.

Perhaps the burden of responsibility shouldn't be on Ai machine learning, but the human prompt author who chooses the instruction: "...in the style of Jo Blow". Who owns the ethical dilemma in that case? Who is being unfair to Jo Blow in that case? I'm quite certain whoever wrote the prompt is 100% responsible.


I think this is a futile discussion. I think originality certainly exists, and I've seen it may times in the art world.

I could be wrong, but you seem to imply originality is a made up concept? Do original thoughts even exist?

Are our comments original or just regurgitation?

Do you believe nothing ever created throughout history is original then?

If you believe originality does exist, where and how do you draw the line?

I don't think people are as boring as you think they are. Or maybe you just don't appreciate art. I can't tell. Some people say they do, but they really don't.

And further, even if we do ignore originality as a concept and pretend for the sake of the argument it doesn't exist, someone still values that style or approach to creation. That probably makes a living based on the subjective value it brings to the table. They sold their work. An AI company now profits off of their work, having used it in an algorithm, without paying anything to those artists.

As an experiment: Give me an example of something you've made yourself that you valued enough to sell, and then tell me why I deserve to have it for free. Maybe that will help me understand. Maybe you can work for me for free?

If none of their work was original or valuable, why even train on it, and why can you even ask for "in style of X"?

Also, the approach of waving responsibility onto individuals is too naïve. If that worked, we'd not need laws or any type of regulation.

Anyway, even with this diatribe I still don't think AI is a bad thing or that it will replace people or or their work—in its current iteration at least. It'll probably improve it, but I certainly understand why people feel violated.


You spent a lot of time deconstructing something I never said or implied.

The keyword you missed is "style". Your word, that I quoted.

As in, "the style of that artist". I wasn't referring to individual pieces of original art.

"Black and white street photography, gritty contrast, low wide angles, urban decay, weathered characters"... could be a style shared by many, that Ai learns.

Nobody owns a "style". A style is merely a convenient way to describe art. It belongs to the art critic more than the artist.

Many have made art in the style of say Andy Warhol - minimal flat colors, mass produced product imagery. Or surrealist styles, "like Dali", with dreamscapes, imaginary landscapes, twisted architecture and optical illusions. Things Ai can learn without being accused of stealing. Unless you're arguing a kind of "theft of humanity".

> Also, the approach of waving responsibility onto individuals is too naïve.

You want others to take responsibility for your actions?

Prompt: "image like Jon Doe's fantasy illustrations"... Prompting with intent. That's on you, not others to regulate your adult choices.

Before Ai, you could hire an artist to help you learn the style of another artist, living or dead. Maybe a drawing style you like. So you hire a skilled artist to point you in the right direction for acquiring a similar result with the pencil. Nothing ethically unsound there. Ai just skips the lessons and time it takes you to draw in the style of your favourite artist. Nothing has been stolen.


> You spent a lot of time deconstructing something I never said or implied. > The keyword you missed is "style". Your word, that I quoted. > > As in, "the style of that artist". I wasn't referring to individual pieces of original art.

Sorry I miss-interpreted a little.

I guess we disagree on what style means maybe:

> Black and white street photography, gritty contrast, low wide angles, urban decay, weathered characters"... could be a style shared by many, that Ai learns.

I still think there's a lot more individual flavour in art that can't be captured with words.

You described a specific art style in your example, and within that there's probably a plethora that fit that description roughly, but there's often a unique signature every artists brings to the table through different application and mixture of those "sub styles" and techniques.

You can try to reduce Matt Inman's (Oatmeal) style to "line art, blobs for humans, vivid colours, absurdist face expressions, comic" etc, and spend ages explaining to an AI to steal the style, and it wouldn't get it probably, but just slapping "in style of Oatmeal" would probably get you very close.

Similar for xkcd, Oglaf, specific anime styles, concept artists (often time this is what gets them hired). They all have a flavour of style that's unique to them even if they use similar approaches and techniques. All manga or anime is in the similar vein of line art, but Miyazaki Hayao is indisputably Hayao; any fan of anime will recognise his style immediately. If this weren't true, nobody would be putting "in style of X", they'd just be saying "{generic technique}" instead.

Listing the individual composition of technique won't really get you close to how they work, but saying "in style of X" gets you crazy close.

You bring up a bunch of dead and famous artists, but ignore the fact that their styles can only be mimiced because they were studied. And they can only be mimced by someone with enough knowledge the theory behind the techniques composed to achieve the style, and the skills they spent years building to to execute it.

> You want others to take responsibility for your actions?

No, but the problem is that currently nobody is being held responsible. Not the model devs for using works without consent, nor the actors benefitting from it. For example, Fiverr is currently flooded with AI scam artists who get away with saying "in style of X" without even having a basic comprehension or intuition what that style is composed off, much less be able to describe the style intelligently. Everyone's turning a blind eye, artists are protesting, and people there's people saying "yes, this is the way it's meant to be".

You can't expect everyone to have the same moral standards as you do. If it's you create a tool and put it into the world, people will use/abuse it. That's just the way it is.

Who should be responsible? Like I agree with you that the profiteers abusing this "in style of X" are much worse and bigger assholes than the people developing the models, but nonconsensual use of someone else's work without their consent, or without any way to prevent abuse of that content (eg by properly anonymising the training data, at the very least) is also bad. The devs didn't even pay a one-time fee to the artists to use their work; which is IMO the rock bottom expectations of decency. They just paid to get the training set, which was probably collated through some scummy ToS loophole.

So they're eating the market of someone who spent a lot of time developing a unique style. Can these artists use the tools? Yeah, and I hope they do, but I still think it's a bit ludicrous how easy it is for me (or anyone else) to mimic a style someone spent years building, with a few words. And that's "thanks" to that artist trying to build a name for themselves. So the clout the artist tried to build is literally coming to bite them in the ass. Not to mention the fact their good will of sharing artwork with the internet is also coming to bite them in the ass too.

It's crazy to me people are being so incredibly flippant about the livelihood, careers and lives of other people.

Anyway. I'm happy to agree to disagree with you on this. It's an incredibly complex subject, and I wish people weren't so dismissive of... Well, other people. And if you feel like you aren't: you are; you're just oblivious to it.

Anyway, I hope I'm wrong and you're right, and maybe there won't be any detrimental effect to artists and art. But given how many scammers are already abusing this, I'm incredibly doubtful.


It's fine we disagree. Let's avoid the trap of right or wrong. Ai is new, and our opinions will evolve over time. With that said...

> "indisputably Hayao; any fan of anime will recognise his style immediately"

No they wouldn't. If you're talking about a still frame, nobody can be certain who the artist is. Why? Because LOTS of artists make similar work. You think there's an indisputable signature in every drawing from every artist? There really isn't.

There may be an obvious signature, such as placing an "upside down umbrella" in every piece they make. But the Ai won't copy the umbrella, because that would be replicating the work rather than style.

There are books about Miyazaki's style, authored by him that go into detail and share everything from concept sketches to the use of computer graphics in his films. It's all there in the books he wrote explaining and SHARING his techniques with the world. But what he can't share, and what can't be replicated with prompts, is the brilliance of the storytelling that threads the vast body of visual work to create the whole.

> unique style

Again, you're claiming an artist's style is unique, when in fact the vast majority of "styles" are blends of other styles and influences. No artist was born in a bubble.

Even if there is a unique visual signature, so what? It's not the heart of the work. It's a preferential technical characteristic.

Not only that, but this world is saturated with images. Even in music, it's difficult to come up with an original riff that doesn't sound like an existing song. The musician, in order to be awarded "originality", needs to use the full 3 minutes of the song to involve lyrics and other elements before something truly "unique" emerges.

You think there's an infinite bucket of "styles" visual artists can tap into? There isn't.

Content choices and narrative choices are more important anyway. For example, the artist may have something to say about humans being cogs in the machine. Sounds interesting. Such concepts form part of the style and reputation of the artist. People follow artists largely because of their artistic substance.

You're too focused on visual technique. The guts of any good artwork is the whole package, including the symbolism and balance of meaning in the work.

What is the artist SAYING with the work. Ai won't help prompters with quality of symbolic meaning, concepts and interlinking symbolism, subtext, balance of imagery, subtle visual tension... all the things that make ART ART -- the ESSENCE of art is found in the body of work.

So... Ai can generate a sci-fi robot running through the city, and your concern is this impedes the profit margin of real artists pumping out the same generic stuff? Sorry, but I won't shed a tear for the loss of artistic mediocrity in a world saturated with stock images.

Artists are adaptable, and should be doing a lot more than selling robot images to stock art collections anyway. Time to level up. Nothing wrong with Ai keeping us on our toes.


I find this situation a bit funny. I agree with you on almost everything you said.

I think the only part we maybe disagree is on the importance of style as a form of expression. I think style is a big part of it (or mixture of styles). To me, there was a style-space where each technique is a dimension (kinda like a colourspace), you can position yourself anywhere within that space. And I think that the choice of that position matters to artists a lot. It doesn't matter _for_ the artist, but it matters _to_ the artist. They could've spent a long time searching the style space to find their own individual style, and it's a part of their art. Maybe thinking about styles mathematically here is a mistake here?

To me it feels unfair their efforts to build this style is being treated dismissively by some. I don't think you yourself treat it dismissively, since you seem to appreciate art too. But I don't think every employer in need of art will see it like that.

Also, I as I read your reply and thought about the topic a bit and you partially changed my mind:

I believe established artists won't be impacted by this, but I fear that certain entry-level positions in some industries will be jeopardised by AI. I'm not even sure why I think this. Maybe it's no more than a hunch based on what I've seen happening to people in entry positions already being treated like throwaways in film and game industry.

An alternative and more optimistic way to look at it would be that entry-level positions will evolve to be more quality/fulfilling work. I hope that's what happens, then.

I also think it's hard to sufficiently develop in any field if you don't do some of the "grunt work" too. I guess I'm not a trained nor is art my career (just a self-taught hobby). So, maybe the exploration and development of styles isn't as big of a deal as I make it out to be for day-to-day artists. Maybe they do the grunt work in art school?

You mentioned mediocre art and stock images. I know what you mean, but I also think that's just a phase lot of people go through. Nobody is born a great artist (or anything else), they become it. People need to start somewhere.

I think it's all very complicated... This is somewhat tangentially related (which I think was the real thing that sent me down this spiral, more than anything else): I wish the artists' wishes about their works were treated with more respect by the people training the models, since I think it's difficult to expect every user of the tool to do the same. Or allow them to opt in and be compensated for their works.

Anyway, thank you for indulging me in this long thread! :)


"I guess different people have lower moral standards than others, and that's always been part of the human condition."

Instead of lower morality, I'd say it's selective morality.

I bet quite a few artists (rightfully) feeling threatened by this phenomenon would have absolutely no problem watching a pirated movie, using an ad blocker, read paywall-stripped articles, the like....whilst this is principally the same thing: taking the work of others without consent or compensation.


> I'm sure many artists would love to "opt out" of humans looking at their art if the human intends to, or might, copy the artists' style at some point

I’m pretty sure that would be a death knell for art. Where are these mythical artists who have never looked at anyone else’s art?


It's the same problem with fake copies of Van Gogh and so forth, except historically those fakes were produced at a much slower rate because of the time needed to master the skills to produce those fakes. With modern tools, those fakes could be mass produced, while the original artists are still alive.


> It's the same problem with fake copies of Van Gogh and so forth, except historically those fakes were produced at a much slower rate because of the time needed to master the skills to produce those fakes. With modern tools, those fakes could be mass produced, while the original artists are still alive.

Those people got in trouble for recreating specific works, or creating new works in his style and defrauding people by saying they were originals. Safe to say that not disclosing "this is not actually a work created by <artist>, just in their style" would be grounds for fraud, especially if you were to sell it.


I can train a LORA on my own PC in less than an hour. Good luck opting out of that.


What does that matter? Generate as much as you want for your own personal reasons.

It's about actually being able to use that content legally (and commercially) that matters to most in this conversation.


AI training is a one-way operation - you can't reconstruct the dataset from a model/lora/ti. Unless it's something really blatant like real people, widely recognised copyrighted characters like Batman or Iron Man - it's going to be hard to prove that someone used your art to train an AI model. I'm not required to publish my model or the datasets that I used anywhere.


I can trivially torrent movies at home also. But then going out and selling them is widely accepted as being "wrong".


Should Github Copilot be trained on private, closed-source, proprietary code?


Yes, AI should be trained on every piece of information possible. Am I allowed to become a better programmer by looking at private, (illegally leaked) closed-source, proprietary code?


One motivation for artists to create and share new work is the expectation that most people won't just outright copy their work, based on the social norm that stealing is dishonorable. This social norm comes with some level of legal protection, but it largely depends on a common expectation of what is considered stealing or not.

Once we have adopted the attitude that we can just copy as we please without attribution, it would be much more difficult to find motivated artists, and we would have failed as a society.



I didn't ask if can I use other people's proprietary closed source code, obviously they have the right to that code and how it's used.

I asked if I can learn from that code, which obviously I can. There is no license that says "You cannot learn from this code and take the things you learn to become a better programmer".

That's exactly what I do and it's exactly what AI do.


> I asked if I can learn from that code, which obviously I can.

Did you actually read the link you were given? Clean room design is because you may inadvertently plagiarize copyrighted works from your memory of reading it.

i.e. the act of reading may cause accidental infringement when implementing the "things you learn"


> i.e. the act of reading may cause accidental infringement when implementing the "things you learn"

Surely you know this isn't the case right? Maybe you're confused because we're talking about programming and not a different creative artform?

Great artists read, watch and consume copyrighted works of art all day, if they didn't they wouldn't be great artists. And yet the content they produce is entirely there own, free from the copyright of the works they learned from.

What's the difference then in programming? Why can an artist be trusted not to reproduce the copyrighted works that they learned from but not the programmer?


Artists get into trouble all the time for producing works very close to something that already exist. That's like the number one reason artists get shunned in the communities they were in.


Every filmmaker watches movies

Every author reads books

Every painter view paintings

Unless you're arguing that every single artist across every field of artistic expression is constantly being jeopardized by claims of copyright infringement, this is a nonsensical point to make.


But they’re not creating similar works, unlike AI which IS. Why is this so complicated for you?


I would seriously question if this happens all the time, these days. The whole copyright thing is way behind the digital and internet revolution. Look at what the Prince case did for transformation copyright fair use.


The process of online artists shaming each other doesn't really have anything to do with the legal system, though they all act like it is.


> Why can an artist be trusted not to reproduce the copyrighted works that they learned from but not the programmer?

They cant. which is why that quote "Good artists copy, great artists steal" exists.

AI has already been shown to be "accidentally" reproducing copyrighted work. You too, can do the same.

Its likely no-one (including yourself) will ever be aware of it - but strictly speaking it would still be copyright infringement. This is the relevance and context of the link you were given.


If everyone is infringing copyright, no one is infringing copyright. This is a dead-end thought.


Sure but the infringement is the problem, not the ideas themselves.

You're describing thought crime right now. It's not illegal to learn things.


And if you "learn" something and accidentally rewrite it verbatim? Thats what clean-room design is to protect against


Rewriting the code verbatim and distributing it would be a copyright infringement, yes, you do not have a write to distribute code written by other people

That's completely different from reading and learning from code, which is what grondo described.

Clean room design relies on this, in a clean room design you have one party read and describe the protected work, and another party implement it. That first party reading the protected work is learning from closed-source IP.


> That's completely different from reading and learning from code, which is what grondo described.

AI (e.g. copilot) has already been shown to break copyright of material in its training set. Thats the context of this whole thread.


Perhaps, but not of Grondo's point.

If an AI infringes on copyright then it infringes on copyright, that's unfortunate for the distributors of that code.

Humans accidentally infringe on copyright sometimes too. It's not a unique problem to machine learning. The potential to infringe on copyright has not made observing/learning/watching/reading copyright materials prohibited for humans, nor should it or (likely) will it become prohibited for machine learning algorithms.


> Perhaps, but not of Grondo's point.

Grondo said that AI should be given access to all code, including private and unlicensed code.

He was given a link to Clean Room Design demonstrating the problem with the same entity (the AI) reading and learning from the existing code and the risk of regurgitation when writing new code.

He goes on to say thats what he does, which doesn't change that fact.

> Humans accidentally infringe on copyright sometimes too.

Indeed we do, and its almost entirely unnoticed, even by the author.

> nor should it or (likely) will it become illegal for machine learning algorithms.

If those machine learning algorithms are taking in unlicensed material and then they later output unlicensed and/or copyrighted material, then they are a liability. Why would you want that when you can train it otherwise and be sure it NEVER infringes others IP? Its a no-brainer, surely. Or are you assuming there is some magic inherent in other peoples private code?


> If those machine learning algorithms are taking in unlicensed material and then they later output unlicensed and/or copyrighted material, then they are a liability. Why would you want that when you can train it otherwise and be sure it NEVER infringes others IP?

Because it could produce a better model that produces better code.

You're now arguing a heavily reduced point. That a model that trained on proprietary code is at higher risk of reproducing infringing code is not a point under contention. The clean room serves the same purpose, it is a risk mitigation strategy.

Risk mitigation is a choice, left up to individuals. Maybe you use a clean room design, maybe you don't. Maybe you use a model trained on closed-source IP, maybe you don't. There are risks associated with these choices, but that is up to individuals to make.

The choice to observe closed source IP and learn from it shouldn't be prohibited just because some won't want to assume that risk.


If you study a closed source compiler (or whatever) in order to write a competitive product, and the company who wrote the original product sues you for copying it, as the parent suggests, you're on shaky legal ground. Which is why clean room design is a thing.


A clean room design ensures the new code is 100% original, and not a copy of the base code. That is why it is legally preferable, because it is easy to prove certain facts in court.

But fundamentally the problem is copyright, the copying of existing IP, not knowledge. grondo4 is completely correct that there is no legal framework that prevents learning from closed-source IP.

If such a framework existed, clean room design would not work. The initial spec-writers in a clean room design are reading the protected work.


>The initial spec-writers in a clean room design are reading the closed-source work.

Right. And they're only exposing elements presumably not covered by copyright to the developers writing the code. (Of course, this assumes they had legitimate access to the code in the first place.)

Clean room design isn't a requirement in the case of, say, writing a BIOS which may have been when this first came up. But it's a lot easier to defend against a copyright claim when it's documented that the people who wrote the code never saw the original.

Unlike with patents, independent creation isn't a copyright violation.


I don't understand what your point here is. The initial spec-writers learned from the original code. This is not illegal, we seem to be agreed on this point. grondo made the point that learning from code should not be prohibited.

What are you contesting?


My point was that, assuming access to the code was legit, and the information being passed from the spec-writers to the developers wasn't covered by copyright (basically APIs and the like), it's a much better defense against a copyright claim that any code written by the developers isn't a copyright violation given they never saw the original code.


I think you're missing the one big flaw here. How exactly do you have access to closed source code?

Did you acquire it illegally? That's illegal.

Was it publicly available? That's fine, so long as you aren't producing exact copies and violate normal copyright law.


You're obviously not


Is that a joke?

Yes you are allowed to read closed-source, proprietary code and become a better programmer for it.

I've decompiled games to learn how they structure their code to improve the structure of games that I program. I had no right to that code and I used it to become a better programmer just like AI do.

That's not copyright infringement. You have a right to stop me from using your code, not learning from it.


This is a pretty extreme stance. There is a fine line between "learning from" proprietary code and outright stealing some of the key insights and IP. Sometimes it takes a very difficult conceptual leap to solve some of the more difficult computer science and math problems. "Learning" (aka stealing) someone's solution is very problematic and will get you sued if you are not careful.


If you think that's extreme, wait until you hear my stance that code shouldn't be something that you can own (and can therefore "steal") to begin with.


Now granted most EULAs and Terms of Service documents aren't legally enforced, most software licenses explicitly prohibit decompiling or otherwise disassembling binaries.

So, yes: They have a right to stop you from "learning" from their code. If you want that right, see if they're willing to sell that right to you.


> They have a right to stop you from "learning" from their code.

They absolutely do not, and as pedantic as it may be I think it's very important that you and everyone else in this thread know what their rights are.

If you sign a contract / EULA that says you cannot decompile someone's code than yes you are liable for any damages promised in that contract for violating it.

But who says that I ever signed a EULA for the games I decompiled? Who says I didn't find a copy on a hard drive I bought at a yard sale or someone sent me the decompiled binary themselves?

Those people may have violated the contract but I did not.

There is no law preventing you from learning from code, art, film or any other copyrighted media. Nor is there any law (or should there be any law IMO) that stops an AI from learning from copyrighted media.

Learning from each other regardless of intellectual property law is how the human race advances itself. The fact that we've managed to that automate human progress is incredible, and it's very good that our laws are the way they are that we can allow that to happen.


> Am I allowed to become a better programmer by looking at private code?

Your argument is based on the idea that you and AI should have the same rights?

I do not see how this works unless AI going to be entitled to minimum wage and paid leave?

Otherwise it is just a money grab


He's not saying that he and the AI have the same rights, rather that he and the person running the AI have the same rights.


> trained on a dataset of Adobe Stock, along with openly licensed work and public domain content where copyright has expired

As someone who has contributed stock to Adobe Stock I'm not sure how I feel about this. I'm sure they have language in their TOS that covers this, but I'm guessing all contributors will see nothing out of this. Fine if this is free forever, but this is Adobe.


It's also worth considering is that there are quite a number of fraudulent images on Adobe Stock, which means that the Firefly dataset without a doubt contains some amount of unlicensed material.


LLM-based AI is tech's equivalent of mortgage-backed securities. Lump in the bad stuff with the legitimate, hope no one notices, and when they do, blame the inherent black-box nature of the product.


I strongly suggest everyone to read this: https://techcrunch.com/2023/01/06/is-adobe-using-your-photos...

I hope it's fair to say that they do train on your work.


We dont. More info here:

>The insights obtained through content analysis will not be used to re-create your content or lead to identifying any personal information.

https://helpx.adobe.com/manage-account/using/machine-learnin...


Thanks for the response. This and the proposed compensation for stock contributions demonstrate that you are taking the right and correct path.

I hope you do continue doing so. I'm all but disappointed in others' approaches in this area, and it paints a very bad image for the potential of AI as tools.


I would not be surprised if behind the scenes they are starting the lobbying engine to safely mine whatever they want. The universe of existing content out there is simply too enticing and out there. This is Google Search vs authors all over again.


speaking of Adobe and AI I really hope they would somehow make finding useful help in their help system somehow possible. It is an endless labyrinth with an almost useful but usually ruined item in each dead end.


Adobe employee, not in Creative Cloud.

We got access to this in beta a week ago and it was an instant hit across the whole company.

This is just the tip of the iceberg and there is a lot of really cool, in-house products around generative AI. The team is going to great lengths to make this ethical and fair (try and generate a photo of a copyrighted character like Hello Kitty or Darth Vader). I'm excited to see the final product of all the internal work that has been going on for so long.


> The team is going to great lengths to make this ethical and fair (try and generate a photo of a copyrighted character like Hello Kitty or Darth Vader)

are you saying it won't work? if that's the case, that seems really silly. actually, it goes against everything I believe in (as well as my understanding of even the kindest meaning of the word "hacker"). it drives me up the wall, it makes my blood boil

who is going to stop me from drawing hello kitty myself?

it's not the tool's job to regulate my creativity. the law exists to regulate the use of my art, not the act of creating the art. I can draw hello kitty all I want and leave it in my drawer, if it floats my boat

limiting the tool just makes me never want to use it. you're like Sony fighting digital music in the 2000s. the future is right in front of you but you just can't see it.

    ⠀⠀⠀⠀⢀⣀⣀⡀⠀⠀⠀⠀⠀⠀⠀⣠⠾⠛⠶⣄⢀⣠⣤⠴⢦⡀⠀⠀⠀⠀
    ⠀⠀⠀⢠⡿⠉⠉⠉⠛⠶⠶⠖⠒⠒⣾⠋⠀⢀⣀⣙⣯⡁⠀⠀⠀⣿⠀⠀⠀⠀
    ⠀⠀⠀⢸⡇⠀⠀⠀⠀⠀⠀⠀⠀⢸⡏⠀⠀⢯⣼⠋⠉⠙⢶⠞⠛⠻⣆⠀⠀⠀
    ⠀⠀⠀⢸⣧⠆⠀⠀⠀⠀⠀⠀⠀⠀⠻⣦⣤⡤⢿⡀⠀⢀⣼⣷⠀⠀⣽⠀⠀⠀
    ⠀⠀⠀⣼⠃⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠉⠙⢏⡉⠁⣠⡾⣇⠀⠀⠀
    ⠀⠀⢰⡏⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠙⠋⠉⠀⢻⡀⠀⠀
    ⣀⣠⣼⣧⣤⠀⠀⠀⣀⡀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⡀⠀⠀⠐⠖⢻⡟⠓⠒
    ⠀⠀⠈⣷⣀⡀⠀⠘⠿⠇⠀⠀⠀⢀⣀⣀⠀⠀⠀⠀⠿⠟⠀⠀⠀⠲⣾⠦⢤⠀
    ⠀⠀⠋⠙⣧⣀⡀⠀⠀⠀⠀⠀⠀⠘⠦⠼⠃⠀⠀⠀⠀⠀⠀⠀⢤⣼⣏⠀⠀⠀
    ⠀⠀⢀⠴⠚⠻⢧⣄⣀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⠀⣀⣤⠞⠉⠉⠓⠀⠀
    ⠀⠀⠀⠀⠀⠀⠀⠈⠉⠛⠛⠶⠶⠶⣶⣤⣴⡶⠶⠶⠟⠛⠉⠀⠀⠀⠀⠀⠀⠀


GP works for Adobe, and Adobe's bread and butter are the professional creators who would love a world where there is hardware DRM on your eyes and you can't even see their creations or a likeness of them without paying a royalty (and one to "rent" the memories of the visualization, not to "own" the memories like we do now). While I largely agree with you, the GP post is exactly what I would expect from an Adobe person.


I forgot for a second that this Adobe, the top stories on HN about Adobe are almost all negative


Are you suggesting there's a positive side to Adobe? I haven't seen any.


> "it's not the tool's job to regulate my creativity. the law exists to regulate the use of my art, not the act of creating the art. I can draw hello kitty all I want and leave it in my drawer, if it floats my boat"

This is very well said. Thank you!


And you good sir are the few types of people who keep my sanity in check. thanks for calling out the tyrants :) - they are terrible and just like the kings of old want constant praise and their rings kissed for the blessing of imposing serfdom to all but their chosen nobility


I think there is a broader phenomenon in the society for complete disregard for law/order and taking it in their own hands. Enabled by Big Tech and giant ESG corporations, this is a method of sideskirting written laws, constitution, the judicial process and force feeding the public with an authoritarian whip. It is anything but 'democratic' hiding under ostensibly kind causes such as safety, climate, misinformation, etc.

It's not a good look for the tech industry and the ESG corporate culture. It is destroying the very hacker culture that made them.


> "the law exists to regulate the use of my art, not the act of creating the art"

Thanks airstrike, for dropping this truth bomb.

And just like that, Adobe expands its ministry of creativity powers. In the beginning, it was dollar bills on the no-can-do list. Now it's pictures of Darth Vader and hello kitty.


This runs into the core problem with technology--we answer "What can we do" before "Should we do it" and "What are the impacts"

Let's say you take your hello kitty dot art, and make a poster promoting a commercial event. You then take it to FedEx Kinkos and use a self-service copy machine to make 1000 copies. You could reasonable argue that you are violation of copyright infringement, and the photocopier / FedEx kinkos isn't.

Now instead, you have AI generate a poster, and it generates a very similar image to hello-kitty. It's arguably so similar than a reasonable person would say it's a copy. You take that poster and again make 1000 copies. Is there copyright infringement? If so, who if anyone, is liable for damages?


Whoever put the poster up for display and reaped some reward out of it is liable for damages. Everyone else is just doing their job in the supply chain. We want supply chains to work for the good of the economy, which is a proxy for increasing availability and reducing prices of "goods and services" to the average person.


Imagine a paying Adobe CC customer.

They use Firefly to generate a poster, and unbeknownst to them, the image it generated is a reasonable facsimile of a copyrighted/trademark character.

The person has inadvertently committed copyright infringement.

So does Firefly need to come with a warning?

The safer solution, to the chagrin of another commenter, is for Adobe to neuter the tool by only training on data in which Adobe has express permission to use.


Surely with all our contemporary AI prowess we can train a model that identifies "reasonable facsimiles of copyrighted/trademark characters" after generating them and alert the user that it could be argued as such. Still, let the user decide.

We do not need creative technology to regulate observance of copyright law.

(By the way I think the chagrined other commenter was yours truly ;-))


With that approach you risk ending up in a very frustrating loop of copyrighted works... A bit like picking a name in an MMORPG that's been out for a few months ends up being a hell of constantly getting your name requests rejected over and over again.


Not really. You just have checkbox somewhere that says "don't copy" and then it won't.

Leave the decision to the user instead of baking it into the technology.


Pretty sure "just having a checkbox" is a massively difficult problem when it comes to AI tech and baking it in is much easier.


A simple warning that what’s been generated looks similar to something that’s copyrighted is not a bad idea. Then it’s up to the AI user to do their due diligence if they intend to use the resulting work for commercial purpose. Neutering the tool from the get go is a step too far.


People accidentally recreate other companies logos in Adobe Illustrator all the time.


> This runs into the core problem with technology--we answer "What can we do" before "Should we do it" and "What are the impacts"

The counter is always, “Well, we can see our way forward to doing it, which means someone else can also do it. So… we better beat them to the punch.”


Reminds me of the recent chatgpt GPT4 virtue signaling in the update notes: "Now AI will refuse more prompts, so it safer" - Yeah this was requested by exactly 0 real human beings (lawyers and politicians excluded of course)


I hate limitations as much as the next person, but these tools are viewed as generators by company xyz. You don’t want Disney to sue Adobe because the tool can circumvent IP and abuse it.

“Draw a Disney logo but for adults”

That image now lives on Adobe.com


> You don’t want Disney to sue Adobe

No, you don’t want Disney to sue Adobe.


So, the above was somewhat flip and terse, but the kind of lawsuit being avoided is also the kind of thing that provides clarity on legal issues and removes spaces of doubt. This can be broadly beneficial.

Giants battling it out can result in a clearer environment for everyone else that couldn’t afford legal risk in an environment of doubt.


What if I draw the logo with a regular 2B pencil?

I want to see Disney sue Faber-Castell for making great pencils I used for my deviant art

Also IANAL but even then there's probably fair use rights in parodying their logo


Does your art live on Adobe.com? I tried to make this clear. Currently “who made it” isn’t as clear as for paper and pencil. AI isn’t a seen as a plain tool yet, it’s part of the artist.


There will be open source tools replicating this within months. You can build your own model based on billions of images on the web or use someone else's or contribute to one.


To expand on this, what we're seeing with LLaMa is that you can fine-tune your model using other models.

It's not clear that the quality will be exactly the same (in fact it will very likely be worse), but working generators are essentially ways to quickly generate training data. And I can't think of a legal argument for why generated output from a model would be less legal to use as training data than an unlicensed photo off of DeviantArt.

Nobody has really called out OpenAI on this, but OpenAI has a clause in it's TOS that you won't use output to build a competing model. But that's... just in it's TOS. If you don't have an OpenAI account, it's not immediately clear to me (IANAL) why you can't use any of the leaked training sets that other people have generated with ChatGPT to help align a commercial model.

Certainly if someone makes the argument that generators like Copilot/Midjourney aren't violating copyright by learning from their sources, it's very hard to make the argument that Midjourney/Copilot output is somehow different than that and their output can't be used to help generate training datasets.


O_o If you wanna draw hello kitty go for it.

You just have to, you know, draw it.

“How dare you not let me copy your stuff for free woth zero effort!”

… entitled much?


hello kitty is not "their stuff"?

"draw it" in this context means "use AI to draw it for you"

am I being entitled or are they acting like they are The Law?


> The team is going to great lengths to make this ethical and fair (try and generate a photo of a copyrighted character like Hello Kitty or Darth Vader).

imagine doing something as unethical as drawing hello kitty


I know why, but why do you guys make your subscription management such an awful experience for users? I used to like Adobe, now I hate the company and will go as far as suffering massive inconvenience to avoid Adobe products.

Last time I canceled a subscription (can't do it through your website, only by talking to support) when I finally got in touch with someone it took several hours to actually convince them to cancel.


One of the great aspects of open-source stable diffusion (civitai.com et al.) is there's a model for every purpose.

Does your inpainting model work with every style? Or is it going to have trouble matching the content for e.g. specific fanart?


It would have trouble matching on trademarked styles, or individual artists / creators styles.

One of the primary goals for Firefly is to provide a model that can generate output that respects creators and is safe for commercial use.

(I work for Adobe)


I don't really understand the negative comments on this. Though a hacker by heart, I'm a designer first and foremost

And I'm extremely eager to get my hands on AI tools that let me extend my capacities based on _my own_ styles & context, and that is focused enough on this scope to evade future legal obstacles when used in production

Very excited to try this tool!


Same, This is seriously tempting me to subscribe to creative cloud if it's possible to run Photoshop on Linux.


It'll be useful, I'm sure, but if it can't match styles then you can only use it with itself. Clashing styles inside the same image is a terrible look.


I understand the intent but the result will clearly sway in the direction of protecting big brands, artists, and individual styles. There’s simply no way that it couldn’t. At some point in the pipeline, there’s a blocklist for copyrighted works of a finite size that’s decided by employees, no?


So that means it would have trouble matching my style, too.


Yes. Although we are working on allowing you to train on your own content.


Sounds like a good way of making it useless or otherwise 100X less useful than Stable Diffusion.


is civitai.com literally just 90% japanese porn?


You can hide NSFW stuff, and apparently they’re going to implement a way to hide anime.


I was just surprised that when you don't have filters on, it seems like it's just a porn platform. I generally don't like hiding nsfw stuff since those filters are usually pretty overly conservative, but....its the AI porn site in my head now.


Hey open-paren, do you know if the videos on this page are high fidelity mock ups or a demo of the current working beta?

From the wording on the page it sounds like these may be mockups:

> With the beta version of the first Firefly model, you can use everyday language to generate extraordinary new content. Looking forward, Firefly has the potential to do much, much more. Explore the possibilities below.

> Imagine generating custom vectors, brushes, and textures from just a few words or even a sketch. We plan to build this into Firefly.

> We’re exploring the potential of text-based video editing with Firefly.

> Get ready to create unique posters, banners, social posts, and more with a simple text prompt. With Firefly, the plan is to do this and more.

> In the future, we hope to enable Firefly to do amazing work with 3D.


The upscaling example is 100% just a downsampled image. Really makes you question the rest of the demos.


Please provide some insight into why you believe that to be the case. Avoid spreading FUD without any substantive backing of your claims.


What if you need to generate a picture of Hello Kitty for an article someone is writing about the art style of Hello Kitty or something? I.E. Fair Use cases.


Copy it? Use a different image generator?

This is just one tool. It doesn't need to be fully general.


Applications are open for YC Winter 2024

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: