Hacker News new | past | comments | ask | show | jobs | submit login
DALL·E 3 (openai.com)
704 points by davidbarker on Sept 20, 2023 | hide | past | favorite | 505 comments



> Coming soon

Ah - this isn't out yet. That puts this in the "announcement of an announcement" category (https://hn.algolia.com/?dateRange=all&page=0&prefix=true&sor...).

Let's have a thread once the actual thing is there to be discussed. There's no harm in waiting (https://hn.algolia.com/?dateRange=all&page=0&prefix=false&so...).


I never saw a post being more deboosted before but this "announcement" was extremely underwhelming so I welcome it.


Happy to exceed expectations in underwhelming!


If anyone's interested: last year I generated around 7,000 images using DALL•E 2 and uploaded them to https://generrated.com/

I wanted a way to experiment/see what DALL•E 2 could create and share with others as some sort of inspiration/a starting point.

This was before the API was available, so I had to generate and save them all manually. And it was rather expensive! But fun.

Looks like I'll have to update them all for DALL•E 3 when I get access.


If you end up doing that, it would be cool to see the comparison of the v2 vs v3 images.


Absolutely. I've already started thinking about how I could incorporate some kind of comparison feature.

Because there's no way to control the seed, a direct comparison (using a before/after slider, for example) probably wouldn't make sense. But I could put the group of 4 images from each version above/below each other as a general comparison, perhaps?


> Because there's no way to control the seed, a direct comparison (using a before/after slider, for example) probably wouldn't make sense.

Even if it was the same seed, from my understanding Dalle3 would have to be just a further trained version of the same checkpoint to even resemble Dalle2's image. Like stable diffusion 1.4 vs 1.5 and 2.0 and 2.1 will make identifiably similar images, but 1.5 vs 2.1 vs SDXL won't look remotely similar.

Even more so because I'd wager they changed their encoder and/or decoder too.

* I think that if they generated something like a controlnet for guidance the same way in both models then they might be comparable but from my understanding Dalle2 doesn't work that way at all.

Comparisons would still be interesting though!


I think you're right — I thought about it a little more after I replied.

I guess it'll just have to be comparisons of the general concepts. It'll be good to see the change in understanding of the prompt and the change in image detail.

If anyone at OpenAI wants to give me early access to give me a head-start… smiles


Yeah I'll be interested to see how much you have to change the prompts to get similar styles


You can't use artist names in prompts anymore so I don't think you'll be able to use DALL-E 3 there?

EDIT: it's only living artists actually that you can't prompt (hopefully, the article says so at least)


That's kind of an interesting way to handle the copyright issues. Not sure how effective it is as I suspect that can be bypassed by including a bunch of details about the artist but not the name.

> ... in the style of a famous Spanish artist who was born in 1881 and passed away in 1973 [and a bunch of other shit about Pablo Picasso]

(I also notice that this is more verbose than just "in the style of Pablo Picasso", which probably helps OpenAI's bottom line given costs associated with token counts. I doubt that's their intention with the change, just something of note. And, of course, a living example would be more applicable for copyright issues but the idea is still demonstrated.)


The artist whose name includes Diego José Francisco de Paula Juan Nepomuceno María de los Remedios Cipriano de la Santísima Trinidad Ruiz.


I think it'll decline if you ask for a living artist, but sounds like it'll work for other artists.

But otherwise you're right — some might not work.


Crafting descriptions of an artist and/or individual works may function as a reasonable replacement for specific names. That's what's happening behind the scenes anyway.

It's an interesting problem. Like, what's the point of conception for a work of art?


Is anybody aware of the specific technical reason that it struggles with words so much? There seems to be enough of a pattern to create a million reasonable hypotheses - curiosity makes me want to which it really is!

Looking at the images it's particularly interesting how it seems to have never once gotten the text correct, always just being a little bit off. Well sometimes way off, but mostly quite close.


Points to them still relying on a single small text embedding which is lossy instead of full cross-attention; but the fact that it can follow instructions reasonably well at least means that they've moved off CLIP - phew! It was frustrating arguing with people about DALL-E 2 limitations which were ultimately due to nothing but CLIP.


Yeah, it's really strange how hard they make it to manage, download, and get full prompts for your images on all these platforms. I made this discord bot for midjourney which with some easy configuration can download and annotate all your images, including as much info as I could grab about version, etc. https://github.com/ernop/social-ai/tree/main/SocialAI

Even then it's not perfect since I'm getting info off of the command you send, which may have fallen into whatever the defaults were at the time, and so when interpreted today, not easily possible to reconstruct the version/seed/etc. from that point in the past, if you didn't include it in the prompt. But still, I just like having a folder of 30k images that I can never lose, with at least the prompt, so I can go through and re-run them later (even manually) to get comparisons over time.


That was really good. At first I thought it was a bit dull with just those 5 images, but the tons of different styles and concepts exemplified made it a great inspiration.


Yes, I'm interested! It's absolutely incredible to have your pictures with the associated prompts. Thanks a ton!


Very kind of you to say — thanks!


Thoughts:

- ChatGPT integration is absolutely huge (ChatGPT Plus and enterprise integrations coming in October). This may severely thwart Midjourney and a whole bunch other text-to-image SaaS companies, leaving them only available to focus on NSFW use cases. - Quality looks comparable to Midjourney - but Midjourney has other useful features like upscaling, creating multiple variations, etc. Will DallE3 keep up, UX wise? - I absolutely prefer ChatGPT over Discord as the UI, so UI-wise I prefer this.


What I think could be amazing about ChatGPT integration is (holding my breath…) the ability to iterate and tweak images until they’re right, like you can with text with ChatGPT.

Currently with Midjourney/SD you sometimes get an amazing image, sometimes not. It feels like a casino. SD you can mask and try again but it’s fiddly and time consuming.

But if you could say ‘that image is great but I wanted there to be just one monkey, and can you make the sky green’ and have it take the original image modify it. Then that is a frikkin game changer and everyone else is dust.

This _probably_ isn’t the way it’s going to work. But I hope it is!


I am using Automatic1111 and SDXL and this is exactly the workflow people are doing with that.

Generate an image, maybe even a low quality one. Fix the seed and then start iterating on that


do you have a link on how this is done? what does "fix the seed" mean? I was experimenting last night with seed variation (under Extra checkbox next to the seed input box) and i couldn't accurately describe what the sliders did. Sometimes i'd get 8 images that were slight variations, and sometimes i'd get 3 of one style with small variations, and then 5 of a completely different composition with slight variations, in the same batch.

As far as the OP goes, they claim you don't need to prompt engineer anymore, but they just moved prompt engineering to chatgpt, with all of the fun caveats that comes with.


I think by "fixing the seed" they just meant using the "Reuse seed from last generation" button. By default, this will mean that all future images will get generated with the same seed, so they'll look identical. The "variation seed" thing mixes slight variations into this fixed seed, meaning that you're likely to get things that look similar, but not identical to the original. The "variation strength" slider controls how much impact from the variation seed is mixed into the original, with 0 being "everything from the initial seed", and 1 being "everything from the variation seed". I'm pretty sure that leaving the width/height sliders at their default positions is fine.

Also, just for future reference - lots of UI elements in A1111 have very descriptive tooltips, they help a lot when I can't quite remember the full effects of each setting.


Yes, that describes it pretty well.

By default the random seed is different for each generation, so even if the prompt and every setting is the same you'll get vastly different images ( which is the expected outcome ).

When you set the seed to a fixed number you get the same image every time you generate if all other settings and the prompt are exactly the same.

So I start off with a pretty high CFG Scale to give it enough room to imagine things ( Refiner and Resize turned off )

Once I have a good base image, I'll send it to img2img and iterate some more one it.


Thanks for info. I don't use img2img except when i want to "AI" a picture, but this makes sense, as one of the steps. I usually just "run off"* dozens of images and pick the best one - but i don't do this for a living or anything, just to be able to "find" a picture of something in less than a minute.

* like a mimeograph!


Not quite. In A1111, the process is "pick a seed and iterate the prompt". You're still regenerating the image from noise, and if SD decides to hallucinate a feature that wasn't in the prompt, it's often very hard to remove it.

Since gpt4 natively understands images, there's the potential for it to look at the image, and understand what about it you want to change


Exactly - iterative "chat-driven" improvement of images will be a paradigm shift


That looks exactly like what was in the video (though the video seems to be down?)


Didn’t see a video! OK I am pretty excited now.

So long as it actually can create an image virtually the same but changed only how I want it. That would just blow everything else away.

I mean I guess it’s. Or that ridiculous. Generative fill in photoshop is kind of this, but the ability to understand from a text prompt what I want to ‘mask’ - if that’s even how it would work - would be very clever


You can _kind_ of do this now, but it's excessively manual. I agreed that it would be awesome to have the AI figure out what you mean to iterate on a given image.

I recently had a somewhat frustrating experience of having a generated foreground thing and a generated background that were both independently awesome - and trying to wheedle Stable Diffusion into combining them the way I wanted was possible, but took a lot of manual labor, and the results were "just ok".


Dall-e 2 had variations, inpainting, etc well before Midjourney. However, I 100% agree that it's going to be interesting to see who and what wins this race.


Stability AI with Stable Diffusion is already at the finish line in this race, by being $0, open source and not being exclusively a cloud-based AI model and can be used offline.

Anything else that is 'open source' AI and allows on-device AI systems eventually brings the cost to $0.


I agree. I am barely excited for DALL-E 3 because I know it's going to be run by OpenAI who have repeatedly made me dislike them more and more over the last year plus. My thoughts are: "Cool. Another closed system like MidJourney. Chat integration would be cool but it's still going to likely be crazy expensive per image versus infinite possibilities with Stable Diffusion."

Especially with DALL-E. Honestly I'd be more excited if MidJourney released something new. DALL-E was the first but, in my experience, the lower-quality option. It felt like a toy, MidJourney felt like a top-tier product akin to Photoshop Express on mobile, still limited but amazing results every time, and Stable Diffusion feels like photoshop allowing endless possibilities locally without restrictions except it's FREE!


They all have their place. OpenAI literally started every major AI revolution including image gen with DALL-E. Let them be the peleton while SD and others follow closely and overtake eventually.


I like painting them as the peleton! You're not wrong it's just not super exciting for me


What's a peleton? Genuine question from a non native English speaker.


I think they mean peloton. It's not English, I believe it is French, meaning a group of bicyclers. In this context it refers to the peloton leading the race.+-


there has a to be a way to link the API from automatic1111 and a "gpt" or "bert" model, to allow similar flexibility, right? The only issue i see is training the llm on the rules of image composition correlated to what CLIP/Deepbooru sees. maybe there will be a leak, or someone can convince one of the "AI Art example/display" sites to give them a dump of images with all metadata. enough of that and this sort of thing seems like a "gimme".

I just started training LoRA, and the lack of definitive information is a burden; but following the commonalities in all of the guides i was able to get a LoRA to model a person i know, extremely accurately, about 1/8th of the time. in my experience, getting a "5 star" image 12% of the time is outstanding. And when i say 1/8th i mean i ran off 100 images, and hand ranked the ones that actually used the LoRA correctly, and 4 and 5 star ranks were 21/100 - i just double checked my numbers!


That's a different race entirely; most people don't even have the hardware, let alone the knowledge to run SD.


unless it's way worse


SDXL and older models are quite good. It's not yet very user friendly but will get there.


> ChatGPT integration is absolutely huge

Probably not. Bing Chat (which uses GPT-4 internally) already has integration of Bing Image Creator (which uses Dall-E ~2.5 internally), and it isn't good. It just writes image prompts for you, when you could simply write them yourself. It's a useless game of telephone.


To be fair, nsfw is a considerable market, which itself is plenty big enough for hold many many businesses. llm trained on literotica, anyone?


> llm trained on literotica, anyone?

I'm not going to post links, but there are several active projects & companies already doing that.


I don't see it. I use chatgpt to create prompts for midjourney. It takes me a couple of clicks only. I don't see the massive difference. Specially since midjourney is much much better than DallE


> DALL·E 3 is designed to decline requests that ask for an image in the style of a living artist.

> Creators can now also opt their images out from training of our future image generation models.

So this version was again trained (without permission) on copyrighted work. And they try to shift the burden onto artists to manually opt out.

Aren't they afraid some court might, at some point, force them to pay each artist back a fee for each generated image?


> So this version was again trained (without permission) on copyrighted work.

So am I!


Yes but you were allowed because you are a human (I assume). Human rights are not transferable to even animals, much less machines.

You also have the right to have your eyes open in a public dressing room, but you’re not allowed to turn on your camera and film.

You’re also allowed to have fireworks but not recreational C4s. And so on.


Completely true, but copyright is not a "right" in the sense of human rights, it's a legal construct that we created to create certain social benefits. And it certainly wasn't my impression that most HN users view the current state of copyright law as an unmitigated positive force in the world.


Except you were actually trained, while the AI applied some form of computation that we (perhaps opportunistically) call "training" but it isn't really the same thing. You can't win in court just by naming things the same.


What is the qualitative difference between the two?


How do we access your brain's API for fractions of a cent? I've got $100 ready to go!


$100 grants you continuous access to my higher-level functions for about half an hour. Inquire within!


The usual way. Send me a check every couple of weeks, and a 1099 at the end of the tax year.


> Aren't they afraid some court might, at some point, force them to pay each artist back a fee for each generated image?

I'd say they're banking on the horse having bolted by the time such a thing might happen (i.e. courts would need to force 1000s of very large powerful companies to pay millions of people - an insurmountable legal effort).


Seems like we are experiencing another historical robbery.


Oh no! Really? Surely you can point to what has been stolen, right?


Livelihoods


"I am no longer able to sell my product in the market since it has been commoditized to the benefit if everyone else" is not the same as "I have been robbed".

This is literal Luddism.


From the start OpenAI started with semantically overloading the "AI is an existential" risk argument from "AI is going to make me starve" to "AI is going to go rogue"


I don't really see AI as an existential risk (nor something that'll single-handedly starve people nor "go rogue" - unless one defines that as something like "having CVEs").

This is less about AI, per se, and more about corporate -vs- personal IP rights. Historically, IP law has bent to benefit large corporations while citing personal IP rights as the raison d'être (& never fulfilling on that justification). What OpenAI (& many others) are doing here is just very flagrantly demonstrating how that justification was only ever an excuse, & that restrictions imposed by the Berne Convention, et al, have never really applied to corporations at scale (outside of small case-by-case exceptional examples).

The livelihoods being stolen are not being stolen by AI - rather it's a further reinforcement (scaling up) of a system that has been doing so for well over 100 years.


> again trained (without permission) on copyrighted work

So far there is no solid proof of that. They didn't disclose the sources or the methodology. Except for 'trained' and 'copyrighted' the rest is questionable. Otherwise they would be already paying royalties.

They could have used the output from prev version 2 with prompts generated by GPT, and then corrected by humans based on the produced image. Also they could use CV to analyze new/old images. I.e. if there is a new feature in the image add it to prompt and train again.


A living artist should be able to request an image in the style of themselves.


It’s absolutely insane this is “allowed” under regular copyright, and now going into mass-consumer commercial products. Given the state of courts, I don’t have any hopes this will be reversed.

They are certainly striking under-the-table deals with big IP holders like Disney to not poke the bears, but leave all smaller actors defenseless (or rather penniless, more so than they already are).


>Aren't they afraid some court

Sam Altman has already empirically proven himself to be rich enough to be above the courts with the whole WorldCoin thing, why should he assume it would suddenly be different now?


There is nothing illegal about World Coin.


It's already illegal in some places and will be in more soon

https://www.google.com/amp/s/www.coindesk.com/policy/2023/08...


It’s a price they are willing to pay.


All artists are trained on copyrighted work.


And they pay a license fee to do so. Artist textbooks are not free, because they pay the rights holders to reproduce their work.


categorically false.


What is categorically false?


that artists pay a license fee to learn from other artists work. it’s an utterly ridiculous argument.


Bullshit. You can look up all works up for free online. What a sorry argument. Copyrighted means you can’t reproduce it, not not look at it.


There’s probably an argument against my point, but this sure ain’t it. I can watch every Disney movie for free on the internet, but that doesn’t make it legal.


It's legal in Germany.

What's illegal here is unauthorized distribution. Downloading a copy of something from a public server cannot get you in trouble here. Only publishing does.


It's not and never was. In 2008 it even got raised from a civil to a criminal offense. It's a common misconception because it is more lucrative to go after people that also upload/publish due to higher possible damages.


For the most part we’re talking about static images here, atleast for the current state of LLMs and those are all freely viewable online.


This does look like it might be a real threat to Midjourney, but isn't going to dethrone Stable Diffusion. I'm guessing prompt adherence will be excellent, but the lack of customizability and the art style gimping are going to limit is use greatly. People are going to use producing base images using Dall-E 3 to get composition, then run them through Stable Diffusion for style/upscaling/details.


Why isn't this a threat to Stable Diffusion? I see more super high quality stuff from midjourney that I struggle to ID as generated than I do SD.


Stable Diffusion is open and fully deterministic: a given version of SD+tools+seed shall always give exactly the same output. The model is available to everyone so you run it locally.

Which means there are countless (free) amazing tools around SD.

StableDiffusion is threatened by exactly nothing.

(others have mentioned that SD shall happily generate porn: I don't care about that... But I care about SD being the actual "open AI").


Maybe I'm doing it wrong, but I didn't couldn't get SD+tools+seed to be deterministic.

Images generated, with the exact same settings (including seed), on m1 laptop are not the same as the images from my nvidia GPU desktop with the SD-webui.


Because SD runs locally and produces the kind of content people want.


> the kind of content people want.

Well put. I was wondering how that aspect was going to be acknowledged & phrased here.


true, fortunately there are many more things to create than nudity, sexually explicit, and political figures

“so for everything else, there’s Midjourney”


SD is also free and there are custom models for anything you want whether it’s creating landscapes or anime waifus. And those specific models are better than the general model that Midjourney uses which is “good enough” at everything.


Not only does it have custom models, you can train your own custom model. This is a big deal for many workflows that Midjourney and DALL-E just can’t accommodate (at least for now!).


Yep. 99% of what I do with SD is generate images of my wife, daughter, and dog.


Dall-E also doesn't allow you to say "in the style of X" and similar. Which I guess appease some concerns from artists, but also makes it harder to guide it where you want it.

So it's not just about wanting to make explicit images, it's about not having a different entity control what you can and cannot do.


I'm able to observe that as well, I'm also able to observe that caring about that isn't necessary in most use cases and the convenience still wins.

I do branding and presentations all the time with graphics from Midjourney and rapidly iterate in that discord chat in various branches compiled in parallel all day. Make something as convenient, cheaper, private/offline, and faster and I'm there. Otherwise the ideology is irrelevant to me, and the unsaid secret is that that's true for most everyone else too.


Midjourney is good for inspiration, but as more people use it things that are just whole cloth generations are going to start to look same-y and cheap. Some iteration/post processing will be necessary not to look low rent.


It was more a rebuttal to you claiming that it only has explicit images going for it. Almost like you were labeling those using SD to use it only for your claimed reasons.


with SD you can use LoRA, and "artist style" is, to my understanding, the easiest thing to make a model of. With LoRA (Low Rank Adaptation) you describe everything in an image that is not what you want the model to model. With a style, You can just use BLIP/CLIP or deepbooru to describe all of your images. At worst, you might have to remove other styles/artists in the tags. the model will only learn about the style. supposedly. I don't follow art enough to have a favorite style; i don't have enough source material on hand to do a style model, nor do i know enough to go out and grab a dataset.

As an aside, textual inversion also made some great inroads into this sort of thing. So smallest size to largest: textual inversion(megs), Locon/lycoris/lora(tens of megs), full model (gigs!). The accuracy range over all compositions follows the same respective order, as well.


Loras are 1 to hundreds of megs for sd15 (I've seen 1mb Lora and also 700mb Lora merges) and I think around 1gb+ for sdxl.

Hypernetworks exist as well for sd15 at least, they're up to I think around 80mb?


Stable Diffusion + fine-tuning is quite powerful for creating specific art styles that aren't easily described to DALL-E / MJ


Because SD has way less commercial restrictions and provides far more control than a text prompt alone ever could even with a real human on the other end.


Stable Diffusion is extensible - for example, ControlNet images have gone super viral on Twitter in the last few days. https://twitter.com/0xgaut/status/1702394230478360637 https://twitter.com/deepfates/status/1701055664603426970

Also, it's basically free if you own a Mac or an iPhone.


It is completely free even if you don't own an Apple product


SD is uncensored, you can generate anything you want with no "safety" rails to stop you.

Just take a loot at civitai to see the kind of finetuned models that are out there.


I do feel Midjourney is better in the "press button and pretty image pop out" but if you want to have control over the process and results then tools like Invoke are generations ahead.


Because people can make porn with locally hosted Stable Diffusion.

They can't do that with MidJourney and DallE.


Yeah honestly the art style gimping made me roll my eyes. Not really interested. MidJourney's filtering was annoying enough and ChatGPT's pointless refusals to do simple things because they could be misinterpreted annoyed me to no end. Combine the two and add explicit filtering for artist styles... yeah I'll just pass on this one.

EDIT: for what it's worth, I'm not making NSFW stuff with MidJourney. I'm talking about things like being unable to use the word "cutting" or "slicing" because they could be used to make gore but I wanted "A stock photo of a person cutting cheese on a counter"


Funny how they say: "Creators can now also opt their images out from training of our future image generation models." and then the link is just a form to submit a single image at a time.

They mention you can disallow GPTBot on your site, sure, but even if you do, what happens if the Bot already scraped your image? In any case, probably other people would just publish your picture in some other website that does not disallow GPTBot anyway.


I have been creating a large number of Midjourney images for a project recently, and it’s made me remember/realize something that I think the AI Art doomsayers seem to not understand: the importance of curation.

A quick glance at /r/Midjourney or even the images featured in DALLE link above shows how boring the “default result” is when using a generator. While it may be easier to create images, you still need some artistic sense and skills to figure out which ones are appealing. In the bigger picture I think this basically means that illustration-type art will become more of a curatorial activity, in which being able to filter through masses of images becomes the predominant skill needed.


Something like photography?


Yeah I think that’s a good comparison. Images are cheap to make and everywhere. That doesn’t mean everyone is suddenly taking good photos.


I’m not terribly familiar with the text to image tools, but you can provide source images as baseline, right? I’d wager that if you’re able to create a baseline image to feed in, your results will be better. The better the input, the better the output. It definitely feels like a situation where artists who can leverage ai will be the ones pulling ahead in the commercial sector.


It doesn’t really work that way. Yes, you can use images as a source, but they are more just mined for “pieces” to rearrange, not overall aesthetic effects.


That is not how image-to-image approaches work.

ControlNet is a obvious counterexample. If you think "diffusion is just collaging", upload a control image using this space that cannot exist in the source dataset (e.g. a personal sketch) and generate your own image: https://huggingface.co/spaces/AP123/IllusionDiffusion


It’s not that I think it’s purely collage, but that inputting a high-quality image doesn’t somehow lead to generating better quality output by default. The various silly images created by using keywords like “Greek sculpture” or “Mona Lisa” are an example.


You can do it with ControlNet guidance for SD.


I don't know how long such a phase will last, and I don't think anyone should count on it.


They can probably train an AI to filter based on human appeal. But IMO there's still room for artists with taste and technical talent to manipulate images closer to a curated ideal. Like the current SD photoshop workflows where generated content creates a base. I imagine once the workflow matures, there's going to be more manual imput again, i.e. drafting specific postures / arrangements for controlnet to block out composition before AI fills to 90%, and then human taste tries to refine/reiterate the last 10%.


Given the bullet point of "DALL·E 3 is built natively on ChatGPT" and the tight integration between ChatGPT and the corresponding image generation (and no research paper released with the announcement), I strongly suspect that DALL-E 3 is a trial run of GPT-4 multimodal capabilities and may be run on a similar infrastructure.


GPT-4 can only do text-to-text and image-to-text. It can't generate images itself. So it will simply use an API call. Really nothing special, Bing does the same thing.


The art produced by GPT-4 so far hasn't been at this level but this may be a newer version. See: https://arxiv.org/pdf/2303.12712.pdf


Have they removed copyrighted "training" material or are they still relying on people's hard work which they "learned from" without consent and are selling without permission?


From the end of this announcement (emphasis mine):

"DALL·E 3 is designed to decline requests that ask for an image in the style of a living artist. Creators can now also opt their images out from training of our future image generation models."

Very carefully-worded statement. So...still relying on people's hard work, but on the upside, you get to opt out of having your work be fodder for DALL·E4." </s>


So

* using living artists work to train models = good

* generating living artists work using said models = bad

Good ethical consistency from the OpenAI crew


Well OpenAI is consitent in that it consistently tries to monetise other people's work without paying for it and in doing so it leverages the gullibility of the masses to defend their actions. Clever.


An artist using an artbook as reference material = good

An artist tracing an image in an artbook and selling it = bad

Seems consistent to me


I think it's worth reframing this analogy:

An artist using an artbook as reference material = good

An artist with one billion arms, using every artbook ever published, creating millions of images per second in every possible style for fractions of a penny per image, outcompeting every other artist forever = bad


I think I'm sympathetic to that idea, but the ethical consideration there isn't that it violates copyright. It's that it's disruptive to society/the economy.

Or maybe one could argue that because it's owned?


Art books are not free, they require paying a licensing fee to the rights holder.


If the artist stole the artbook from a shop, the problem is not that they are using it as a reference; the problem is that they stole it. Likewise if they download the pdf from a pirate site. It is a separate, different problem. You can tell because there would still be a problem even if they didn't then go on to produce art, or read it at all.

If, however, the rights holder chooses to just give them away; or, y'know, puts them on their own website for anyone to look at - there is then no license fee to be paid for looking.

Note that this still does not mean someone can make copies and sell them. That's a separate right. But using such materials as a reference is just fine, and people do this all the time.


But not opt out of training Dalle4 on Dalle3 outputs, which were trained on your art.


Legally this is likely a non-issue. It depends on if they successfully make the case that their AI learns like a human, and thus any outputs that aren't direct copies of existing art are new creations.

This even applies if the AI copies an artists art style (in the same vein as a human looking at one artists art over a weekend and then being commissioned to paint something in the same style, which is completely legal since you can't copyright an art style; although Adobe would love that[0]).

0: https://twitter.com/UltraTerm/status/1679294173793628161


This kind of argument is always absurd to me because it doesn’t matter if it learns like a human or not, it’s still not a human.

(I would also argue that it learns and generates images in ways that are non-human, just based on speed and scale alone)


How would you turn that into legislation though? I can simply draw a few pieces of art in the style of Naoki Urasawa, train a model on it, and claim that the outputs of the model are non-infringing. An artistic style is either copyrightable or not - I don't think a blurry middle ground helps anyone.


Corporations are people too... in a particular legal perspective.


Corporations are not literally people in a legal sense. Corporations have some of the same rights as people in certain legal transactions because treating them as entities rather than each individual within a corporation separately is more convenient, and because to do otherwise would in some cases infringe upon the rights of the individual humans who make up the corporation (for example, corporations have a right to free speech because the individuals within that corporation have a right to free speech, and it would be impossible to deny free speech to corporations without also denying it to individuals.)

But systems of law are still capable of recognizing the distinction between the personhood of corporations and of people, just as they can recognize the difference between humans and AI even if AIs can be demonstrated to "learn" the way humans do. As always, context and nuance matter. Laws aren't written or decided upon based on pure logic or calculus but on what human beings want and consider to be in their self-interest.


So the law says that a painter can't steal copyrighted images, but a programmer can. Maybe the law has some catching up to do.


That's the exact opposite of the argument being made.

If you wouldn't be able to sue a painter over producing an image, why should you be allowed to sue a programmer over producing that image?


They haven't. Probably one reason Adobe's AI will beat them out long term.

Also another thing that's been on my mind is I wonder if all this AI generation stuff could cause a Games Industry style crash where due to such a over saturation of highly advertised but meaningless/worthless AI content consumers lose interest and stop spending money in different respective industries (books, ganes, films, digital art, music, etc.) and then they crash.


If there's an AI crash it will be due to the vast number of AI companies with insufficiently differentiated products and subsequent race to the bottom, not due to the ubiquity of the output.


I am not aware of a games industry crash, it would appear that gaming is an industry larger than all other forms of art combined. But indeed niches that are saturated by enshitified content have almost crashed and I get your point. I suppose the average will turn even more average and indeed people will stop spending money on it. AI being a statistical machine it will excell at making whatever is common and plenty and as such those industries will suffer even more. Average music, content writing, drawings, etc, will drop to near zero value, that's guranteed.


The comment you're replying to is referring to the video game industry crash in North America in the early 1980s. Basically the market was flooded with games of poor quality due to a lot of factors, including Atari's complete lack of quality control on games they put on their 2600 console. Nintendo ended up redesigning their Famicom console as the Nintendo Entertainment System with an emphasis on it looking like a VCR as opposed to a cheap game console like NA audiences were used to (the Famicom itself is fairly small and plasticky with permanently attached controllers). Additionally they were strict about licensing development on the system with the goal of fostering a crop of family friendly, high quality games. It was a couple years after the fact (iirc) but Nintendo's efforts to differentiate themselves in the wake of the crash obviously payed off and led to a long period of Japanese ascendancy in the games markets. So the crash cleared out a lot of the market and led to a huge opportunity for Nintendo.


Idk. I think this an overcomplexification. To a large extent, the NES was an instant success because Super Mario Bros was simply an amazing game.


No, this has been directly confirmed by Nintendo developers at various points. For people to even be able to try SMB, they'd have to be convinced that the NES wouldn't suffer the fate of the Atari 2600 and the like. It wasn't just about the VCR-looking design - the marketing materials had very careful wording to avoid associations with the failed game consoles, and accessories like R.O.B. (that never even existed in Japan!) were mostly made to make the NES look like a complex electronic toy, and not a game console.


Sure. But that doesn’t mean it was the thing that mattered. The NES is clearly a video game console.



OpenAI license image data from Shutterstock, so it's possible that this is trained entirely on licensed images.

https://investor.shutterstock.com/news-releases/news-release...

More transparency about the training data, as always, would be greatly appreciated.


Truly, I have always hated how human artists have "trained" by looking at other people's art without permission, downloaded those without permission into their meat-brains, and trained their organic neural networks on this art.

You can't do that. It's copyright-maximalist copyright infringement.


The fundamental differences in scale between manual recreation by a human and automated replication by a machine are what led to the creation of copyright law in the first place.


No, that isn't what led to the creation of copyright in the first place. The concept predates Gutenberg's printing press.

Copyright in the United States was drafted into the Constitution as a way of rewarding creators so they could create more.

They're already being rewarded, perhaps too handsomely, there is no need to extend it further. If they persist in trying to take more than they're given, then the public will just need to revoke the privilege. It's not a human right.


The concept only barely predates the press, with the first actual copyright laws being in the 1700s. https://en.wikipedia.org/wiki/History_of_copyright


Proto-copyright goes back centuries before that. I think around 600AD, King Dermott said "to every cow its calf, and to every book its copy". Then it really picks up in the Middle Ages.

It's in your own fucking link for godsakes.


As I said, barely.

Also in my "own fucking link":

> Prior to the invention of movable type in the West in the mid-15th century, texts were copied by hand and the small number of texts generated few occasions for these rights to be tested. During the Roman Empire, a period of prosperous book trade, no copyright or similar regulations existed, copying by those other than professional booksellers was rare. This is because books were, typically, copied by literate slaves, who were expensive to buy and maintain. Thus, any copier would have had to pay much the same expense as a professional publisher. Roman book sellers would sometimes pay a well-regarded author for first access to a text for copying, but they had no exclusive rights to a work and authors were not normally paid anything for their work. Martial, in his Epigrams, complains about receiving no profit despite the popularity of his poetry throughout the Roman Empire.

> The printing press came into use in Europe in the 1400s and 1500s, and made it much cheaper to produce books. As there was initially no copyright law, anyone could buy or rent a press and print any text. Popular new works were immediately re-set and re-published by competitors, so printers needed a constant stream of new material. Fees paid to authors for new works were high, and significantly supplemented the incomes of many academics.

Incidentally, if you click the link about King Dermott and get to https://en.wikipedia.org/wiki/Battle_of_C%C3%BAl_Dreimhne, it says that's "an account that first appears... nearly a thousand years after the alleged events supposedly took place, and therefore a highly unreliable source".


> without permission

or attribution even


That ship sailed long, long ago with ImageNet, I'm afraid. All that theft is part of "the economy" now, which means it ain't comin' back. Best we can hope for is a legal decision that says, "AI doesn't make shit. It's all public".


Funny how people complained about chinese factories stealing IP and wanting protection, but now they defend OpenAI and others, and even rejoice at the idea that this will "free society". Not clear to what it will be free to do since many that were left without jobs due to said factories have switched to white collar work, which is now to be stolen by ... the people that complained about china stealing their IP. It's hilarious to watch this mass histeria kickstarted by one single corporation. People are literarily like cattle - you can steer them any direction you wish if you know how.


> but now they defend OpenAI and others, and even rejoice at the idea that this will "free society"

A tale as old as time, "it's different when we do it".


Funny how there's so many well paid IP lawyers around, and they focused so hard lobbying to extend the copyright to 100+ years for simple copying, but they never had the imagination and creativity that copyright is supposed to be all about protecting, to extend copyright beyond that.


Frankly, I'm quite tired of having artists steal each others' ideas without payment. To say nothing of children.


Have human artists done so?


But a machine is not a human, so your analogy is moot.


Most of the times this argument is fielded, including this one, it is formulated in the shape of an appeal to a general moral principle. I don't see what this principle is supposed to be: as my analogy shows, there is clearly no general moral principle against learning from copyrighted material and people's hard work without their explicit permission. The more narrow interpretation, in which the claimed principle is that a machine must not learn from copyrighted material (...), is also implausible: since we have no real history of machines learning from copyrighted material in any way that is recognizable as learning, it stands to reason that a principle addressing that scenario can not yet have become general.

The appeal is thus to a completely novel principle that you have come up with for yourself; and it seems that rather than presenting arguments for why others should adopt this principle, you are trying to present it in such a way that someone not paying close attention would be fooled into believing that it is common sense and widely accepted. An analogy with the classic "you wouldn't download a car" comes to mind.


But a human is a machine.


Yeah if we overdose in lsd often the two are indeed equally capable in memorising data and mixing it together.


I've honestly got no idea what that means. Can you elaborate?


Humans are nothing like a machine. Machines can be taken apart piece by piece and put back together again.


Humans can indeed be taken apart piece by piece and put back together again. We just don't have that level of technology yet. There's nothing physically stopping it from happening though.


You can melt down a lathe and it is quite hard to reassemble it, and even if you did reform the entire thing people would doubt if it is the same lathe.

Humans have had parts removed and reattached. With transplants components have been replaced entirely. There is a point at which you can destruct a machine from which it is impossible to reconstruct without getting into ship of Theseus issues. That point is different for different things.


How would you take apart a ruler?


A ruler isn't a "machine"...


Sometimes debating the issue with ai fans is like arguing with children. They come up with all sorts of what they think are "clever comebacks" but really all they do is a reduction to absurdity. It only proves the fact that they are indeed children and fail to understand the topic alltogether. In such scenarios is best to leave them be.


Prod the anti arguments enough and half of them are religious and the other half are pragmatic. I've got no objection to either, but they're not deep.


Does that even apply to a LLM?


Are you trying to start a conversation that will revolve looking up dictionary definitions?


Hmm. I don't think so. Whether humans are machines or not is really a matter of faith, not dictionary definitions, I would think.

It was a pithy one-liner about categories in response to a pithy one-liner about categories.

But I'd say the underlying question I'm trying to ask is philosophical: what property do humans have and machines lack that makes the first's learning from copyrighted works acceptable, and the second's unacceptable? (eastof suggested a property below).


And another round of races commences!

Some folks seem to have some strong ire towards OpenAI (maybe a bit less recently), but for one, they seem to do a really, really, _really_ good job at making themselves "the benchmark to beat" for certain things, and in doing that, I think they really seem to push the field quite far forward. <3 :'))))


I dislike OpenAI because they were founded to work on AI safety, and the most anti-safety thing you can possibly do is encourage competition over AI capabilities, which is exactly what they are doing over and over again.


AFAICT, “AI safety” was a term created by the overlapping (sometimes in the same body) group of X-risk cultists and corporate AI marketeers as part of their effort to redirect concern from the real and present problems created and exacerbated by existing and imminently-being-deployed AI systems into phantom speculative future problems and corporate prudishness.


X-risk concerns have been around for a long time and were not invented by AI marketeers. I agree that the marketeers are abusing the concept to try for regulatory lock-in and to make their products look maximally impressive.


> X-risk concerns have been around for a long time and were not invented by AI marketeers

I didn't say X-risk concerns were invented by AI marketeers, I said the “AI safety” language was invented by the overlapping groups of X-risk cultists and AI marketeers, some of whom (Sam Altman, for one) are the same people.


AI safety is the dumbest idea in the world by people who think computers are magic, so confusing its meaning is great. The original AI safety people now think LLM training might accidentally produce an AI through "mesa-optimizers", which is more or less a theory that if you randomly generated enough numbers one of them will come alive and eat you.


If there's any magic being alluded to, it's by the people who say that AIs will never reach or exceed human intellectual capabilities because they're "just machines", with the implication that human brains contain mystical intelligence/creativity/emotion substances.


"AIs will never reach or exceed human intellectual capabilities" is an example of Wittgenstein's point that philosophical debates only sound interesting because they don't define their terms first. If you define AI this is I think either trivially true or trivially false.

In the cases where it's false (you could get an artificial human) it still doesn't obviously lead to bad real life consequences, because that relies on another unfounded leap from superintelligence to "takes over the world" ignoring things like, how does it pay its electricity bills, and how does it solve the economic calculation problem.

It's more like having children. Sure they might become a serial killer, but that's a weird reason not to do it.


True, and a good way to explain it to a layperson is through a comparison of Html and Python.

Are there any implementations of Python in Html? No, because Html is not a programming language. Are there any implementations of Html in Python? Many, because Python is a programming language.

Given these assumptions, one easily imagine that Html is a weaker language than Python.

So if Html is weak, let's make it stronger! Let's add some more Html headers of webpages, than three. Html has now 1 million headers! Is it less weak now? Does it come closer in strength to Python?

No, because the formal properties of Html did not change at all, no matter the number of headers. So, do the formal properties of the grammar generator called GPT, are any different related to how many animals it got statistical data on? No, the formal properties of GPT's grammar did not change at all, if it happens to know about 3 animals or a trillion.


While I dislike the silliness that you're alluding to, I think you're using multiple meanings of the phrase 'AI Safety' there all lumped into one negative association.

There are risks, esp in a profit-motivated capitalistic environment. Most researchers don't take the LessWrong in-culture talk seriously. I'm not sure many people are going to be able to actually understand the concerns of people in that group given the way you've presented their opinion(s).


> Most researchers don't take the LessWrong in-culture talk seriously

Yes but politicians do, for some reason. AI Safety has become a meaningless term, because it is so broad it ranges from "no slurs please" over "diverse skin colors in pictures please" to the completely hypothetical "no extinction plz".


“diverse skin colors in pictures”, and, more critically, “AI” vision systems in government use for public programs should work for people of different skin colors, is not so much “AI safety”, as the kind of AI ethics issue that the broader “AI safety” marketing campaign was designed to marginalize, dilute, and distract from.


Those people are the ones who founded OpenAI, so it's specifically the kind of thing they are pivoting away from.


Look, there is no AI safety advancement without AI capability advancement. I think we van learn fuck all about AI safety if we don't try to actually build those AIs, carefully, and play around with them. AI safety is not an actual field of study when you don't have AIs of corresponsing level to study - otherwise there are zero useful results.


Sure, but you snuck an assumption in there. Just because AI is possible, or someone else will do it, doesn’t obligate us to build it. If we can’t make AI without risk of significant or existential harm, then we shouldn’t do it at all.


My point is that we have to try, because there are still big issues in this world we should try to fix. We're just gonna have to try and make it work.


>Some folks seem to have some strong ire towards OpenAI...

Yes, they should. OpenAI IS Microsoft, never forget this. Any old timer like myself remembers the crap Microsoft pulled in 90's. And nowadays they still would do the same (and in background sometime they still do it) if they would lead in those areas. I have no love towards FB/Zucky boi, but the move to make LLAMA free is a good one. Hopefully another leak comes from inside OpenAI and we get access to everything.


Not even in the background. In order to use BingGPT you have to use the edge browser (there are ways around this but they are not obvious to a nontechnical user). What could possibly be the reason for that besides anticompetitive behavior?


> I have no love towards FB/Zucky boi, but the move to make LLAMA free is a good one.

I think that might be a bit "enemy of my enemy". Remember "commoditize your complement"? Not that I'm averse to the tech giants forcing each other into a race to the bottom.


><3 :'))))

What is the meaning of this? Why is it part of your post?


I think it's a heart and a smiling face with a tear :)


Yeah it's just a habit for what I do, feels most comfortable for me when commenting or messaging people.

We humans all have our quirks! <3 :'))))


As an isolated image, I prefer the Dall-E 2 sample (of the basketball player) to all the others on that page, aesthetically. Due perhaps to having used a more fine-art-heavy training corpus, or a less specific correspondence to prompts?


I appreciate your preference (I like things heaver on impressionism too), but I don't think it's due to the corpus but rather the model capability. DALL-E 2 is just behind in capability. Of course we won't know until October but I suspect you could prompt v3 to get a style closer to v2 if you wanted.


This is actually an interesting issue the Midjourney team has thought a lot about. As each version has gotten "better", ie more realistic, there's been some loss of the "artistic" side. There are a lot of users who still use the old V2 model (compared to the most recent V5) specifically because of how "bad" it was. The grimy and less coherent parts are what they're actually looking for, instead of a more precise or perfect looking result. This has led to there being flags for adding in more stylisation or "weirdness" or being able to choose between more realistic or more artistic versions of models.


agreed. the new version (which they obviously view as so much better that this is their one "we made improvements" sample) is just repugnant


reminds me kitsch


Artistic preference aside, the Dall-E 3 version definitely follows the prompt closer (in the it shows someone dunking a ball).


That's part of my point. It better reflects the banal concept expressed by the prompt.


it looks less like an 'oil painting' though. Looks to me like one of those stencil, spray-painted images you see people selling at tourist attractions.

Perhaps the Dall-E 2 unintentionally got that better.


The reason that this will be a good product is that it is accessible natively within the chatgpt interface.

In addition, having access to a library of prompts, and being able to produce, create, and store images within the web interface will unlock this type of generative ai for images to many more people.

Compare this to the midjourney way, in which users must not only sign up, they have to use a discord bot (not saying this is hard, but more so, a larger barrier to entry).

Native integration will mean instant adoption by millions on day 1.


For all the press releases the images displayed are always very impressive, however whenever I try a similar (at least from my point of view) prompt I get undefined blobs full of mistakes. Is my prompt-fu that low or is that a more widespread issue?


Even the press release images have multiple pretty significant issues. They look good at first glance but are pretty much nonsensical if you take a closer look.

Why is the spoon writing on the back of a clipboard, for example?


Especially for previous generation of models (from Dall-E 2 to SD 2.1) without any sort of finetune, my experience was that you can get good results only and only if you have an amazing prompt. This obscure problem seems to be slowly disappearing with newer generation (SDXL) models or the existing fine tunes (RealisticVision).


It's a widespread issue. With DALL-E 2, yes, the vast majority of the time you do get undefined blobs full of mistakes. You can get some amazing things, but only with clever prompt engineering.

This is about DALL-E 3, which is just announced. Nobody's played with it yet so we don't know if it's a lot better or not.


You're probably not using DALL·E 3:

> DALL·E 3 [...] will be available to ChatGPT Plus and Enterprise customers in October via the API, and in Labs later this fall.


I'm definitely not using DALL·E 3 :) but this is has been my experience for all other image generators


Prompt-fu is definitely part of it. It takes dozens to hundreds of hours of playing around before you have a good eye for it... which isn't going to happen with something you have to pay for per image, let's be honest.


rtf headline


Never understood how blocking adult content generation is considered safety.


I assume it's to stop people from generating revenge porn.


I am not familiar enough with current dall-e, would I be able to upload a photo of someone and ask it to turn it into adult imagery if this restriction wasn't there?


Yes. Inpainting lets you erase part of an image and have the AI model fill it in based on your prompt.


This would only justify blocking nudity for inpainting.


That is not what revenge porn is. This is de novo artwork.


This feels like semantics. If I create use SD/Dall-E and create a comprimising image of a woman I know, and I release it, what difference does it make if it's a photo or generated? Especially as tech gets better and it's hard to tell at a glance whether or not it's a fake.


If you are a painter and you create a compromising painting, does your argument still hold water?


Nice to see OpenAI catching up to Midjourney. It's been interesting to see how good Midjourney is compared to DALL-E and StableDiffusion. There has been a wide gap in quality for awhile now.


I think it’s more subjective than that. For me the Stable Diffusion experience is far ahead of Midjourney. Running it locally, hundreds of custom models online for any kind of image you want, controlnet, plus it’s free.

The open source community is pushing forward SD forward far faster than Midjourney is improving.


can midjourney be used without that propriety and inappropriate "discord" interface?


Not yet, but from the news I've heard of Version 6, that is the plan. They're working on a big website update and at first they will allow the use of both, and over time I think the plan is to build new features for the website UI only. That's because they're running into major limitations with Discord when it comes to more advanced stuff like region painting and so on.


That was one of the weirdest experience I've had on the web in a long time.


Ooooh exciting ! Been using DALL-E 2 to generate space images every 30 mins from prompts generated by GPT-4; and I've noticed the DALL-E output is so dramatically worse and more constrained than MJ (no API though) or SD (which I'm not running locally yet). Been sad at the image quality and was just wondering this week when DALL-E 3!

You can see it at https://cosmictrip.space

(and coming soon is an adventure game series backed by DALL-E and GPT-4)


Interesting idea, i love it!

If you want to self-host, check out comfyui. It is a breeze to install, and offers an api for headless interactions. On my 5 year old i7 NUC it produces a 1024x1024 image (cpu only! no gpu needed) in around 20 mins using the SDXL model.


I haven't heard of comfyui yet but a simple install with a headless API and ability to run on my linode (hopefully..!) without a gpu is all I really need to kickstart the local-running plan. Intersting, will check that out, thanks! And thanks for the kind words as well, glad you love the idea haha!


Good luck! Dunno if linode will cut it, my box has 6 cores and 32 gbs of RAM; it's a pretty resource heavy tool... but yeah if you are okay with >10 minute generation times this is a very cost effective solution.


I like that pixelated stuff.

Wonder if DALL-E itself counts as a forbidden living artist or if soon we will „generate x in the style of DALL-E 2“


"Credits" couldn't have been more ironic, looks like name of every possible person at the company got a mention except for the real artists upon whose data their Generative AI is working.


You know, I've been trying to use some of these new generative AI and LLM models and I think I'm sort of reflecting off of them. I've gotten a few good things out chatGPT in terms of writing advice and such, but I find the generative art bits really frustrating, and typing in long prompts to be also kind of not worth the effort, I can usually just make what I want in less time than it takes to refine my prompts down.

However...I'm a very visual thinker, when I'm thinking or speaking, I often see images in my head of what I'm trying to convey.

I wonder if aligning ChatGPT and DALL-E or something similar so that I can "see" an image of what the computer is saying as well as the text might be a great next step towards making me feel more engaged with the technology.

That and native speech to text would be nice so I can just talk at it and have it just sitting on the side as a helper-bot instead of being the main point of my focus while I work on or do other things.


I’m also excited if they can make an image generator you can chat to and iterate. That would be awesome.

You might know Thai already but have you tried using ControlNet to make images with stable diffusion? It allows you to input an image that it follows along with the prompt, and there’s a bunch of ways to influence it. You can even give it a hand drawn sketch. Or have it pick out position of limbs, or use the edges of objects and keep those.

If you have something specific you want to create then it’s amazingly helpful.

The only easy to use site I know of that offers it is happyaccidents.ai (or you can run yourself if you have SD installed)


It's quite a bit more narrow, but FusionAI is a really cool example of this too! https://www.producthunt.com/posts/fusionart-ai


Thanks for the tip! I'll be checking it out.

This conversation reminds me of the old Star Trek:The Next Generation episodes in the Holodeck, only they're talking to "the computer" to iterate on a holodeck scenario design.

https://youtu.be/vaUuE582vq8

https://youtu.be/NXX0dKw4SjI


> but I find the generative art bits really frustrating, and typing in long prompts to be also kind of not worth the effort

I think this might just be inherent in the domain - the state space for images is so much larger than it is for text, so there's just a lot more ways to interpret a text prompt. Sane "defaults" help, but it might just be inherently true that it takes a longer prompt to get close to what you're seeing in your head.


They do seem to really struggle with prepositions...at least for what I've been trying.


Can’t really beat SDXL with loras, but glad they’re still advancing the tech


There is no paper, so no technical detail about the model. Yeah this is very different than the SDXL world, where you are much more in control.

Too bad, since there might be some interesting advances (the way the model follows the prompts better for instance), but OpenAI is continuing to advance the tech behind closed doors.


Is this the first time OpenAI has had a big announcement like this without it actually being released? Normally they tend to ship when they announce.


I actually bought credits with DALL-E only to be disappointed by the results, because there isn't a version 3 available.

They honestly could have just waited to announce this until it was actually released!


I tried watching the video of Dall-E in ChatGPT (with the thumbnail of "my 5 year-old keeps talking about a"), but it is giving an error "sorry, this video does not exist". Anyone else having trouble watching the video?



Very unlikely to be better than SDXL combined with its rich ecosystem of loras, controlnets, and other custom content in WebUIs like Automatic1111 or comfyUI and their extensions.

Would love to be proven wrong!


To add a little bit to this: ControlNet.

If you are actually a visual artist, I think the leader of the pack right now is controlnet, because you can exactly determine the visual structure of your image. While MJ or Dall-e may be better at "imagining" concepts, or have a more aesthetic sensibility (with Loras and custom-trained models, I'm not even sure about that) with controlnet you can very precisely specify how your image should be structured.

This is closer to how traditional artists work. They don't go for the details (color, texture, shading) first. They do a sketch: what are the big forms in this image? How do they fit together? Then they begin filling in intermediate details. What is the color palette? Where are light sources? Which areas have contrast? Which do not? Only after they have done all of that preliminary work will they actually implement the frills on the dress, or the twirls of the mustache.

If you are just a person who wants to make some pretty pictures, Midjourney (and, Dalle3, now) is probably your best bet. If you are an artist who wants to use an actual tool, you are using StableDiffusion. I think it's unlikely that the centralized "plug and play" Midjourney or OpenAI will ever be able to or interested in replicating the complex interface of stablediffusion. But there is a tremendous opportunity for a startup that can improve the UX of the complex workflows that are being developed by "ai artists."

That's also why I am convinced that MidJourney / Dalle will not replace artists. You simply cannot, with a single prompt, replace the work of a true visual artist.


>That's also why I am convinced that MidJourney / Dalle will not replace artists. You simply cannot, with a single prompt, replace the work of a true visual artist.

They already are, because employers don't care about "true visual art," they care about cost and productivity, and getting an intern or someone outsourced to write prompts is both cheaper and faster than paying an actual artist, and capitalism dictates the path of least resistance is the path all competitive business must take. Companies are already replacing their creative staff with AI, or are planning to. AI generated art is already everywhere in advertising. And yes, they contain obvious errors that wouldn't exist with a real artist. And no, companies do not give a damn.


I replaced my Art department of 0 people in my hobby project with midjourney and added 2500+ visuals to my hobby project (online-RPG). So on the one hand you are right, on the other hand...


On the other hand, I'm obviously not talking about hobbyists, but plenty of people (including hobbyists) are using AI art to generate game assets.


MJ is already better than SDXL if you don't need any of the tools that SDXL has and MJ doesn't. So it isn't that hard to imagine Dalle 3 being better than SDXL.


> > Very unlikely to be better than SDXL combined with its rich ecosystem […]

> MJ is already better than SDXL if you don’t need any of the [elements of the rich ecosystem beyond the base models]

Yeah, I think you kind of missed the point there. (Also, not convinced you are right even there, MJ seems to be much worse at prompt-following than base SDXL model, and on other qualities in the range where subjective opinions are going to vary considerably on which is better, judging from the head-to-head comparisons with prompts I’ve seen, largely from people claiming that MJ is better so presumably not trying to subtly favor SDXL in the construction of the comparisons. Because of the ecosystem, its been a long time since I found MJ more useful than even SD 1.5-based toolsets.)


From my experience, while MJ is not as good at following the exact prompt, it's been much better at producing overall visually appealing images.

I don't have as much experience with SDXL, but I've used plenty of Dall-E and other models; most of them generate results that follow the prompt well, but look more like a collage than a piece of art. Which may very well be what you're going for, but for doing more abstract and creative brain storming, I much prefer the Midjourney, which definitely has a specific "look".


Is your hypothesis that DALL-E 3 won't be using additional methods? Or what do you consider "better" -- I agree Automatic1111 and comfyUI make SDXL great for home use, but it feels a bit like an argument of PC vs Mac, where OpenAI is providing the Mac experience (polished, instantly easy to use) while PC offers more customization and nitty gritty.


> but it feels a bit like an argument of PC vs Mac, where OpenAI is providing the Mac experience (polished, instantly easy to use) while PC offers more customization and nitty gritty.

1984, the year the Mac was introduced (and when the more customizable, nitty-gritty Apple II series was their main seller) was also Apple’s all time peak in share of the personal computer market in terms of units sold.

So…yeah, I think the comparison is apt, but maybe not in the way you think it is.


The hands fidelity seems impressive. Any ideas how this is comparing to latest stability model?


Really? On their large, featured image, the female character's left hand seems to be two deformed blobs sneaking behind her other arm, while the male character's left forearm seems twice as long as it should be, eventually merging with the countertop.


Yeah, I was surprised how "ai blobby" that image was. Guess it makes the rest of the page seem less cherry-picked? Maybe there's a trade-off in "have lots of little details" and "have uniform good quality"?


Oh that is a good callout, I wasn't even thinking that is a hand in the first place


SDXL 1.0 still produces quite a lot of terrible hands, but fine ones too. If a picture is great except the hands and some other details, inpainting can be used as a workaround.

Also, cherry picked examples from Dall-e 3 may not be representative of the average output. Like some SD 1.5 models may look amazing on civitai or reddit, but you soon realise that they are terrible on average and overfitted to very specific kind of pictures and characters.


I feel like MidJourney has already almost mastered hands. The one thing I definitely wan to see more of is Text.

There's one image with "Explore Venus", and in the video, the hedgehog has a mailbox with Larry on it. Both of those look good, but obviously super cherry picked.


> DALL·E 3 is designed to decline requests that ask for an image in the style of a living artist.

There can be only one!

This seems weird to me, but I admit I'm about as far from an artistic person as it is possible to be. I understand why it was done (people kept asking for pictures in the style of that one dude and they were better and he hated it) but it still just seems strange.


They want to make it less obvious that they've stolen people's content. That's what these procedural images are based on. Loads and loads of content taken from people for free.


> Loads and loads of content taken from people for free.

The content that the people themselves posted to the Internet? I'm pretty sure that OpenAI didn't break into any artist's studios and steal anything.


If I wear my wallet out in the open does that mean it's ok for you to steal it? Or if I expose my painting in a museum is it ok to copy and resell it? People that used to steal software used to claim that they just made a copy, they didn't steal a car. Yet somehow corporations prevented them from making said copies. Now we see people are taking the side of corporations stealing people's content. The content that they themselves don't want to be stolen. Weird world, and the markets are clearly irrational. This IP theft will backfire and it will be glorious.


If your art style can be put into a bunch of parameters (which it can) it can be copied by a human. Copyright of style is stupid and muddy, it's always a matter of people thinking "the magical humanity" not being possible to put into ones and zeroes.


Correct, but there are clear rules to how humans can copy things. Also humans don't usually need to ingest and store billions of parameters about anything to learn about it nor could they. If they could they'd be software.


It's nice of you to say that, but I know what I did.


But OpenAI and Midjourney did train on their artwork without consent, and as a result artists are at risk of losing income from these models.


Copyright extends well after death, though.


Yeah, I think it's more that they're trying to reduce how much they anger/upset artists than anything with copyright specifically.


How is that stealing?

Do art students steal when they tour the museum? Please, I really need to know... what sort of dingbat philosophy is it that thinks that this even slightly resembles stealing, in either of the "copyright infringement is stealing" or the "actual theft" meanings.

I hope the DSM VI includes copyright maximalism in its list of mental illnesses.


That's a projection. Also OpenAI didn't walk into museums to "learn". It downloaded billions of images that it then ingested tokenised and then mixed to produce results. It's not a human "touring a museum". This argument is absurd and is based on the flawed notion that software has "equal rights" to humans, or that it posseses some sort of intelligence.


I don't speak whatever incoherent, irrational language you're using.

None of what you've said is relevant. The same or similar processes the brain uses are at work here.

> This argument is absurd and is based on the flawed notion that software has "equal rights" to humans,

The people who run the software have the same "equal rights" as humans. If you're allowed to download one of those images and look at it, they're allowed to download it and train their software with it.

You're trying to invent new intellectual property rights out of thin air. No thanks, we already have more than enough of that lunacy.


[flagged]


>Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.

>When disagreeing, please reply to the argument instead of calling names.

>Comments should get more thoughtful and substantive, not less, as a topic gets more divisive.

Sure looks like it's your posting privileges that are in danger, not his.


Also weird since chatgpt is more than willing to do it


Is there any information on the resolution of the outputs? This is indeed a new low in announcing something: „hey we have this awesome new model, which right now none can use, and we won’t tell you what is changed - except for it’s magically better. Oh and don’t ask for any technical details on the output you get… details are for n00bs“.


The banana image is 1792 x 1024, the hedgehog with watermelon is 1024 x 1024, DALL-E 2 does 1024 x 1024, or more if you use outpainting. The composition on the portrait and landscape images seems better than I would expect if they were just outpainted normally though.


OpenAI is turning out to be not supportive to the developer community with its hidden intent of becoming the monopoly.


How are they becoming a monopoly with midjourney and stable diffusion as strong opponents?


By having backroom discussions with governments around the world and discussing how to regulate their competitors out of existence? [0] None of them have the money or influence to lobby for such changes like OpenAI does (alongside with Microsoft)

The path way to an OpenAI monopoly is quite clear, especially with the controlling stake from Microsoft. So I won't be surprised to see OpenAI continuously attempt and revive their regulatory capture using licences [1] against actual 'open' AI companies who release their papers, code, models, etc.

[0] https://news.ycombinator.com/item?id=35960125

[1] https://www.reuters.com/technology/openai-chief-goes-before-...


Yeah, good luck with keeping something open sourced secret again. Worked like a charm with encryption and DVDs CSS.


Have other models solved the problem with text in images or is that new here? That "I just feel so empty inside." one is so much better than anything I've seen before from generated images, and their video has some really high quality text in some of the images too.


DeepFloyd did it, at the cost of lower quality images overall. Maybe that was because it wasn't a commercial product, though, more a proof of concept? ideogram.ai has similarly good text with higher general quality, as long as the text is short.


If i remember correctly, google's imagen did that a while ago.


What's the point? Stable Diffusion could basically do all of this almost a year ago.


I haven’t really used it. Can it follow all the details of a precise prompt as shown here?


SDXL can. I suspect the VAE will be better with DALL-E 3 but SDXL can do it too.


Except text.


I wonder how comparable this will be to MidJourney and I'm excited to try it out.


One thing I immediately noticed was the text, which doesn't seem to be something I can get MidJourney to do.


If you want text, checkout https://ideogram.ai

The quality is good and is free currently.


There's only a small handful of examples, which very well could be cherrypicked, but yes I am excited about text, that's one thing most models struggle at currently.


Pedantic but FYI the J isn't capitalised, it's just Midjourney.


The cartoon with the text wouldn't be possible in Midjourney. Also Dall-E 3 seems to have very good text comprehension, which was an area where Dall-E was always relatively good at compared to other models with better image quality.


>DALL·E 3 is now in research preview, and will be available to ChatGPT Plus and Enterprise customers in October via the API

Does this mean that ChatGPT plus price will increase? Otherwise the value you will get for the subscription is crazy!


Via api, I take that to mean we’d pay per call, just get earlier access than non-plus/enterprise customers. I could be wrong though.


I'd actually say that the quality of the second dunking example is better in v2 instead of looking like some ArtStation output

“An expressive oil painting of a basketball player dunking, depicted as an explosion of a nebula.”


Yeah, I agree. The other examples of v3 output were better - especially anything with text, I'm surprised they didn't focus on that more as it was a significant weakness of v2 (and mainline StableDiffusion, but ideogram.ai seems to have figured it out, so it'll probably be widely available eventually).


Dunno what model Ideogram uses underneath but DeepFloyd-IF for proper text was already open sourced a while back.


what does being 'built on chatgpt' means? does it mean they use gpt3.5 embeddings? or is it a simple plugin?


It means that ChatGPT will be used to optimize your prompts before they are used in DALL-E 3!


The subheading says:

> When prompted with an idea, ChatGPT will automatically generate tailored, detailed prompts for DALL-E 3

Then there’s a video of someone typing into ChatGPT and it responding with images


Every transformer model uses a small/tiny llm (CLIP, OpenCLIP, BLIP2, FLAMINGO) as their text encoder and I guess they use gpt 3.5 for theirs.


even the first version was described by them as gpt: "DALL·E is a 12-billion parameter version of GPT-3", but i am not sure it's the GPT-3 that we know (as trained on all the stuff), but rather like, same/similar architecture? and in other places they mention CLIP as a part of the model (I'm talking about the first version)? all of this is quite confusing for me, and leaves me wondering, if this is simply "same architecture" or "reuse of embeddings" or "use in training" or "fine-tuning", especially for the exciting new version


DALL·E 3 is built natively on ChatGPT

Plus users will be able to directly create images within the ChatGPT web interface.


Marketing I suppose.


Never felt so far removed from being an actual artist ...


Anyone else noticed the Venus poster having actually legible text? That’s very cool. Previously I’ve only ever seen generated images having garbled text.


I asked DALL-E 2 for several pictures of "DO NOT ENTER" signs and a few of them had signs that said "DO ENTER".


Almost! The subtext is still in a dreamstate


I'm surprised Stable Diffusion and others haven't cracked legible text by now. OCR has been solved for decades now, surely they could include some OCR term in the loss function while training it?


This was the most exciting part to me.


This page is somehow quite interesting, returning 404:

    $ http https://openai.com/dall-e-3
    HTTP/1.1 404 Not Found
    Cache-Control: no-cache
    Connection: keep-alive
    Content-Encoding: gzip
But others seem returning normal 200.


Fooocus has a similar "prompt expansion" feature with a model dedicated to just that, as well as a list of style "presets" that are known to be in SDXL's training data:

https://github.com/lllyasviel/Fooocus

Other stable diffusion UIs have this as an option too.


Wonder if the cost is prohibitive enough that casual users (like myself) would benefit from a ChatGPT Plus subscription style service.

I'd love to see some basic usage included in my GPT+ sub

edit: Welll... they say "Try ChatGPT (DALL·E 3 coming soon!)" ... if they are including some DALLE usage in GPT i'm going to resub instantly. So huge.


Interesting about OpenAI is that they can use old models to generate new training sets. For them it's cheap. Example of that would be the use of GPT-4 to generate sets tailored for training smaller or MoE models. Which is hard to do manually.

Really accelerating AI development, with humans in the loop for now.


When MidJourney releases the web version and API, competition with OpenAI DALL-E's will be intense!


@samaltman how is this open? what was the reasoning behind not releasing any of the underlying models?


It's hard to make money if anyone can copy (and improve!) your work.


> DALL·E 3 is built natively on ChatGPT, which lets you use ChatGPT as a brainstorming partner and refiner of your prompts.

Nice! I teach ChatGPT on midjourney’s documentation and it gives me great prompts on my loose ideas

This follows that concept


> DALL·E 3 is now in research preview, and will be available to ChatGPT Plus and Enterprise customers in October via the API, and in Labs later this fall.

What does this mean? How do you use ChatGPT Plus through the API?


The antecedent is "Enterprise customers".


When searching for OpenAI and Enterprise I only find ChatGPT Enterprise. Does that mean they have a separate API that is not in the regular OpenAI API?


No, they mean they think of the API as being used only by companies. Which is probably true, overall.


Apparently, to create such stunning detail, one must be a Shakespeare.


I dont know. DALL E 2 was terrible. I regret paying for it-- the results look like a 6 year old with Ms Paint. Dalle 3, no matter how better, can't be that better?


Very excited to try the text fix feature!

Anyone have any tips on getting beta access to these models earlier? I've been in GPT3 beta since June 2021, and I was only ever able to get Codex early.


Check out the prompts to those examples used...to create such stunning detail, one must be a Shakespeare...


It can generate readable text based on the prompt! Has that been possible before?


I hope we are able to access the generated image URLs from ChatGPT Plugins.


Genuine question: are people using this in a business use case ala ChatGPT?


looks amazing, the text generation looks really great


The spoon is writing on the back of a clipboard.


> As with DALL·E 2, the images you create with DALL·E 3 are yours to use and you don't need our permission to reprint, sell or merchandise them

How generous.


What if you type "Buzz Lightyear from Disney's Toy Story"?


You get an image that is yours, that you do not need permission from OpenAI to do anything you want with. (Fine print: But you definitely need permission from Disney to use it.)


Ok, if I type "toy astronaut", how am I supposed to know if/when it produces a Disney character?


You might need Disney's permission but you don't need OpenAI's (or rather...it wouldn't help you regardless).


„prevents misinformation and propaganda“.

who at OpenAI decides the right and wrong??


Can it draw hands yet?


Hands are a more or less solved problem. At least in midjourney.


Not only there’s no model weights or code released, there’s not even a paper. Nothing is revealed about the model. “OpenAI”, ladies and gentlemen.


At first I thought of it as Open(Web|Source|File Format|StreetMap|etc), now I group it with OpenTable.

Also at first this objection really resonated with me. I think the meaning of OpenAI has spread pretty well now, and that it's getting to the point where raising this objection is tiresome.

There is an important point to be made about how it got popularized as being open and then they went and closed it while keeping the momentum, but that should be made instead of just saying, "wait, it's called OpenAI but isn't open?!?!"

However, stuff gets started as an open play all the time and gets closed, without open in the brand name, for instance https://ghuntley.com/fracture/ - hence the term https://en.wikipedia.org/wiki/Openwashing


This is the first time they have not posted any sort of a paper when releasing a new model. Even the sort that accompanied GPT-4 announcement.


Qualifier intended. It's getting to the point where raising this objection is tiresome., soon it will be :-)


I still think publicly expressing how we feel about their “openness” is the right thing to do. And I’m sure there are people inside OpenAI who feel the same way.


To be fair, that usually implies that you're made tired by the recipient's statement. Like if your child (hypothetical) was misbehaving and you told him "I'm getting tired of your disobedience" that child should expect that you mean you want them to stop being disobedient.


Is it the complaining that is tiresome? Or is it the lying perpetrator who is tiresome?


We should just start calling them Closed AI from now on, just like we used to call Microsoft M$.


That’s always been bad taste as a joke. Like the French boomers who transform “Facebook” into “Face de bouc” thinking it’s funny. It’s really tiresome for people familiar with the subject, but it feels novel to them because they are not well-connected with other people.


Cynical monikers applied to corporations and/or brand names are not meant to be funny, at all. They are often useful warning signs, meant to demonstrate corporate deviousness over time.


C’est Fessse de Bouc :)


ShutAI is catchier.


Open could be used to describe the public interface to the closed model. As compared to Google's pre-chatgpt models which where inaccessible to the public.


Yeah, the trouble is, that they're open like the Apple App Store. They claim to have a fair policy and follow it consistently, but they don't.

https://community.openai.com/t/my-account-has-been-banned-fo... (see related topics for some more)


I think the least they could do is change their name, to better match their new philosophy.


They’ll wait until the first PR meltdown, rebrand as something silly like “Meta” or something, and then fade back into the collective unconscious.


Yeah, the whole 'open' part of this was them aligning themselves philosophically with open-source. But of course they throw that away when the money comes talking...


Meanwhile, competitors are choking in papers, but fail to release anything I can get my hands on (that’s worth a damn).

I can actually appreciate “open” as “open to access” or “open to actually having a product iso posturing about being so far ahead but never releasing anything worthwhile” (looking at Google).


Agreed, but there's room in the middle. Meta releases Llama papers and weights. And the community is all the better for it.


There's room for all here! May the best one win


Llama was leaked which forced Facebook's hand.


But if they release model weights to their algorithmic raster image generator and don't succeed in their lobbying efforts to ban their competitors, it will result in the AI apocalypse. Think of the children!


I am one of the more powerful open source fanatics out there, and yet this constant refrain, over and over, lamenting that the name of the company is inaccurate, is very tiresome. I think we know here. This does not add anything to the discussion IMO. Observing that no weights, code, or paper were released are useful, but the line about “OpenAI is not really open” is IMO tired and unproductive. They aren’t going to change it. It’s just sour grapes at this point.


So pretty much, "shut up and accept it, peon"

Yeahno... that's not gonna suffice. Squeaky wheel gets the oil!


Are you suggesting that if we complain hard enough about the name on Hacker News that… OpenAI is going to change their name? Because it is my position that this will never succeed, and all we’re doing is adding noise to the discussion here.


I don't want them to change their name, I want them to open-source their models.

And its not about HN, it is about the larger community grumbling. Not that it will do anything but if they're going to annoy us with their disingenuous naming then we can annoy them by calling them out for it. There are a million other names of blithe corpo-jibberish they could use, but they didn't.

And it is oh-so-convenient how discussion that you don't like is "noise". Us smarties are fantastic at rationlizing arent we? Yes let's just limit the conversation to breathless panty-creaming only. I'm sure the CEO would really like that!


Shut up and accept what? A company being closed source / for profit? Are we rioting against capitalism or something? Should I get my pitchforks? Just to be sure, we'll include literally all major tech companies, or is only OpenAI the target?


Only openAI, they're the ones absuing the word 'open'

> Are we rioting against capitalism or something?

That would be nice... can we do it Fight Club style?


I love and use OpenAI services everyday, but I also think a fundamental revulsion at Orwellian doublespeak is fine.


I agree - however is it really productive to the discussion if multiple people are commenting the exact same thing every single time? Upvoting one comment that shares this sentiment is enough, in my opinion...otherwise any valid criticisms are overshadowed by this outrage over the name.


Feeling it is fine. Taking time to say it every single time is tedious. We don’t have to speak every feeling we have.


Sure, but when you say something people resonate with it gets upvoted until it's the top comment relatively consistently. So while I have no issue with you objecting to this, most people clearly still agree with it. For what it's worth, I am very much in the "don't be a deceptive asshole" camp.

The other aspect you're missing is once you stop criticizing it, you are passively normalizing it. And that's horseshit. Call out the fuckers that do bad shit more often because they deserve it. I am not saying you are a shill, but you are doing exactly what the company wants - normalizing their shitty behavior and defending them against criticism for it.


You don’t have to. It’s clear many of us feel this way, and you’re not the arbiter of which feelings are valid.


Just to be clear I’m explicitly saying the feelings are valid, but that saying them over and over doesn’t add anything to the conversation.


Open in this context means open for business.


Open you wallet ;-)


I am certainly looking forward to the day that open source models can be evenly matched with "open"ai's models.


We get it. OpenAI is a for profit company now. The "Open" in name doesn't reflect their vision anymore. Can we stop it already? Or is this some kind of rant against for profit / closed source in general?


Fedex is part of the federal government, right? Names mean nothing.


https://www.fedex.com/en-us/about/history.html

> Mr. Smith thought the word “federal” suggested an interest in nationwide economic activity, and hoped the name would resonate with the Federal Reserve Bank, a potential customer.


A surprising number of people, still, to this day, believe that to be true. Names really do mean something.

Edit: and don’t get me started on the Better Business “Bureau”


A lowercase "i" prefix would really tie it together.


Has anyone who funded OpenAI back in 2015 spoken out against their non-openness? Elon Musk only did after he no longer had a stake in it.

Edit:

OpenAI is a corporation and their stakeholders include Microsoft, Peter Thiel, and Infosys.

Yeah, not really surprising that they are not "Open".


I think I remember there being an organizational structure where the outer organization was a non-profit and they separated out what is now OpenAI into a for-profit subsidiary with a profit cap of 100x investment.

I still can't believe such a deal is legal.


OpenAI is revolutionizing AI and somehow the only thing that occurs to you is this tiresome criticism of their name?


Joke's on you pal if you're still expecting that from 'OpenAI' :/


At this point, I won't hesitate a single bit of doubt that this company literally stole stable diffusion and built on top of it. Like how will we ever know? If they were so good they could have released before SDXL. But they waited.


The real open AI research institute these days seem to be DAIR.


Follow us on Instagram.


NotActuallyOpenAI


Democratic People’s Republic of Korea. Democratic Republic of the Congo. Anyone who puts the word Democratic in their name is likely to be less democratic than others


ClosedAI


Are you willing to fund their org and all their salaries? That's what it would take them to be open in a capitalistic society, realistically.

Don't get me wrong, I love open-source, open-weights research, but the elephant in the room is that people aren't willing to do that on dirt-poor postdoc salaries anymore, for good reasons, especially when greedy landlords are now charging upwards of $4000/month just to have a reasonable living space, and the government takes close to 50% of your salary.


Not only is there no nutritional information or recipe released, there's not even an ingredients list. Nothing is revealed about this fruit. 'Apple,' ladies and gentlemen. Heck, the only Apples you'll find at an Apple Store are running on iOS, not growing on trees!


Reading about advances in AI art is always bittersweet. Generative text-to-image systems have come very far in the past 2 years. Impressively so. Frankly, I am in awe at the (cherrypicked!) outputs on this page.

Years on, it's still a little hard to fully grasp the imminent, momentous impact this (has yet to have?) on commercial artists. I fear it will become pretty much impossible to make any sort of living off art in the next decade.

I mean: the outputs on the page are just awesome. Leagues ahead of the stuff we have now. And I'm already seeing old-gen generated AI images on corporate blog posts—everyone will jump on DALL-E 3.

Being in the profession right now must be very discouraging indeed. My heart goes out to those artists who will eventually be replaced by cheap, intuitive text prompts.

Commercialising art for corps was one of the last ways to exist as an artist in today's economy and get by. I fear the extinction of the profession will have a big impact on our cultural capital.


> I fear it will become pretty much impossible to make any sort of living off art in the next decade.

Software developers have been having technology completely eat their work out from under them since the dawn of the industry. But jobs aren’t being lost over it; there’s more demand for software developers than ever. When software advancements reduce the work needed to produce output, the world has moved on by taking that as the new baseline which software developers build upon and demanding more software to be built with bigger and better features.

Back when I first started working, it was somebody’s entire job to take pages and pages of content and transcribe them into HTML manually so that they could be published on a website. People used to do that all day, every day. Then CMSs came along and completely eliminated that work. Now sure, if a developer decided that all they wanted to do was write static HTML and refused to adapt their skills, they would be out of a job. But we all used the new technology to build more dynamic websites that provided more value. The new technology didn’t take away our jobs, it provided an opportunity to do a better job.

It’s the same with this. These aren’t tools to replace artists – these are tools artists can use to do more and better work. These aren’t tools that will reduce demand – when everybody can get on-demand, totally custom artwork, people will want more of it, not less.


I’ve been slowly warming up to this idea, and I’m not sure I’m totally convinced, but I think it does make sense to compare the web development industry to the art industry here.

Have web builders totally replaced developers? No, not at all. It’s definitely cut out the lowest end of the market, but that’s because your local restaurant really doesn’t need more than a squarespace template.

But warping a website builder to do something more complex is complicated. To the point where you need to hire someone trained in using that builder tool. Then that person finds the edges of the customizability of that tool and reaches for just writing HTML/CSS/JS.

I could totally see a similar trajectory here. We’ll have prompt engineers, and they’ll realize sometimes photoshop is just easier to tweak the output, then maybe they’ll realize it’s easier to prompt for components of an image, and composite them manually… and then we’ve reached a point where it’s just a tool that makes art easier.

Though… it still does feel somehow different from web development. Will just have to see how things shake out.


I wish this were the case but I'm not really sure I buy it. I'm a software engineer with absolutely no eye for design or illustration. Now I can write a prompt to get me a good-enough image in 90% of cases whereas before that was totally unreachable without an artist. I don't need any training to use this tool and in many cases it gets me to the full end result without any modifications needed.

The reverse doesn't hold -- there are no tools that allow an artist with no training or study to be an effective developer. I'd say the CMS example is more of a data entry job than a development job now, it's two different things.

Most of the advancements in tech still need trained developers to utilize them. GPT-4 cannot create non-trivial programs itself and does not seem particularly close to doing so, it's mainly used for scaffolding and bug fixing and guiding, all of which still need a trained developer at the wheel.

I'm worried for the future of artists.


> I'd say the CMS example is more of a data entry job than a development job now, it's two different things.

Right, but it used to be a development job, which is the point. The CMS took a development job and turned it into a data entry job -- exactly what DALL-E does to some limited forms of artistry.

But development didn't disappear, and neither will artistry.


Were you using artists for anything you can use DALLE for now? Probably not, because what they're best at is too different. But making them more similar will actually increase demand for them because you'll have higher standards for whatever you get out of them. (Jevon's paradox)

Basically, there is no circumstance under which automation reduces demand for labor. It's like if you got a pay raise and started worrying it might make you poorer.


> Were you using artists for anything you can use DALLE for now?

I mean sure yeah, in the past I would pay for logos from artists and now I just make them with AI for my projects instead.


That's true, but I still worry because it feels like we are reaching some kind of inflection point. For instance, Vercel's latest product (https://v0.dev) makes me nervous about the impact to front-end development.


What's interesting about seeing these conversations on Hacker News is that 10 minutes ago, our profession was putting thousands out of work on a daily basis without even a second thought.

Ever stopped at one of those self-checkout registers at the grocery store that didn't exist in your childhood and wondered what happened to the cashier who used to be there?

A lot of people mumble something about "jobs not worth doing", but I'm not sure if web front-end development (how many unique, differentiated, worthwhile websites exist? how much of that code is glue and boilerplate, and people solving the same problem in parallel every day?) fares much better in that gauntlet.

Software developers right now are, for the first time in their lives, experiencing a little backlash to their privilege -- for hundreds of millions, they've been riding that rollercoaster all their lives.


> Ever stopped at one of those self-checkout registers at the grocery store that didn't exist in your childhood and wondered what happened to the cashier who used to be there?

They're still there, they're restocking shelves.

Similarly, the invention of ATM machines increased the number of bank tellers.


My local supermarket normally seems to be staffed by maybe two or three people total. They stock shelves until someone needs them to fix the machines, do an age check or temporarily use the old checkout line. I remember supermarkets of that size used to have several people stacking shelves and then the same number again on the tills. So I think it has reduced the number of those jobs.


You're right, and I'll gladly bite that bullet and say that the changes you describe are good. It's almost always a net positive to be able to produce more output with less human effort.


I feel embarrassed to admit that I've never thought about it from that perspective. Totally agree.


There are two different questions:

- will be more artwork be bought (probably)

- will the increase in the quantity of artwork being bought (assuming there is one) compensate the decrease in price for each piece of art?

Clearly for software, _so far_, the increase in productivity / decrease in price has led to such a big increase in demand for software that some programmers are better off - I say _some_ because many jobs within the software industry stopped existing.

Yet if we look at something like agriculture, there has been some increase in demand in food products(e.g. much more meat is being sold), but one can only eat so much - so most of the increase in productivity has led to way fewer people working as farmers, and not much increase in farmers income.

No "these are tools to replace farmers - these are tools farmers can use to do more and better work".

Basically people are making statements about the elasticity of demand with respect to prices for prices & quantities no one has ever observed. If making a piece of art is now 1/2 (random number) the cost thanks to AI, will people buy 50% more? 100% more? 1000% more? I have no idea and I am not sure why people think they do.


This certainly seems to reduce how much experience and study an artist needs to have in order to produce art. And I think it's likely to reduce how much the viewer values the product as well.

Automation does not always increase demand for the jobs being automated. Tractors didn't create more demand for oxen than ever.


The automation of tractors over manual plowing didn’t replace the Oxen, but the people driving them.

Farm automation allowed a single man to plow many fields in the time one field would have taken.

Automation allows one person to accomplish more with less.

Tractors didn’t reduce the demand for product, but increased it many fold.

Automation increases demand; every time.


The number of people working in agriculture was divided by c. 6 in the US [0], despite the population being multiplied by c. 3.5.

Sure, automation increased demand for food, but demand did not increase enough to not destroy 5/6th of farmers jobs.

Now this may not be a bad thing at all, that's not the point, but I am not sure "hey demand for art will increase but 5 out of 6 artist jobs will no longer exist!" is a super hopeful statement to artists.

[0] https://ourworldindata.org/grapher/number-of-people-employed...


> Automation increases demand

Yours is a different definition of demand than the economics definition, which is basically the amount customers will buy as a function of price. Higher demand means buyers will pay a higher price for the same art.

Automation increases supply, which increases the volume of art consumed, mainly because it is available at a lower price. That is shifting to a different point on the same demand curve, and doesn't mean that demand has increased.


Automation increases demand.

Just look at the dairy industry. Pre-automation sales numbers vs. post-automation sales numbers.

This is true for every industry on earth, and doesn't care about the pedantic definition of demand. Consumer demand = more purchases. More purchases = more revenue.


What's the alternative? Outlaw AI art? Too late for that.


It's inevitable, but that doesn't mean we should delude ourselves into thinking it's win-win for everybody.


Who's deluding themselves?


To refresh your memory, the context of this thread is that JimDabell argued that software automation increased demand for software programmers, and the same will be true for art. I responded that there's no guarantee of that, and then you pointed out that it can't be stopped. Which may be the case but it has no bearing on the question of whether artists will lose their careers. Just because it's inevitable doesn't mean it's not a cause for legitimate concern.


But despite the work, it still takes a programmer to glue all of our crap together even if that crap is mostly preassembled. That's our fault and frankly a failing of our industry but it's why automating most of programming doesn't mean that your team can get by with 0 programmers.

Clip art/stock photography though, you can do that with 0 artists if you have the actual art bit automated.


>you can do that with 0 artists

you could do that with zero artists for decades because it's already fully commoditized. Everyone who is cheap already just buys a subscription to some gigantic vault of stock imagery owned by some digital rights holder for pennies.

the mechanical reproduction of art did not start in 2020


It's a good point, I'm not a consumer of that market so I don't know it well and am projecting my lack of even rudimentary mspaint skills that in that situation. I wouldn't be able to remotely customise something I found in the gigantic vault of stock imagery so I'd need somebody to help me with even that, but presuambly somebody with a recurring need for stock art probably doesn't lack those skills to the extent that I do.


The fine art world, which is typically what people mean by "art", has basically nothing to do with generative AI and probably never will. It's an entirely different set of metrics.

What you're talking about is illustration, which will indeed have a difficult time in the future.


art/fine art, will only be at risk when ai learns to physically paint or interact with the world. I think it needs to actually use a paint brush. Pictures of paintings are never the same as seeing them in person. They just can't be captured the same way. Ai is being trained on photographs of these physical paintings.

a lot of art isn't always so much about the technique but the content. ai might make the gap from beginner and crap to pleasing to look at in a technical sense, but that doesn't mean the art it self will be interesting or meaningful.

if you really think art as a whole is at risk of being destroyed by ai you should go to your nearest high quality museum and really think if ai would make that. i know like half the posters are in the bay area lol, go to sf moma this weekend and think about this.


> art/fine art, will only be at risk when ai learns to physically paint or interact with the world.

So you just need to pair DallE with an axidraw?


it would need to understand shading, layering, strokes etc.. and how utilize them to get pleasant effects.

idk how a computer would deal with the unpredictability of a medium like water color. axidraw looks more like a printer thats using a pencil instead of ink.

skip to 36min in here, https://www.youtube.com/watch?v=OokWxQPU5kw


Yes, your opinion seems like the obvious truth to me and I don’t quite understand why this is eluding people.


probably just people commenting on fields they dont have experience with


Some “artists” I know are scraping by on work that could be eaten by generative AI. Even if they don’t self identify as “illustrators”.


We can argue all day about the definition of Artist. My point is that for artists, painters, sculptors, etc. that show their work in galleries and museums, generative AI will have about as much effect as digital art: a little, but not much.


Maybe in the short term. In the longer term, who is going to bother developing these skills? Not everyone “hits a home run” and is successful in galleries and museums, but there have always been fallback options like commercial illustration. If those options are gone, then it’s either 1) be good/lucky enough to succeed in galleries/museums 2) starve or 3) be independently wealthy before becoming an artist.


I mean, the vast majority of artists operating in the gallery system are not moonlighting as commercial artists. It’s an entirely different skill set.

I encourage you to check out the work at museums and galleries in whichever city you live in. It should be immediately obvious that these aren’t really skills that transfer to Photoshop and blog post images.


> The fine art world, which is typically what people mean by "art"

Maybe in your circle, but that's certainly not the case generally.


Art galleries? The art world? These are pretty universal terms and they refer to the system of museums, art fairs, galleries, etc. that have nothing to do with commercial illustrations.


The book illustrator is an artist who works in publishing, the art gallery painter is an artist who participates in the "fine art" world.

I've never heard these terms used like you are using them.


https://en.wikipedia.org/wiki/Art_world

See also: Art Forum. ArtNews. MoMa. The Met. Etc.

I didn’t invent the term.


"Art world" is a term connected to "fine art" not "art" or "artists". That's where we disagree.


>Maybe in your circle, but that's certainly not the case generally.

Great, now that's solved, let's agree on a definition of 'punk' next


In the first (cherry-picked, as you say) example, the man's moustache is doubled, the telephone handset is in two places, and his sideburns are on sideways.

It would be a terrible outcome if authorship and illustration are mostly reduced to editing and touching up errors in AI-generated statistically likely art.

(Although oddly, for programming, I'm really looking forward to that outcome).


This is going to open new doors for future artists, at least in the near term.

I can easily imagine a future where a single dedicated individual or at least a very small independent team can make a full-length movie without leaving their apartment on a tribal budget that rivals a big Hollywood production costing hundreds of mullions to produce today.

Right now you've got the writers striking, worried that the studios are going to replace them with AI, I think this is totally backwards, it's the studios who should be worried because the barriers to entry that protect them today are about to come crashing down.

I am already seeing a few people make short films using these tools. Right now anything "AI generated" has a certain novelty factor, like back in the day when 3D CGI was new and we were all rendering chrome 3d spheres and shiny red cubes and cylinders on black and white checkerboards, this phase is going to pass soon enough. Perhaps that surreal Midjourney glow will be part of 20's nostalgia in the decades to come. There's a whole new set of skills a new generation of artists are going to master and do things largely unimaginable a year or two ago. They're going to make art that expresses their own perspectives and ideas and not just what's currently allowed by the current consensus, just as the artists before them did.


> I can easily imagine a future where a single dedicated individual or at least a very small independent team can make a full-length movie without leaving their apartment on a tribal budget that rivals a big Hollywood production costing hundreds of mullions to produce today.

This is definitely exciting to contemplate, but it's still quite tricky for the economics: Right now, the fact that you need 4000 people to make a blockbuster movie limits the number of movies being made, giving each movie enough of an audience (potentially) to make their money back.

Better tools will enable more awesome creatives to make content alone - but will there be enough eyeballs to consume that content, even if it's cheap to make?


We went through this with YouTube. The answer is yes. There's more than enough demand for stories that the current movie industry isn't interested in telling to sustain millions of cottage-industry movie or TV shows producers.


Artist don't create art to make a living, but to experience the joy of creation.

Making a living from art is something that the majority of artists even before GenAI did not reach.

I agree that it will change things a lot, but fearing the "exctinction of art" is a bit dramatic IMO.


I've heard more than once an artist say doing art is the only way to keep themselves from committing suicide. Artists gonna do art.

Another example would be someone who loves to garden. If a gardening robot was invented those people would still garden themselves.

Generative AI allows people without mechanical skill to do art, by iteratively interacting until the AI gets it just right for them, so in that sense generative AI opens up art to many more people.

As for making money from art, I believe we're going to have to implement some sort of universal basic income. I currently don't have to work because I collect social security, so I pursue my hobbies the way most other people on UBI will do.


I think you're defining a very particular type of 'artists', if you broaden the definition then think of graphic designers or concept artists, now thier job has been completely automated.


if you broaden the definition then think of graphic designers or concept artists, now thier job has been completely automated.

Sure, but will that have an effect on art? Most artists I know have day jobs completely unconnected to their art. Even the ones that draw stuff for a living don't consider that drawing their art. After spending all week working in the "drawing factory" they spend their weekends working on their 'real art'.

The only argument I can see is that fewer people will be willing to really learn the craft of drawing/illustration etc. if the chance of getting a day job doing that diminishes greatly.


Why do we even want to contribute to a culture so austere and calculating and uncaring anyway? Like, kind of a chicken/egg situation.

It is all just impetus to go and find the prior form of "art" we have had all along. Art need not be about the Artist and her Works, or about creating something for some kind of vague consumption. Art can be about collectivity and shared truths. Art used to be about speaking to God or whatever, and whoever actually placed the pieces of glass were very secondary. At my most optimistic, I feel like some turn like this (but probably more secular) is inevitable. People need to get stuff out, and if this pressure is not relieved by our current ideas of labor and such, it will find a way nonetheless like water through stone.

http://web.mit.edu/allanmc/www/benjamin.pdf


My guess is this will devalue art overall. That is indeed momentous it’s been a corner stone of modern culture for about 500 years. What does a world without art look like.

Does a new form of “art” evolve that makes use of these seemingly omnipotent brushes?


I think it will devalue visual art the same way photography has devalued photorealistic paintings

It’s simply a new tool for artists to create art with. It will change art but I doubt it will destroy art. Like you said, a new form of art will evolve from it.


Sadly, I feel it has already started to devalue art. I find myself unironically wondering whether some pieces are AI-generated on my feed. When I can't tell, I mentally devalue the image.

I'm becoming more and more redpilled on the generative art stuff. The risk-benefit analysis checks out in favour of LLMs, but maybe art is something we can keep our Markov chains off of.


Alternatively, it creates the oppertunity for research into different art styles, such as the ControlNet optical illusion spirals that went viral on Twitter/X last week: https://arstechnica.com/information-technology/2023/09/dream...

Those spirals got backlash for being "too easy to do", but others have done more creative things with the same technology: https://www.reddit.com/r/StableDiffusion/comments/15wdoqa/be...

That's the importance of accessibility, and has happened all the time in art history (e.g. photography did not kill painting on a canvas)


It becomes all about the artist; something that can not be replicated. What the art says about artist, what the artist intended, and the mere fact that a specific artist created it. Photography played a part in engendering this transition to "modern art" in the last century.


Graphic arts have always been about the money, so I'd guess whatever happens on the technical side will have little effect on the art market in general.

For instance the emergence of extremely good reproductions didn't affect the price of the original paintings. Same way the evolution of photoshop doesn't have much effect on the price of picture prints.

Now stock photography could be impacted. But otherwise generative AI only opens the door to more picture production, and won't be put in competition to the traditional painting/prints market/direct artist support market.

> What does a world without art look like.

As an aside I always find it funny when 'art' is used as short form for 'pictures', especially in the world's context.


> I always find it funny when 'art' is used as short form

I think, given the context being an generative AI for images it should be fairly obvious what I’m talking about.

I’m sure it’s only a matter of time before they do it for music, dancing, fighting and whatever else you care to tag with the term.


I dunno. The value of the type of art that is a "corner stone of modern culture" (i.e. the type of art you see in museums, etc.) doesn't seem to be related to how hard it is to produce.

For example, Flag by Jasper Johns sold for $110M in 2010 [1]. The fact that almost anyone could make something like this (the subject matter is public domain, and the technique is within reach of anyone who has taken an art class) doesn't seem to have diminished its value.

And as for all of the art you see in the everyday world, being able to design it more easily would likely lead to more of it, not less. Instead of seeing the same 20 things that they carry at IKEA over and over, you'd have custom artwork everywhere.

[1] https://en.wikipedia.org/wiki/Flag_(painting)

EDIT: Added the last paragraph while you were replying.


> i.e. the type of art that you see in museums, etc.)

Nope. I mean the visual art we see all around us. If it’s trivial to generate it’ll be everywhere and nobody will value it. It will become a annoyance to people - so what fills that space then? Bare walls? Or will things go to a new level. Was kind of my line of thinking.


We'll still have theater.


Looking forward to the day I have a little stage in my living room where little androids act out scenes from Shakespeare, Beckett. Ha I bet Beckett never saw that one coming.


Film/video already devalued theater and generative video is marching right behind generative imagery.


> Film/video already devalued theater

Is that true? Are there fewer live music shows, plays, live comedy, etc... than there were before cinema and television?


In total, probably not but per capita?


I have paintings hanging in my house produced by artists I personally know. If anything, I think ubiquitous, throw away and infinitely reproducible digital art makes my paintings even more special to me (although none are worth all that much to start with).


Until we see successful video games/movies/tv shows built using 100% generative art I will remain sceptical that this will happen. What's more likely is that generative AI art will be its own category like digital art, where creative folks will use their creativity to create amazing works of art with the help of generative AI. After all, it takes creativity to know what to generate. You can't ask the AI to make you a whole movie or set of video game levels.


100% might be a high bar for just skepticism. If there was something massively successful that was even 50 or 60% AI (and the novelty of AI wasn't the sole reason for its success), I think I would be firmly convinced.


It might increase the preciousness of art made with physical media. Such real art will be seen as the real deal, handcrafted, etc.

I think the market for affordable prints and copies will definitely suffer, because I can create prints of Midjourney art that looks as good as anything I see in our small local galleries and gift shops.


Artists would have to find creative ways to separate themselves from AI-generated art.

Also, it would be nice if there was something akin to a watermark that was perhaps invisible but upon inspection using a certain tool will reveal if it was created using the Generative AI model (similar to how real currency can be inspected and differentiated from fake currency notes)


All the AI generated images still make my head hurt when I look at them for any length of time. You start seeing oddities with shapes and lines, or some bizarre errant merge happened and my brain struggles to comprehend it. There's also a sort of "natural messiness" hand made images have that I don't think AI will ever capture.


Would you pass a blind test? I'm pretty sure I wouldn't.


i feel the same way, but for big money movers, unless you can put it on a spreadsheet, it might as well not exist.


For now, all we know is this technology replaces the human capacity of drawing with good technique what a human wanted to be drawn. I think the relevant part of art is not the ability of generating an image, but to put thought on what you want to be there.


The language on this page also suggests they want to put the prompt engineers out of a job.


Luckily AI art will most likely be in the public domain so artists will probably always be involved but you'll become some editing cog in a machine trying to change a piece of ai art enough that it becomes copyrightable.


There's still a significant lag time before uncensored generative AI catches up with this. So gracious of OpenAI to keep porn artists employed a little longer with all their moralizing.


Seems like a problem for generative systems too though, fewer artists creating original artwork results in less input and variety to generate work from.


Images aren't all created by "artists", and you don't need to train on images created by "artists" in order to produce something that could be called "art".


I don’t really see how this has any impact outside of digital art. People are still buying paintings even though photos exist, and still buying sculptures even though cnc machines exist.

I also think the output isn’t yet good enough to be usable without some human intervention (fixing hands, spot treatments, etc.)


That’s just until AI combines with CNC to produce unique sculptures. Which won’t take long.


The impact has already happened, this will accelerate, paintings and even sculptures can be now bought dirt cheap online as it's being mass-produced by machines in China. Now fine art, graphic design etc are next to go.


Probably being produced by factory workers, not machines. (It's like clothing - clothes are often cheap now, but every one of them was man-made.)

And Chinese wages are rising, so it'll eventually be too low-level for them.


I see they're continuing the campaign to make people dislike something called "AI safety" by redefining it as corporate prudery.


This hijacking of the term really bugs me. As though The Terminator himself would have been "safe" if only he didn't speak cursewords to his neighbour.


I think they've just redefined "AI safety" as an analogue of "brand safety." To corporations, that's what safety means:

https://en.wikipedia.org/wiki/Brand_safety


When car companies talk about safety, they mean the car is unlikely to kill its occupants, rather than that the stereo plays only unoffensive music to protect the brand.

AI safety is a thing apart from brand safety, and OpenAI would be well aware of this, just like GM is aware of what crash safety means.


> they mean the car is unlikely to kill its occupants, rather than that the stereo plays only unoffensive music to protect the brand

Right, so in that case the occupants are their customers, and they're hopefully protecting them from harm. They're not optimizing for, say, pedestrian safety[0].

In this case, OpenAI's customers are other companies, and they're keeping them from harm, and the number one harm that companies are worried about re: AI is "what if we deploy an AI tool and it generates nudity etc that damages the bottom line."

I'm not saying this is a good thing, but it seems to describe the situation as it is, doesn't it?

[0] https://driving.ca/auto-news/driver-info/blind-spots-on-pick...


It's the perfect next vehicle for activist-censors to parasitize now that they more-or-less have secured control of "content moderation".


> Like previous versions, we’ve taken steps to limit DALL·E 3’s ability to generate [...] adult [...] content.

And so continues the trend of "progressive" AI companies deliberately handicapping their models for no real good reason.


True, not only handicapping is already over the top with most models, but even mentioning it here gets you flagged, see my honest comment at the end (-1), amazing how people are censoring themselves now.


Progressive = porn-enabling? No real good reason = let’s have kids use this without limiting their imagination nor psychological damage?


Conservatives (in the US meaning of the word) are exclusively the ones clutching their pearls about pornography. I don't know how you can be so confused on the subject. Three red states have even started the process of banning pornographic websites.

https://www.pcmag.com/how-to/pornhub-blocks-access-utah-here...

In the end, this is about liability, not morals.


From OpenAI's own statements:

> OpenAI claims it focused a lot of work on DALL-E 3 in creating robust safety measures to prevent the creation of lewd or potentially hateful images.

The safety measures in OpenAI is pretty much cliche progressive hyper-moderation in 2023. You see them same approach on Reddit and elsewhere, and the left-leaning communities are consistently the most locked down to a particular overton window. This is not a very controversial take.

It goes well beyond just pornography and explicitly violent stuff.


Go ahead and try to sell a product to enterprises where the creation of violent and pornographic images is possible. I promise you that you'll get zero sales. This isn't censorship nor is it about "progressive" values. This is capitalism.


Again this goes well well beyond pornography and violence. It was already very aggressive and now they claim that invested a ton more work in making it even more “safe”.


Your reply reminds me of that meme "everything I don't like is woke".

How is blocking porn a "progressive" thing? aren't conservatives the ones blocking porn these days?

for example: https://www.rollingstone.com/culture/culture-features/republ...


I think you should trust Rolling Stone on conservatives about as much as you should trust Fox News on the topic of progressives.



So continues the trend of bad takes.


Complaining about censorious moralizing is valid. You don’t have to defend it just because you view it as a partisan attack.


What?


Please provide more substance when saying something like this. Kindly explain why you feel it is flawed instead of flatly rejecting it without elaboration.


The comment is not forthcoming but the point is valid because it’s pretty obvious that the reason they are afraid of generating adult content is because of the risk that people e.g. use it to make porn of people without their consent, CP, shock images, or just anything that will harm OpenAI’s reputation and get them sued/regulated into the ground. Not because Sam Altman is some kind of evangelical moralizer.


Interesting that they don't have the watermark on the "DALL-E 3" images.


Perhaps they use some form of steganography.


Why would you expect one? I don't think I've heard of any AI image generators placing a watermark.


v2 has a watermark in the lower right corner of every image.


Thats the case now too if you use the API and not the labs version


why do you think this is?


There's probably no point in terms of a watermark's goals.

Additionally, since OpenAI is asserting no rights over the image, a watermark would be inappropriate.


Makes sense




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: