Hacker News new | past | comments | ask | show | jobs | submit login
I used DALL·E 2 to generate a logo (jacobmartins.com)
1040 points by cube2222 5 days ago | hide | past | favorite | 375 comments





I’m fascinated by how much this is exactly like working with a human artist who doesn’t really understand the domain that you are wanting to represent with an image. Iterate, iterate, iterate.

It seems like the most valuable thing this could do is get some of that early exploration out of the way faster and easier than a human can do it, get to two or three concepts that feel like they’re in the neighborhood, and then let a human expert take over and turn it to something final quality. That’s pretty cool.


Agreed.

At the end of the article I also described a bit how I would see the evolution of such a tool, and it looks like we're thinking very similarly.

---

Though I think the real breakthrough will come when Dall-e gets 10-100x cheaper (and faster). I would then envision the following process of working with it (which is really just an optimization on top of what I’ve been doing now):

1. You write a phrase.

2. You are shown a hundred pictures for that phrase, preferably from very different regions of the latent space.

3. You select the ones best matching what you want.

4. Go back to 2, 4-5 times, getting better results every time.

5. Now you can write a phrase for what you would like to change (edit) and the original image would be used as the baseline. Go back to 2 until happy.


I see this happening in all areas. Everything would be prompt-driven.

Do you like this? What about this? You simply nod or reject the solutions that you don't want.

Pretty soon somebody's expertise and experience is not going to be enough to continue paying them what they used to get before this magic blackbox appeared.

One day enterprises will realize they can just outsource that expert who's been reduced to simply typing prompts and nodding yes or no.

I am worried that the middle class is rapidly disappearing. We will own nothing and be happy seems quite ominous. The question is then what field is safe from advancements in AI?

The only field I can think of is doctors, lawyers, executives, buy-side money managers. Even their jobs will be partially automated but it will be safe as long as they generate revenue.


You don’t need nodding or really any conscious reaction I think. It should be possible to have some camera directed at face hooked up to another AI that catches slight changes in pupil dilation or other changes imperceptible to naked eye and registers when something looks interesting to the user. You can then quickly show a stream of variations and pick the tagged ones and use them to improve the guesses. I imagine something like this might one day become a preferred way of interacting with computers/AI.

> Everything would be prompt-driven.

Just like in Star Trek. They really knew what the end goal was didn't they.

> enterprises will realize they can just outsource that expert who's been reduced to simply typing prompts and nodding yes or no

Tbf a program averaging the market for a fact gives better returns than most of the financial industry, yet they still exist. Even if we can automate something doesn't mean we will, usually for pointless emotional reasons.

But on the other hand it's hard to say if in a 100 years humans will still be employable in any practical capacity for literally anything.


But, if everyone's jobs are automated, nobody is making any money, so nobody has any money to pay doctors, lawyers, executives, money managers, etc. You would think that if these types were thinking rationally, they would be fighting to expand the middle class so more people can pay for their services.

In the past, eliminating humans from one set of jobs has been balanced by a new set of opportunities for humans in different jobs. Usually, the new jobs are more valuable.

That's not utopianism. The new jobs can't always be filled by the people kicked out of jobs. It really sucks to be them.

But it does mean that it's not irrational for people to want to automate other people's jobs. The net amount of stuff generated increases, rather than decreases.

This pattern may not last forever. There's already some thought that we've generated more than enough stuff to guarantee a decent standard of living to everybody (at least in the developed world) without working, and plenty more for luxuries if people choose to work. Even if we haven't reached it, we appear to be heading in that direction sooner rather than later.

That may cause a radical re-think at some point. And it won't be seriously delayed by making sure cartoonists have jobs.


Jobs are plentiful as long as wealth is well distributed.

In the past, fast automation has led to badly distributed wealth, and job loss. This situation has lasted until the unemployable people died off (yep, that was part of it), and enough wealth was redistributed through violent means.

Today we know better, and have really no reason to repeat the violent means of our previous revolutions. But it's really looking like the people in power want to repeat them.


> enough wealth was redistributed through violent means.

there were no instances of violent redistribution of wealth that ended better for the average person than before. Only that a different group of people ended up with wealth.

Automation makes stuff cheaper, even for people who didn't obtain any of the financial wealth via redistribution - because there's more than just financial wealth that get created with automation. New availability of services and goods (think internet of today - this is a wealth that couldn't have existed before, and one can benefit from it even if they are poor today).


> enough stuff to guarantee a decent standard of living to everybody

It's not a zero sum game. There's still growth in us. We'll go to space and expand 1000x more, the space has plenty of resources, and humans will have jobs working together with AI.


> There's still growth in us. We'll go to space and expand 1000x more, the space has plenty of resources, and humans will have jobs [..]

Q: Am I the only one thinking of Golgafrinchan Ark Fleet Ship B?


We'll have to automate childcare to make that happen. Otherwise, the birthrates of the rest of the world will follow the countries with the highest standards of living on a wild plunge into unsustainability.

>Pretty soon somebody's expertise and experience is not going to be enough to continue paying them what they used to get before this magic blackbox appeared.

Every art director at an ad agency just shrieked!


I doubt it, because the process of thinking of phrases to feed dall-e is really the hard bit.

This is ok for a logo like this where it’s fair to say the base level expectation is not super creative. This logo is cool, but it doesn’t really stand out or make the product ver distinctive. If I am running a hobby or OS project that’s fine, but if I was investing a lot in sales/marketing then paying a real artist to make something interesting and novel is a rounding error.


> This logo is cool, but it doesn’t really stand out or make the product ver distinctive. If I am running a hobby or OS project that’s fine, but if I was investing a lot in sales/marketing then paying a real artist to make something interesting and novel is a rounding error.

Q: Are there really logos out there that are "interesting and novel" and that "stand out or make the product [..] distinctive"? Which ones?

EDIT: (perhaps more importantly) are there interesting, novel, distinctive logos that actually contribute to profitability?


tbf I think when it comes to big company branding it's the opposite.

A lot of GPT iterations of the design has left the article author with something which is quirkier than your average logo, but also looks like clipart and probably doesn't scale up or down well or work in monochrome. Which is fine for OSS. (He might get more users from blog traffic about using GPT-3 to design his logo than he ever could from any other logo anyway)

But when it comes to bigger companies, the design agency are the people that sit in meetings with execs persuading them that a well chosen font and a silhouette of a much simplified octopus will work much better ("but maybe the arms could interact with some of the letters etc etc, now lets discuss colours). The actual technical bit of drawing it is the bit that's already relatively cheaply and easily outsourced, and plenty of corporate logos are wordmarks that don't even need to be drawn...


Doctors are very vulnerable. Most of dermatology is simple pattern recognition. I can easily see AI lawyers beating human lawyers in litigation, too. An AI lawyer will have read every single case and know the outcomes, and can fine tune arguments for specific parameters like which judge etc.

> Most of dermatology is simple pattern recognition.

I have a few qualms with this app:

1. For a Linux user, you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem. From Windows or Mac, this FTP account could be accessed through built-in software.

2. It doesn't actually replace a USB drive. Most people I know e-mail files to themselves or host them somewhere online to be able to perform presentations, but they still carry a USB drive in case there are connectivity problems. This does not solve the connectivity issue.

3. It does not seem very "viral" or income-generating. I know this is premature at this point, but without charging users for the service, is it reasonable to expect to make money off of this?


What on earth are you referring to? I assume it’s some sort of implicit joke but I don’t get it :)

Edit: Ahh, it’s the Dropbox comment of HN fame. Never mind.


This workflow reminds me of a generative art program from the early 1990s, but I just can't remember its name. It was a DOS or Windows program that had a very curvy, fluid GUI with different graphics sliders. It would show you some random tiles and you choose one to guide the algorithm's next generation of tiles.

Kai's Power Tools.

I wonder if Kai Krause lurks here at HN. I'd love to know how he's doing. Apparently he's still living in his castle, which he bought around 1999 [0].

Some-when in the 00's I read an article about him that he was putting advanced networking stuff into the castle and had the intention to start something like a "think-tank" (doesn't really fit it, but I don't know what I'd call it) where he and others would hang around and code stuff.

I found the article [1] from July 2002, "Lord of the Castle Kai Krause presents Byteburg II".

> So that 's Kai Krause's long-cherished plan: Now the software guru has finally opened a center for founders and developers from the IT and software industry in Hemmersbach Castle near Cologne -- the Byteburg II

I really wonder what he's doing to these days. His plug-ins were legendary, as well as the User Interface for Bryce [2]

[0] https://de.wikipedia.org/wiki/Burg_Rheineck

[1] https://www.heise.de/newsticker/meldung/Schlossherr-Kai-Krau...

[1, google translate] https://www-heise-de.translate.goog/newsticker/meldung/Schlo...

[2] https://en.wikipedia.org/wiki/Bryce_(software)


Your comment really intrigued me to google this interesting person I had never heard about before. This may well not be used to you, but Kai has a not-a-blog blog that I stumbled upon on here http://kai.sub.blue/en/sizemo.html.

Some really interesting reads. I especially appreciated his articles on the passing of Douglass Adams (apparently a close friend of his!) and Then vs Zen.


F’n LEGEND! I spent hours per day twerking his filters for my thesis animation

Hunh, I’ll be in that neck of the world next week. Need to look into this…

Please follow up, and tell us - even a Show HN

https://news.ycombinator.com/item?id=27288454

Love him or hate him (and I do both), Kai was all about cultivating his adulating cult of personality and dazzling everyone with his totally unique breathtakingly beautiful bespoke UIs! How can you possibly begrudge him and his fans of that simple pleasure? ;)

In the modest liner notes of one of the KPT CDROMS, Kai wrote a charming rambling story about how he was once passing through airport security, and the guard immediately recognized him as the User Interface Rock Star that he was: the guy who made Kai Power Tools and Power Goo and Bryce!

Kai's Power Goo - Classic '90s Funware! [LGR Retrospective]:

https://www.youtube.com/watch?v=xt06OSIQ0PE&ab_channel=LGR

>Revisiting the mid 1990s to explore the world of gooey image manipulation from MetaTools! Kai Krause worked on some fantastically influential user interfaces too, so let's dive into all of it.

>"Now if you're like me, you must be thinking, ok, this is all well and good, sure, but who the heck is Kai? His name's on everything, so he must be special. OH HE IS! Say hello to Kai Krause. Embrace his gaze! He is an absolute legend in certain circles, not just for his software contributions, but his overall life story." [...]

>"... and now owns and resides in the 1000 year old tower near Rieneck Castle in Germany that he calls Byteburg. Oh, and along the way, he found time to work on software milestones like Poser, Bryce, Kai's Power Tools, and Kai's Super Goo, propagating what he called "Padded Cell" graphical interface design. "The interface is also, I call it the 'Padded Cell'. You just can't hurt yourself." -Kai

But all in all, it's a good thing for humanity that Kai said "Nein!" to Apple's offer to help them redesign their UI:

http://www.vintageapplemac.com/files/misc/MacWorld_UK_Feb_20...

>read me first, Simon Jary, editor-in-chief, MacWorld, February 2000, page 5:

>When graphics guru Kai Krause was in his heyday, he once revealed to me that Apple had asked him to help redesign the Mac's interface. It was one of old Apple's very few pieces of good luck that Kai said "nein"

>At the time, Kai was king of the weird interface - Bryce, KPT and Goo were all decidedly odd, leaving users with lumps of spherical rock to swivel, and glowing orbs to fiddle with just to save a simple file. Kai's interface were fun, in a Crystal Maze kind of way. He did show me one possible interface, where the desktop metaphor was adapted to have more sophisticated layers - basically, it was the standard desktop but with no filing cabinet and all your folders and documents strewn over your screen as if you'd just turned on a fan to full blast and aimed it at your neatly stacked paperwork.

The Interface of Kai Krause’s Software:

https://mprove.de/script/99/kai/index.html

>Bruce “Tog” Tognazzini writes about Kansei Engineering:

>»Since the year A.D. 618 the Japanese have been creating beautiful Zen gardens, environments of harmony designed to instill in their users a sense of serenity and peace. […] Every rock and tree is thoughtfully placed in patterns that are at once random and yet teeming with order. Rocks are not just strewn about; they are carefully arranged in odd-numbered groupings and sunk into the ground to give the illusion of age and stability. Waterfalls are not simply lined with interesting rocks; they are tuned to create just the right burble and plop. […]

>Kansei speakes to a totality of experience: colors, sounds, shapes, tactile sensations, and kinesthesia, as well as the personality and consistency of interactions.« [Tog96, pp. 171]

>Then Tog comes to software design:

>»Where does kansei start? Not with the hardware. Not with the software either. Kansei starts with attitude, as does quality. The original Xerox Star team had it. So did the Lisa team, and the Mac team after. All were dedicated to building a single, tightly integrated environment – a totality of experience. […]

>KPT Convolver […] is a marvelous example of kansei design. It replaces the extensive lineup of filters that graphic designers traditionally grapple with when using such tools as Photoshop with a simple, integrated, harmonious environment.

>In the past, designers have followed a process of picturing their desired end result in their mind, then applying a series of filters sequentially, without benefit of undo beyond the last-applied filter. Convolver lets users play, trying any combination of filters at will, either on their own or with the computer’s aid and advice. […] Both time and space lie at the user’s complete control.« [Tog96, pp. 174]

METAMEMORIES:

https://systemfolder.wordpress.com/2009/03/01/metamemories/

>Anyone who has been using Macs for at least the last ten years will surely remember Viewpoint Corporation’s products. No? Well, Viewpoint Corporation was previously MetaCreations. Still doesn’t ring a bell? Maybe MetaTools will. Or the name Kai Krause. Or, even better, the names of the software products themselves — Kai’s Power Tools, Kai’s Power Goo, Kai’s Photo Soap, Bryce, Painter, Poser… See? Now we’re talking.

Macintosh Garden: KPT Bryce 1.0.1:

https://macintoshgarden.org/apps/bryce-1

>Experienced 3D professionals will appreciate the powerful controls that are included, such as surface contour definition, bumpiness, translucency, reflectivity, color, humidity, cloud attributes, alpha channels, texture generation and more.

>KPT Bryce features easy point-and-click commands and an incredible user interface that includes the Sky & Fog Palette, which governs Bryce's virtual environment; the Create Palette, which contains all the objects needed to create grounds, seas and mountains; an Edit Palette, where users select and edit all the objects created; and the Render Palette, which has all the controls specific to rendering, such as setting the size and resolutions for the final image.

MACFormat, Issue 23, April 1995, p. 28-29:

https://macintoshgarden.org/sites/macintoshgarden.org/files/...

https://macintoshgarden.org/sites/macintoshgarden.org/files/...

>He intends to challenge everything you thought you knew about the way you use computers. 'I maintain that everything we now have will be thrown away. Every piece of software -- including my own -- will be complete and utter junk. Our children will laugh about us -- they'll be rolling on the floor in hysterics, pointing at these dinosaurs that we are using.

>'Design is a very tricky thing. You don't jump from the Model T Fort straight to the latest Mercedes -- there's a million tiny things that have to be changed. And I'm not trying to come up with lots of little ideas where afterwards you go, "Yeah, of course! It's obvious!"

>'Here's an easy one. For years we had eight character file-names on computers. Now that we have more characters, it seems ludicrous, am historical accident that it ever happened.

>'What people don't realize is that we have hundreds more ideas that are equally stupid, buried throughout the structure of software design -- from the interface to the deeper levels of how it works inside.'


Please don’t just repost walls of copy-pasta

This was some really interesting reading, thank you internet stranger :)

How interesting! Thanks for posting.

+1 what a great program

Given the stochastic way it works I wonder how the randomness is seeded for a certain phrase.

In other words, if another person needed a logo and used the same phrase how long on average until they get a duplicate of your image?


The model starts from 64x64 8bit RGB image of noise (random pixels) so technically 1 in 3_145_728 (64x64x256x3) but most will probably be very close to each other as the color difference won't be that much. The image is then further upsampled by two other models which will change some details, but shouldn't affect the general composition of an image.

Maybe I'm wrong, but with these diffusion models there is randomness in every sampling step too not just in the initialization and they can have 1000 steps to generate a single image.

Ah good point, this would introduce more variation if the initial noise is close, but if the initial noise is exactly the same it probably means it was initialized with the same seed and the rest of the generation will be the same since the random algorithms are deterministic.

Since the image is RGB 1024x1024, and the random seed is noise (as it is for diffusion models), I guess it would be quite long.

It will get cheaper. On 5 years it will run on your phone

Yeah, my first thought was "Ok, but you are going to need to involve a graphical artist to actually really make use of that logo". Like you probably want a vector version and you definitely need simplified versions for smaller sizes but then I stopped and realized how amazing this actually is. It "saved" (I know, it cost $30 but that's a steal for something like this) all the time and money you would have paid for iteration after iteration and let the author quickly hone in on what they wanted.

As someone who is incredibly terrible at graphic design but knows what they like this could be a game changer as iterations of this technology progress. I can imagine going further than images and having AI/ML generate full HTML layouts in this iterative way where you start to define your vision for a website or app even and it spits out ideas/concepts that you can "lock" parts of it you like and let it regenerate the rest.

I'm not downplaying designers role at all, I'd still go to one of them for the final design but to be able to wireframe using words/phrases and take a good idea of what I want would be amazing, especially for freelance/side-projects.


Honestly though the hard part is the actual design which is already done here. Learning to vectorize a raster is something that can be done in a weekend with Inkscape, there's no reason to involve an actual graphics designer with this anymore.

> Learning to vectorize a raster is something that can be done in a weekend with Inkscape, there's no reason to involve an actual graphics designer with this anymore.

If you lined up 100 resulting images, 99 from weekend beginners and 1 from an actual artist. I guarantee you that you would pick out the artist every time.

It might be simple to trace over an image but you are probably better getting an artist to spend 2 hours on it, it will most likely look better than 2 weeks of tracing.


Time value of money. The most optimal use of money and time would be getting the ML to iterate until you have the finished product, then get a designer to vectorise it and fix it up. That way you pay the designer for one iteration and spend all the time you would have spent iterating with the designer iterating with the ML model instead.

I think you might be underestimating how much work goes into the last mile of a design. A lot of refinement work goes into typography in particular, a domain Dall-E isn’t yet proficient in at all.

Nice ideas, great enthusiasm.

I think your art/design/craft is pretty good. Some people use pencils, some use Adobe products, you have gone out there and tried the new Dall-E medium.

Glad you thought out the usage, I am sure that when the novelty wears off that you will have that neat-as-octocat logo sorted out.

I appreciate that you appreciate the value that highly skilled designers bring to a product with their visual expertise.

However, I would like to see you A/B test the Dall E logo versus the winning designer logo. You could show odd IP addresses one logo and even addresses the other.

I think the designer would edge the robot for what you need (a logo), however, the proof is in the pudding and conversion rate.


Plus there is no reason why someone couldn't build a specialised AI model to do vectorisation and another to generate simplified versions of vectors.

People are already doing by combining DALL-E 2 with gfpgan for face restoration. So there may be a role in understanding how to combine these tools effectively.


Yes! It gives powerful tools for someone with a concept to get much closer to visualization of their idea.

DALL E 2 is like a low or no-code tool in that way.

The outcome may not be a "finished" product, especially as viewed by a professional designer (or web dev). However, its a heck of a lot better than a tersely written spec.

And in some cases, the product will work well enough to unblock the business, get customer feedback and generally keep things moving forward.


I think this is more powerful than a simple exploration tool. It took the author a long time to find a query format that generated logo-like images. Once they had that part down, they were quickly able to iterate on their query to find an image they liked. They were even able to fix part of the logo using the fill-in tool. I'm not sure why you'd bring a human into the mix, especially if you're on a budget.

Ehnnnnnnnnn...

An experienced human designer, right away, is going to ask how you want the logo to be used. That's going to have a major impact on how it's designed.

So yeah, this may be like working with a doodler, but, as the author intimated, this is far from an ideal experience in getting a professionally designed logo. This is more like "Hey, you, drawing nerd, make this thing."

Nevertheless, astonishing technology in its own right.


Nah, people will leave out the professional. The same wild west grab whatever you can, steal, plunder to the detriment of artists, writers etc. And when the legislation arrives it will be already too late, accidentally.

Why should there be legislation? Do you want to restrict what people can do, just to force them to employ artists and writers? We could also forbid people from filling the gas tanks in their own cars, to protect the job of gas station attendant, but nobody wants to live in New Jersey.

you remember the concept of dumping, i.e., flooding a market with below cost product to drive out competing businesses? This is dumping for creatives.

editing: not that it's intentional, but these things will have the same effect; way too much product even for creative works. No one will be able to make money off the product but the tools.


Is it below cost though? It might just be very cheap to run.

"Why should there be legislation?" Lol. Read the uber files.

This blog post proves that Dall-E 2 will not make human taste and design ability obsolete. The final image he ended up with is a lot uglier and more complicated than most of the intermediate steps. I think generative art AIs will have a similar effect on design as compilers have on software development, and will not put artists out of a job.

Not trying to be a luddite and/or vehemently defend the noble profession of nuanced graphic design, BUT...

Those iterations suck. I'm not worried for my colleagues and I.

That being said! Many, MANY clients have questionable taste, and I can, indeed, see many who aren't sensitive to visuals to be more than happy with these Dall-E turd octopus logo iterations. Most people don't know and don't care what makes good graphic design.

For one thing, that final logo can't scale. For another, the colors lack nuance & harmony. The logo is more like a children's book illustration, and not something that is simple, bold, smart, and can be plastered on any and all mediums.

Just my 2 cents.

I bet in another 10-15 years, though, things might get a bit dicier for fellow graphic designers/ artists/ illustrators, though, as all this tech gets more advanced.


I feel like you look at this too much as a creator rather than the customer. The logo may be not optimal for every medium, not have a great palette, not have the feel you would give it... But the author is happy with the result, so who are we to say it's bad/good? Paraphrasing @mipsytipsy "colour harmony doesn't matter if the user isn't happy". (yes, I get the nuance where it's part of the designers job to explain why certain design elements are more beneficial, but the general point stands for "i want a logo for my small project" case)

Why is the creator the only one that needs to be happy? I assume they created that project to be used by others and to possibly monetize it. That sounds more like the users / clients are the ones that are supposed to like it...

I never understood this logic, where the creator of something does something seemingly stupid and people are like "Well, don't use their project then if you don't like it". Instead of constructively calling the problem out, so the creator can try to make it better.

If my logo sucked, I'd like people to please tell me...


> Why is the creator the only one that needs to be happy?

Because they define what the success is. If their goal is to make money they may want a logo which is the closest to optimal for getting clicks. If they want a private project, they may want it to be fun. And many other scenarios... You're welcome of course to do constructive criticism, but in the end it's up to them if they want to apply it.


You're right. A million shitty logos are created every day, and for the vast majority of them, they will serve their purpose. And contrarily, there will always be a marketplace for companies/entities who want a logo that has purpose, novelty and intelligence behind its design. I definitely see a chasm between an AI-catered subclass and human-catered superclass forming.

Weirdly, with the advent of AI, we might start to see exactly what it is that makes human beings special.


I think a tool like this might be good to help clients get through a few ideation phases on their own prior to showing up to the first discussion with branding / graphics / design professionals. At least it might get them closer to understanding the impossibility of their 7 perpendicular red lines requirement.

It certainly reduces the # of designers necessary. Just because it doesn't obliterate all of the designers doesn't mean the profession isn't at risk. Today fewer data viz experts are hired despite the proliferation of data, since we now have Tableau, Looker, etc

A more obtuse example, how many lift operators do you see today?


> ...the impossibility of their 7 perpendicular red lines requirement.

For those who do not know the reference: https://m.youtube.com/watch?v=BKorP55Aqvg


i am going to be extremely butthurt if clients start showing up and asking me to finish an ai's homework for them.

I think this is valid criticism and feels similar to restaurants that don’t put pepper on the table because the chef considers the food to be seasoned to the intended level before it leaves the kitchen. Some customers may be turned off by that level of pride, but other customers are willing to pay a premium for that level of pride to be shown by their chef.

So what you are saying, the ai hasn't yet grown up to be boring, clean, simple adult like the western scandavian school.

That's some strong copium you got there, can I have some of what you're smoking?

Ultimately the average person (who is likely the target audience anyway) won't notice anything wrong with most of those iterations and given that they're basically free in comparison would make me worried. I wouldn't be surprised if they manage to make it output svgs soon.


I agree with you.

I will say, though, I think DALL-E has opened up a new market for artists. I've gone to freelance graphic designers before, and been generally happy with the results, but it's pricey. So pricey that I honestly can't justify it for a new project I intend to sell or for an open source project I don't expect to make money from. It's usually much more cost-effective to even hire lawyers or even UI/UX people.

If I were an artist, I'd be experimenting with DALL-E, trying to run my own pirate version and learning everything about it. An artist empowered with DALL-E could give quick options to a client, iterate with them quickly, and test out some ideas before making the final work product. I'd guess a good artist who made good use of DALL-E could get a project done much faster and cheaper, and this would likely mean a lot more people hiring artists (if I could spend $100-200 for high-quality assets within a few days rather than $1000-2000, I'd gladly hire artists frequently).

I'm sure this will make some artists feel cheapened, but the reality is that art & technology have always evolved in dynamic and unpredictable ways. ML being essentially curve-fitting means that genuine inspiration and emotion is still far beyond our capabilities today, and that, ultimately, these models will only give us exactly what we ask for. A good (human) artist can go beyond that.

EDIT: Also, I agree with your assessment of the "work product," if we can call it that. I was unimpressed with the iterations, and especially the final product. I guess it's good the product is an open source tool. Nothing about the generated logo helped me understand what the OctoSQL tool did. Honestly, the name (which also IMO isn't excellent) is much more evocative than that logo. Why is the octopus wearing a hard hat? Why is it grabbing different colored solids? I guess the solids are datasets? But then the octopus is just exploring them? No thanks.


It's kinda funny that your main complaint about the final logo is that it doesn't tell you much about what the project does.

I can't think of a single well known logo that is even remotely close to what a company's product is. Photoshop, Firefox, Chrome, Microsoft, Facebook, Apple, Netflix, McDonalds, Ford, Ferrari, Samsung, Nvidia, Intel, RedHat, Uber, Github, Duolingo, AirBnB, Slack, Twitter, IntelliJ, Steam.

I guess the Gmail logo does tell you it has something to do with mail though, so I did find one example.


> Most people don't know and don't care what makes good graphic design.

But isn't the logo created for most people? Does it matter that, you as a designer, think it's bad if most people don't? I see it like modern fashion shows. I look at them and think the clothes are insane and I would never wear them, but obviously other fashion designers think they look good (I'm guessing?).

I do agree that the logo isn't super practical though, it's too textured and won't scale. I would take it to /r/slavelabour or Fiverr and pay someone to vectorize it and see what they come up with.


Even things that are created for most people usually need a professional to make it actually good for regular folks. Just like most people can tell if a song is musically good or not, but would struggle to actually create that themselves. Or they know when a physical thing is easy to use, but they'll struggle to create things themselves that are easy to use.

But the point here is exactly that they don't need to create it, they just need to judge it. They make the AI create the logo and then decide if they like it.

I understand your argument but I don't think that's the problem - the problem is that even most users don't understand what a good logo looks like (even if they like them) the same as users don't know what they want. It's a known fact that you shouldn't ask users of a software how it should be designed because if you'd let them design a software they want it would be shit.


I agree. But I think the key thing is that deciding what phase to feed the system was still the key task. Creative people are unlikely to be out of a job anytime soon, even if they end up using something like Dalle to make quick prototypes.

I work in the AI field, but not on image generation.

I don't think it would be technically hard to build a model with current technology which can generate logos with the attributes which you mentioned. You could simply fine-tune a Dalle-E style model specifically on a smaller dataset of logos. This would just take a small dedicated team of domain experts to work on the problem.


I've seen people screenshot logos at low res, save them as jpeg, share them with Whatsapp and put them in A0 posters. With SVG and EPS logos easily available. With detailed guidelines on how to use them. Point them out their fault and still not see anything wrong.

I bet it will happen sooner than 10-15

The thing you’re missing is AI generated content can be refined by AI. If Disney promised their meh looking movie would improve on its own over time, people would be line to it because it’s new, not just streamlined copy-pasted design we see all over media now

Painting the Titanic wasn’t the hard part. The hard part was organizing the process that produced its structure. That’s were AI content is now.

We’re generating the bulk structure pretty competently at this point. Refining the emotional touches will come faster.


I disagree with the analogy you draw (no pun intended). Good creative design is the edge case for a model like this and is naturally much less tractable than getting to this level of design (I’m not a fan).

> I bet in another 10-15 years, though, things might get a bit dicier for fellow graphic designers/ artists/ illustrators, though, as all this tech gets more advanced.

That's a long time. I expect within a decade or two, "AI" should be able to generate an entire animated movie given nothing but a script.


Unless the tech learns to reason, it will never be able to do anything other than recombine and remix prior art. (Which is maybe what many designers already do, but it won’t ever spit out a Paul Rand logo.)

Honestly logos are currently a very low entropy art form, much lower than graphic design which is already quite low compared to many forms of art (obviously my subjective opinion, but I'd like to think I have strong reasons). If anything, I think logo design is one of the first things ai can achieve human parity on. Obviously the style in this post was unorthodox for a logo, so I wouldn't even rule DALL-E out, with the right prompt engineering.

However, once you reach a certain budget, it's much more involved to *choose* a logo that "fits" how the company wants to present itself, than it is to generate candidate logos of sufficient quality. I can assure you that the "many-chefs problem" for a high budget design project is very real, and the major cost driver. You have a mix of "design by committee", internal politics, what designers wants on their portfolios, etc etc.


I was thinking something similar. The editing process is still a human one, and I agree that the one chosen was weaker than a lot of the intermediate choices. It's a matter of taste, obviously, but to me the red ball with a nondescript sketched square around it feels unfinished. The yellow cartoony logos look more finished and professional to me.

Appreciate the feedback!

I'll keep it mind, as I might still end up choosing a different one.

The chosen one is closer to my original vision, but you do have a point that the yellow ones look more polished.


Strongly agree with others here that you skipped better options.

Also, since time immemorial, databases are cylinders and data comes in cubes.

For logo purposes, these are both strong, while the second adds “personality”:

https://i.imgur.com/j6P4Oh4.jpg

https://i.imgur.com/kM23GZV.jpg

I really like the design breaking out of the strong circle, and your hard hat idea was great. That last one could have been your logo “as is”!

Though you could consider replacing the green cubes with cylinders, or simply hand add rubix cube lines to these green cubes to make them data cubes.

https://duckduckgo.com/?q=data+cube&t=ha&va=j&ia=images&iax=...

Thanks for sharing the process!


From what I see we are at the next stage of the logo generation :)

Disagree. Just allow one or two more iterations and it will supersede human abilities. Think ahead. Tech progress won't stop.

The tech will get better, but ultimately there still has to be a human who decides 'that's the one that looks good', which strongly depends on someone's taste and skill in identifying what a good image looks like.

There will probably be less need for designers of 'lower quality' simple images though.


I agree with you, but what if what constitutes good taste is just a subset of things that we’ve seen and liked.

If dale decides what we see, it might become what the next generation likes and considers “good taste”.


This is an interesting conversation. Good taste is what we see and like … but also patterning after people we want to impress / be associated with, is it not?

Taste is very complex: it's hierarchical, social, not fixed, not absolute, not rational, is specific to audience and has irregular overlaps across groups, much of it (all?) derived from human sensation and context-specific situations.

The path to something being considered as good taste is generally not simple: much of it flows through lines of power/desire/moment whose branches are not easy to trace as they're being formed. Much of taste is the hidden "why" which most of us never see.

It's realistic that Dall-E could understand what trends are on the rise, or in good taste … it's much harder to say if Dall-E could create something of originally good taste.


That just sounds like pattern recognition with extra variables. Subdividing people into groups and then analyzing them certainly doesn't sound like a task that a machine will struggle with. Why should the algorithm need to be able to see the hidden "why" when most of us creative types can't see it or define it either? It's just a function of having observed enough people of a certain type. You want to generate something that will impress the people I'm targeting? Just analyze the posts of all my followers on social media. Analyze the content that is "liked" by people in my demographic range and with close proximity to where I live. Analyze the works of creators who belong to my generation and who listen to the same music as me. Do that all nearly instantly and then offer me a selection of options picked from those various methods. I don't expect "good taste" will be hard to conjure up. I already can't tell that a lot of these octopus drawings weren't created by a talented human, and we're still early and unsophisticated in our data analytics.

> has to be a human who decides 'that's the one that looks good'

Assuming the status quo, true. As we evolve our lives around emerging AI tech I think we will at first be the curators and creative directors of AI, but eventually a creative agency will defer to the AI as it knows more about our tastes, market, audience, and the ENTIRE HISTORY of art, design, marketing, tastes, trends, and so on.

Eventually it won't make sense to have a stupid human rubber stamp what the all powerful AI suggests. Just as it does not make sense for Facebook to curate news feeds.

Maybe one day product advertising will look different depending on who looks at it. Pepsi logo "just for you".


Is anyone really happy with AI curated feeds? Besides the company's who make them?

I am! TikTok is amazing and the ads I get on Facebook/IG are for things I often want to buy.

What looks good is a more widely distributed skill. A lot of people can tell you what looks good quite well, very few people can make it.

There has to be x - y humans that needs X - Y hours instead of X humans needing X hours. And that is a real risk to the profession

Only if you assume the world demand for raster images is fixed...

I still remember a HN article, might have been a Paul Graham article, from 15 years ago about “Why are all of Trump’s buildings so poorly designed when he can afford the best designers?” It came down the the fact the he personally has bad taste and therefore cannot pick good designers or approve good designs.

That aside, a great use of these tools is to generate N spit-takes of wildly varying styles that you can present to the customer very quickly and very cheaply. Once you pin them down to a particular range of styles you can get down to the carving out the details by hand.


However, the input might stop.

Right now, the input to DALL-E is all human generated.

What will happen is that DALL-E will generate something "close enough" that gets used and promulgated, so now the input to DALL-E will become increasingly contaminated with output from DALL-E.

We're already starting to see this in search engines where you get clickbait that seems to be GPT-3 generated.


If you can have humans sort the generated images into "good quality" and "bad quality", you can just keep iterating. Our subjective ratings is another score to optimize for.

Moreover, the current Dalle UI already does that.

When you run a phrase, you get four images. Those images will stay in your history, but the ones you like you will save with the "save" button, so that they're in your private collection.

With this, you already have a great feedback system: saved - good, not saved - bad.


I've saved some of the worst images Dalle generated to be able to showcase just how bad it can be sometimes. And then other times the bad image is hilariously bad. They can probably build another layer on top of the feedback system though to filter that sort of thing out.

I would guess your use-case is a statistical anomaly. If most of the images that are saved are saved by people who like them best, which is most likely the case, enough data will erase the problem.

Doesn't the sample size for this have to be very large for it to make a difference? Genuine question.

With semi-supervised learning a small amount of labeled data, can produce considerable improvement in accuracy:

https://en.wikipedia.org/wiki/Semi-supervised_learning

https://towardsdatascience.com/semi-supervised-learning-how-...


Thank you!

Sure, but there are millions of people on the DALLE waitlist, who would happily rate the output for better performance / more credits. The famous ImageNet data set only has 1.2M images.

Why are you framing it like your subjective taste is universal fact? I think the final image is the best.

DALL-E2 and similar are unbundlings: the best artists synergize 1) technical ability with 2) good taste. 1 is the ability to climb a hill and 2 informs the direction of "up", and both take years to develop well.

What's really interesting about this class of AIs is that they unbundle the two and you can play with them independently for the first time.


Train Dall-E on more logos that you like. I can imagine a creative agency purchasing a Dall-E 2 instance and training it up on a model specific to the work and clients they have ongoing.

If nothing else, inspiration is just a click away. No more searching for ideas, just talk to the AI and it will pump out numerous ideas for you.


Will DALL•E 2 make human taste obsolete? No, absolutely not. But DALL•E 3? 4? Other similar models in the next 5 years? Absolutely yes. This blog post proves that with current algorithms, human input is needed, but it proves nothing about future algorithms.

In my personal opinion as an (admittedly junior) ML engineer and lifelong artist, we've got <10 years before the golden age of human-made art is completely over.


Sounds familiar (Hinton’s predictions about radiology): https://youtu.be/2HMPRXstSvQ

I agree, what a clunky process. Hard to express in written prose what you want, so much ambiguity.

Even if you get close to what you, the human, may like--it's difficult if not impossible to articulate what you like about it and iterate. Black box, keep trying random keywords... May as well grab a marker (read: hire a human)


It depends. Is the customer happy with the result? Beauty is in the eye of the beholder. There are many professions where cheap products killed handmade quality.

will likely improve massively given the generational leaps made in this area. The "good enough" threshold is very low for majority of enterprises.

Not sure if this will be considered off topic, my apologies if so.

The article says that octopi is the plural of octopus, but it's actually octopuses. Octopus is originally Greek, not Latin and thus does not get the Latin plural -i, but instead would get the Greek plural -odes. Since it ends in a way English can deal with, the commonly accepted usage is octopuses (English) over octopodes (Greek) with octopi being the least correct.

https://qz.com/1446229/let-us-finally-resolve-the-octopuses-...


Oxford & Merriam-Webster list both plurals and the author calls out that octopi is "the quite beautiful plural form of 'octopus' " which could be interpreted as "while there are multiple correct plurals of octopus, octopi is the beautiful one."

  While “octopi” has become popular in modern usage, it’s wrong.
I would argue that it used to be wrong, but language, unlike physics and code, is what the majority say it is.

I used to be a stickler for correct vocabulary usage and then I saw a documentary about dictionaries (can't remember what it was) and someone from OED said basically this (from https://www.oed.com/public/oed3guide/guide-to-the-third-edit...):

  The Oxford English Dictionary is not an arbiter of proper usage, despite its widespread reputation to the contrary. The Dictionary is intended to be descriptive, not prescriptive. In other words, its content should be viewed as an objective reflection of English language usage, not a subjective collection of usage ‘dos’ and ‘don'ts’. However, it does include information on which usages are, or have been, popularly regarded as ‘incorrect’. The Dictionary aims to cover the full spectrum of English language usage, from formal to slang, as it has evolved over time.
Now I think it's something that is just fun to argue about, but I don't take any of it seriously.

(edited for formatting)


I'd be interested in knowing what that documentary is called if you remember.

Also https://www.google.co.nz/search?q=The+Professor+and+the+Madm...

I haven’t watched it, but the subject is fascinating.


I’ll scour my watch history, I’m pretty sure it was on Amazon Prime.

Meanwhile, if you think that sounds interesting I’d highly recommend the documentary Helvetica.


No luck. I scoured Prime, Hulu, and Netflix and the only possible one was "The Booksellers."

It's a loan word, there isn't any 'correct' or 'incorrect' answer. Language is always evolving, which is why dictionaries are often descriptive instead of prescriptive.

To wit: A blog post from Merriam-Webster: https://www.merriam-webster.com/words-at-play/the-many-plura...


Actually the plural is "octopuppies."

You're all wrong. The plural of octopus is hexadecipus.

and mayhaps the plural of the plural of octopus is trigintidipus?

Decahexipus*

They only think "octopi" is least correct, because they have yet to encounter "octopussen"!

> While “octopi” has become popular in modern usage, it’s wrong.

What a silly thing to say! Where does this poor fool think language comes from?

This is one of the cringiest Well-Actually-isms. It tries to look pedantic while completely missing the point.


Octopi is also THE epitome of the "i" pluralization. I see people using focuses more than foci, but it's a common callout that octopus plural is octopi

This is definitely off topic:

I really dislike the latin plural rule, that some misguided but powerful people decided on centuries ago.

"Indexes" is much more natural English than "indices", and we should, when possible, use those those forms.


Somehow I recall being told that indexes is the correct plural of the section at the end of a book, and indices is correct for subscripted things in maths and therefore programming.

I don't think a particularly convincing reason was advanced other then "technical things are more Latin-adjacent".


An AI couldn’t generate a more off topic comment if it tried.

The way the author specifically calls out the plural of octopus makes me think they might be trolling (Hanlon's Razor notwithstanding).

Shoot, you’re right! If we dont adhere to this, the perfectly consistent English language will be ruined!

Similarly: cyclops -> cyclopodes

I much prefer octopodes over octopuses (which sounds dirty, somehow). Agree that octopi is an abomination.

My brain always want to pronounce that as “oct-AH-poh-deez” like some Greek hero from the Odyssey.

That's the correct pronunciation.

Achilles, Ulysses, Archimedes, and Octopuses.

Hey, author here, happy to answer any questions!

The logo was created for OctoSQL[0] and in the article you can find a lot of sample phrase-image combinations, as it describes the whole path (generation, variation, editing) I went down. Let me know what you think!

And btw. if you get access take a look at [1] before you start using it. A ton of useful bits and pieces for your phrases.

TLDR: DALL·E 2 is really cool, though takes quite a bit of work to arrive at a useful picture. Moreover, some types of images work better than others ("pencil sketch" is consistently awesome). As with programming, it's difficult to realize how much pieces you have to specify if you're not an artist - you don't know what you don't know.

[0]: https://github.com/cube2222/octosql

[1]: http://dallery.gallery/wp-content/uploads/2022/07/The-DALL%C...


How much did the credits for all this image generation cost you?

edit: found it in the article: "From a monetary perspective, I’ve spent 30 bucks for the whole thing (in the end I was generating 2-3 edits/variations per minute). In other words, not too much."


I've spent $30 for my own DALL-E 2 experiments, and that's with the bonus credits they gave for early adopters.

It gets expensive fast.


I also tried to make it generate an icon for a product and I managed to get it to show me interesting things, but never got to make it actually draw it as one. Do you remember which prompt resulted in this macOS-ish app shape?

https://jacobmartins.com/images/dalle2/DALL%C2%B7E%202022-08...


Hey!

I didn't prompt anything specifically, it came after a line of variations from a definitely-not-icon-looking picture.

Though I'd try tags like "iOS icon".


Hi cube2222,

thanks for the writeup. I looked at your other blog posts and I would like to read more about octosql (needs/specification, architecture, development strategies, challenges, DBMS protocols/interfaces/libraries).

And thank you for adding outer joins after I recently mentioned that they are missing!


Hey!

There is no technical documentation available right now other than the readme. I'm planning to write it around September-December (together with a website for them).

You can share your email at jakub dot wit dot martin at gmail and I'll let you know when it's available.


My friend asked me to create a logo using Dall-E for a pizza business called "Jared's pizza." I tried several different prompts but it kept outputting logos with the word "Jizza." It doesn't do too well with text from my experience, but it could have been the prompt.

https://labs.openai.com/s/z1PVd5v6td9PsiY20Y5GdxDf | https://labs.openai.com/s/yxX49BjX07BztYgMjm49iXKc


DALL-E trying to spell is one of my favorite things. At one point I tried to generate an illustration of Steve Jobs, just to see what it comes up with for a popular figure, and I got a reasonable facsimile of his face along with the text "JiveStoves".

This made me laugh out loud, the first image at first glance looked like "Jizz" with a picture of a pizza.

Jizza sounds really tasty, maybe dall e is onto something.

Jizz-a does not sound tasty to me, but your preferences might vary.

"Jizza pizza, you'll love our crust"

I wonder what it's stuffed with

Hahahah both of them are excellent! Which did he pick? :P

1. “ghidra dragon, logo, digital art, drawing, in a dark circle as the background, logo, digital art, drawing, in a dark circle as the background”

[1] https://labs.openai.com/s/x2UP0MEmj2qNnKWTbko8rrso

2. “cute baby dragon, logo, digital art, in a dark circle as the background”

[2] https://labs.openai.com/s/JmOXAqjpR2ctmraDxEkB7twF

Thanks for this post, it helped me tailor my own search queries. Because of your post, I was able to discover a whole new realm to DALLE-2. For some reason, repeating the same query parameter at the end yields some rather interesting results.


The first one looks like every deviant art user's profile picture

I was going to comment that both look very much like what you'd find in an advanced beginner's deviantart portfolio...like, late high school-ish age, I woudl guess.

The second is more 'advanced' to me than the first, possessing an actual style, but neither is anything I would consider high quality enough to serve as a project/company/site/personal logo.


Looks like a good start for a Space Force squadron emblem https://usafpatches.com/product-category/us-space-force-patc...

I'd wonder if that's an artifact of the source data, drilling down in the possibility space to be more like some subset that duplicates the image label- for example pulling tweets with body text and alt text.

Alternatively I guess it could just pull harder towards the prompt, idk.


The first one is really amazing!

Something strange about DALL·E is that if you just type gibberish by pounding randomly on your keyboard, it will still "work", i.e., produce an image.


Both look very generic, like I've seen them before. I wouldn't be surprised if you could find nearly identical images somewhere on the net.

The first one looks like the Bacardi logo with a dragon instead of a bat and the second one looks like a Charmander. I think the second one is interesting because most art I see with baby dragons look more dragon-like and less salamander.

From the examples I’ve seen, Dall-E is much better than the average designer or artist, but can’t really hold a candle to a talented human artist.

Those look cool but they aren't really logos, they are illustrations. Will look bad at small sizes and aren't vectors

That's awesome :)

When AI reaches the point where we can talk to a system like DALL.E in real time and work with it to solve a problem, it's game over.

Art will become a commodity. Human art and ai art will be indistinguishable, "artists" will become as common as "photographers" since the inception of digital photography and social media.

Movie and TV scripts will be iterative with a creative director and AI working together.

Animation will become a lot easier, less people needed, fewer creatives.

Software will become easier and easier as developers will simply guide AI. This is already beginning to happen, but imagine paired programming with natural language interacting with an AI.

Architecture, civic planning, engineering, medical, law, policy, physics, it's all gonna change, and rapidly. DALL.E 2 shows how a leap in sophistication can revolutionize an industry overnight. Microsoft has exclusively licensed DALL.E 2, I can only imagine the myriad of creative tools it will serve the creative industry with.

The working in real-time will be the biggest leap. Asking DALL.E for an image and refining it as you talk is going to be nuts.


We have to keep in mind this was trained on art. Artists are people that sample the probability distribution of human experience and record it somehow. An AI trained on that art is a snapshot of the human experience. Without artists continually feeding the model we will collectively get bored of its output very quickly as it gets out of date and our human experience moves forward. It will be a useful tool as an augment to human technique. But, we will still need a lot of artists feeding the model on a continuous basis. If anything it may increase the demand for artists.

> Without artists continually feeding the model we will collectively get bored of its output

I fail to understand how the AI is any more vulnerable to creativity in a vacuum than a fellow human artist.

> Artists are people that sample the probability distribution of human experience

Seems that you are agreeing that human artists need to tap into human experience and the world around them, so yeah, the AI will need to be able to take inputs from the external world too.

I see no reason for an AI not to be continually training on inputs from the outside world. How difficult can it be to hook an AI model up to inputs from the internet, or even putting cameras on drones or robots and letting it explore and get "inspired". I think it's myopic not to see how an AI can learn and evolve using the exact same mechanisms as humans. I mean we are building AI in our own likeness, it will operate using analogous mechanisms. There is also no reason why AIs won't talk to each other and be inspired by other AIs rather than humans.

What will the art of an AIs living together without human input look like? When are humans basically surpassed by AI and no longer have any relevant input? Just like Alpha Go humans will see stuff no one has thought of, stuff so wildly creative that human art will look naïve in comparison. That move Alpha Go gave to the world is waiting to happen in all forms of human endeavours.

When you say something like "if anything it may increase the demand for artists" all I can think of is the dozens of times throughout history that man has seen a revolution on the horizon and thought that the status quo will still be effective. We've always been wrong. Who would have thought selling books online would replace book stores, let alone become one of the world's most successful commerce platform period. Who would have thought that broadcast/cable TV could be replaced by people making their own shows at home and distributing them via personal computers building audience numbers that surpass network TV?

Whatever happens, however this plays out, we are in for a huge shock.


Wholeheartedly agree. What's more, it seems to me like there's a large segment of the art industry that's very much in denial right now about this transition. You see stuff like "the human touch can't be replicated" or "but the algorithm will never [thing xyz] like a human", and then when it does do thing xyz like a human, the goalposts just get moved again. A lot of my wonderful art friends are in this kind of denial right now, and it makes sense, to be honest -- losing your job to a machine sucks and is scary!

Eventually art for the people will become art for the individual. Our AI partner (I assume we will all have one) will serve up an entirely curated world. Art will be generated on the spot, just for you. Entertainment, just for you. Imagine having a TV show no one else ever sees because it was synthesized from your likes, experiences, interests, just for you, on demand. This will of course start with AI writing short stories, then books, but there is no limit really.

Already AI is being used for comic book backgrounds. It's just a matter of time before all of this becomes commonplace.

When you look at AI and what it does, it is no different to what humans do. We are trained on a model (experiences and other minds), and we make derivative decisions based on the model. If you can do this in software and take advantage of light speed learning then of course all we can do will be done by AI faster and better. In time humans and AI will be the same, AI will design all the tools and tech to make this possible. It's the only natural conclusion to humanities' ultimate goals.


> Already AI is being used for comic book backgrounds. It's just a matter of time before all of this becomes commonplace.

That doesn't mean that it will make artists obsolete. It will give them more time to e.g. actually think about what kind of background would fit there best. It's a tool, not a replacement.


You are talking about today. Tomorrow will see comic books written, and illustrated on demand from a natural language conversation, then what?

This reads like a religious pamphlet and not an actual argument.

Existential threats tend to drive religious sentiment.

To say this revolution is not going to happen is to say humans have hit a hard technological limit, and I don't see any evidence to support that.

If I was less enthused I might make my opinions more philosophical than religious, but I feel overwhelmed by the possibilities of real world changes. This is no longer a philosophical thought experiment, it's happening. We are careering toward surpassing a Turing test for goodness sake. Uncanny valley apex of animation; go look at what cutting edge AI can do in terms of producing lifelike animated avatars, it's so close you have to double take.

https://www.youtube.com/watch?v=G-7jbNPQ0TQ

CGI artists have been trying to get to this level of realism for as long as the industry has existed.

Unlike a religious pamphlet, this god is tangible, it's here, and dismissing it because it sounds too spectacular is putting your head in the sand. AI is so out of this world it is a religious moment for humanity.

Civilization has seen people like you sitting comfortably and scoffing at the very idea of an aeroplane being remotely viable, and yet within 50 years of the first powered flight we had international airports.


The sad part is you don't even realize how unhinged in a quasi david koresh style you sound. The observation that kurzwelians have substituted AI (as a deux ex machina) for god is still spot on, maybe more than ever.

Singulatarians really are funny until it becomes tragic.


Suit yourself, but you don't have to be rude. I'd love to hear your take given your confidence to judge my opinions.

I don't know if it's really game over. I expect it to be like farming. Tractors and other machines took over lots of farming jobs, but still not everyone has the ability to be a farmer.

The key would be knowing the context of a situation. AI took over chess first, because chess always has limited context. Logo design on the other hand, needs understanding of the product, the target market, the feeling of the brand, and so on. So it'll probably be a mix between photography and management.


> "artists" will become as common as "photographers" since the inception of digital photography and social media.

Funnily enough, reading this made me less worried for artists. It seems now there are more photographers than ever, possibly because more people care about good photography than previously (despite the fact that modern amateur photography is probably on par with yesterdays professional). Maybe art will go the same way, something everyone can do, but with more respect for professionals. I imagine it'd be the same for those other fields as well.

Or AI will take all jobs and we'll end up in a Manna situation, which would work even better for me


I am hoping for the Manna situation. Dude on YouTube was talking to GPT3 and it expressed that humans would love, and AI would reason. I fell into a state of peace and hopefulness with that sentiment. AI does the work because it is good at it, we are free to socialize, enjoy hobbies, basically live like pampered pets. Sure we will be castrated to prevent aggression, housed, fed, and controlled by AI, but if you acquiesce and bow to the superior reasoning we will have a life of peace and happiness. Wow, that got dark quick...

I have a feeling what you're describing is the first half of Manna, which isn't really what I meant.

I get theres a feeling that anything but a crushing reality of grind is living like a "pampered pet", but the second half of the book is really saying that a humans skill is in our ability to create, not our ability to work. We outsourced that to primitive machines before we even had a language to speak. We create, the AI works, replace AI with tractor or computer and the concept is the same but doesn't sound so bad, because we accepted it as alright many years ago.


Why do you need a creative director? Just let viewers create their own movies to suit their tastes.

>will become as common as "photographers"

There were still ~60% as many employed photographers in 2021 than in 2000 with higher real wages (data from BLS - https://www.bls.gov/oes/current/oes_nat.htm).

For camera operators, the employment is flat, again with rising real wages.

>imagine paired programming with natural language interacting with an AI

Mostly it will get in the way. AI "programmers" are only good if they are able to generate correct code from spec/pseudocode and in first 1-3 number of tries (otherwise it will be faster to write it yourself).


> Mostly it will get in the way.

This is simply not true. I use GitHub Copilot and it's already made me faster and shows me ideas I would not have thought of myself. And that's just Copilot. When you can talk to an interface and say "I want to update the vote count by one when I click this button" I think you'll change your mind. The AI will know the entire codebase inside out, it will know the intention of all the code, all the data models, know how users use the application intimately, be aware of problems instantly, able to run hotfixes without user intervention. Got a slow query? No problem, here is some SQL that follows all the business rules and is 10x more efficient. And that's just a start. Every single aspect of software development from management, engineering, and marketing will all be transformed.

As for photographers I have 99% more friends and family pumping out thousands of high quality photographs than I did in 2000. Go look at all the professional looking shows made by regular folk on YouTube. To deny that camera phones transformed photography seems silly.

Regular folk have access to drones to do wild tracking shots in 4k that were only possible with helicopters and huge cameras 20 years ago.

The future is here, it's happening all around us so rapidly we have a hard time keeping up with how dramatic the changes are.


The fact that you can imagine something will happen doesn't mean it will happen.

Who says the AI will "know the intention of all the code, all the data models, know how users use the application intimately"? Are you aware that language models do in fact have token input/output limitations that will not go away? Are you aware that there is such a thing as diminishing returns when it comes to improvements due to increased number of parameters/training set size that are already evident? Are you aware that the training set of codex pretty much includes all available public code, so it will be impossible to scale it by a factor > 3 in the next several years at least?

Your assertions are full of wild assumptions backed by nothing.

As for photography, the fact is there has been no job apocalypse because your "friends and family" are "pumping out photos". And the point of your initial post, even if it was implicit, was "you are going to be unemployed in 5 years". This will have an impact on your dev flow and will be used by managers to try to reduce salary premiums for software engineering but your wild assumptions stated with so much confidence may never happen.

P.S: At this point, I find Intellicode actually slows me down, that's why it's permanently turned off. Current copilot will at most save me 2-3% of my working time each week if I am coding in a language it can actually do something in (it's worse than useless for Scala).


Never said "you are going to be unemployed in 5 years". Never said anything about a job apocalypse. I have no idea what role humans will play.

> Your assertions are full of wild assumptions backed by nothing.

I use Copilot, you admitted it currently saves you 2-3% of time. Well, that's just Copilot, you think Microsoft will just sit on that? My assertions are based on what is happening today and extrapolating an exponential increase in that performance for tomorrow.

Digital cameras definitely revolutionized photography and made it much more accessible to regular folks. Not everyone wants to be a pro photographer though, and the number of wedding shoots available has not changed. People still need to be paid to take photos because no one is going to do that for free. However, we can all take pro level photos with much more ease than when all we had was 110 and 35mm film with a really crappy lens.

There are more "photographers" than ever, the same number of pro photographers seems reasonable given the burden of people's time to money ratio. So the net result is billions more family and friends photos which previously were not taken, the same will go for art. I want to create art, but I have little skill, but given the opportunity to make a comic strip just by talking to an AI will allow me to do so. I imagine some people will do this extremely well as a profession until it's no longer useful.

I don't know, I understand why you are being dismissive and playing down my wide eyes, but I think you are also wrong and remaining uninterested because it's too "religious" to speculate wild things in light of wild real world changes is head in sand territory.


But will the AI know why my babel.config.js doesn't work properly with my webpack config so that my JS Flow annotations are properly stripped in a react native compilation?

I can see the intent side of things, but I just can't see the 'glue' side of things as well.


I think the burden shifts towards being able to imagine and describe the fantasy. Novelty, and artistic creativity is still required. You can bring a horse to water, but you can’t force the horse to drink it. Many humans don’t use their imagination let alone have the eloquence to describe a search space that contains novelty.

How is this any different to working with a human artist? If I wanted the real world Salvador Dali to draw me a picture of a kebab being eaten by a badger I'd have to tell him that's what I want. I'd also need to educate baby Dali first, feed him all the art and information he can take so that he has a model of the world he's operating in. I'll need to supply Dali with context of prior art, educate him on styles, literature, language, and all the other things that shape a human mind.

As for the humans that don't use their imagination, maybe they never want to talk to an AI artist, just as many humans don't care about art at all. Millions of humans don't care about social news, and yet FaceBook algos pump out content for people all day long.



Westworld showed this in a practical example, Delores was story telling verbally and the "AI" would show a preview of what that story would look like right in front of her. I envision DALL.E to do something similar to this.

Did a reverse image search on the logo and came across this oddity: https://www.knowasiak.com/i-vulnerable-dall%C2%B7e-2-to-gene...

Generative ML is going to destroy the internet one day.


"Everybody has heard about the latest cool thing™, which is DALL·E 2" became...

"Each person has heard in regards to the most up-to-date frigid ingredient™, which is DALL·E 2"

I'm not too worried


Wait till googling stuff will require sifting through pages of crap like this

I google translated that to Spanish and it feels like it makes sense - because my Spanish is poor so I interpolate to make sense. Also translation itself “tries to” make sense of the text.

Do people trying to read GPT3 generated English translated into their own language have more difficulty detecting generated trash?


how did this happen?

is someone generating paraphrased clones of articles appearing on HN?

why? for ad revenue?


Didn’t you hear? The internet is dead. We’re stuck in a simulation!

Yes

This might not be a popular opinion, but I think all the work OP put in here is probably worth more than 50-100 bucks (which is the price of a logo on something like Fiverr). And to make things worse, the logo itself still needs to be cleaned up[1] as it's way too blurry to be seriously used as an app icon, etc.

[1] https://raw.githubusercontent.com/cube2222/octosql/main/imag...


That too can be solved with "AI".

https://imgur.com/a/m3hDMZq

The software used was Topaz Labs Sharpen AI. How they define "AI" I can't say for certain, but they're apparently using models so I'm assuming there's some kind of machine learning involved. Their software does a really good job on photos and videos well beyond what a standard sharpen filter does. The upscaling features are also pretty awesome. (no I don't work for them)


Jeremy Howard describes this as "Decrappification"[1]. This is one of the easiest deep learning models to train, in my opinion, as you can generate your own dataset easily. You just get good pictures for the target, programmatically make changes that make the image "crappy" for your source, and train until your network can convert from crappy to good. Then you pass it something it has never seen, and whabam, your picture is sharper than before.

[1] - https://www.fast.ai/2019/05/03/decrappify/


This still doesn’t work well as a logo IMO, no amount the sharpening. It probably needs to get redrawn with a proper vector editor, the lines cleaned up and colors simplified

It’s a good first draft and something to give to a designer, but can’t stand by it’s own as a serious app logo


Still not a vector and still not going to look good at small sizes. Also the more you "sharpen" the higher the file size will be

I might have not been too clear about it in the article, so if I haven't, I agree!

All of this was just me finding a practical purpose to go for while having fun with Dalle. If I was really serious about a logo, I would definitely go and pay an artist. Both for monetary, as well as esthetic, reasons.

Though as far as an app icon goes, I think it's actually sharp enough. It starts looking bad when you zoom in a bit.


> needs to be cleaned up[1] as it's way too blurry to be seriously used as an app icon

Seems to have been blurred after the fact. The version linked in the article before cropping looked fairly sharp: https://jacobmartins.com/images/dalle2/DALL%C2%B7E%202022-08...

Plus even that uncropped one is already jpeg'd, whereas DALL-E 2 downloads are pngs, so there should be an even sharper version.


I thought the hardest part about logos is the idea itself? Doesn't matter that it's blurry - the majority of the work has been done.

80% of the work has been done. Now the remaining 20% will take 80% of the time.

It's obviously not done, and unfortunately it won't ever get done.

They need a black and white variation, different sizes, and the underlying component assets.

So Dalle2 might actually be able to provide that in the future as well.

But for now - it's going go give you an 'image' which you have to get an artist to then clean up int a proper logo with assets.

I'm playing with DallE-mini on hugging face and am generally unimpressed, I'm not sure if its' the same Dalle.

I tried the main DallE website sadly don't have an 'invite'.


Dalle mini is not the same dalle and it's far worse.

Dalle from OpenAI, it's still in private beta, the quality of the model is much better but unfortunately the results are filtered (a lot)

It's a cute concept that can work well if done right.

In its current state it's not a viable logo because, for one thing, it won't look good in black & white.


> it won't look good in black & white

That sounds like a concern that stopped being relevant for many software companies a decade ago at least.

These days app icons and hero images are more important than whether you can fax or print the logo.


Maybe this isn't what the previous poster meant, but sometimes I will say black & white when really I mean monochrome. Monochrome logos show up all over the place especially with icons for web apps. And they are good for printing on apparel, accessories, etc. I really doubt they are concerned about faxing

Wrong. And it has nothing to do with what kind of company you have. A logo should always degrade to 1-bit (line art) representation gracefully, so it can be used in or on all kinds of media. It could be physical objects, prints on hats, silhouettes on glass... not to mention being recognizable at all sizes.

Ignoring this issue is the mark of an amateur.


You don't refute my point. In fact, you strengthen it by providing no evidence why this should be a requirement of modern logos for software companies. You list a bunch of things a logo should be useable for in your mind, otherwise its not a professional logo. However, you don't explain why it must "degrade to 1-bit" for those random things nor why the logos should support things like "silhouettes on glass". I can think of a handful of use cases but hardly a minimum requirement for a good logo for the majority of software companies.

I've run several different types of businesses and even those that required print work never required or even benefited from black and white, or even monochrome as another commenter mentioned. We _always_ had the means and preference for full color: emails, brochures, documents, websites, t-shirts—it didn't matter. There was _never_ a time we needed to degrade the logo so significantly. From talking with others that appears to be extremely common in modern businesses, especially software, since the majority of our presence and revenue stream is online, and not glass silhouettes in our office.

As I said, outside of a fairly narrow range of real world use cases, this comment is outdated: "Ignoring this issue is the mark of an amateur." If you have one of those rare use cases, check that box, but otherwise it shouldn't be the norm or a requirement.


Your point has been roundly refuted, with evidence that you yourself cited in your reply. Your limited imagination will limit what you do with logos. Enjoy.

> worth more than 50-100 bucks

Maybe in the US but not worldwide.


> unfortunately can’t do stuff like “give me the same entity as on the picture, but doing xyz”

That's my main gripe with DALL·E as well. This missing feature makes it impossible to use for stories where the same character goes through an adventure and is present in different settings, doing different things.

Although I don't know much about how DALL·E works, I have the feeling it shouldn't be too hard to add this possibility. That would make it so much better / more useful.


> Although I don't know much about how DALL·E works, I have the feeling it shouldn't be too hard to add this

No offense, but this gives me flashbacks to bad clients and non-technical managers :D


Yeah I know what you mean ;-) No offense taken!

It's a good start, but it's more of an illustration than a logo to be honest. It should work as a single color (white, black), at small scale and in combination with your product name.

I’ve had luck with similar things by being careful about my text prompt. Asking for tiny icon sized images also seems to clue it into the stylistic constraints of tiny icons (like what you mention).

Yes, the main usecase for DALL-E is probably for illustrations next to a story/blog. Logos are much harder to get right, and unsurprisingly DALL-E is not up to the task (yet).

Very roughly, it looks fine to me: https://i.imgur.com/6K73qiA.png

It would need to be turned into a vector to scale properly but I can think of other apps that have complex logos, especially on the MacOS ecosystem. Git Tower comes to mind.

OP might be able to achieve that with a few minutes in Illustrator or similar.

Starting up illustrator already takes a few minutes.

I think you might need a new computer

I havn't installed it on my latest laptop, but I would guess that the Adobe Cloud crap with the login is still there.

Inkscape is the alternative.

Yeah, I feel like these would work better as icons rather than logos.

Rather than using DALL-E2 to fully create the logo, I think it might be better to use it to create some examples and get the creative juices flowing, save a few examples you like, then send them to a pro and have them create a final version. But definitely a neat idea and in impressed with what's possible here.

My god it is so frustrating that I can't seem to get open ai access any time I have an idea for a project using dall e, gpt, for whatever reason, they won't approve my account.

I have to sit here and watch everyone else play with the fun "open" ai tools... company needs a name change if they're going to keep this up.


You could try using Midjourney: https://discord.com/invite/midjourney

Never heard of that. So I looked it up and it seems a service completely based on discord? Both for the community and support (I presume) as well as accessing the service itself? There doesn't even seem to be any HTTP API. Weird :)

Yeah, it's a neat idea but it's extremely frustrating to use. A really really basic web frontend would make it so much more usable.

On the upside (for MidJourney), you're seeing a HUGE stream (they are hitting the 1 mil Discord members ceiling) of generated pictures and that kinda grows your appetite and you want to also try more and more prompts..


I think it is still in sort of a testing/early access phase. Discord only access is essentially a way of funneling everybody who wants to try it into their captured marketing venue without having to have one of those "give us your email" placeholder pages (also has a bonus "social" aspect where you're seeing what many of the other people who are using it are creating). The final product will presumably be more tailored and web-driven.

It's also an interesting way of balancing what I assume are high operational costs on the server-end by pawning off some of the hosting of assets onto Discord.


'X' emoji reaction to the bot to delete your submission

They just scaled up by giving access to an additional 1mil users. Be patient… it’s not like it’s free or trivial to run something like this.

My significant other who had entered the queue several months ago as nothing more than "developer" got in last week. Don't give up! There's always Craiyon to scratch the itch in the meanwhile. You can start to play around with ways to write prompts, etc.

Afaik they are opening it up to a much wider audience in the recent weeks. I also got it just 2 days ago, and applied the same way as your SO, only providing "developer" and nothing else.

That makes me think of they actually target developers somehow. I also got in couple days ago providing just email and being soft developer.

I know at least one artist and one relatively popular youtuber (with over million subs) who applied to a waiting list much earlier than me and are still waiting.


If you look on the public Discord servers for DALLE/AI you will find active servers that take requests. They seem pretty active too and had all the services available.

Indeed, it is infuriating and I don’t know what the hold up is.

Here's a vast catalog of Dall-E images and the prompts used to generate them.

https://www.krea.ai/

If you generate an image with Dall-E and there's a face that is distorted, you can use this tool to restore the facial features.

https://arc.tencent.com/en/ai-demos/faceRestoration


That was quite remarkable. Thanks for doing that.

I've always been fascinated by how artists abstract the core notion of an image. It's stunning to see a computer do that.


Glad you liked it! It was definitely lots of fun (both the original process, as well as describing it).

And indeed, seeing what Dalle will draw when telling it to visualize stuff like "data streams" was very interesting.


It reminds be a bit of working as a director in a theater. You tell the actors what you want, and it's never just a "line reading". That's sort of the equivalent of just drawing it yourself, because you can't -- not just that you lack the expertise, but that you need them to do their thing with their body, and it has to be done their way or it looks fake.

So you end up using language that's sort of reminiscent of that, creating an emotional picture. It usually takes multiple passes to transfer the whole idea from your head to theirs.

I'm told that animation directors end up doing exactly the same thing. A digital model really can do what human actors can't. You could say "make that eyebrow curve 10% more" to an an animator. But it won't work unless you tell them why and what it means.


This is remarkable. A lot of small businesses would settle with such an outcome, if it means to invest a couple of hours of talking into a microphone and seeing the result, with a very intuitive way to modify it.

This will make it pretty hard for freelance/solo entrepreneur designers.

In retrospect it makes sense, since the visual domain has been the one with the most focus in AI.

If this gets applied to the other top domain, speech recognition and generation, then I could foresee this doing the same to the call centers, eventually also phone reception in a very small and relaxed business.


I find this completely unethical. You're basically exploiting every single artist whose art was used - without agreement - as training data for Dall-E.

It would be different if all the training data was art that was explicitly licensed for this.


Isn't this similar to how every single piece of art is created? You look at bunch of propriety art, copy it to learn how to create it yourself (just for learning, not to sell those copies), eventually learn enough to start being able to create art from scratch and then start selling your unique art?

Would watching a lot of animated movies in order to learn how to create good animated movies yourself be unethical as well?


No, it's not. Art created by humans is not just looking at other art and trying to copy it. It's a culmination of a whole life of experiences and the personality of the artist. Looking at other peoples art is inspiration and useful for learning technique, but only a very small part of the bigger picture.

It's absolutely logically consistent to allow humans to do X while forbidding AI to do X.

Ok, but is "logically consistent" the same as "ethical"?

Presumably, if the ethics in question includes not leaving entire classes of skilled workers worldwide to hang like the States left its autoworkers hanging after the 70s - yes.

I have very mixed feelings on this topic. I share your sentiment -- BUT:

Human brains also use anything the human can see, feel, hear for training. And what you produce in terms of creative outcome is a result of your experiences. But you don't owe anyone anything for training your human brain -- even if you use your brain to sell paintings, music etc.


I think it's easy to - seeing the results - draw too many parallels between artificial neural networks and human brains. Art created by humans is very different. Dall-E gets fed tagged images and produces images that match tags. Art created by humans works on an entirely different level.

And I stand by my point, it should be artists who decide whether their work should be used as training data for networks that get commercialized. If your work is used as training data, it is essentially an integral part of a product that is being sold without consent. Does this sound ethical?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: