Is the latest Midjourney now considered better than Stable Diffusion? If so, wil...

aenvoker · on April 13, 2023

SD can be better than MJ if you dig really deep into the open source nature of SD. Open source UI and plugins, community-made models of many types and uses, tribal knowledge of techniques... It’s very complicated and requires a lot of searching, reading, installing and experimentation.

The base tech and models straight from Stability AI give pretty crap results if you just plainly describe a scene.

MJ in contrast provides a great results out of the box. Say anything, get beautiful picture of something. From there you need to figure out specifically what you actually want.

However, if you really want to iterate on just specific details of a scene with a specific layout, with specific characters in specific poses, with specific style elements, MJ is too chaotic to control at that fine detail. So to is SD out of the box. But, if you take the time to learn how to install and use ControlNet, highly specialized models/LORA/textual inversions from wildly varying sources, in-painting, latent upscaling, hook up Photoshop/Krita/Blender integrations, ect, ect… you can eventually get very precise control of SD’s results. And, then new better tech releases next week! :D

pmoriarty · on April 13, 2023

This kind of reminds me of a VSCode vs emacs comparison.

dragonwriter · on April 13, 2023

> Is the latest Midjourney now considered better than Stable Diffusion?

Its better in terms of having a no-user-configuration service available that gets you from zero to decent results with nothing more than prompting.

SD is better in available specialized customized models (finetuned checkpoints and the various kinds of mix-and-match auxiliary models and embeddings that can be used with them), not having banned topics because it is self-hostable, available tooling and UIs that expose tuning parameters and incorporate support for integrating techniques like guided generation with various types of ControlNet models, animation, inpainting, outpainting, prompting-by-region, etc., with image generation.

simonw · on April 13, 2023

Midjourney provides much better images by default. It's really impressive.

Stable Diffusion's advantage is in the huge amount of open source activity around it. Most recently that resulted in ControlNet, which is far more powerful than anything Midjourney can currently do - if you know how to use it.

https://github.com/lllyasviel/ControlNet

pmoriarty · on April 13, 2023

Could you elaborate on just how it is more powerful?

d23 · on April 13, 2023

Look around a bit for info on controlnet. You can use depth maps, scribble in where you want objects to be, or place human poses in a scene and SD will use it to generate an image. You can combined multiple controlnet models and control how much they contribute to the scene. The level of control available is pretty awesome. I say that as someone who was in the DALL E beta and used midjourney for a few months (though I guess I don’t know what advancements they’ve made in the last few months).

echelon · on April 13, 2023

I think that's the popular consensus.

Both StabilityAI and the open source community are working on improvements to Stable Diffusion.

Keep in mind StabilityAI is also pursuing LLMs and the host of other model types, whereas text to image is Midjourney's single core competency and value prop. Midjourney is very focused on staying ahead.

edit: I wanted to add that the extensive training costs can be prohibitive for the OSS community to fully participate. Coordination via groups such as LAION can help, but gone are the days of individual OSS participants contributing directly to core foundational model training.

acheron · on April 13, 2023

Not if you want to generate a picture of Xi Jinping.

furyofantares · on April 13, 2023

In fact here's a list of painters whose style is immune to direct mimicry in Midjourney because their name is banned:

Ambreen Butt

Jan Cox

Constance Gordon-Cumming

Dai Xi

Jessie Alexandra Dick

Dong Qichang

Dong Yuan

Willy Finch

Constance Gordon-Cumming

Spencer Gore

Ernő Grünbaum

Guo Xi

Elena Guro

Adolf Hitler

Prince Hoare

William Hoare

Fanny McIan

Willy Bo Richardson

Shang Xi

Wang Duo

Wang E

Wang Fu

Wang Guxiang

Wang Hui

Wang Jian

Wang Lü

Wang Meng

Wang Mian

Wang Shimin

Wang Shishen

Victor Wang

Wang Wei

Wang Wu

Wang Ximeng

Wang Yi

Wang Yuan

Wang Yuanqi

Wang Zhenpeng

Wang Zhongyu

Xi Gang

Xie Shichen

Xu Xi

I have this list because I recently made a site[1] that displays the 4 images from a prompt of "Lotus, in the style of <paintername> <birth-death dates> [nation of origin]" for every painter listed on wikipedia's "List of painters" -- except for those in the above list.

The fact that they banned both Xi and Jinping separately to prevent Xi Jinping was surprising to me. Twice as banned as Adolf Hitler.

[1] https://lotuslotuslotus.com - small chance you get an NSFW image if you hit upon Fernando Botero or John Armstrong, perhaps there's more.

neurostimulant · on April 13, 2023

So if you're an artist and don't want your style to be used in an AI product, all you have to do is changing your name that includes a variation of controversial leader names (xi, adolf), or slightly offensive name (dick, gore)?

dragonwriter · on April 13, 2023

> So if you're an artist and don't want your style to be used in an AI product, all you have to do is changing your name that includes a variation of controversial leader names (xi, adolf), or slightly offensive name (dick, gore)?

Nope, that doesn't stop models from being trained on your art. It makes it somewhat more difficult for people to prompt specifically for your style, but your art still influences output, and there may be other ways (e.g., titles of specific works) to deliberately and specifically evoke it in particular.

pmoriarty · on April 13, 2023

No, because self-hosted programs like SD will still be trained on your art.

Anyway, I suspect with more competition many of the restrictions on sites like Midjourney will eventually be removed.

aenvoker · on April 13, 2023

FYI: MJ is in the process of introducing a smarter word filter that recognizes the difference between 'Wang Jian' vs. 'suck my fat wang'

FooBarWidget · on April 13, 2023

I wonder what happens if you write Xi in Chinese instead of the romanization. Xi Gang's Xi (奚) is not Xi Jinping's Xi (习).

Terretta · on April 13, 2023

What mechanism or project did you use to automate prompts and capture results of the prompts?

furyofantares · on April 13, 2023

ChatGPT made me a python script for automating pasting results, as well as most other tasks related to this project.

I already had a discord bot I wrote by hand before for downloading the images.

I thought of the project in the morning while my kid was getting ready for school and had it running jobs before we were out the door, worked a little before I left for work, and and a little more after work, and it was done before dinner time.

ChatGPT is incredible.

qikInNdOutReply · on April 13, 2023

Adolf Hitler distinctive style? Naive watercolors with botched perspectives is a style now. Okay..

ianbicking · on April 13, 2023

I listened in to their weekly chat today and it sounded like they'd be happy if they could just ban all political mimickry. I think the AI images of Trump being arrested (which looked like Midjourney to me) were disappointing.

Maybe that just means banning images that are meant to fool people, and obvious satire would be ok, but they might be ok erring on the side of caution.

(Nothing in the talk was this explicit, but this was my read of the subtext)

furyofantares · on April 13, 2023

It's been noticeably better than Stable Diffusion since v3 at least (I wasn't paying attention before that). It's on v5 now, and I think MJ has continued to get better faster than SD through this time period.

ControlNet for Stable Diffusion may be an exception to this.

ianbicking · on April 13, 2023

There's lots of versions of Stable Diffusion so I've had a hard time knowing what to compare exactly. But from what I've seen none of them come close to Midjourney.

Stable Diffusion does more things though, like in-painting where you can erase part of an image and then have it recreated. I've seen videos of people doing impressive things with in-painting and extensive regeneration of each portion of an image until it's just right. Seems like a ton of work though. Still I've had some fun using it to modify images or extend images.

GaggiX · on April 13, 2023

Stable Diffusion XL is on the way, it's a bigger model, it's still training but you can try the last checkpoint on dreamstudio.

vitorgrs · on April 13, 2023

Midjourney v4 was already better than Stable Diffusion. The new Dall-e (which you can use on Bing) I also find it better...

The main difference with Stable Diffusion is that you can fine tune with your own dataset. There's img2img, and a bunch of other tools. But the base model it's really worse than competitors right now.

flangola7 · on April 13, 2023

Also SD can do porn, which Midjourney forbids for some reason. They're leaving an assload of money on the table and somebody will nab it sooner than later.

theaussiestew · on April 13, 2023

“Assload” - I see what you did there

archerx · on April 13, 2023

The Dall-e API sucks so much right now, I’ve been experimenting with it the past few days and it produces a lot of horrors. I even used the Dall-e prompt book as a guide but still so many more misses than hits. Even when it gives a non horrifying image it’s just decent. 5/10 rating

I started testing out the official Stable diffusion API and it already gives you way more control than the Dall-e API and seems to produce less horrifying images that are better quality but I feel like Dall-e understands the prompts better. 7/10

I would love to try mid journey but I uninstalled discord years ago and have no plans to ever reinstall it ever again. So I’ll wait for the API access if they ever do it. 0/10 (only for being discord only)

vitorgrs · on April 14, 2023

Standard Dall-e 2.0 is worse than Stable Diffusion... Dall-e Experimental, is available on Bing, which I guess is Dall-e 3... Similar approach of GPT4 they did with Bing as well I guess...

RichardGao112 · on April 13, 2023

Check out our stable diffusion API with prompt tuning at Evoke: https://evoke-app.com/

kaetemi · on April 13, 2023

From what I understand, SD doesn't handle color space correctly (or at all), hence all the weird saturated blue-magenta-orange-beige gradients in a lot of its example outputs. And why its output often feels more like a bad Photoshop collage than a proper blend. It's probably trained on unmanaged sRGB. In which case the SD model is fundamentally flawed, since doing math in sRGB space is nonsense and causes bias (specifically those saturated gradients are a sign of just that...). Although I don't know for sure, I didn't find any color management code in their scripts when I looked for it, so I'm assuming this is the case.

I'd be happy to be corrected if anyone knows the details on this in SD.