I've been working on a web client[1] that interacts with a neat project called Stable Horde[2] to create a distributed cluster of GPUs that run Stable Diffusion. Just added support for SD 2.0:
New 512x512 model supported with all samplers and inpainting
New 768x768 model supported with the DDIM sampler only
Not yet supported is the upscaling and depth maps.
To be honest I'm not sure the new model produces better images but maybe they will release some improved models in the future now that they have the pipeline open.
All the same issues as migrating 1.5 to M1s. It went fast because I upgraded my existing codebase that had those fixes already instead of building of the new compvis one.
Nicely done; this seems to work for me. In my own attempt, I got stock Stable Diffusion 2.0 "working" on M1 using the GPU but it's producing some of the most cursed (and low-res) images I've ever seen, so I've definitely got it wrong somewhere. The reader can infer the usual rant about dynamic typing causing runtime misconfiguration in Python.
How much of this is stable diffusion 2, and how much is something else? For instance, the text based masks, the syntax like AND and OR, the face up scaling — are these all part of stable diffusion 2 (and can be used via other stable diffusion apis)?
The figure isn't static. Some models require as little as 3.5gb of free memory, others demand 8-16 gigs. MacOS is weird with memory management and everyone's Mac is different; I'd really only recommend running the model on 32-gig machines to avoid writing into swap, but technically it's possible with 8 and 16 gig machines.
2.0 is a mixed bag. It's set making pixel art back entirely. I'm pretty sure this is down to the aesthetic filter - it has a very biased idea of what good images are. It's silly to do that at the training stage, that should be something you do in the prompt.
Fine tuning is out of reach for me, so I'm sticking to 1.5.
Wow, this looks awesome! I noticed that the sample notebook doesn’t include SD 2.0 by default, and says that it’s too big for Colab. Is that a disk size/RAM limitation?
As an aside, it would be cool if you versioned that notebook in the repo, so that it could be easily opened with Codespaces.
This would have been perfect if it worked on Windows too. I need to look into dual booting Linux (opening a can of worms) just to give it a try, as WSL doesn't seem to cut it.
AWS g5.xlarge instances. Very fast (roughly RTX 3080 speeds) and about $1 an hour. However, you can just turn the instance on and off and not pay anything except the latent EBS cost.
awesome library, I haven't seen this before. I just added it to my stable diffusion api service so you can query stable diffusion 2.0 if you don't GPUs setup currently: https://88stacks.com
[1] https://tinybots.net/artbot?model=stable_diffusion_2.0
[2] https://stablehorde.net/