If like me you're using Real-ESRGAN-ncnn-vulkan [1] and are curious what upscayl-ncnn CLI [2] changed from it, there's not much and nothing substantial [3]. Not a criticism, just wanted to learn whether it's worth upgrading to for a CLI tool ($subj is a separate GUI app based on it).
Upscayl NCNN has several fixes (like latest vulkan, bug fixes, feature additions) and is coming up with more soon (we just haven't pushed the changes). Also working on CPU support.
If you're going to use Real-ESRGAN, you're going to have to deal with a lot of issues and a codebase that hasn't been updated in a long time. Afaik, Upscayl NCNN is the only project maintaining Real-ESRGAN NCNN as of now.
I've only used -i -o -n mode, but for me that just works. And when something just works, I tend to see "a codebase that hasn't been updated in a long time" as a good thing generally :)
(Don't take it personally, it's my burnout from modern software cycles.)
I understand. With Upscayl it's a little more configurable.
For example, you can add compression to the images, change the format (which Real-ESRGAN can't always do), drag and drop images, have extra models not shipped by default and in future even have face enhancement.
I have used RealESRGAN for some years now, and several pictures hang on my house that were enlarged using this.
I enlarged a lot of old pictures from family albums, and then fixed the bad spots with something like Photoroom, Snapseed, etc., and then sent them to be printed.
Works really great.
And not only that, if I find an old picture on the internet that is low res with no high res found after much google-fu, Duckduckgo, or Kagi, I use RealESRGAN to upscale, and then use it as my wallpaper, presentation background or whatever!
Yes! I Real-ESRGAN is still the best algorithm for enhancing images. The only algorithm better than it is Topaz's but that's to be expected from a paid product.
Real-ESRGAN can get better with a better trained model too, which is on Upscayl's roadmap.
Does anybody have experience training/finetuning Real-ESRGAN models? I downloaded one of their datasets to see what they are working with, and it has 800 images at ~3Mp resolution. I am curious if others have results based on higher resolutions + more/fewer images in the dataset? I would like to use it for fine art paintings by period / style, etc.
I think I've got a twitch now, when I think about this. How those stupid moviemakers would say, "enhance!" and I (along with many of my geek brethren) were like, "there are no more pixels, that can't be done!"
Or does it? If you enhance security footage to make a barely visible face recognizable it will make it but it won't be the person in the footage anymore.
The MRI scanner near you is potentially using AI to increase image signal, then to increase pixel count 4x (well, 2x in X direction, 2x in Y direction. Potentially also generating a slice between each slice in 3D datasets).
If you turn off the AI and actually acquire the information, the scan is long, the patient moves and it is blurry. However if the patient is still, the non AI images look like the AI images.
Disagreeing with it is sort of moot now - it’s in the machines and is running. It works well, I’ve been using it daily for a few years.
All vendors have it - I use the Siemens product ‘Deep Resolve’ [1]. Their PR department undersells it, in my view this is a bigger change than the 1.5T to 3T transition.
Which may well be, but those detectives still aren't going to suddenly be able to read the license plate from an "enhanced" blurry photo, like their TV counterparts were doing back in the early 00s.
Not necessarily. As long as some mashed pixels of the license plate remain, and assuming different license plates would result in differently mashed pixels, it might be possible to restore the original from the highly compressed image data.
From series of images, aka. video, sure. From a single image? Not so much.
In video there is a lot of temporal information and even if the spatial resolution wasn't high enough in a single image, one would be able to accumulate a higher resolution version of the scene using multiple observations.
The models usually bundled with ESRGAN don't enhance like CSI. They straighten lines and jpeg blocking a little and correct "subatomic" details, but cannot restore large areas like faces, objects, etc. It's more of an AI-flavored "Effects - Sharpen" than stereotypical "Enhance!".
Yes - I can't find it now, but a few weeks ago I saw a demo website for some other upscaler with an example image of a picture of food, and it was pretty clear that the upscaled "version" was different bread than the original lower-res version, and things like herb "sprinkles" (oregano?) in the original became pine nuts or something in the upscaled version....
Yes, I personally think it's false advertising. AI Re-painter is not AI Upscaling. I even received a review on Mac App Store saying Upscayl is worse because it does upscaling instead of repainting like Magnific.
Except that when you try to enhance the numberplate of a suspected killer you get a platenumber which is based on avarage noise and mqybe different than reality.
I tried this out back in December. It is very straightforward. Would recommend for anyone who is testing the waters and just trying to start exploring the various tools.
From my understanding though, the quality is pretty far behind that of the cutting edge. A friend recently recommended Topaz to me, but that isn't open source.
The quality is actually comparable in some places. Upscayl being a free project though, does not have its own model yet and as such, we depend on community models (which are great in their own right).
I do have plans to create our own robust model though, once I collect enough money :)
extras tab in a1111 is really the best results I can get. You need to know what all those options mean of course - it's a more complicated interface for sure
UpScayl is great, I use it a lot for work. Upscaling low-res graphics and illustrations for use in graphics in a pinch, upscaling portrait photos of people for print and photoshop… upscaling old copies of things for editing purposes, you name it.
It’s not perfect but no alternative is. Bloody useful though.
I'm glad you like it! We're working on making it even better :D and with the introduction of Upscayl Cloud the possibilities for professionals will be endless (within reason of course, haha)!
Like it? My brother I was trying to hide my enthusiasm to look cool in front of people on YC.
I use UpScayl pretty much every day and doing so feels like magic every time. Your user interface tweaks here and there are always appreciated and UpScayl is a joy to use.
upscayl is very approachable, but lacked many features i needed. i ended up using https://github.com/AUTOMATIC1111/stable-diffusion-webui after upscaling became part of my regular workflow, but for someone who just needs a few images enhanced, it's an ideal tool.
Are there any models out there for cleaning up an image, not just upscaling? I have a bunch of old photos taken on early low-res point-and-shoots that have JPEG artifacts etc and this seems like something a modern model could easily be fine-tuned to resolve, but every few months I look around and have yet to find anything
Unfortunately, for video, nothing I've seen yet has matched the quality of Topaz Labs' (paid) tool. Clarity and consistency always seem to be an issue with other implementations. If love to be proven wrong because I have a project that's stalled due to the low quality/resolution of the source.
But doesn't that defeat the purpose of upscaling? I see a lot of people promoting Magnific AI (which costs $40 a month) but it's not really an upscaler. It's more of an AI repainter. It replaces most details by re-imagining them instead of guessing the original details. I think we should be clear about what an AI Upscaler is and what an AI re-painter is.
ESRGAN is the same thing. You cannot recover signal that are already lost without additional information (in Diffusion case, model's prior). Upscaler are hallucinators by that definition.
But results differ in perception. ESRGAN doesn't hallucinate whole [sub]objects and that puts it into a different category. Latent/denoising i2i re-scale is different because its parameter range doesn't really intersect with what ESRGAN does.
What @cchance said. Also, Diffusion models have much more capacity than ESRGAN. Unable to hallucinate whole subjects is a limitation of model capacity and training data, not the reverse (i.e. a high capacity model can mimic what you want in a low capacity model if you steer it the right way). An unsuspicious drop of water can have a whole world inside it under a microscope.
It can only hallucinate if your i2i noise level is high enough, if you're adding enough noise that the latent is basically filling in giant swaths of noise areas then of course its going to make up new stuff, thats not hallucinations as much as you asking the AI to draw that entire area as something new.
I wonder if image upscaling is actually reflective of a greater disease in society: we are exceptionally obsessed with holding onto everything. I mean, my God, if the situation comes down to a desire to upscale an image, just let the image go.
There are lots of reasons someone might want to upscale something, not just "holding onto everything".
I painted something on my iPad over the last few weeks. I'll be upscaling it so it's high enough resolution to do a large print on my wall. It's not something I need to "just let the image go" for.
(And, of course, there's nothing wrong with holding onto things...)
[1] https://github.com/xinntao/Real-ESRGAN-ncnn-vulkan
[2] https://github.com/upscayl/upscayl-ncnn
[3] https://github.com/xinntao/Real-ESRGAN-ncnn-vulkan/compare/m...