The funny thing is that no-one AFAICT has realized that the same content can be encoded in different-looking QR codes. Beside the obvious (different error-correction levels), the content itself can be changed while maintaining its semantic meaning (e.g. "https://example.com/foo", "HTTPS://EXAMPLE.COM/foo", or "HtTpS://eXaMpLe.CoM/foo" are all semantically identical) and even the QR encoding itself can be tweaked (e.g. by changing the version and mask, see the demo on https://www.nayuki.io/page/qr-code-generator-library). Each combination would yield a different-looking QR code that would encode the same meaning, and it could therefore allow the diffusion models even greater freedom.
I think this has been done for a long time? I made this one that looks like a VLC logo and links to VLC, back in 2013. https://imgur.com/a/rq8WzGm inspired by some post I read at the time.
If you scan it, you realize it's putting lots of "junk" at the end of the URL that doesn't change its meaning, but is used to tweak how the QR code looks.
The URL is on the form videolan.org/#234234234523453455...
Also not sure the diffusion model has been taught yet that it could in some cases choose to deliberately ignore a (small) fraction of some of the elements/blocks, if it helps with composition.
Or even better just wire up a QR decoder to the loop and automatically reject images that don't decode correctly, and let the model sort out how far it can go.
You'd need to inject a lot of noise (and different kinds of noise, and different implementations of QR decoders) before you present the image to the QR decode, to make sure you are still producing a robust QR code.
QR codes have several different bit-pattern masks, which an encoder should choose between to avoid blocks or runs.. so even without altering the URL there are different outputs.
[A-Z0-9] encodes a lot more efficiently than [a-zA-Z0-9] so there are other considerations when altering a url.
Other things you can tweak in a QR Code symbol without changing the decoded content: Error correction level, empty segments, segment encoding, deliberate addition of fully correctable errors, messing with padding bits/bytes after the terminator.
Diffusion models cannot generate a qr code by itself. The article even shows you need to generate the base image yourself. What you are proposing would mean bruteforcing the QR codes until we see a better shape.
I wouldn't call generating a couple of semantically equivalent (but fully valid and 0 error) QR codes and giving each of them as inputs to diffusion model brute forcing.
Even the normal AI image generation process often involves some trial and error. Generating a bunch of variants by changing random seed or slightly tweaking prompt keywords and choosing a result which looks the best. Doing multiple attempts with multiple different inputs doesn't seem more crazy than doing multiple attempts with same input and hoping for better results.
> I wouldn't call generating a couple of semantically equivalent (but fully valid and 0 error) QR codes and giving each of them as inputs to diffusion model brute forcing.
If you have 20 letters in your URL, and you give the model each of 2^20 different ways to capitalise them, isn't that pretty close to brute-forcing?
It would be at least possible to write a python plugin for the 'automatic1111' ui which allows you to simply input the QR text, and have it generate images based on a variety of possible configurations in batch so you can go through and pick the one which is most aesthetically pleasing.
edit: It would also be beneficial to have a script which automatically distorts the image in various ways and checks to see if it would be recognized by a QR code scanner, and give a score so you can see how likely it would be to be scannable in various conditions.
Without CRC calculation you could just check which square correspond to which letter and possibly change the letter casing.
But with CRC it will leads to squares changing in the CRC area.
Yes, that's the whole point. I did not claim you can control every single module in the QR. I just said that you can get different QR codes for the same semantic content.
I'm sure somebody will get to this soon.