Hacker News new | past | comments | ask | show | jobs | submit login
How to make a QR code with Stable Diffusion (stable-diffusion-art.com)
407 points by andrewon on June 11, 2023 | hide | past | favorite | 73 comments



THIS is what AI art should be like. Hard for humans, easy for computers, amazing visuals. And on top of that it functional as well.

I feel like this is closer to how human artists create, artists have a set of constraints, and a lot of limitations, and forcing NN to output valid QR codes puts the same constraints on the process itself.

Next step, include the QR reader in the training loop, and make it differentiable to increase the valid output from 1/4 to 100%


Agreed! The wisest peeps (I always harp) see AI Art as a functional tool on the metaphorical toolbelt. It doesn't make all the other tools obsolete, it's the tippy top of the pyramid so to speak.

The cherry-on-top to refine things we cannot do, adding an artificial flare to the organic substance, yada, yada, yada.


The funny thing is that no-one AFAICT has realized that the same content can be encoded in different-looking QR codes. Beside the obvious (different error-correction levels), the content itself can be changed while maintaining its semantic meaning (e.g. "https://example.com/foo", "HTTPS://EXAMPLE.COM/foo", or "HtTpS://eXaMpLe.CoM/foo" are all semantically identical) and even the QR encoding itself can be tweaked (e.g. by changing the version and mask, see the demo on https://www.nayuki.io/page/qr-code-generator-library). Each combination would yield a different-looking QR code that would encode the same meaning, and it could therefore allow the diffusion models even greater freedom.

I'm sure somebody will get to this soon.


I think this has been done for a long time? I made this one that looks like a VLC logo and links to VLC, back in 2013. https://imgur.com/a/rq8WzGm inspired by some post I read at the time.

If you scan it, you realize it's putting lots of "junk" at the end of the URL that doesn't change its meaning, but is used to tweak how the QR code looks.

The URL is on the form videolan.org/#234234234523453455...


Sorry, yes, I should have clarified I meant "no-one has done it in the current batch of diffusion-based experiments". As you point out it has been done multiple times before, e.g. https://research.swtch.com/qart (online demo https://research.swtch.com/qr/draw/).


Also not sure the diffusion model has been taught yet that it could in some cases choose to deliberately ignore a (small) fraction of some of the elements/blocks, if it helps with composition.

Or even better just wire up a QR decoder to the loop and automatically reject images that don't decode correctly, and let the model sort out how far it can go.


You'd need to inject a lot of noise (and different kinds of noise, and different implementations of QR decoders) before you present the image to the QR decode, to make sure you are still producing a robust QR code.


QR codes have several different bit-pattern masks, which an encoder should choose between to avoid blocks or runs.. so even without altering the URL there are different outputs.

[A-Z0-9] encodes a lot more efficiently than [a-zA-Z0-9] so there are other considerations when altering a url.


Yes, that's what I meant.


Other things you can tweak in a QR Code symbol without changing the decoded content: Error correction level, empty segments, segment encoding, deliberate addition of fully correctable errors, messing with padding bits/bytes after the terminator.


There is a CRC so it's not as easy at it sounds.


None of what I wrote impacts the CRC calculation.


Diffusion models cannot generate a qr code by itself. The article even shows you need to generate the base image yourself. What you are proposing would mean bruteforcing the QR codes until we see a better shape.


I wouldn't call generating a couple of semantically equivalent (but fully valid and 0 error) QR codes and giving each of them as inputs to diffusion model brute forcing.

Even the normal AI image generation process often involves some trial and error. Generating a bunch of variants by changing random seed or slightly tweaking prompt keywords and choosing a result which looks the best. Doing multiple attempts with multiple different inputs doesn't seem more crazy than doing multiple attempts with same input and hoping for better results.


> I wouldn't call generating a couple of semantically equivalent (but fully valid and 0 error) QR codes and giving each of them as inputs to diffusion model brute forcing.

If you have 20 letters in your URL, and you give the model each of 2^20 different ways to capitalise them, isn't that pretty close to brute-forcing?


I often iterate my prompt on a fixed seed and by changing slightly the prompt to get the best results, so that wouldn't works well here.


It would be at least possible to write a python plugin for the 'automatic1111' ui which allows you to simply input the QR text, and have it generate images based on a variety of possible configurations in batch so you can go through and pick the one which is most aesthetically pleasing.

edit: It would also be beneficial to have a script which automatically distorts the image in various ways and checks to see if it would be recognized by a QR code scanner, and give a score so you can see how likely it would be to be scannable in various conditions.


Sure, but as I mentioned it has absolutely nothing to do with the CRC calculation, that is what you claimed.


Without CRC calculation you could just check which square correspond to which letter and possibly change the letter casing. But with CRC it will leads to squares changing in the CRC area.


Yes, that's the whole point. I did not claim you can control every single module in the QR. I just said that you can get different QR codes for the same semantic content.


As an engineer, the lack of consideration for the matter of how much one should corrupt a signal (which might be scanned under different conditions, such as different light levels and resolutions) is grating to me.

It’s like scratching a design onto the bottom of an audio CD, playing it, and if it works on your CD player, shipping it. “Works for me”


I feel like you need both people. The people who scratch a unique song onto a CD that only plays on one random CD player 2% of the time, and then the people who optimize that one song to play on 100% of devices while your upside down and underwater.

Make the cool QR code first, then get it to work everywhere when you actually made something cool.


Look at it as if it's using unused parts of the signal spectrum to fit in another signal instead. Sometimes less error resilience in exchange for more data is a worth while trade, see for instance 256-QAM compared to 4-QAM (although not quite the same, I admit).

I get what you mean, it's a misuse of the underlying technology and a crude hack in some ways, but if it's stupid and it works it's not stupid.



Interestingly when I long pressed the codes marked as “No” (in Safari mobile), a lot of them were still decoded.


I didn’t even know that long pressing on a QR code decodes it. Thanks for the tip.


Meanwhile, bare QR codes are ugly AF and draw the eye away from any other design... Things like this might lead to greater use.


This can barely be recognized by a layperson as a QR code at all.

This may be great for steganography — like these US POWs in Vietnam blinking Morse code for "torture" while being filmed, when their captors don't realize that and pass the video, and the morbid message is later extracted.


Greater use of QR is not a good thing, it's an attack vector actually


Engineer here as well.

What is the quality metric?

Does it need to work in an emergency?

Does it need to look cool more than it needs to work instantly? (like an ad)

Also, I imagine most people use like 3 QR code readers, snapchat, the camera app for ios/android. Seems pretty trivial to test for 80% of the population.


As a crafter and tinkerer, I love to test how far I can scratch that CD.


> which might be scanned under different conditions, such as different light levels and resolutions

This makes me think of your stereotypical pirate map to buried treasure


There's a way to make a QR code that provides a device with all the info it needs to sign onto a WiFi network.

I'm wondering if I could make one of those, turn it into an interesting piece of art, then frame it and put it up in my living room. When guests ask for WiFi, I would just say "take a photo of that picture".


I've been wanting to do that with a wood laser cutting. My lazier version was to set the desktop background of the tv in my living room to that QR code.


there was a cool one posted to /g/ earlier https://i.jollo.org/ePE3JHak.png


This was created using this guide:-

https://learn.thinkdiffusion.com/creating-qr-codes-with-cont...

I was speaking to him when he created it!


I’ve seen these and they are so novel! The problem is they don’t look like QR codes.

A regular QR code, regardless of the fact they are an eyesore, people know what to do when they see one on a menu.

Artistic ones don’t really look like QR. Def don’t want to have to add an arrow and “scan this!”

I suspect this will come on more handy incorporating QR into larger murals that people might photograph anyway.

iOS camera “sees” QR codes w a url preview. This surprise embedded mega might make for an interesting beat in the march to ubiquitous AR.


It would be a cool idea for leaving easter eggs embedded in designs though


Maybe it's because I've read a bit about how they work and in general have seen a lot of them, but I can recognize these are QR codes even if they weren't labeled as such


That's great, but how about your grandma? Could she recognize that's a QR code?


I'm not sure she knows what a normal QR code is...


My point is: this is cool and all, but it makes the functional part of QR codes harder, not easier. While something can be functional and beautiful at the same time, and while QR codes are not beautiful by any means, this adds beauty but at the same time removes the functional part — ease of identifying QR code as such.


My phone camera couldn't recognize any of the generated ones. Looks cool though


My phone didn't recognized as well, until I zoomed out the screen to 50% and I was able to scan all the generated QRs from the article.


With iPhone 14 from MacBook Air M1 screen about 35-40 cm away every picture except one read on first try and that one read after I pulled the phone back a bit.


I could read half of them, it was detecting a face in the first Mechanical Girl, face detection probably prevents it from working, since it worked for most of the sceneries. It also didn't work on any picture that extrapolated the boundaries of the qr code.

I tried with built-in camera app of my Samsung m21.

If anyone wants to use this, they should do a lot of testing first.

Edit: I also noticed that angle matters when scanning from certain monitors, the first qr code didn't work at first, but I noticed the top of the picture was too dark then I tried to hold the phone in the same angle as the screen and it worked.


Asus Zenfone, tried both barcode scanner app and regular camera. Barcode scanner picked up the template, none of the generated ones. Camera app picked up two (both japanese girl ones) - fast enough for it to be useful, and the first robot one after like 10-12 seconds trying various distances, which I think should be considered a failure.


Same here, my Xiaomi did not recognize any of them at any distance.


Same here, with a samsung s21 and the built-in camera app.


Unfortunately, they don't look as nice as the originals that were shown a few days ago, I wonder how they did that.

https://mp.weixin.qq.com/s/i4WR5ULH1ZZYl8Watf3EPw

They talk about a custom QR codes ControlNet.



This is not using a custom Controlnet for QR codes.


Probably more trial/error


What's the noise margin like?

QR codes with logo's typically contain bit errors and rely on the forward error correction code to correct those errors. The logo comes at the cost of some noise margin.

I'd guess that that the same applies here? All good if decoration is your priority, but if reliability is also a priority you have to be aware of the tradeoff.


this.

i wonder if all qr code scanners follow a standard way to read the code, while accounting for the redundant data and checking for errors.

might explain why my phone struggles to scan some of them.


I tried to post the original a few days ago but the submission was automatically killed.

https://mp.weixin.qq.com/s/i4WR5ULH1ZZYl8Watf3EPw


This is cool and can’t wait until there’s more accepted steganography where every image has some side channel of info encoded into the image in a way that as a viewer I don’t notice it, but my phone’s camera can discern.


Not bad, but the ones from the custom trained NN are way nicer in my opinion, with most of these it's pretty clear they are QR codes (which is probably better for usability, but makes the art a bit worse).


At least with my scanner (Binary Eye on Android) these don't scan very well. Nonetheless they are very cool.

I wonder if this would work better or worse with different types of 2D barcodes like Aztec, Data Matrix, or PDF-417


I don't like this. I hope it dies soon.

It has very bad UX because it muddles the "affordance" of a QR. A QR should explicitly and clearly look like a QR for people to understand they should use it as such.

It isn't effective because it degrades the bandwidth of QR codes. It reduces the amount of information you can place on QR codes.

This is just a gimmick, a funny trick that implements bad functionality.

As in Jurassic Park: just because you can do it doesn't mean you should do it.


First, it’s just a new artistic choice that some will use and other’s won’t.

And second, if it does become very popular, users will become familiar with it and will likely instinctively grok that some of the ‘art’ they are looking at can be scanned as a QR code. Phase one will be artsy QR codes with a written caption: “Hey, this is actually a QR code! Scan it!” Phase two, is removing the caption because people are so accustomed to them.


I think it looks much nicer than a regular QR-code and there are definitely times when you want a QR-code to blend in rather than stand out.


A large banner, as someone else mentioned, seems like the perfect place to use one of these. “Come see the new exhibit at the art museum” e.g.


Interesting I am not able to focus on these images. Looking at them makes my eyes pop around until I have to look away.


Way cool article and novel hacking of QR codes!


Very nice post, thanks for sharing.


this guide starts by promoting their gumroad book which i dont want.

currently using a1111 off of https://colab.research.google.com/github/TheLastBen/fast-sta...

and hoping to go from there


Yep, these guys have spammed the front page as of now all from the same url


Ad ad ad ad ad AD ad AD ad Ad....

And some content in between


There are ads on this page? I almost forgot that I've been experiencing the internet ad-free for years now. Thanks uBlock!


Ah, I got uBlock, but didn't have on the mobile where I was reading it first...


Maybe the QR code examples are ads? I don't know, I didn't scan them


I wouldn't even mind that, that'd be a creative sponsor spot. I think they're talking about actual ads here.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: