Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Creating custom coloring pages from photos. Great for parents/teachers (portraitart.app)
179 points by stevels 8 months ago | hide | past | favorite | 80 comments
Making coloring pages out of your own personal photos is a fun and creative way to engage kids (and adults!) with art. Instead of generic coloring pages, you can turn special memories into custom works of art just waiting to be filled in with color.



My immediate issue, as the father of a mixed son who has just witnessed him spend hours trying to represent his own appearance accurately for an art task at school:

This turns black and mixed people into white-presenting caricatures of themselves. E.g. compare [1] and [2] where the hairstyles of the black people are turned into distinctly "white" hairstyles. My son's hair is closer in texture to that of the man on the right, and there's no way of getting it into the shape of the coloring-in page.

I get there's no ill intent here, but even for far less ethnically ambiguous inputs this "whitewashes" hairstyles and general appearance, and once I saw it I couldn't unsee it.

If my son was the age to want to use this, I'd be concerned about letting him use this, because I know he is sensitive about how he is different to his friends, etc. and this would wildly misrepresent and erase his actual appearance.

[3] and [4] is another stark example.

[5] and [6]....

[1] https://portraitart.app/static/gallery/group1_original_1024....

[2] https://portraitart.app/static/gallery/group1_coloring_page_...

[3] https://portraitart.app/static/gallery/portrait4_original_10...

[4] https://portraitart.app/static/gallery/portrait4_coloring_pa...

[5] https://portraitart.app/static/gallery/basketball2_original_...

[6] https://portraitart.app/static/gallery/basketball2_coloring_...


This was absolutely my first reaction as well. I scrolled through the first few photos of White families where it seemed to turn even them into 1950s caricatures of "small town White Americans" and thought "hmmm, how's it going to treat Black people?"

I was not at all surprised when it turned Black people into White people.

This to me is one of the clearest examples I've seen of what people talk about when they talk about biases in training data. Clearly the training data of "coloring books" is heavily biased towards "stereotypical 1950s white people" and turns everyone into them.

(And you can add [1] and [2] to your list of comparison images.)

1. https://portraitart.app/static/gallery/father1_original_1024...

2. https://portraitart.app/static/gallery/father1_coloring_page...


The artistic license thing in itself I think is fine if only it didn't go just one direction. And you may very well be right that the examples it may have seen of coloring books may well be a very peculiar subset.


“Artistic license” assumes agency, that there is some artist who understands the facts of a situation and deliberately changes them in the work.

It does not apply to a machine doing a poor job because it lacks context in its training data. Just like medieval art, which was inaccurate¹ due to ignorance, not because the person doing the drawing decided to add some flare.

¹ https://www.youtube.com/watch?v=jnB2Uj7gWSE


> Just like medieval art, which was inaccurate¹ due to ignorance, not because the person doing the drawing decided to add some flare.

There's a site showing how Europeans drew elephants from popular descriptions once they disappeared from the continent: https://www.uliwestphal.de/elephas-anthropogenus/index.html


Another – presumably coincidental – issue with images [1] and [2] is that there is a person missing. I assume that’s not related to skin colour, but to me it adds to the slightly irritating effect of how the resulting images may be disconnected from human reality in ways that would commonly be understood as disrespectful. (Not to speak of the “usual” AI problems, such as additional fingers, stray body parts, etc.)

[1]: https://portraitart.app/static/gallery/group1_original_1024....

[2]: https://portraitart.app/static/gallery/group1_coloring_page_...


Thanks for pointing this out, seriously. As a white european guy, I'd probably never have thought of this issue, that's an important reminder about our biases


With less "artistic license" [1]

We use a model but not a diffusion model + controlnet which is what I'm assuming this website does.

Would love any feedback on the results + site [2] – before it's asked: we do not use any user uploaded images for model training, they're only stored to show you the color version of your originally uploaded image

[1] https://static.dreamandcolor.com/aa828ad1-5696-4d4e-95eb-bcd...

[2] https://dreamandcolor.com/


It still has the vibe of janky edge-detection that can be done without AI. Admittedly, less janky than pure edge detection but still not quite smooth enough that I'd pay for the results.

Also, just FYI, asking three different time across these comments for people to check out your site feels a bit spammy to me. I'd suggest you do your own 'Show HN' if you are trying to engage with the community here for feedback.


Fair, thanks for your feedback – some images work better than others in terms of the artifacting. There's lots of improvements we'd like to make, for now the best balance I've found is letting the user choose their level of detail on the site and add any masking lines as needed before PDF export


Honestly, when I first started reading your comment I thought you were being over sensitive but I totally get it looking at pictures 3 and 4. Even 5 and 6 completely changes the hairstyle.


I had to look for multiple examples myself before I was sure I wasn't being oversensitive. As I said, I'm sure there's no ill intent, and in a way I'm not surprised either, because as my son himself found out, if you're going to draw line art - especially with clear, thick lines like this, then accurately representing black or mixed hair without it turning into a caricature that'd be even worse is a skill.

And one that is very likely underrepresented in the training set for no fault of theirs.

And this is why getting diversity "right" (so not the Gemini fiasco way) is hard - you can be extremely well meaning and just not have it register until someone for whom it's personal looks at it.


Looks like it takes considerable "artistic license" to deviate from the photos. Every person getting turned into a stereotypical "caucasian" type is the most noticeable (and most problematic), but also the spotted cat in one of the pictures gets a "stereotypical" tabby coat. Also, it likes to insert a horizon, even if the original image doesn't have one, and sometimes it completely changes what's in the background. But I assume this is because of the images in the training set - I guess in coloring books most people have light-colored hair (which can then be colored brown, black, red... whatever) and the model then associates that with stereotypical blondes?

There are several modes (https://portraitart.app/portrait-art) - some of them are better at keeping people looking like they originally look, the "coloring page" mode seems to be the most "heavy-handed"...

It also doesn't like freckles: https://portraitart.app/static/gallery/portrait6_original_10... -> https://portraitart.app/static/gallery/portrait6_pencilsketc...

After some further browsing, the "caricature" mode seems to be the strangest - caricatures are supposed to take the characteristic traits of a person and exaggerate them in a humorous way, while these images sometimes lack even the slightest resemblance to the original. For example, this mixed race couple https://portraitart.app/static/gallery/couple1_original_1024... is not only turned into two brunettes https://portraitart.app/static/gallery/couple1_caricature_10..., but they also have a different pose, different hairstyle and wear completely different clothes...


Thanks for calling this out. I think the dev cycle for production AI Systems will involve a lot of bug squashing on this kind of thing. These models handle a huge variety of inputs; by definition much more than you could hand code. It’s hard to even identify systematic failures.

Once a failure mode is known - like here - how do you fix it? The foundational problem is minorities are a minority of the training set. Good training data is expensive, so how can we practically boost representation without getting the Gemini fiasco?

One idea: start with the best coloring books. Presumably some human has mastered the art of “accurately representing black or mixed hair without it turning into a caricature”. Find them, and start buying their art. When they have drawings based on photos, use that training pair. When they have line art only, use a style transfer tool (like this one!) to convert to a photo. That gives you another pair for use in the other direction.

Another idea: make it easy to users to flag poor transfers. Add a “whitewashed” option to the standard mod report flow. Feed that into RLHF. Get better.

Another idea: focus dev cycles on this. Once you have a system to flag problems and address them, use all this fancy AI to identify input photos with black/mixed hair. Manually inspect how the model performs and push feedback into the next training cycle.

One tool for this is https://www.aquariumlearning.com/

The point is - once you identify a corner of feature space where things break, you can fix them.


> Presumably some human has mastered the art of “accurately representing black or mixed hair without it turning into a caricature”

There absolutely are good examples, yes, and I think you're right that the failure of capturing it is largely down to volume in the extant datasets, and particularly in datasets that provide a direct match for the type of line-art colouring in style, which is in itself fairly dated.

I agree with all of your recommendations. I think it's definitely a problem that will be solved. The main thing is for people to get used to looking for it.


It's galling for me once pointed out because I know that a lot of black people have totally different hair to the degree that it's a separate skill to cut that hair versus white people hair. I knew someone who would only have her sister cut her hair because none of the stylists in town could cut it without ruining the shape. They'd use the same tools and techniques as with perfectly straight hair and would destroy it. The hair is something that seems intensely personal and to have it erased and morphed so completely is really broken.


No to mention the hand on the shoulder of the dude on the right in [2] has six fingers...


It turned my black dog into a beagle!


Feels more likely this is the opposite - the "originals" look like hallucinations.. what is the rear suspension attached to on this "bmw"[0]? Presumably it's supposed to be an R60[1]. It magically has a lot more (accurate) parts in the colouring page[2] (bigger front and back fender, seat, engine gets fins, frame changed shape, tail has license plate holder and signals, the front brake is connected to the leaver via a cable). At least these six fingered people[3] deserve each other - shame she moved her wedding ring, Inigo Montoya seeks them.

[0]: https://portraitart.app/static/gallery/motorcycle1_original_... [1]: https://wallup.net/1960-bmw-r60-classic-bike-motorbike/ [2]: https://portraitart.app/static/gallery/motorcycle1_coloring_... [3]: https://portraitart.app/static/gallery/wedding1_coloring_pag...


The BMW R60 image appears to be CGI / 3D-rendered. If you zoom in on the original[1], you can see how the rear suspension is attached.

[1]: https://www.pexels.com/photo/orange-and-black-bmw-motorcycle...


Did you see the goose's feet? One is a wine glass stem kind of a thing, or a liquid pour; the other is a pyramid.


The suspension is attached to the rounded bar next to it. You can find more pictures of it using Google.


Good catch! You are really into the details.


I really think this should have GenAI in the title on the webpage or on here, because it's taking the source photo and making best guesses while creating details out of whole cloth. This isn't just "another photo filter", it's instead taking the picture and running with it with sometimes pretty poor results.

For example, I used a random selfie of me on the beach and big rocks in the water, wearing dark sunglasses and mouth closed.

Watercolor: Added eyes to my glasses (!). The rocks and water turned green, and it looked like I was standing in a park. Coloring Page: Ferns and trees added that weren't there before. Both: Gave me a teethy smile when my lips were closed.

The examples on the home page must have been extremely handpicked because I'm not seeing anything to that same fidelity.


When I saw this posted late last night, I was pretty excited since I have kids, but then realized it's just determining the basic idea of what it sees in the photo, and then drawing a new image from scratch that has those elements. What I wanted is something that takes the original photo and removes things from the image until we're left with some basic line drawings, so that the people in the coloring book actually look like the people in the photo! Their examples very obviously show how different the resulting image is from the original. If Google allowed you to search by saying "Find me a coloring book image where a dad is sitting behind 2 kids with his arms around them, and the daughter is on the left and the son is on the right" then we'd end up with results almost as good.

I'm no Photoshop expert, but someone else in this discussion pointed out that Photoshop has some filters or other tools that do edge detection and can yield much more accurate results. If the techniques could be replicated in a SaaS then you'd have a much better product. Of course, if it could be done in a desktop Mac app or an iOS app, I might choose that for privacy reasons.


> If Google allowed you to search by saying "Find me a coloring book image where a dad is sitting behind 2 kids with his arms around them, and the daughter is on the left and the son is on the right" then we'd end up with results almost as good.

I've been using Stable Diffusion, and more recently Bing (app) and now GPT-4/Dall-E 3 (ChatGPT app) for that very purpose, with quite good success rate. It's perfect for when my daughter randomly asks me for a coloring page with a dancing vacuum cleaner or such. Dall-E is quite good at this, all you need to do is tack "in the style of a children coloring book" or such to the end of your prompt.


Heh, I used a photo of myself sitting on the couch with my chihuahua looking up at me over my laptop screen and it replaced her with a St Bernard. The hallucination is strong in this one.


> #1 Photo to Art App

According to whom? There’s just that sentence, surrounded by laurel leaves, above five stars. Zero indication of where that award/rating came from. Did you make it up?


This one calls itself "#1 AI coloring page generator" https://colorbliss.art/


So they’re both confidently making stuff up without any sources or justification. Quite on brand for the current generation of generative AI.


oh dang, i forgot to put in my source. Thanks for calling that out - I fixed it for you!

https://colorbliss.art


When I was just starting out I made a similar app (https://github.com/rmdocherty/buck3t) using classic computer vision operations like k-means and Canny edge detection rather than (I assume) ML. It also lets you fill in the returned coloring page in the browser.

It’s simple and works relatively well but is prone to fail on high frequency objects like foliage, where the ML approach appears to a) succeed and b) stylise (seems to cause problems).

The free trial on the cloud function expired so the web app doesn’t work and the source JS code is awful but someone (maybe me) could pretty easily rewrite the cloud function into a flask server to allow local hosting.


Same here - I had a web-based system that let you do coloring pages, paint-by-number, cross-stitch patterns, etc... and it all worked in the web session. While adding an AI element does give improvements over my results. I'm not sure those improvements are worth being prompted for my email, waiting even longer for results, the results not being 100% true to the original content, and not having those results just show on-screen.


Could be neat, but I’m really dissuaded by requiring an email signup to try it out


They don't validate the email, so just register with whatever@portraitart.app and let them spam themselves.


Use a personal domain with a default address that filters to a mailbox then use a different email address for every site you use. This plus random passwords from a password manager contains the privacy leakage, blast radius of credential loss, and keeps your mail spam free.


Or, you know, just don’t normalize indiscriminate email collection as a filter for using a service.


Ship sailed? Your solo rage boycott won’t change the incentive structure of the universe. Thus adblockers, black hole email addresses, etc. Using the service without providing back the value also creates incentive to avoid the dark pattern in a way that’s a stronger reinforcement than simply not using.


I have that and do that, but it still dissuaded me from using this service. I may have a large tinfoil bat but I still have some standards.


Have to send user notification in email, because the processing queue could be long.


A progress bar on the website would do. And if the user closes a browser tab, you can cancel their job.


Or just keep a cookie to know which job to show if the user comes back. You don’t even need a cookie banner for something that simple with no personal information.


I’m happy refreshing for results. I really don’t want to “sign up” until I know I’m happy with the results.


IMO your FAQ needs an entry for: when you store uploaded photos, what will they be used for. Will my uploaded photos be used as training data for your AI models for example?


First thing I wanted to do was upload a family photo for my son. But of course, I don't want my family photo used for anything...


Agreed. The site should automatically delete the uploaded photos after a set amount of minutes or seconds. No reason to store them.


Right now, we keep it for user to access it later. User can choose to delete the photo and all associated pictures, as explained in https://portraitart.app/help

We do not use the data for other purpose.

This is a good suggestion. We will definitely think about how to design the UI to satisfy both.


If you don't want a site to (possibly) store your photos, not uploading them in the first place may be your only _real_ option...


Good suggestion. We do not used the data for training. We will clarify that in FAQ.


Don't hide it away in the FAQ, that information should be front and center during the upload process.


The coloring page that should have shown my son and daughter sitting on a bench that looked like a bear waving magic wands… turned my son into a girl with Princess Leia hair, and my daughter into a anthropomorphized bear, no wands.

Fun, but it’s no longer a special memory. There’s so little tying the source and result.


For significantly less compute you could make pretty good coloring book pages just using different edge detection methods or simply using the stamp filter in Photoshop.


I wouldn't call simple edge detection "pretty good" colouring book generation. You're just gonna get all the edges of complex shapes; it needs to be cartoonified/simplified with having large contiguous patches of colour in mind.


I actually used this exact method in the early 2000s to make "coloring books" for local businesses using their own photographs. They would give them out and use them for marketing.


Or, you can just ask openAI directly. That way only they have your holiday pictures, instead of adding a third party of unknown trustworthiness to the mix.


Any prompt I can use?


Does anyone know of a site that works without the gimmicky AI? Like it always seems to switch the sides of people, changing major details, etc. I just want a coloring page of the actual image I upload, not some AI's interpretation of it with creative licensing.


You could try out our site [1] – we use an AI model but not a diffusion model (like OP is) to convert photos to line art, the resulting lines are less "smooth" but it better represents the original image

[1] https://dreamandcolor.com/


Tangentially related, I want to make a comic strip. I have some questions for people knowledgeable in "prompt engineering":

1. What's the style name for dilbert-like comic strips? Comic? Line art?

2. How do I produce consistent characters throughout the "comic box" and episodes?

3. How do I hint the generation of the comic? E.g. I draw a stick figure of someone sitting, and it generates my character with that pose.

4. How do I "refine" some parts of the generated image. For example I like the generated character, but I want the face to look different direction.


> 1. What's the style name for dilbert-like comic strips? Comic? Line art?

I don't think there's a clear settled name for that specific style, but I'd describe it as an American newspaper strip with low levels of detail and inspiration from "ligne claire" (Ligne claire is the style of Tintin etc. - focuses on strong, clear lines with a lot of details omitted, rarely shading very rarely hatching; e.g. hair in Ligne Claire is usually described by its borders - even messy hair is mostly hinted at by tufts sticking out, with the rest flat shaded).

American newspaper strips by no means always use simple line work, so describing it as that in isolation is unlikely to work, but it has other characteristics (which are by no means universal - this is a regular complication in categorizing this), such as often omitting backgrounds, or just hinting at backgrounds (e.g. with a picture hanging on a wall being all you see of the wall), and often using panels that change little or sometimes not at all during a conversational exchange (basically strips have a lot of characteristics around streamlining works for artists on a deadline that's often a major stress factor, so lots of simplification, but what requires simplification to same the artist time will vary greatly).

But there is variation there too, and especially very significant variation between the classic single-strip-a-day format and the Sunday format with a bigger panel, where you often end up seeing non-traditional layout or characters "bursting out" of individual frames in various ways where the single strips are often far more regimented. Many comics do both, and will sometimes look very different between the two.

E.g. Calvin and Hobbes strips are fairly regimented, usually four panels of near uniform size, while the Sunday panels were renowned for breaking "rules" with uneven, sometimes overlapping panels, and drawings bursting out of them - the last ever Calvin and Hobbes consists of 3 or 5 or 7 frames depending on how you count - the first frame contains another frame, which contains a drawing, which contains another frame with a drawing. The two top outer frames don't even align with the bottom outer frame... [1]

You almost certainly would need to try multiple things and add additional descriptors to get the specific feel you want.

[1] https://www.washingtonpost.com/arts-entertainment/2020/12/31...


1. You probably won't hit 100% parity just by prompt engineering especially if you want to do your points 3-4. You would need to train a style LoRa.

2. You can use one of the many ip-adapter models but the best result would probably be a combination of ip-adapter + subject LoRa.

3. That would be a controlnet model probably sketch best results would be achieved (for poses) using open or dw pose models.

4. You would do that with inpainting or stable drag.

https://stable-diffusion-art.com is a good starting point.


2. How do I produce consistent characters throughout the "comic box" and episodes?: controlnet "reference"

3. How do I hint the generation of the comic? E.g. I draw a stick figure of someone sitting, and it generates my character with that pose: controlnet "open pose/sketching"

4. How do I "refine" some parts of the generated image. For example I like the generated character, but I want the face to look different direction: inpainting

It's still basically impossible to get perfectly consistent character features across multiple images/without a massive amount of effort. These models have been trained to generate single/standalone images.

Maybe some of the Sora related tech (where it generates frames in parallel/each frame has the context of every other frame) for coherency can be used to improve the plain image generation stuff as well, ie being able to specify a character with multiple views (3/4, back, front etc) so that you can refer to that character in your prompt.


>1. What's the style name for dilbert-like comic strips? Comic? Line art?

"style of Scott Adams."


used a fake email, but when the processing was done, it didn't work and asked me to pay for something.


The tool is not perfect. Sometimes it does not fully understand the image content. Also, right now, the image is slightly watermarked, but perfectly fine for using as coloring page.

And it seems personalized coloring page is very popular, we actually just decided to make the coloring pages free of watermarks, as our minor contribution to education. Will change the code to make that happen soon (in a few days).


Sure - I don't mean to sound like I'm complaining - I think the tool could be very neat. Would be great to be able to use it easily without any email, at least for the initial trial.


On a positive note. I have fond memories of my wife and son coloring at the dining room table during the pandemic. I look forward to trying this out.


I uploaded a picture of my dog and none of the styles looked like HER. They were just some generic dog in the same pose as mine.


I was wanting to build a tool like this to transform photos into templates for doing stained glass work. This is very cool.


Due to high demand, the processing time is higher than usual. Need to wait for your result in email notification.


I wonder if the backlash against "AI" will grow to be enough to keep us from hell.


Right now we hear

Libera te ... me ...

Soon we'll hear

libera te tutemet ex inferis


what language / image libraries did you use to build this? thanks


I think this is just a thin wrapper around GPT-4 or similar? Parents I know have replaced colouring books by just asking GPT-4 , and the colouring pages it generates look exactly like the example outputs here.


Those coloring books are generated by text prompting. Here we let you create coloring pages from your own photos. Related but different.


GPT-4 supports images as input.


Seems like a stupid use of AI


Super annoying... needs an email

You lost your opportunity, you can now go back to anonimity.


> enter your email

Nope


Just fill in pa-contact@photofairy.ai for the address.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: