Hi HN, OP here. I've tinkered with this problem a few times over the years, but only now am I happy with the results. This time I wrote it in Go and implemented things a bit differently that helped performance quite a bit.
I plan on implementing SVG output pretty soon. It won't be hard I just haven't done it yet.
For the Twitter bot, I'm planning on adding another mode where it posts a GIF showing the iterative progression of drawing an image.
Let me know if you have any questions or if you have ideas for other features!
There is lots of work on "artistic rendering" like yours, from the early 1990s to the present. It started with work by Paul Haeberli at SGI [1]. The first chapter in "Image and Video-Based Artistic Stylisation", edited by Paul Rosin and John Collomosse [2] describes many techniques you might be interested in. Have a look; you will like what you see.
A few people have asked that, so maybe. Another thought was you could tweet an image to the Twitter bot and have it tweet back the result, but then your image is out there for the world to see, so maybe that's not as good.
If you create a frontend, it may be a good idea to add a button for buying a print (which you could do through one of the many online services available). Some of these look very nice and I'm considering printing something like this on a canvas to hang on the wall.
I was about to create such a website a couple of years ago, but isn't this too much computational expensive to be a free service? It's going to make people wait a lot if the service hits HN or something like that. Or maybe my code was too much unoptimized and it's possible to train the images in a quite faster fashion after all. Anyway would be great to have such a site.
EDIT: Now I checked better your implementation and is very different, and indeed, can be made a lot faster. What I (and others before me) do is to have a fixed set of polygons and move, rotate, change color in order to find the best arrangement. In your case instead you add shapes incrementally in order to lower the error.
I assume because "primitive" generally has negative connotations. (Don't try to make rational arguments about that doesn't apply here or how obvious it is that it's not the "bad" sort of primitive. We're talking about marketing here. Rational arguments are a category error.) I'm not creative enough to come up with a positive, just some more neutral ideas that you would have the opportunity to shape, like Shapify (taken by other things though), something with "simplify", something with "fun" maybe ("fun up your photos!").
And to me, Shapify is just another generic "web 2.0" name. Lost in the sea of a million other "ify" domains. The point here is that they should at least claim the domain name for their product, regardless if that is going to be the actual public facing product.
This is another one of those "your tone suggests you disagree with me, but you're just repeating my point back at me" posts. I told you "Shapify" was a neutral word that you can pour your own connotations into. A lot of brands are. What's a "Cisco"? What's a "Nike" in a country where most people won't get the classical allusion? What's a "Coke"?
Those brands aren't generic "ify" names that sound like they came from a domain name generator. Primitive sounds like a pretty good company name to me. There is already a skateboarding company of the same name that does well.
This is really impressive. I'll bet if you printed these out as high quality giclees and didn't tell anyone how you did it you could sell them for thousands of dollars in fine art galleries.
I feel compelled to note that "giclée" is a fancy word for "inkjet printed". It has connotations of higher quality, but it's not a distinct process. See https://en.wikipedia.org/wiki/Gicl%C3%A9e
That's not quite true. It's a particularly high quality of ink-jet print. You can easily tell the difference between a giclee and the output of an HP OfficeJet.
From Wikipedia (and professional photographers I've spoken to mirror this sentiment), “since [giclée] is an unregulated word it has no associated warranty of quality.” It's like “premium”.
The word "giclee" does not even appear in the description, but "inkjet" does. It's still an ordinary injet printer. It's bigger and more accurate than most. I've never seen them put in rooms so small that they'd "fill the room", I've usually seen them in rooms with several printers.
I dunno, I think a lot of the hoity-toity types who are willing to overpay for art would be put off by knowing that it was done by computer. For them, the value is in the scarcity more than the aesthetics: they want to have something that their friends don't (and can't!) have. So if it were me, I would choose not to reveal that anyone who knows how to use github and an inkjet printer can reproduce the results.
That's not necessarily true. There are many examples of fine art which is made by computer and easily replicated by anyone. However, the prints themselves are limited in quantity (or, there may just one official print).
This is how photography nearly always worked. I think there's some early photo tech that did only result in one image, but all of film photography produced 'prints'.
Although I remember growing up with one print that deliberately had a fold in it, made by the artist as part of the art.
"I dunno, I think a lot of the hoity-toity types who are willing to overpay for art would be put off by knowing that it was done by computer. For them, the value is in the scarcity more than the aesthetics:"
There is a portion of the art market that caters for interior design. A client might say something like, "I want a dozen in this colour, style with an ocean breeze feel." As soul destroying as this may be to an artist, some clients want their art by the numbers, colours and feel.
I have tinkered with this myself. Have you tried a fitness function that doesn't just compare the source pixels, but also compares the source and approximation after each has been run through various convolution kernels, and adding the result to the overall score? it takes a little longer but it can let you fine tune things like giving a higher score to images whose edges line up.
... Then just come up with some fake name and portfolio website because I have a feeling that people will want to pay less for automatically generated art even if in a blind test they would like it more than a "real" one ;)
The code is very readable! Especially because it's not using a lot of external libraries. Including the twitter bot is a very nice touch. Awesome!
One of the key features of golang is concurrency. Your code is single-threaded, even though hill-climbing and annealing (and also all kinds of image operations) could be highly parallelized. Maybe that's something rewarding to look into at some point?
If you ever find the time, can you outline your strategy for your choosing a convergence algorithm? How did your experiments go for choosing the parameters (you called them maxAge, maxTemp, etc.) for those? Why did you currently end up choosing hill climbing over annealing, here: https://github.com/fogleman/primitive/blob/master/primitive/...
I like it. I noticed that the oval brushes were almost always of lower quality (in the sense of doing a worse job recreating the original). I think it's because you only take axis-aligned ellipses and not arbitrarily rotated ones, which might be fun to try too.
Actually ellipses perform pretty well. I've analyzed this a good bit... http://i.imgur.com/82Aksg0.png (lower is better, this shows average score over 100 different input images)
As an aside, it would be nice if you could cite the original sources of the images posted by the Twitter bot; it would be nice to compare, and probably most safe legally too.
Thanks. This is wonderful. I just put go on my mac and got this running in ~5 minutes. I am going to do a few images tonight and will share the google photo album as a comment to this comment when I get a chance tonight.
As far as other ideas, morphing from one photo to the next one. Making changes (morphing shapes, changing colors, adding / removing shapes) would look interesting.
So I wondered how this algorithm behaves when applied to its own output. I know there won't be a fixed point because of random component, but here's the program applied 11 times to itself, starting from a red dot:
I'm extremely short-sighted. After looking at all the pictures, and thinking "Okay, that's pretty nifty," it occurred to me to try taking off my glasses and viewing them again. The result is amazing :) All of the individual shapes blur into invisibility, leaving only a very recognisable if fuzzy version of the original image.
OP is from Scotland, same as my wife, and I can confirm 'short-sighted' is the correct usage in Britain for what most Americans call 'near sighted'. The term does not have the negative connotation in Europe that it does in America.
Don't feel too bad. I had to read it twice to get the meaning, myself. The fact that it's used in a literal sense outside of the U.S. isn't something that I would have considered. I suppose that my viewing of foreign media has had a dearth of references to optometry.
Similarly, the mosaic effect that they use on TV to hide faces sometimes leaves them still recognizable if I take my glasses off, even if they're completely unrecognizable when seen clearly.
I wonder if this effect works for people who aren't short-sighted? Like, maybe short-sighted peoples' visual systems develop to use more focus-invariant features.
Yes, but I think the GP was talking about learning, forming new (different) brain connections, because most near-sighted people navigate the world with blurry vision for at least a few minutes daily, sometimes more.
(personally, since my eyesight without glasses or contacts is pretty bad, no more than 15 mins until I get annoyed with the difficulty spotting small objects and go for my glasses :p gonna laser those puppies one day, maybe. it's amazing living in an age where they can actually fix this sort of thing, shame if I didn't take advantage of it)
You can get a similar effect by pressing Ctrl "-" a couple of times in your browser, to shrink the images. The result is that they look much sharper. But of course, this spoils the original effect.
This may be a bit of a "visual Turing Test" actually. Humans are easily capable of seeing the picture "through" the shapes, but image recognition algorithms would probably find this kind of input quite hard.
I'd be interested what the latest Neural Networks have to say about input like this.
Those pics remind me very much of the old game Out of this World. I wonder if someone can modify this to encode video. It would make for an interesting effect.
If you like it, try lowering the number of shapes primitive uses. I added an example video with n=15, it works really well for simple forms. Almost like a low poly rendering.
Naive (probably non-realtime) version for video should pretty easy, just run the algorithm frame-by-frame. Biggest problem would be the lack of temporal stability ("jitter"), but I imagine using previous frame as the base state for next frame would help.
Real-time graphics rendering is another story altogether, I don't see any trivial ways of applying this sort of thing for typical graphics pipeline.
Back there lots of people posted their implementation (been there, done that... an easy one with triangles in pure C). It's a fun exercise you can then show around and get very interesting output of
Very nice. I like the algorithm choice and the result. I spent a bit of time exploring a similar process, and used it to generate multiple frames and animate them with a nearest neighbor algo. Examples and writeup is here: http://max.io/articles/harissa/ and the code: https://github.com/binarymax/harissa
Many of you may know about how you can squint your eyes to make the images more clearer. Can someone explain why some of these images get a lot more realistic - e.g. the lion [1], the girl [2]..while there is not much improvement for the others - e.g. the barn[3], the small aircraft [4] (or at least that is what I think it is)?
Possibly naive question: can this be used for image compression? Use this for a rough approximation to the original colours, and encode the diff to the actual image using some efficient algorithm (lossless or not)?
You could store the combo shapes for 50 images in about 350 bytes. If you want to fix to axis and pick one shape, you can get it down to 300. (That's 256x256 images, for each shape you need two coordinate pairs, or 4 bytes, 1 byte for rotation, 3 values for rgb (15-bit color), and 2 bits per shape for choice between ellipse, rectangle, and triangle
It would definitely be lossy compression, but yeah. If you are just talking about 50-200 polygon coordinates... It would be interesting to try and use these types of images in demoscene or size based game competitions.
Impressive work. I played around with the same idea before while learning a bit of python, and found (in my project at least) a breeding strategy to give better results than a hill-climbing algorithm.
Shameless plug: similar artistic effects can be achieved with ImageTracer, however the algorithm is different. It's a Public Domain tracing and vectorizing library, outputting SVG.
> Hill Climbing or Simulated Annealing for optimization (hill climbing multiple random shapes is nearly as good as annealing and faster)
Great to see the author is using this instead of the GA / evolutionary algorithm code it was inspired by.
You can see it in the results too, they are gorgeous! Annealing (and, apparently hill-climbing too) are just faster to converge, allowing to tweak better optimizations for the hyperparameters.
GAs are an attractive idea because, well, if you look around, evolution gives rise to such clever solutions and beautiful shapes. But people tend to forget natural evolution took billions of years, especially (sort of) the big leaps in complexity. Annealing is almost always a better solution, with the possible exception if your problem space is biologically inspired and you come up with a sensible crossover operation. If the crossover operation (or without one) doesn't have the right properties, a GA is basically equivalent to annealing (except in parallel), but much harder to tweak for fast convergence.
What I find cool about this is that I ran it on my profile pic, and when Facebook shrinks it down to the micro size image, you can't tell the difference between the primitive version and the original.
i can only imagine that a pretty video thats had each frame run through this would come out incredible looking. you could probably get away with each frame being pretty low quality render too.
Probably not for real-time, but not impossible as a filter. I would imagine it'll be more like Prisma, take your snap and let it chug for a few seconds to give you a pretty picture.
Yep, I later read the technique used by the OP, and is quite different compared to programs similar to this one, including mine, where a fixed set of shapes is evolved in order to find an optimal arrangement. This is a complex optimization problem that takes a lot of time. In the OP implementation old shapes are never re-arranged, just new ones are added in order to decrease the error. While the visual effect is still very interesting, you need more shapes this way to approximate the same level of details, and is in some way less interesting IMHO, at least conceptually, compared to "arrange N items to get the best result" which looks like a quasi-human task.
If you were smart, you'd snag primitivepics.com , stat. Forget Twitter, establish your brand. I wouldn't even open source it until you've done so. It's unique, and non trivial.
I plan on implementing SVG output pretty soon. It won't be hard I just haven't done it yet.
For the Twitter bot, I'm planning on adding another mode where it posts a GIF showing the iterative progression of drawing an image.
Let me know if you have any questions or if you have ideas for other features!