Hacker News new | past | comments | ask | show | jobs | submit login
Geometrize: Turn Images into Geometric Primitives (geometrize.co.uk)
157 points by adrian_mrd on Dec 20, 2020 | hide | past | favorite | 43 comments



Shoutout to Michael Fogleman's primitive[1] for likely inspiring this work.

[1]: https://github.com/fogleman/primitive


> The algorithm tries to find the single most optimal shape that can be drawn to minimize the error between the target image and the drawn image. It repeats this process, adding one shape at a time. Around 50 to 200 shapes are needed to reach a result that is recognizable yet artistic and abstract.

So one way to think about this is it’s a compression algorithm with aesthetic artifacts at high compression rates.

I wonder if you could use a multi-dimensional Hough-like transform on the primitive of choice to avoid the hill climbing step. You can do this easily with lines, but ellipses and triangles will be more difficult I’m sure. Sounds like a fun Christmas break programming project!


This uses hill climbing to minimize RMS error, I wonder if gradient descent with some momentum would be faster?

Thanks for sharing!


Looks like it’s a recursive minimisation. I wonder what an optimal minimisation would produce.


Indeed, you seem to be correct. It gets a call-out down at the bottom of the page.


Just purchased, thanks.


Honest question: If you take a copyrighted photograph and convert it to geometric shapes, how “blocky” does the picture have to be before it’s no longer a copyrighted work of the original artist?

I mean, if it’s almost identical with a huge amount of shapes, I can understand that it’s reasonable to look at it as the same piece of work, but on the other hand if you make it blocky enough that it looks like an interpretation of the original image as a new work in its own right, what ownership does the original author have? None, some, complete?


While copyright law's interaction with new technologies can be hard to predict, I strongly suspect that the level of "blocky-ness" doesn't matter. If an image was created from another image created by a different artist, the new image is as derivative work[1].

In general[2], as long as the new work is a transformative[3] difference from the original, a new copyright is created, although royalties might need to be paid to the author of the original work.

[1] https://en.wikipedia.org/wiki/Derivative_work

[2] see a real lawyer for actual legal advice

[3] https://en.wikipedia.org/wiki/Transformation_%28law%29


Your question reminded me of this[1] horrifying story, which deals with that exact issue

[1]: https://waxy.org/2011/06/kind_of_screwed/


And, on the other hand, is Richard Prince.

https://petapixel.com/2013/04/26/appeals-court-overturns-pre...


Unfortunately, in practice, because of the way the legal system works, your question doesn't have a straight answer.

Even if you follow the law and get proper legal advice, you can still be sued, and what happens afterwards will depend on the specific situation, how much money you have, how competent your attorneys are and whether you decide to settle or go all the way to trial.

Hence, for it to be worth doing, you probably need to make a lot of money with your derivative work, and have very good counsel.


I would imagine there's probably some precedent for this. You've been able to totally distort pictures beyond recognition with Photoshop filters for a long time. It seems like something that's probably come up before.


How would one most easily construct a pipeline or script to either split apart an mp4 into images, process them all with this tool, and then recombine into a video, or else modify this tool to take video as input?

Anyone have any ideas / suggestions for a path to try that might require the least installation and scripting?

And can anyone tell whether it's deterministic or not as far as whether the same input generates the same output each time?


A filter that works on a still image doesn't necessarily work well on motion graphics, because you need a certain stability from frame to frame.


I've started work down this path for a project I am working on. I downloaded some footage of satellites flying over the earth from NASA, used FFMPEG to extract each frame of the video, wrote a ruby script which runs over every image in a folder and calls primitive (https://github.com/fogleman/primitive) on each image.

The final step is to recombine it into a single video, which I haven't done yet because I haven't yet implemented video playback in the program I'm writing. In the meantime I'm just loading each image on the fly and playing it as a sprite animation. This results in low FPS but it gets the gist across.

https://www.youtube.com/watch?v=SBFGqxWYyvI&feature=youtu.be

I'm thinking it'd still look pretty choppy even if it was a 60 FPS video though. I'm not yet sure how I'm going to smooth it out. I can share the FFMPEG command I used / script I made with you if you want, but I'd have to find it first (this was maybe a year ago). I'm sure the script is garbage.

I do not believe the go program that I am using is deterministic. You can run the same command on an image twice and it will produce a different result. It might be possible to make it deterministic, but it's not something I've wanted so I haven't looked into it.


I actually know how to do the splitting and recombining part pretty well using Da Vinci Resolve (kind of like the open source version of Adobe Premiere / Final Cut Pro) but I was wondering what the easiest tool would be to process.

Do you have to just call into the python script process.py primitive? Or do you call into that main.go? It's just this simple IO that I feel like would be the hard part.

Your video looks really good!

If you wanna take 2 min to see if you can find it, that might be super helpful for a VR video project I'm doing!


Ffmpeg can: https://trac.ffmpeg.org/wiki/Create%20a%20thumbnail%20image%...

https://superuser.com/a/624573

You’ll just need to figure out the middle bit, shouldn’t be too hard though.


Regarding actual input and output, I assume that’s relatively easy with ffmpeg.

As for results, the challenge might be keeping the illusion of movement from frame to frame.

It’s likely small changes to the input image result in largely different positioning of the shapes. When watched as a movie, this wouldn’t result in a fluid motion.


With the way this optimizes it would produce crap, even if using the same seed frame to frame. One very interesting prospect is in taking the motion vectors out of the video and applying them to overlapping primitives in the shape data form as a first step to apply when fitting the shapes, that is to do this greedy hill climb in tandem with mpeg motion decoding. Maybe it wouldn't be too hard but it's not something I've got the motivation to try out and tweak right now


And more on this, maybe seeding the next frame to favor the motion vectors mutation of the shape before this libraries random seed mutation


The most interesting one I saw was this: https://www.geometrize.co.uk/assets/images/examples/building.... It is a building in the full res, but when the algorithm is applied it looks like a mountain. Every other picture I saw was, more or less, as recognizable as the original.


Where have I seen this before? Oh yes https://alteredqualia.com/visualization/evolve/ (from 2008)

(From the person that wrote the webgl renderer for three.js)

Press start and you can see the algorithm at work, and how each shape is added randomly, score is computed, then the process repeats again


Interesting also to see here how computing power (and algos?) have improved tremendously. I assume the benchmarks are from ~ 2008. Getting to the 94% fitness levels takes 5 minutes on my phone - listed as >120 minutes on presumably a beefy workstation on this page.


This, but in 3D, could be incredibly useful for modelling.


You just rang a faint bell in my mind... There is such a project IIRC... if I can remember it I'll post a link...


I’ve been meaning to give something like this a shot to assist in lazy loading large images. Generate a low complexity svg that is hopefully pretty small and just fade in the full image once loaded.


I tried this recently:

https://gist.github.com/Munawwar/6ac51e33e901d89750ee61319d0...

The challenge is making it look pretty and at same time have a small svg output and at the same time not taking forever to generate (it's perf gets exponentialy slow). Took some time to tweak it to the settings I used.

It looks nicer than blurhash.. however blurhash is way smaller and might be more practical alternative


Oh that’s very nice, I’ll give it a shot.

I did try blurhash but it was just a bit too abstract? For large images. Think desktop full width.


Also checkout https://blurha.sh/ if you want a minimal palceholder image for the web


You should check out interlaced PNG and friends


There was a challenge at stack overflow to encode an image in a tweet (140 characters at the time):

https://stackoverflow.com/questions/891643/twitter-image-enc...

Some of the solutions took a similar approach that used geometric primitives.


Best example of this algorithm is Frou Frou’s “the dumbing down of love” music video, from back in 2002!

Crap quality: https://youtu.be/N4IhGlNTJEg

I’ve always puzzled of how they morphed the shapes...


Better version from the director’s vimeo: https://vimeo.com/44001799


Would be a good use case for Chat/conferencing avatars.

If this can be done in real time (even with a delay), fade transitions could be a nice alternative to video too.


Very cool project, I would love to use it as a way to teach computer graphics to students. Is there more explanation somewhere besides the brief description ?


How would you use this to teach computer graphics to students?


one concept can be around how images are rendered from primitive shapes, showing that using this demo step by step.


Nothing is actually rendered this way. Polygonal manifolds aren't overlapped to create shading in 3D space. This isn't how 3D, compression, or image manipulation works, it is purely artistic.


I see, well I'm going to research this more. Even if it is not a literal explanation conceptually this demo can serve a learning aid.


How?


Lovely effect. Thank you for sharing.


Is there a library that goes with this?


Open the page, read the first 25 words, and come back when you still have questions.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: