Hacker News new | past | comments | ask | show | jobs | submit login
Seam Carving (wikipedia.org)
123 points by astrea 62 days ago | hide | past | web | favorite | 35 comments



I have an JavaScript implementation of Seam Carving :D https://khuesmann.github.io/ You can play around with it here.


This is great. Rando request, update % completed in title, for running it in background tabs?


I remember seeing a demonstration of this during a keynote at Adobe MAX in Barcelona, in 2008 maybe? I'm fuzzy on the year, but I do remember there being an air of excitement around this feature in particular – it seemed like magic! Since then I've learned how the algorithm works, and it's beautifully simple to the point where I can't believe it wasn't discovered long before.

I particularly like how it can be composed with other algorithms (face detection has been mentioned) to change the energy function and make it even smarter. It's one of those things you can implement in a weekend and have a lot of fun with.

Kudos to everyone involved in developing this!


> Since then I've learned how the algorithm works, and it's beautifully simple to the point where I can't believe it wasn't discovered long before.

This is actually the case for many such discoveries - once you understand them, and furthermore realize the simplicity behind the result, you wonder why it wasn't discovered earlier.

In some cases, it actually was - sometimes many times over by different people. The only reason it didn't catch on is likely due to societal factors, technology (for instance, many algorithms that were later used to great effect for certain work today were actually known much, much earlier - sometimes almost to the dawn of electronic computing, or further back - but the technology to exploit them didn't exist; they were mostly "paper only", and sometimes only in a simplified form suitable for hand calculation, or only done for so many few iterations), or lack of application. Computed tomography is - I think - an example (among many) in this space.

Some discoveries and such sit around for a long time until society, technology, or application need "catches up"; during that time they may be "forgotten". Other times, they are published - but in the wrong place, or time, or whatever - and never reach the critical amount of audience to perpetuate them further.

One example of this that I know of happened with a graphics discovery on one of my childhood 8-bit computers; that discovery, had it been known more widely, might have made a greater impact within the community at that time. But while it was known then, it was at the wrong time and was published in the wrong magazine. Ultimately what doomed it was likely the fact that the transition from "home computers" to "PC revolution" occurred. So the discovery had to wait - and it was "rediscovered" by retro enthusiasts of the platform (many of which were now "old hands" of the same machine - who had been publishers, business owners, programmers, etc - in the machine's heyday), about a decade ago. So it had to wait about 20 years or so.

Some of these discoveries, though, really make you wonder, in a "what if" kind of manner.

The steam engine is one of them; it is such a simple device that a rudimentary form could have been built by the ancient Greeks; all the basic technology was there, at least for a low-pressure system.

But all that was known and created was arguably a toy, a curiosity, the aeolipile (Hero's engine). Even it was based on knowledge that had languished for about 200+ years. For many reasons, it was never developed further, nor was anything involving cylinders or whatnot created (although Hero certainly knew and understood enough to take things further).

What simple things are we currently sitting on - staring us in the face - but yet we do nothing with them, but if we did, might propel our society and culture forward (by anything of decades to potentially much more)? Can we identify them? Why do we not exploit them if we can?

In a way, it's analogous to so-called "black swans" - that seem obvious in retrospective, but are likely impossible to identify prior to that point. These things don't keep me up at night, but they are interesting to ponder.


One example of this that I know of happened with a graphics discovery on one of my childhood 8-bit computers; that discovery, had it been known more widely, might have made a greater impact within the community at that time. But while it was known then, it was at the wrong time and was published in the wrong magazine. Ultimately what doomed it was likely the fact that the transition from "home computers" to "PC revolution" occurred. So the discovery had to wait - and it was "rediscovered" by retro enthusiasts of the platform (many of which were now "old hands" of the same machine - who had been publishers, business owners, programmers, etc - in the machine's heyday), about a decade ago. So it had to wait about 20 years or so.

I'd be interested in learning more about that, if I don't already know about it. Was it something to do with the C64?


I m curious about the 8bit computer discovery you talk about, would you mind sharing it or a link to it ? Edit:typo.


Related post on using a forward-energy measurement: which seam would result in the minimum-energy image:

https://github.com/axu2/improved-seam-carving

https://news.ycombinator.com/item?id=17165889


Seam Carving is available in GIMP with the "Liquid Rescale" plugin.

$ apt install gimp-plugin-registry

GIMP->Layer->Liquid Rescale


Thanks, that was interesting!

I especially liked the linked demonstration video at: https://www.youtube.com/watch?v=vIFCV2spKtg .


This algorithm is relatively old, but there is an interesting question about it: given how well it works (eg on landscape images), one wonders if and how it could be recast in a deep learning / differentiable programming setting.

The problem is, the dynamic programming and/or median extraction techniques that the algorithm uses (IIRC) are not very differentiable.


Given an energy map of an image and a target number of pixels to remove, the algorithm is deterministic. The step to apply ML would be the energy map generation. Saliency map generation with CNNs is an active area of research and it could replace a slew of more specific algorithms you would want to apply (face detection, foreground/background segmentation, edge detection, etc).


It's one of the assignments of the second part of the Princeton Algorithms course on Coursera: https://coursera.cs.princeton.edu/algs4/assignments/seam/spe...


This was also one of the first algorithms I had to implement in university. It's really great to teach about dynamic programming / recursion and how different approaches to a problem can have vastly different outcomes. And the result is great and visual.. unlike most of the early algorithms you learn like sorting algs.


Any thoughts on that course, if you've taken it?


I really enjoyed it. I liked the style, presentation, and the projects. Sedgwick clearly knows his stuff. The textbook is a classic, and the "booksite" is crammed with extra stuff.


Very interesting, the steps for the algo are mentionned as: 1) Load an image. 2) Calculate energy of each pixel. 3) Find seams. 4) Remove seams.

My question here is: once #4 has been completed how do you 'merge' the pixels that are left once the seam pixels have been 'discarded'?


The "seam" is one pixel from each row (or column) connected together either vertically(/horizontally) or diagonally. You're left with two halves that you just connect. for instance

  1|2|x|4 
  1|x|3|4
  1|x|3|4  
  x|2|3|4
becomes

  1|2|4  
  1|3|4  
  1|3|4  
  2|3|4


Your explanation makes much more sense than mine, with illustrations to boot – kudos! :o)


You're right, it's so clear. Many thanks to you sir!


You could of course interpolate the pixels on either side of the seam, but I think most implementations of this technique just drop the seam pixels and leave the other ones untouched. You can think of seam carving more as an intelligent crop than a resize operation, with the difference from a regular crop being you're not just chopping off rectangles on either side of the image, but actually selecting a path of pixels to remove. Key is you're removing one pixel from every column or row (depending on whether you're carving in the horizontal or vertical direction) so you're not ending up with jagged edges of the picture, and the seam always goes side to side or top to bottom, never side to top or side to bottom. If by chance your energy function generates seams that look exactly like rectangles on either side of your picture, it'd be indistinguishable from a regular crop.


FWIW the example image is of Broadway Tower in the UK. I can see it from my house.

https://en.m.wikipedia.org/wiki/Broadway_Tower,_Worcestershi...


I was actually just blessed yesterday with a windows spotlight image of this tower. I then looked up how to get there from London the next time I'm there (My wife and I visit UK multiple times per year), but no easy public transit options unfortunately.


The interesting problem for me here is shrinking quickly by > 1 pixel. I can think of multiple ways to do it, but I'm not sure I know what the fastest way is.


Morphological erosion is one solution https://en.wikipedia.org/wiki/Erosion_(morphology)


If someone is interested in seeing what this would look like in code, I recently made a Rust implementation of this algorithm :

https://github.com/lovasoa/seamcarving/


Caire (https://github.com/esimov/caire) uses seam carving as well. It uses pigo for face detection as well to ensure that it does not resize around the face.


When I last used it, it hung up on bigger images. I've had more success using ImageMagick's liquid rescale


It's neat that "remove the lowest information portions of the image" is a great proxy for "remove the parts a human thinks are least relevant". Though it would probably work poorly for a picture of some people in a forest.


I didn't know that such algorithm existed, this is impressive. It would be awesome if would be able to resize video to scale it to the target display, with face detection to avoid the wired effects.


The challenge with adapting any algorithm that operates on photos to video is ensuring consistency across frames so the effect isn't jarring. That said, perhaps not surprisingly, there has been work in solving this which seems to have started with the paper below.

http://www.faculty.idc.ac.il/arik/SCWeb/vidret/index.html


Related HN post from a few months back: https://news.ycombinator.com/item?id=20285242


Robert Sedgewick's algorithm's course, available on Coursera (or if you're a princeton student I guess) implements this for one of the class assignments.


Flash implementation of Seam Carving: http://rsizr.com


Both the results of seam carving and the underlying dynamic programming implementation are really fascinating


Do any Wordpress plugins, for example, use this to generate the various sized and shaped thumbnails?




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: