Hacker News new | past | comments | ask | show | jobs | submit login
Improved seam carving with forward energy (avikdas.com)
93 points by akdas on July 29, 2019 | hide | past | favorite | 21 comments

Intuitively I would guess that this could profit from using a more sophisticated color difference function [1] which accounts for human color perception better than simply RGB component differences. And even if one uses RGB components, making sure that gamma correction is handled properly might make a noticeable difference.

[1] https://en.wikipedia.org/wiki/Color_difference

You're absolutely right that RGB differences are not very meaningful to humans. I chose to stick with the easiest approach for educational purposes, but I'd love to hear if anyone tries out a different color difference function.

Note that the authors tried other types of energy functions in their original papers, ones that didn't simply use color differences. No one energy function seemed to be perfect across a variety of images.

Completely agree with this. In my experience, LAB space really makes an observable difference in perception-based applications, as compared to the mostly arbitrary RGB cube space.

RGB values in images are de-facto sRGB, which is reasonably close to human perception (sRGB gamma curve is close to Munsell lightness curve).

In this case applying gamma correction to compute the color difference in linear light would make things worse, as it'd model physical properties of the light, rather than human perception.

After reading the first article [1], I was inspired to implement the algo myself. Great to see a follow up! Would be cool if there was a third that explained what's going on in this [2]... :D (was mentioned in the comments here [3])

[1] https://avikdas.com/2019/05/14/real-world-dynamic-programmin...

[2] https://web.archive.org/web/20110707030836/http://vmcl.xjtu....

[3] https://news.ycombinator.com/item?id=16269998

I must have missed the real-time adaptation of seam carving, so thanks for bringing it to my attention! I'll certainly take a look at it.

As far as I can tell, this method does just calculate the seams once, but it makes sure they don't overlap. Then you can go through the seams in order and remove them.

So I haven't dug too much into seam carving but have a quick question, do any of these algorithms blend the pixels of a removed seam into the surrounding pixels after removal? Maybe as high as a 50/50 split, or lower like a 90/10 split. Does this end up with a better image or a blurry mess?

You can do something like that. For example, enblend does multi-scale "blending", where it builds an image pyramid and runs seam carving on each level before re-combining.

Seam carving was one of my favorite CS projects I did while in school. This was a cool read, thanks

Never did it in school, but when I came across it as an assignment from a Brown CS class I decided to give it a go on my own, since I'd done Racket stuff at school and new my way around it decently enough. I agree wholeheartedly -- it is just insanely cool.

Literally everything about it. The concept, visualizing intermediate steps by producing a picture that shows the energy of each pixel, then the CS with finding the path, then seeing the image it produces. Just an incredibly interesting (and very modern issue given different screen sizes), a clever algorithm, challenging to implement, visually interesting results... It literally has it all.

100% agreed! I came at it from the angle of someone asking me about real-world applications of dynamic programming. DP is usually covered only in an academic setting, so there's one more plus for seam carving.

dynamic programming shows up a lot in image processing. another cool application is in the image quilting algorithm, where you're trying to synthesize a large texture from a smaller one by stitching together small patches. to reduce the visible seams you can use dynamic programming to find the "best" path through the overlap of two adjacent patches.

here's a link to the paper, which has some cool results to look at: https://people.eecs.berkeley.edu/~efros/research/quilting/qu...

Do the seams need to be continuous? Intuitively it feels like the seam being carved out doesn't need to be made from pixels touching each other. What if the best seam has a slope of about 1/3 (i.e 1 vertical pixel for ever 3 horizontal pixels)?

In the first paper, the authors discuss what happens if you take the lowest energy pixel from each seam. The problem is that visual coherence is not maintained, because there is no connection between the removed pixel.

What you're talking about--expanding the number of pixels the seam can move from row to row--is certainly possible. This will require more computational resources, as each seam can continue from five different seams (or more if you choose to widen your slope parameter). And, I'm not sure you'd get enough visual coherence from these types of seams.

That said, the best way for to know for certain to is to try it out!

Follow-up from the related article last month (which was discussed in https://news.ycombinator.com/item?id=20285242).

For those who have no idea what it's all about: https://www.youtube.com/watch?v=vIFCV2spKtg

Question about the algorithm:

Is the algorithm run from scratch each time a seam is removed? I.e. energy function computed again on the resized image, all seams recomputed, then only the lowest energy seam is removed.. and repeat.

Or are all the seams from the first calculation used for resizing?

Good question. I talked about it in my previous article, but the gist is I compute the energy (or costs in this case), then use dynamic programming to find the new lowest energy seam each time. This allows a subsequent seam to pass through an earlier seam, as seams are found based on the latest images, where pixels have been pushed together from previous removals.

However, you'll notice that the costs are only changed in a small area around the removed seam, so if I optimized the process, I could avoid recomputing costs for the majority of pixels.

The algorithm is re-run, but is usually implemented using a dynamic programming approach to avoid completely recalculating every seam from scratch.

I'm using dynamic programming, but I'm still computing seams from scratch on each "frame" (each time I resize the image). Dynamic programming is useful for solving the seam finding problem on each frame, but it's an orthogonal concern to what computation can be saved on the next frame.

For illustrative purposes, I didn't optimize storing the cost data across frames, but that's something I could do.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact