Barrier synchronization is always going to wreck parallel performance. When possible, it's good to break up your work so that data dependencies are explicit and fine grained. So instead of breaking up work like this:
Where E is dependent on A and B, (and begins when they are done) H is dependent only on E, F is dependent on B and C, etc. This way you don't have to wait for the entire strip to finish, you can start work on E after A+B, but while D is still working. This also gives you weird block sizes for "free", which is a huge no-no with barrier synchronization. For instance, Es and Gs are size 6; if the first example had most elements of size 4, but one element of size 6, it would be catastrophic. All your threads would be done when the final thread is only 66% complete, and it's single threaded from there on out for that strip.
This pencil and paper approach to breaking up work is good at parallelizing things that don't appear parallelizable at first glance, e.g. prefix sum [1]. Compilers are good at identifying opportunities to reorder instructions in ways that break low level data dependencies. They tend to miss the big higher level opportunities like this.
I created a desktop GUI tool [0] that used multiple cores for seam carving images of your choice. Including masking parts of image to keep or remove. The backend code parallelizing the work is from the CAIRE project.
It was ages ago, and has since been archived on google code :)
If folks are interested in a simple implementation to read or reuse, here's a JS version I wrote back when HTML5 and <canvas> were new and exciting (over 10 years ago now): https://nicolasff.github.io/canvas-seam-carving/
The button says "warning: very slow" but JavaScript performance has significantly improved since then :-)
Still a very cool technique, and pretty easy to implement.
A related idea would be “image summarization”, i.e. an (as-yet hypothetical) technique for generating an image that’s an “icon” of a larger image. In the same sense of “icon” as the one visual designers are trained to create: one where both large-scale structure (shape+color) and fine-scale structure (texture) are preserved, while “redundant” detail is sacrificed, in order of least to most important for visual recognition by a human or image-classifier model. You could also call this a type of psycho-visual compression, in the same vein as chroma subsampling.
Simply resampling an image smaller keeps the large-scale structure (the shape and color), but loses the fine-scale structure (texture.) That’s why we don’t just resample photos down to 32x32 and use them as icons :)
Seam Carving (whether parallel or not) does the opposite: it preserves the fine-scale structure (texture) but distorts the large-scale structure (shape.)
But not always! The amount of shape distortion in Seam Carving is worst case O(N) with the number of seams deleted, but best-case o(1). It depends on the algorithm’s choices of seam.
Regular iterative Seam Carving, as described by its original paper, isn’t stateful-enough to be able to minimize any whole-image quality (like loss of large structure.) So it’s no surprise it hasn’t been used for something like this.
But parallel Seam Carving might just be the ticket for this problem. A model could be trained to select an optimal set of seams that work together to minimize the large-structure deformation of the image (i.e. the image’s lower-band difference in frequency-space), while also individually being low-entropy seams from the Seam Carving algorithm’s perspective.
Check out the original SIGGRAPH presentation on it. There are lots of really great demos. It's one thing to read about the algorithm, but to see it in action is magical.
This discovery is 13 years old. Although it has clear limitations when it comes to humans, I'm surprised I don't see this kind of resize feature available in more software. Does Photoshop have this hidden in a drawer somewhere? Does imagemagick support this?
Hmm, I knew it was implemented in a bunch of plugins or dedicated software like PS. However I am surprised it's not found all over the place by now as part of default apps / preview / OS builtin software / img tag options, etc.
Also none of the implementations I have tried (last time I checked, not up to date on this) were as slick / realtime / interactive as the SIGGRAPH demo.
I have a photo of my friend, his ex-wife and a few others standing together on a camping trip. I wanted to share that photo with the group since it was the tenth anniversary of that trip, but didn't want to remind my friend of his divorce, so I used seam carving to elide his ex-wife. (The tool I used allows you to paint areas of low or high energy.) The result was almost seamless (pun intended) and you wouldn't notice anything amiss if you weren't looking for it.
I think as a photo editing tool. The neatest nontrivial use i know is to “delete” objects. You can use segmentation algo to highlight an object and set its energy to 0 then seam carve out “naturally”.
I used seam carving in a side-project Chrome extension I developed [1]. The extension pull the top landscape photo from r/EarthPorn and resizes it to 1920x1080 using the seam carving library caire [2]. Then it uploads the resized image to Imgur where it can be used as the user's new tab wallpaper.
Perhaps using seam carving was a little bit overboard but I thought it would be interesting and it works really well for most Earth landscape photos. I only rarely see artifacts and minor distortions in the resized images it produces. Also since I only needed to resize one photo per day, the computational overhead was not a very big deal.
It’s...not as useful as it first appears. It’s actually very computationally expensive to solve the mapping, and it can add quite of bit of space to the image to pre-compute the seam order (which is the proposal made by the original paper).
Each pixel can be tagged with a rank according to its seam number along each axis. If you have 4,000x4,000 pixels then each pixel needs a 12-bit integer value for each axis; and if you want the optimal map it’s another 1 bit per pixel. So something on the order of an extra 25-bits per pixel for a naive encoding, which almost doubles the size of an 8-bit per channel image.
It also has no explicit scene awareness. The energy function essentially defines an implicit heuristic of scene importance, which is surprisingly reasonable until you’re removing >25% of pixels along either axis.
I believe it shipped with Photoshop for a couple years (maybe still?) but in my experience it’s a bit more of an artistic effect than an automatic and scalable solution—and it would be really hard to make it run in real time at game frame rates. (I think the artifacts alone would make it not worth it for games.)
A non-practical application is for funny memes. Someone figured out Photoshop's Content Aware Scaling feature can hilariously distort faces [1], and people have been using it memes since.
You can divide the work into twice as many independent chunks of the same size like so:
If you were going to divide the image into K horizontal strips of triangles, instead divide the top half into K/2 horizontal strips of triangles and divide the bottom half into K/2 horizontal strips of triangles. To make things easier at the end, please include a 1px tall overlap between the top and bottom parts. Work downward from the top and upward from the bottom until you reach the middle. At the middle, for each column, you have the cost of the top half of the minimum seam that includes that pixel and the cost of the bottom half of the minimum seam that includes that pixel. So you can add those together then take the min as before.
> Also, I guess this is beside the point, but surely almost all of the DP structure can be reused if you're repeatedly removing seams...
It's not besides the point at all. If you can reduce this linear-time algorithm to, say, amortized constant, then you'll handily beat parallelism.
After you remove a seam, you need to recompute the cones below every removed pixel -- which ends up being the cone below the topmost removed pixel (sadly, you can't just zip down the neighbors of the seam, but you can avoid expensive data moves with a 2d linked list). If the image is super wide, that gives you a speedup by approximately the aspect ratio. If it's square, you should be able to cut the runtime in half. If it's tall and skinny, neither this nor the author's approach are terribly helpful.
Rotating as you suggest would be equivalent of swapping between vertical and horizontal seams. A tall and skinny image with vertical seams has the same problems that a long and wide image has with horizontal seams.
No, it’s a greedy algorithm. I don’t think it would be in P to solve it globally. Naively you could perform any graph search to find minimum total cost to depth N where the cost between nodes is the energy removed. In that case it’s exponential.
Because the seam is defined such that each pixel in the next row is within 1 column of the pixel in the preceding row, you can reuse all the calculations except for a pyramid shaped area around the last removed seam (in the worst case). However, that area is on the order of half the total number of pixels, so it doesn’t save _too_ much to reuse the other half.
I wonder if typical data is more forgiving. If, for example, we cut an optimal seam down the column n=x, and the DP algorithm tells us that the best seams starting at (0,x-1) and at (0,x+1) are also column seams, the next iteration can happen in linear time in the number of rows. (Plus number of columns -- pessimistically -- if the next-optimal seam isn't "nearby".)
Basically, you're checking the things that crossed the old seam, and I'd guess that'd happen often enough, but I don't know how far the effects would typically spread. Maybe it would work worse in "sparse" images like blue sky, and better in "noisier" images like forests and crowds and dogs' fur.
I'm not sure how much cost that data-dependence would incur though. Might not be worthwhile.
This is probably also a very interesting technique for data augmentation for computer vision ML. (That is, seam carving. Not much to do with the parallel version...)
Unlike other typical data augmentation operations, spatial relationships are altered in nonlinear but still realistic ways.
Cases where data augmentation works always pose a fun challenge: clearly if there's some way to manipulate the data which shouldn't alter the algorithm output, there's also some way to make the ML invariant to that manipulation. If you can create an invariant algorithm then the augmentation should be unnecessary.
There is some interesting work on making ML invariant to rotations and such, but I'd be curious if there's an algorithm which is invariant to this. I could imagine that convolutions and pooling might be relatively invariant to this technique, for example, as long as the algorithm isn't doing a lot with the large scale structure of the image.