
Show HN: Improved Seam Carving Using Forward Energy - seam_carver
https://github.com/axu2/improved-seam-carving
======
seam_carver
Don't think most people know about the improved seam carving algorithm (since
only the original is taught in most CS curriculums) and thought you might
think this is cool!

~~~
jaytaylor
It is a neat algorithm, hope you are able to get it merged into scikit-image
[0] without too much fuss~ good luck!

[0] [https://github.com/scikit-image/scikit-
image/issues/3082](https://github.com/scikit-image/scikit-image/issues/3082)

------
chatmasta
Seam carving is really cool, not just for the typical application of changing
images, but also as a method of fingerprinting images (and the other side,
anonymizing them). There are some interesting papers on these use cases via
Google scholar [0].

There are a lot of potential businesses that could be built around image
fingerprinting. Off the top of my head those include the obvious copyright/IP
monitoring, and also more novel ideas like "meme tracking" to observe the
propagation of memes across the Internet.

[0]
[https://scholar.google.co.uk/scholar?hl=en&as_sdt=0%2C5&q=se...](https://scholar.google.co.uk/scholar?hl=en&as_sdt=0%2C5&q=seam+carving+fingerprint&btnG=)

~~~
seam_carver
That looks pretty cool! Thanks for the link! I'll look more into it soon.

------
xg15
> _Both papers introduce different ways to "calculate the energy of a pixel,
> which is a measure of its importance—the higher the energy, the less likely
> that the pixel will be included as part of a seam."_

If I understand correctly, this function is basically how the algorithm
decides which parts of the image to cut (or stretch). So it seems like it
could be interesting to experiment with a range of different functions here.

Has anyone tried to e.g. plug in a CNN?

~~~
seam_carver
I'm unfamiliar with CNN's, but you can draw masks manually to protect/remove
parts of an image, or use a face recognition thing, or a motion detection
thing, etc.

~~~
TTPrograms
The biggest value of ML here would likely be to create a more useful energy
function. This could be informed either by object detection data or hand drawn
"energy" masks.

The other big change that could be useful would be to go with a non-greedy
approach. Say you want to scale down a crowd of people - how do you remove,
say, one person from the crowd instead of removing "low-energy" strips from
multiple people. The people in the crosswalk image sort of demonstrates the
difficulty of using basic energy metrics there.

One interesting idea would be to use GANs - come up with some parameterized
seam removal function (with number of seams a input) such that the resulting
image is still highly probable under the data distribution. So you can't cut
objects into weird shapes since there aren't images like that in the dataset.

~~~
seam_carver
By GAN, do you mean "Generative adversarial network?" I don't have much ML
experience so I don't really understand what you're saying :P

Do you know a good place to start? I heard Andrew Ng Coursera course is good.

~~~
clickok
I would not dive into GANs right off the bat. Ng's course is a good start,
maybe also buy or borrow a copy of Russel and Norvig's "Artificial
Intelligence: A Modern Approach"[0]. If you decide to look into deep learning,
I've heard good thing about Bengio's book (see also Hugo Larochelle's lectures
on YouTube) and if you want to learn more about reinforcement learning,
"Reinforcement Learning: An Introduction" by Sutton and Barto is frankly
amazing (and there's a great course on YouTube by David Silver).

From reading your code, you're probably familiar with most of the stuff you
need to learn ML, so I thought I would take a stab at explaining how you get
from "no machine learning experience at all" to "surely GANs will solve all
our problems". Broadly speaking, a lot of machine learning is just deciding
what your goal is (carve an image, play a game, classify objects, group
similar data points, etc.) definining an objective function (e.g., number of
sharp edge removed, player score, a classification loss function[1], distance
over points in a cluster) and then maximizing or minimizing that objective
function.

In classification, you want a function `f(.)` such that given a data point
`x`, `f(x) ≈ y`, where `y` is the correct label for that data point. Say you
have a bunch of these data points in a matrix `X`, and a vector of correct
labels `y`–– how do you come up with a function that correctly classifies the
data in `X`, and how can you do it so that data you might collect in the
future would also be correctly classified?

You could do this with linear least squares (come up with weights `w` to
minimize `|f(x) - y|`, with f(x) = x^T w) but unless your data is cleanly
separable, that probably won't work so well. There are a lot of possible
techniques people have tried, but lately neural networks have become popular
because they are powerful approximators, relatively easy to deploy, and pretty
versatile. But the training process[2] still boils down to tweaking weights
until `|f(x) - y|` is as low as you can make it.

General adversarial networks are kinda a tweak on this idea-- you start with a
bunch of images, and you train a network to generate images that are "similar"
to them; call the initial images the "real" images and the generated ones
"fakes". You want your network to generate fakes that are similar to the real
images. How do you define similar? Well, you have a _second_ network whose
objective is to distinguish between real and fake images. You train both at
the same time, using feedback from the second network to improve the first
network, and using the images generated by the first network as training
examples for the second.

Initially, the first network will generate splotchy garbage that the
discriminator can easily tell apart from the real images. Over time, the first
network improves, creating increasingly realistic fakes, while the second
network gets better at determining whether an image is real or generated.
After much training, you have two networks: one that is good at generating
images that are similar to the seed images, and another network that is good
at identifying if an image is real or fake.

––––––––

0\. It even has a website and example code
[http://aima.cs.berkeley.edu/](http://aima.cs.berkeley.edu/)

1\.
[https://en.wikipedia.org/wiki/Loss_functions_for_classificat...](https://en.wikipedia.org/wiki/Loss_functions_for_classification)

2\. See here for a nice explanation of the backprop algorithm,
[http://neuralnetworksanddeeplearning.com/chap2.html](http://neuralnetworksanddeeplearning.com/chap2.html)
or if you want the OG (original Geoff) paper on the topic, it's in the
proceedings for Parallel Distributed Processing 1986 -- "Learning
representations by back-propagating errors", Rumelhart, Hinton, and Williams.

~~~
seam_carver
That's a lot to take in; thanks for all the links!

------
innagadadavida
14 sec per image is excessive, is there either an implementation or algorithm
that can handle do this in less than 500ms per image?

~~~
seam_carver
It's 14 seconds to remove 200 seams on the 512-by-342 bench image, but I note
potential optimizations in the notebook.

Right now, the seam_carve function in scikit-image accepts an energy matrix,
which it then uses to generate the accumulated energy matrix. Forward energy
calculates the accumulated energy matrix directly, but I calculate the energy
matrix backward from it to be able to use scikit's seam_carve function.

If I write my own seam removal function or change scikit's, I think I could be
able to make the speed comparable to backwards energy.

Furthermore, the accumulated energy matrix calculation step in scikit uses a
double for loop, which might be able to replaced with some np stuff for a
speedup for both forward and backwards energy.

I should probably raise this as an issue on GitHub... thanks!

~~~
seam_carver
I actually realized I was doing a lot of unnecessary computation in my dynamic
programming step. I removed it and from 14 s it's now 8 s.

~~~
seam_carver
After talking with some friends, basically I could get this to be a lot faster
if I just wrote it in Cython. (the skimage.transform.seam_carve function is
written in Cython). Also, I realize now that the double for loop in Cython is
actually fast, so that means that the skimage implementation of dynamic
programming is optimal. But I can still change it to accept the accumulated
energy matrix!

------
imranq
This is the third assignment in Princeton’s Algorithms II course on Coursera -
highly recommended!

~~~
seam_carver
That assignment only covers the original seam carving algorithm; this post
also covers the improved seam carving algorithm!

