Hacker News new | comments | show | ask | jobs | submit login
Image unshredder by simulated annealing (2015) (nayuki.io)
253 points by nayuki 377 days ago | hide | past | web | 56 comments | favorite



This is a fun demo but, as others have mentioned, simulated annealing really isn't necessary, or even that appropriate, for this specific problem. I threw together the same demo with a very trivial algorithm that I think does a more accurate job of reconstructing the same images in significantly less time than the simulated annealing [1]. The algorithm consists of first finding the closest matching pair of pixels as a seed. It then finds the remaining column that most closely matches one of the current edges and adds it to the reconstructed columns. This then repeats until there are no remaining columns.

On a different note, if this quote:

> The unscrambling problem being solved here is much harder than a typical unscrambling problem, because we are rearranging single columns of pixels.

is referring to reconstructing shredded strips of paper then it is laughably false. The single columns of pixels are extremely clean and correspond very, very closely to their neighboring columns. In reality, the tearing of the paper makes the edges not match very well and it's just much noisier in general. I've worked on this problem before and it's many orders of magnitude more challenging than the reconstruction that is done in this demo.

[1] http://sangaline.com/blog/image_unshredder/


Thanks for the helpful critique. Your proposed algorithm is reasonable and much, much faster than simulated annealing, and I appreciate the live demo. It looks like your algorithm is deterministic (except for tie-breaking of a pair of columns that have the same difference?), so it doesn't matter which way the image is shuffled.

You are also correct that I overstated how hard the problem is; in retrospect it isn't that hard at all to unshred clean digital images (as opposed to paper scans).

On your web page:

> It takes roughly the same amount of time to run but correctly reconstructs all of the images.

This is not true, but is very close to correct. The image "Blue Hour in Paris" has some disturbances in the lower part of the sky, "Alaska Railroad" has a single wrong column on one side of the image, and "Abstract Light Painting" sometimes has a few incorrect columns on the edge. At least that's what I could spot by eye at a glance; I didn't rigorously compare them against the original images.


First off, thanks a lot for posting this and you did a great job with it! It was fun to play around with and it obviously inspired me enough to dig into the code and try my own variation.

You're right about them not matching exactly; I did it in a hurry and wasn't looking closely enough. I removed that incorrect statement (though it does seem to generally outperform the simulated annealing).

The tie breaking for pairs with equal differences and the parity of the image is essentially random. It is deterministic in the code, because it won't replace a candidate unless it's strictly better, but it's totally arbitrary which is found first.


Wait, you threw that demo together in under 20 minutes? Wow.


>I've worked on this problem before

I suppose it helps if you're already familiar with the problem domain. Nevertheless, it's very cool indeed.


Very cool. Also remarkably fast response.

Interestingly, it occasionally flips the picture horizontally (I only saw it with Blue Hour in Paris).


There's really no way to preferentially differentiate between the image being flipped one way or the other once the one pixel wide columns have been shuffled. The order of the first pair that you put together determines which orientation you end up with and this is totally arbitrary. The reason that it flips sometimes is because the shuffling of the columns is random and the order of the initial pair of seed columns depends on their original order in the shuffled image.


Why on my machine does your unshredded version come out flipped? Pretty reliably, actually.


There's no way for the computer to know which way is correct and which way is a mirror image.


Yes, but it comes out flipped 100% of the time. I ran the train image 10 times and it was backwards (you can tell from the numbers) every single time. This is more than just chance.


Whether it is flipped or not seems to depend entirely on how it ended up being shuffled.

Maybe that's why you don't use Javascript Math.Random() for cryptographic purposes? :-)

Shuffle function:

  function doShuffle() {
    var startTime = Date.now();
    var pixels = shuffledImage.data;
    while (shuffleStartColumn < width) {
      // Pick a random column j in the range [i, width) and move it to position i.
      // This Fisher-Yates shuffle is the less efficient than the Durstenfeld shuffle but more animatedly appealing.
      var i = shuffleStartColumn;
      var j = i + Math.floor(Math.random() * (width - i));
      for (var y = 0; y < height; y++) {
        for (var x = j - 1; x >= i; x--) {
          var off = (y * width + x) * 4;
          for (var k = 0; k < 4; k++) {
            var temp = pixels[off + k];
            pixels[off + k] = pixels[off + 4 + k];
            pixels[off + 4 + k] = temp;
          }
        }
      }
      shuffleStartColumn++;
      if (Date.now() - startTime > YIELD_AFTER_TIME)
        break;
    }


The algorithm isn't random -- every time you run your image through it, it's doing the same exact maths, and getting the same exact result. It's random whether or not any image you might pick will be flipped, but for a given image, the result will always be the same.


You could improve this greatly by adding another type of change to the annealing function. Instead of just swapping columns sometimes (I.e. at random) it should evaluate flips. A flip is performed by choosing two columns at random and then reversing the order of all the columns between them. Annealing algorithms in general benefit from a diverse set of change functions.


The image I tested seemed to fall into some kind of local maximum:

http://i.imgur.com/miCDFOD.png

As you can see, the left quarter of the image is reversed. I imagine the additional function you described would help with this.


I really like your idea of mirroring a random range of columns. I would consider implementing it the next time I have time to work on the code


I think another optimization would be to 'grow' the regions considered. Instead of treating every column as a separate object, have the size of regions considered vary with the temperature. The metaphor would crystal formation within annealed metal.


Why simulated annealing? Wouldn't it be much more efficient to just compare all pairs of columns and then sort them cording to similarity? That would take a 360,000 comparisons instead of about a billion iterations for an image 600 pixels width.


If you have the similarity between pairs of columns and try to find an order such that the similarity between adjacent columns is minimal, you've basically got a version of the travelling salesman problem.


Which is NP-Hard in general, but this instantiation of it can be made significantly easier than a random instantiation (in clock time, anyways).

You could pick a similarity metric such that adjacent columns have some bound on worst-case similarity - this allows you to reject many adjacencies. This could be chosen empirically by analyzing other known images. You could additionally model the probability distributions of similarity for adjacent columns, thus allowing you to reject further adjacencies and sets of adjacencies.

By using all this structure you should be able to do much better than this simulated annealing - which I have yet to see complete successfully on my machine, at any rate.

Is the author here? What energy metric is this using? Total variation?


I like your idea of rejecting pairs of columns that are too dissimilar. You are correct that my simulated annealing algorithm does not completely unscramble the image, and I believe this is because of the horizontal symmetry ambiguity.

The energy metric being used is simply the sum of absolute differences across all 3 channels for all horizontal adjacent pairs of pixels.

In pseudo-Python notation: sum(abs(pix[x,y,ch] - pix[x+1,y,ch]) for x in range(width-1) for y in range(height) for ch in range(3))


Cool - that's TV on RGB. There may be improvements on the channel selection in the literature in terms of "selectivity" for real images - i.e. HSV or something else. That might be an easy change that would bump your performance.

Also you could apply some nonlinear function to calculated TV to reflect your rejection penalties - i.e. if it's greater than x make the energy arbitrarily high. The thing is you would like to then reject the whole family of solutions that had that adjacency in your annealing, which doesn't seem super straightforward to me.

The other thing I see at the end is that it hits a local optima of multiple subsets of correct adjacencies (up to reflection) but there's no way to permute those subsets to get a better solution, as the temperature is too low. You'd almost want to "group" those subsets and then rerun on the 8 or so segments just over permutation and reflection. On the other hand, it does suggest that a "greedy" approach gets you pretty close - finding a locally optimal solution solves part of the global optimization. Then you can reduce the final problem to an exhaustive search of a much smaller space.


> I believe this is because of the horizontal symmetry ambiguity.

One quick test for this would be to make the strips 2 pixels wide.

It's interesting to watch the random walk that it does, and I think that problem would remain. Once it has made some wider strips that should be adjacent but aren't, the strips tend to randomly move between them, but there is no reason to prefer progress in one direction or the other.

I'd be interested to see what happens if you only allowed moves in one direction, e.g. only allowed a column to be inserted to tbe right of where it was.


Although the travelling salesman problem is NP-hard, and it’s possible to construct hard-to-solve instances, there are algorithms that work very well for most instances.

I just tried using LKH[0] to reconstruct these scrambled images, and it does it perfectly in a fraction of a second.

[0] http://webhotel4.ruc.dk/~keld/research/LKH/


Interesting experiment! It took me a minute to understand why you mentioned the TSP, but then I realized that you can model each column as a destination and the pixel difference between two columns as the distance. Once I got that, I can certainly appreciate that a state-of-the-art TSP solver will solve the scrambled images perfectly in no time. =)


I just posted my code, in case anyone is interested in playing with it.

https://github.com/robinhouston/image-unshredding



Interesting, although I guess I should have expected a TSP solver to do well. FWIW simulated annealing can also be used to solve TSP instances. Although clearly it doesn't do too well in this case.

Anyway, I only mentioned it to point out it wasn't a simple matter of calculating the similarities between columns and 'sorting' them, you need a more sophisticated algorithm.


Interesting to see how bad simulated annealing does, by not being able to step over so many hills. LKH is a bit unfair, as this problem cries for a Travelling Salesman, but still SA doing so bad is astonishing. Given how often it is used for optimization problems. Maybe solvers should be used more often.


A similarity score for a pair of columns doesn't give you an ordering that you can use for sorting.


Sure it does, just run it multiple times. First pick a random slice. Then find the most similar slice and place it next to it. Then find the most similar slice to that and place it next to that, and so on. (And do this in both directions from the starting slice to get both sides.) Eventually you will complete the whole image.


It doesn't work, plain and simple.

Never mind what many others have said, that a greedy algorithm might work, that you can reduce it to TSP and apply heuristics, etc. The application of simulated annealing is also crippled: Suppose you reconstructed two slices of the image. To improve it, you have to take a column from one slice and move it to the other---but that doesn't lower the energy! What would lower the energy is to take one of the slices and attach it to the other, possibly flipping it in the process. So while it says "simulated annealing" on the lid, it's a random walk for the most part.

The idea is workable, but it needs a different primitive operation: pick a range of columns, attach it in a different spot, possibly flipping it.


It looks like it might work best as a two step process. first pass to connect the major chunks, then find the edges and anneal those chunks.


Along with flipping chunks horizontally if no good solution is found. right now some chunks end up mirrored


I'd be interested to see how a Genetic algorithm with carefully chosen mutation/recombination operators would suit this problem.

Mutations could be flips and random swaps. Recombination would be combining chunks of already partially optimised solutions (breeding).

Genetic algorithms do suffer a cost of having to keep a population of solutions in memory so it would be interesting to see how the end results compares in performance to annealing.


> carefully chosen mutation/recombination operators

A job for a meta-genetic algorithm maybe? Where the mutation operators are themselves evolved.


Would like to see this applied to a page of text.


And to a bag of slices from multiple pages of text.


Some real world unshredding by Fraunhofer Institue:

http://www.ipk.fraunhofer.de/fileadmin/user_upload/IPK_FHG/p...

Shredded secret service files, fragments of papyri or frescos...


The BBC also did a good story on Fraunhofer's E-Puzzler being used to piece together shredded Stasi documents:

http://www.bbc.com/news/magazine-19344978


Simulated annealing is super cool. Back at the dawn of time, my colleague and I used this technique to place and route transistors for a project in our VLSI course. It was the most fun.


there are still a lot of destroyed documents being stored around the world (say, from the StaSi for example). I would love to see this applied on them. It migjt shed some light on things supposedly lost to history.


Do you actually think these documents remain destroyed for technical reasons? Of 90.000 official and 200.000 unofficial employees of the Stasi (secret police of the German Democratic Republic), a whopping 20 have been convicted. Do you think that's because the powers that be couldn't do better?

The official excuse is that whatever the Stasi did was lawful at the time. The same logic would have made the Nuremberg Trials impossible, but it wasn't applied. That's not an accident.


Some way to fix the image would certainly help, but you still need to find a way to scan all the parts and put them in the right direction.

Moreover, some of those documents may have been destroyed for very valid reasons.


Sliding a bit off topic but what would be a valid reason for them to have been destroyed? Other than masking history, which I'm sure is valid for some, for most it's not a great thing.


The internal security agencies of repressive governments are very good in general at collecting people's secrets for use as weapons. There's no potential for harm that I can see in destroying such information, and considerable potential for harm in recovering it.


Didn't work for me either time I tried.


You are right; with the current algorithm it is quite difficult to fully unscramble the image into one single strip. For now, the best you can do is get a bunch (~20) of strips after annealing, and manually flip and rearrange them (in an image editor) to get back the original image.

The default temperature of 4000 is fine for all the example images at 600×400.

The default 30 million iterations leads to a reasonable run time of about 10 seconds, but leaves many small strips after annealing. Bumping up the number of iterations to 300 million, 3000 million, etc. will take proportionally longer time, and give marginally better results.


Based on the description in the webpage, I expected it to work on the default image with default parameters


As stated in the "notes" section, you need to set the number of iterations to about 1 billion, and also adjust the temperature according to the picture's characteristics.


Thank you for the demo. This helped me understand the recent quantum computing (quantum annealing) process a little more concretely.

What I saw here was a scrambled image being sorted to find a local minima of entropy in pixel color value between respective stripes. A quantum annealing computer is doing is, where the image is the problem space (scrambled because we dont have the solution), and the quantum computer rearranges this to find global minima/maxima.

Where a QC is not useful would be when the problem space is too big (image too big), or it can only find a local minima that isn't the solution to the classical problem (picture is still objectively after annealing)

please comment someone if my understanding is wrong


Awesome, but please put the original energy somewhere. It's not clear whether the final configuration actually has a lower or higher energy that the original image, which would be interesting to know.


> The unscrambling problem being solved here is much harder than a typical unscrambling problem, because we are rearranging single columns of pixels. As a consequence of this design, it is impossible to resolve the horizontal reflection symmetry ambiguity – every image has exactly the same energy score as its mirror image.

> In a more typical problem, the image would be sliced into groups of columns, where each group moves as a unit. This situation would reduce the number of pieces to unscramble, and also make the arrangement be unambiguous to horizontal flips.


IIRC, the best thing to do was to apply clustering to the rows/columns (k-means and hierarchical both worked, but I think it was hard to find the appropriate tree height for hierarchical clustering) and then to apply seriation: http://innar.com/Liiv_Seriation.pdf


Couldn't you do some feature detection on the edges of the resulting columns and repeat the annealing to get a better end-product?


Is there a definition of temperature that someone can share that would be relevant in this context?


probability of bouncing out of a local minimum




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: