Turns out they are just splitting images up and rearranging them - not as interesting.
(see for eg. http://en.wikisource.org/wiki/File:Espionage_den02_79.png)
* Support crosscut shredders (shredded in both directions).
* Make it work even if the shreds aren't all the same size.
* Make it work when there are some pieces missing.
* Show where those missing pieces would probably go.
* Make it work when two images' shreds are mixed together.
* (with missing pieces)
* (generalized to N images)
* Predict what the missing pieces might look like (this one is much tougher).
* Think of ways you could cheat (like searching Tineye for the fragments to find the original image, if it was originally online somewhere).
You can keep going as long as you want. If it's too easy, just make it harder.
Edited because HN doesn't support Markdown or linebreaks. :(
Wa can make it harder in many creative ways, and that would be fun, but I wrote this comment to express the feeling of disappointment I had when I saw the post. I felt a little bit like someone showed me a candy but already ate half of it.
Does that exist already?
I agree with you, and I know that you have some serious knowledge about that, but seriously, reading HN makes me feel that this hiring pool would be filled only with programmers that could solve any type of code-puzzle in their sleep.
I noticed they did remove on of the functions they had listed in the hints at the bottom, possibly to make things a bit more challenging.
My first take on it was the following:
Take the average of the differences between each pixel and the pixel adjacent to it over each column and store that as, say, X.
Find some maximum number such that the sum of every nth column of X - the sum of all other rows of X is maximised. That's the column width.
Split into a series of columns, use stable matching to match columns based on the sum of differences between rightmost male pixel and leftmost female pixel over all rows.
Use stable marrage to give you a partial ordering, turn into a complete ordering and unscramble the image.
Any better/more elegant solutions?
I tried this, and it worked for a few test pictures I downloaded, but it fails for the given Tokyo panorama: the dark edges and sharp corners in some buildings confuse the algorithm. The matching must be done on somewhat more than pixel values alone...
EDIT: See how my implementation fails: http://www.michielovertoom.com/incoming/TokyoPanoramaUnshred...
Re: your attempt: I think you just need to find the starting strip, looks like your algorithm is working.
The algorithm works fine for some random pictures I found on the web. It's just not working for the Tokyo picture ;-(
My algorithm, given a strip, would tell you which strip was on the right and also gave a certainty. The last strip would have a "next strip" value shared with an earlier strip (thus wrongly attached) but with less certainty.
Edit: Or (worst case) the last strip would have a high degree of uncertainty.
How do you calculate your certainty?
PS. Have you actually realized a succesful implementation? (Just curious)
It also works on the test images I created:
Because of this, is someone going to actually take the time to understand what assumptions people make to run the program? If there was an interface, they could just run a script to test submissions...
EDIT: Given the slices, is it possible to order them properly entirely in the frequency domain?
When you do the Fourier Transform (or in this case, the Discrete Fourier Transform (DFT)) you turn your time domain function into a frequency domain function. This function resides on the complex plane, with a real and imaginary part.
Typically, you're doing the transform to get at the information in the magnitude, which represents the frequency information in the original time domain function. To get at the magnitude, you do: sqrt(Re^2 + Im^2).
Now, when you're looking at the magnitude, you have lost location information entirely about the original signal. Rather, you now have perfect frequency information. (yes, combined with the phase: atan(Re/Im), you can reconstruct the original signal).
Here's an example. Imagine a time domain function x(n) where x(0) = 1, and x(n) = 0 everywhere else. This is just a spike at 1.
The magnitude of the DFT of that function is f(n) = 1 everywhere! In the time domain, you have total localization of the signal all in one point. But, in the frequency domain you have the opposite (constant 1 everywhere)! Intuitively, the reason for this is because in order for sines and cosines to represent such a localized function, you have to add up a lot of them in order to cancel each other out and produce such a localized signal.
When you're comparing localized values in the time domain, the FT will most likely not be the tool to use. The FT gives you frequency information, without spatial information unless you take into account the phase. In this instagram puzzle example, the information about the pixel values at the edges of the strips have now been "spread" across many frequency values in the freq domain.
Another example is x(1) = 1 and x(n) = 0 elsewhere. This is just a time shifted version of the function given above. The magnitude of the DFT is still just a constant 1 everywhere. No difference! Only the phase differs.
I mean, you'd need to specify some threshold to handle slices on the ends, and maybe use sum of (difference^2) instead of just the sum, but it shouldn't be hard at all. Am I missing something?
Hint: very often, when you think "isn't it extremely obvious?", you've come up with a greedy algorithm and it's not the right one. :)
> For maximum security, documents should be shredded so that the words of the document go through the shredder horizontally (i.e. perpendicular to the blades). Many of the documents in the Enron Accounting scandals were fed through the shredder the wrong way, making them easier to reassemble.
Also, not as easy as I thought it would be. A friend had to give me some help with the math to match the stripes.
Edit: How long did you guys take to figure this out? I took nearly 4 hours.
I hope they are not another Zynga :).