
Instagram Engineering Challenge: The Unshredder - mikeyk
http://instagram-engineering.tumblr.com/post/12651721845/instagram-engineering-challenge-the-unshredder
======
snotrockets
There's a famous anecdote about human unshredding: when the Iranian
revolutionary took over the US embassy in 1979, they captured the shredded
remains of secret documents. They took those shredding to the carpet weavers
Iran is so famous of, who manually rewoven the shreds into the original
documents (See
[http://en.wikisource.org/wiki/Documents_seized_from_the_U.S....](http://en.wikisource.org/wiki/Documents_seized_from_the_U.S._Embassy_in_Tehran/Shredded_Documents))

~~~
blumentopf
The German government is funding a project to algorithmically unshred the
files that the Stasi (East Germany's secret police) destroyed in 1989 when the
Berlin wall came down and protesters started to occupy their buildings:

<http://news.bbc.co.uk/2/hi/6692895.stm>

~~~
ppk
Yeah it was fascinating, there's a whole article on it from Wired:

[http://www.wired.com/politics/security/magazine/16-02/ff_sta...](http://www.wired.com/politics/security/magazine/16-02/ff_stasi?currentPage=all)

------
icki
Interestingly enough, DARPA is running a similar challenge (their prize is a
lot better): <http://www.darpa.mil/shredder_splash.aspx#Splash>

~~~
hopeless
I knew I'd heard of a similar challenge. If I was a cynic, I'd say instagram
would take the winning solution and enter it in the DARPA challenge. Crowd-
sourced competition entries!

~~~
jvandenbroeck
The DARPA challenge is much more complicated, the Instagram challenge (which
is accompanied by a solution!) is child's play.

~~~
sprobertson
Though the apparent ease of the latter could potentially inspire some
brilliant approach that ends up being the basis for a solution to the former.

------
uniclaude
I thought it would be a fun thing to do tonight until I saw all the tips given
at the end of the article. They basically give the solution, it's not funny
anymore :(.

~~~
stevelosh
So make it more fun:

* Support crosscut shredders (shredded in both directions).

* Make it work even if the shreds aren't all the same size.

* Make it work when there are some pieces missing.

* Show where those missing pieces would probably go.

* Make it work when two images' shreds are mixed together.

* (with missing pieces)

* (generalized to N images)

* Predict what the missing pieces might look like (this one is much tougher).

* Think of ways you could cheat (like searching Tineye for the fragments to find the original image, if it was originally online somewhere).

You can keep going as long as you want. If it's too easy, just make it harder.

Edited because HN doesn't support Markdown or linebreaks. :(

~~~
uniclaude
_If it's too easy, just make it harder._

Absolutely.

Wa can make it harder in many creative ways, and that would be fun, but I
wrote this comment to express the feeling of disappointment I had when I saw
the post. I felt a little bit like someone showed me a candy but already ate
half of it.

------
asharp
Warning, spoiler?

My first take on it was the following: Take the average of the differences
between each pixel and the pixel adjacent to it over each column and store
that as, say, X[]. Find some maximum number such that the sum of every nth
column of X - the sum of all other rows of X is maximised. That's the column
width. Split into a series of columns, use stable matching to match columns
based on the sum of differences between rightmost male pixel and leftmost
female pixel over all rows. Use stable marrage to give you a partial ordering,
turn into a complete ordering and unscramble the image.

Any better/more elegant solutions?

~~~
Luyt
_"sum of differences between rightmost male pixel and leftmost female pixel
over all rows"_

I tried this, and it worked for a few test pictures I downloaded, but it fails
for the given Tokyo panorama: the dark edges and sharp corners in some
buildings confuse the algorithm. The matching must be done on somewhat more
than pixel values alone...

EDIT: See how my implementation fails:
[http://www.michielovertoom.com/incoming/TokyoPanoramaUnshred...](http://www.michielovertoom.com/incoming/TokyoPanoramaUnshredded18.png)

~~~
webfuel
Can you post some of your test pictures?

Re: your attempt: I think you just need to find the starting strip, looks like
your algorithm is working.

~~~
Luyt
The starting strip is, by definition, the strip on which no other strip has
been placed left of. There is no other way of determining what the starting
strip is; it's not a given. The problem is that with my current pixel matcher
(sum of absolute differences) one strip is wrongly attached. That made me
think I also have to take other features in consideration, like color, hue
and/or line detection. But that seems outside the scope of this challenge.

The algorithm works fine for some random pictures I found on the web. It's
just not working for the Tokyo picture ;-(

~~~
webfuel
Minor spoiler/hint:

My algorithm, given a strip, would tell you which strip was on the right and
also gave a certainty. The last strip would have a "next strip" value shared
with an earlier strip (thus wrongly attached) but with less certainty.

Edit: Or (worst case) the last strip would have a high degree of uncertainty.

~~~
Luyt
Currently I'm using a matrix to determine adjacency, but I'll give this graph
idea a try. Thanks for the tip!

PS. Have you actually realized a succesful implementation? (Just curious)

~~~
webfuel
No problem! Yes, I have this working:

<http://webfuel.org/instaunshredderv2.php>

It also works on the test images I created:

<http://webfuel.org/a.png> <http://webfuel.org/b.png>
<http://webfuel.org/c.png>

------
______
I find it strange that they don't provide an interface that the desired
program should implement. For example, "provide the input image as a command
line parameter to the script / binary, write the output to out.jpg"

Because of this, is someone going to actually take the time to understand what
assumptions people make to run the program? If there was an interface, they
could just run a script to test submissions...

------
lunaru
The tips seem to be just a crude pixel matcher. Seems much more elegant to
transfer the image into the frequency domain via FFT and find edges that way.

~~~
jamesjyu
No way! You'd lose all spatial information in the frequency domain.

~~~
fooandbarify
That is false. (No information is lost in the Fourier Transform.) However, I'm
not sure how Fourier analysis would help in this case - image processing is
not my forte. Perhaps there would be a spike at the spatial frequency where
the discontinuities occur (given that the discontinuities are evenly spaced)?

EDIT: Given the slices, is it possible to order them properly entirely in the
frequency domain?

~~~
jamesjyu
You are correct: no information is lost in the Fourier Transform. What I meant
is that it actually becomes harder representationally to compare localized
amplitudes of the time domain information in the frequency domain. And
typically, what you're looking at the frequency domain will tell you nothing
about location data in the original frequency.

Here's why:

When you do the Fourier Transform (or in this case, the Discrete Fourier
Transform (DFT)) you turn your time domain function into a frequency domain
function. This function resides on the complex plane, with a real and
imaginary part.

Typically, you're doing the transform to get at the information in the
magnitude, which represents the frequency information in the original time
domain function. To get at the magnitude, you do: sqrt(Re^2 + Im^2).

Now, when you're looking at the magnitude, you have lost location information
entirely about the original signal. Rather, you now have perfect frequency
information. (yes, combined with the phase: atan(Re/Im), you can reconstruct
the original signal).

Here's an example. Imagine a time domain function x(n) where x(0) = 1, and
x(n) = 0 everywhere else. This is just a spike at 1.

The magnitude of the DFT of that function is f(n) = 1 everywhere! In the time
domain, you have total localization of the signal all in one point. But, in
the frequency domain you have the opposite (constant 1 everywhere)!
Intuitively, the reason for this is because in order for sines and cosines to
represent such a localized function, you have to add up a lot of them in order
to cancel each other out and produce such a localized signal.

tl;dr:

When you're comparing localized values in the time domain, the FT will most
likely not be the tool to use. The FT gives you frequency information, without
spatial information unless you take into account the phase. In this instagram
puzzle example, the information about the pixel values at the edges of the
strips have now been "spread" across many frequency values in the freq domain.

EDIT:

Another example is x(1) = 1 and x(n) = 0 elsewhere. This is just a time
shifted version of the function given above. The magnitude of the DFT is
_still_ just a constant 1 everywhere. No difference! Only the phase differs.

~~~
fooandbarify
Now we are talking past each other ;) You're right, of course, which is why I
said I'm not sure FT would be useful in this case. I was giving the
grandparent the benefit of the doubt, however: it is at least conceivable to
me that the _evenly spaced_ discontinuities in the shredded image (due to the
uniform slice width) could present as a frequency spike (or rather, a series
of them) in the FT of some function of the input (the derivative, perhaps) in
the same way that the FT of a Dirac comb is also a Dirac comb. Wild
speculation, of course, because like I said - image processing is not my
forte.

------
fferen
Isn't this extremely obvious? Couldn't you just take the first slice,
calculate the total color difference between its right edge pixels and the
corresponding left edge pixels of each of the other slices, and stick it next
to the one with least difference, then repeat?

I mean, you'd need to specify some threshold to handle slices on the ends, and
maybe use sum of (difference^2) instead of just the sum, but it shouldn't be
hard at all. Am I missing something?

~~~
lunaru
That would be slow if you didn't know the slice size (the bonus part of the
question), but no, you're not missing anything for the basic solution. That
said, it would be O(n^2) since you're comparing every slice to every other
slice.

~~~
bermanoid
The general problem is fairly easily isomorphic to traveling salesman, so if
you can get even O(n^2) you're already making simplifying assumptions that
mean you can't solve the general case. Further, with n=20, I wouldn't usually
worry too much about asymptotic performance unless you're in exponential
territory.

~~~
bermanoid
Actually, on second thought, this isn't quite so easily isomorphic to
traveling salesman (any particular "order shredded strips" problem is easily
equivalent to a traveling salesman problem, but I'm not positive that an
arbitrary traveling salesman problem can be re-cast as an "order shredded
strips" problem in a consistent manner, so I shouldn't make that claim). So I
retract my statement, it's quite possible that someone could effectively solve
this problem for reasonable inputs without quadratic runtime (though I'm not
positive exactly how you would do it, and the efficacy of your method would
likely depend on your assumptions about the inputs).

------
joshmlewis
Isn't this like the bunny slope of Darpas shredder challenge?

------
wallflower
Interesting. A 90 degree phase change makes a huge difference

> For maximum security, documents should be shredded so that the words of the
> document go through the shredder horizontally (i.e. perpendicular to the
> blades). Many of the documents in the Enron Accounting scandals were fed
> through the shredder the wrong way, making them easier to reassemble.

------
username3
Seems like they're trying to fix the rolling shutter effect on iPhones or
shredded photos the iPhone takes sometimes.

------
vedran
For anyone who's thinking of using Ruby I recommend OilyPNG instead of
RMagick. It's a c extension to ChunkyPNG and is much faster if you don't need
all of the fancy filters that come with ImageMagick.

<https://github.com/wvanbergen/oily_png>

------
tucaz
Looks like I was able to do it. I'm not sure if they will be happy to get a
result in C#, though.

Also, not as easy as I thought it would be. A friend had to give me some help
with the math to match the stripes.

Edit: How long did you guys take to figure this out? I took nearly 4 hours.

------
stuyguy
You know you're done if you understand what this image means (and how it was
made): <http://i.imgur.com/Kn7Za.png>

------
BarkMore
Although Instagram promised a brand-spankin’ new Instagram shirt if you
complete the challenge correctly, they are only sending out stickers.

I hope they are not another Zynga :).

------
zx2c4
It works! Here's a demo: <http://www.youtube.com/watch?v=hILm9oojA48>

~~~
zx2c4
Oh, and here's the source: <http://git.zx2c4.com/instagram-
unshredder/tree/unshred.py>

------
vidoss
Did anyone manage to match the 12th shred next to 6th ? (zero index). That
white and black building is really messing it up.

~~~
vidoss
Ok. Got it working... See demo <http://vidoss.github.com/>

------
dasil003
Some of those Instagram filters look like they make the job easier.

------
st3fan
Doesn't DARPA have a similar challange?

------
razzaj
in the sample they provided shreds seem to be 25 pix and not 32

~~~
eCa
You want to use the _linked_ image [1], not the one they show in the post.

[1] [http://instagram-
static.s3.amazonaws.com/images/TokyoPanoram...](http://instagram-
static.s3.amazonaws.com/images/TokyoPanoramaShredded.png)

------
weka
and this is why I need to start learning python.........

