
Where's Waldo? - bkaid
http://stackoverflow.com/questions/8479058/how-do-i-find-waldo-with-mathematica
======
sergeyk
This is a toy example of the kind of problem that the field of Computer Vision
is actively working on: object detection. In a (tiny) nutshell, our best
answer for general images and objects is:

1) Instead of using the full color pixel image, use an "edge image" with some
simple additional normalizations. If color is important, do this per color
channel.

2) Create a dataset with as many cropped examples of the target object as you
can find (mechanical turk is useful for annotating large datasets); every
other crop of every image is a negative example.

3) Train a classifier (SVM if you want it to work, neural network if you're so
inclined) using this dataset.

4) Apply the classifier to all subwindows of a new image to generate
hypotheses of the target object location. This can be sped up in various ways,
but this is the basic idea.

5) Post-process the hypotheses using context (can be as simple as simply
finding the most confident hypotheses within a neighborhood).

If you're interested in object detection, an excellent recent summary of the
recent decade of research is due to Kristen Grauman and Bastian Leibe:
[http://www.morganclaypool.com/doi/abs/10.2200/S00332ED1V01Y2...](http://www.morganclaypool.com/doi/abs/10.2200/S00332ED1V01Y201103AIM011)
(do some googling if you don't have access to this particular PDF).

A cool paper from a few months ago that should be mentioned when commenting on
a post called "Where's Waldo?" is
[http://www.cs.washington.edu/homes/rahul/data/WheresWaldo.ht...](http://www.cs.washington.edu/homes/rahul/data/WheresWaldo.html)

~~~
apu
Heh, I started reading this comment and was ready to jump on something I
disagreed with, but remarkably we're in full agreement!

Somehow I'm always surprised when two vision people agree on the right way to
approach a problem =)

------
TamDenholm
Something unrelated but perhaps interesting to some people, "Waldo" is
actually a localised name for the USA and Canada, his original name is Wally.

<http://en.wikipedia.org/wiki/Where%27s_Wally%3F>

~~~
robgough
It brings me an almost indescribable joy to find that Wally is the original
name. Yet I have no idea why.

Waldo always seemed a bit of a strange name, and it still confuses me why it
would be changed for the US market. Anyone know why (Wiki doesn't say).

~~~
tomjen3
Then you may cry when I tell you that his name is Holger in Denmark.

~~~
haraball
In Norway it's Willy. Kind of strange with all the different names for the
same character.

~~~
judofyr
Why is it strange? "Wally" (pronounced in "correct" Norwegian) sounds really
weird.

~~~
haraball
I should have written fascinating, not strange.

------
6ren
Are there other examples of it working? (if there were links, I couldn't see
them).

There's a danger of _overfitting_ , where a technique works for one instance
(or a subset of instances), but not in general. Detecting stripes could work
in general, but as a SO commenter noted, "Where's Wally" images often include
spurious stripes to undermine this detection strategy for humans.

~~~
qntm
To say nothing of that one Where's Wally image consisting entirely of
imperfect Wally impersonators, with one real Wally identifiable only because
has the correct hat, glasses and chin.

<http://ecx.images-amazon.com/images/I/61eiboxUJhL.jpg>

------
dice
The algorithm described by Heike is essentially just looking for striped red
and white shirts. Anyone who's done more than a couple of "Where's Waldo?"
games knows that striped shirts are often thrown in to draw one's eye. In
fact, in this very example there is another striped shirt (lower left corner,
just above the wall) which could very well have been Waldo that this algorithm
did not highlight. Without being able to recognize Waldo's human
characteristics (thin, glasses, strong chin) the approach described will
inevitably fail.

------
rgarcia
_I had to play around a little with the level. If the level is too high, too
many false positives are picked out._

I was impressed until I read that--the guy is basically fitting the
model/procedure to the training set (of size 1). I'd wait for a more general
approach before accepting the answer.

------
re
On NPR, this turns into: "an algorithm that can find Waldo in any image."

[http://www.npr.org/blogs/waitwait/2011/12/18/143865340/the-w...](http://www.npr.org/blogs/waitwait/2011/12/18/143865340/the-
wait-wait-snack-pack) via
[http://meta.stackoverflow.com/questions/116401/stack-
overflo...](http://meta.stackoverflow.com/questions/116401/stack-overflow-
mentioned-on-nprs-wait-wait-dont-tell-me-and-ny-times)

~~~
rcthompson
On "Wait, Wait, Don't Tell Me!", which is a comedy make-fun-of-the-news quiz
show. They exaggerate everything in that way for comedic effect.

------
ofca
Programming potential never ceases to amaze me. I want to learn more. NOW!

~~~
_mrc
You might want to check out ai-class.com - it includes an introduction to
computer vision (and plenty of other cool stuff).

------
kevinalexbrown
Cool. I've done some work on things like this before. Some of the things I do
to make it work on multiple images:

Template matching is your friend in this case, because most Waldos look
similar. You already tried this in a basic way by searching for the stripes of
a given color. You can make it more powerful by making the template include
more properties, and work in more contexts. For instance: what if Waldo's a
different size?

The other option is to pretend you don't know what Waldo looks like, find him
in a bunch of images, label the subimages as "waldo" candidates, measure
certain properties of those subimages, and find which of coordinates of
feature space have similar properties. Then use these properties as your
template.

Finally, you could train a classifier on subwindows like sergeyk suggested.
This has some difficulty because where's waldo images are difficult to
subdivide into subwindows on the scale of a single person. Do you move pixel
by pixel? Do you divide it into a grid? Each grid will contain weird parts of
people in each box. Etc. If you do find a way to divide the image into
"people" -- perhaps by doing a preliminary "person"-template sweep that
identifies locations of people in the image -- then you can use a supervised
learning algorithm to say "yes, this person is waldo" or "nope, FRWONG!",
based on the image properties in the subwindow around that person.

------
viscanti
This needs to be an augmented reality mobile app. The problem on the AI side
of things is that a good algorithm that reliably "learns" what Waldo looks
like would need a substantial number of examples.

A good solution to this would get close, then calculate the probabilities of
every "maybe-waldo" and then display the one with the highest probability of
being Waldo. An augmented reality app that highlighted Waldo on every page
would be awesome.

~~~
shabble
If you've got net access (or even if you don't), it seems almost plausible
that you could just identify the book/page in question and use a lookup table
of coordinates.

I don't know how many variations on the /Where's Wa[a-z]+\?/ theme have
actually been produced, though, so maybe it wouldn't be easier.

Then again, if you can upload unknowns, wait until you've got enough samples
to generate confidence, and then store the result, it'd scale/perform _much_
better :)

------
danso
Amusing application, but I'd like to see the version that finds Waldo on the
page in which everyone is wearing striped shirts

~~~
antics
In most normal applications, the only thing that would change is what your
features are. For example, if you wanted to find Waldo using the shape of his
face and/or hat, you would probably just find some SIFT points (or something),
and then build an eigenWaldoface, possibly using a PCA'd set of Waldo faces
and hats as examples, and then SIFT the image and look for the places that are
most like the eigenWaldoface.

This article is not interesting because it's an amazing new algorithm or
something that solves some important world problem. It's interesting because
it takes something that is not known among the general hacker population for
doing this sort of thing really easily, and accomplishes it in a fairly simple
way.

Don't be a grump, this is cool. :(

~~~
danso
Ha, I wasn't being a grump...these kinds of problems are An important party of
the evidence in showing thre practicality and usefulness of code. I love it.

I mostly wanted to see who else remembered that particular Waldo puzzle...it
was the final one in one of the books

~~~
ComputerGuru
I do. It took us an entire plane trip from Sacramento to Chicago to find him
there :)

------
brianbreslin
interesting problem. i'd like to then apply this concept of finding a needle
in a haystack to satellite imagery. Using super-computing + giant image data
sets, you could theoretically find some pretty obscure stuff if you knew what
you were looking for (hidden treasures???).

------
jastr
This is undoubtedly a data point on the path to the singularity.

