

Ask HN: Resources for image pattern recognition algorithms - ChadB

I am interested in extracting a set of objects from a large, noisy image. For instance: if I had a high-resolution picture of a seating section at a stadium, I might want to extract each baseball cap into individual images given a second picture of a prototypical baseball cap.<p>There is obviously a lot of research published on this subject (by subject I mean image pattern recognition as a whole). I've been pouring through the IEEE, as well as just searching via Google for a couple of days.<p>My problem is that this is so far from my field of expertise, I feel like I'm wasting an awful lot of time reading research papers that are not applicable or are outdated.<p>So my question is, if anyone has any experience in this field or a related field, are there a set of known problem definitions I could narrow down my research into? What about known classes of algorithms (and by that I mean more specific than "machine learning" or "neural networks")?<p>Any advice is mightily appreciated.
======
mahmud
You are right you're wasting time. Image processing is not the sort of a thing
you can pick up on your own "to get work done"; it's extremely demanding and
actually fun. Just hire someone.

If you're prone to falling into "hack mode", learning this stuff will not help
you one bit. The problems are far too interesting and encompass wide areas of
research that are guaranteed to please everybody; low level bit manipulation,
file formats, numerical methods, signal processing stuff with filters and
sampling, wavelets, edge detection, rank, laplacians, convolution, dithering,
ray tracing, morphology, neural networks and other genetic learning
algorithms, heaps of inner-loop and vector optimization, scene detection stuff
that make use of ray tracing plus interesting data structures like octrees,
statistics, information theory .. in a nutshell, it's something to give up
work, wife and kids for.

Don't ever let curiosity drag you into that tar pit, hire someone.

------
aswanson
I could consult our very own pixcavator:
<http://news.ycombinator.com/user?id=pixcavator>

<http://inperc.com/wiki/index.php?title=User%27s_introduction>

~~~
pixcavator
I wish I could give a "positive" advice but best I can say is that what you
describe is a tough, tough problem. There is no off-the-shelf solution and if
you are new at this, you are up for a lot of pain with a slim chance of
success. Sorry, can’t be more helpful.

~~~
ChadB
I suppose it depends on the definition of "success". Learning about the state
of the art and writing some enlightening code, no matter its ultimate
usefulness, would be plenty successful for me.

I'm mostly surprised at the lack of papers I'm able to find that have been
published after the mid '90s.

I will give major kudos to you though, sir. I've spent the last hour or so
reading through the wiki. Please, keep up the great work.

------
dryicerx
I am quite interested to learn more about object recognition as well,
unfortunately I don't have any direct answers to your question.

May I suggest your search for "Content Based Image Retrieval" for research
papers in that field.

~~~
ChadB
Yes, I've done a bit of searching on CBIR actually. There are a few papers out
there about querying at multiple resolutions (like
<http://grail.cs.washington.edu/projects/query/mrquery.pdf>), which will
definitely be helpful.

The next step, for me, is to use a single image as the set to be queried,
without knowing at what scale the target objects appear in the image.

------
bravura
I am not a vision specialist, but I work around a lot of them.

Look into the OpenCV image processing library. I haven't used it, but it seems
to implement a lot of basic functionality to get you off the ground.

If you can't find anything good from the last ten years, then look at Yann
LeCun's recent papers. (Google wanted him to be head of research, but he
preferred academia.) In particular, investigate his convolutional networks.

The work of Rob Fergus is more applied, and should lead to good recent
pointers.

Look for works experimenting with the NORB dataset.

------
ucla_jatt
You might want to check out open cv library. They have this thing called the
haar classifier/training. You train it to recognize an object and then it can
look for that object in other images. Here is one example how it was used to
recognize sign language. [http://sandarenu.blogspot.com/2008/06/opencv-
computer-vision...](http://sandarenu.blogspot.com/2008/06/opencv-computer-
vision-library.html)

