

Content-based image classification in Python - glamp
http://blog.yhathq.com/posts/image-classification-in-Python.html

======
droz
Well written piece, kudos to the author.

If we are uploading cheques and licenses, why not just have a form with the
respective upload sections for each document. Sure, people might get the two
backward by accident, but is that error rate really any worse than the false
positive rate of the proposed solution.

If you are going to go the machine learning route, then it really only makes
sense if you go all the way and do some OCR on the documents to pick out the
meaningful information to prepopulate some input screen that would likely
appear after submitting those documents.

~~~
wicknicks
Yes, content analysis can be greatly simplified using symbolic inputs. GPS
Tags in images, User input in OP's check example and most industrial pipelines
used sophisticated barcode scanners to scan objects very quickly, but the
cameras are at a fixed distance/height from the object and the objects are
aligned in a particular direction. Some purists would argue that this is not
computer vision, but it makes for some excellent engineering, and solves some
hard problems very well.

------
lamp344
I find Computer vision so interesting. Does anyone know if there are many
consulting opportunities dealing with it?

Or is it a really crowded fields because it's so interesting to so many
people. (I never see people looking for computer vision researchers in the
who's hiring threads)

~~~
zwieback
I used to work as a machine/computer vision engineer but that was before the
web was big so the image retrieval opportunities didn't exist.

I worked mostly in small specialty companies doing inspection systems
(sorting, metrology, controls, robotics) and then for many years at HP in
manufacturing. In those positions computer vision was part of the job so it
often didn't show up in the initial keyword searches. Most of the jobs
required some additional skills, most often optics and some controls or
automation.

If you broaden your search to include robotics and automation you may find
more. Also, medical imaging is fascinating and of course there's defense but I
didn't want that.

------
te
For this particular problem, how does the proposed solution compare to just
classifying based on image size?

------
balakk
I'm kinda curious why they didn't use any of the popular local feature
descriptors to cluster.

------
piqufoh
So why do I use yhat? Won't this run quite well locally?

~~~
glamp
you're right it does work locally.

yhat is built for embedding these types of analyses in a production
application

~~~
primaryobjects
Couldn't you also serialize the resulting model to JSON (including
weights/values), stick that JSON hard-coded into your source, and deserialize
at runtime to execute the model? This is how I typically deploy neural
networks and other machine learning models.

For example, [http://colorbot.herokuapp.com](http://colorbot.herokuapp.com)
stores the optimal model for the neural network in JSON, right in the code.

