

Training Hybrid Human-Machine Computer Vision Classifiers - genevievemp
http://blog.cs.brown.edu/2014/01/24/training-hybrid-human-machine-classifiers-1-24-2014/

======
Qworg
Work on "human in the loop" classifiers have been around for a long time.

That said, this is very interesting. The real innovation here is the untrained
humans feeding the expert system via simple visual identification. This
requires a preestablished taxonomy to be defined, so you won't get any
surprises, but it is perfect for any large scale classification that has fine
differentiation (like biology).

~~~
genevievemp
I'm working on how to seed this system without defining the taxonomy in
advance too. This would enable growing a library of classifiers that would
eventually be able to do a huge number of fine-grained classification, without
saying what those classifiers have to be beforehand. But setting up the
taxonomy in advance makes it so you don't end up training two classifiers that
do essentially the same thing.

~~~
Qworg
Just spitballing, could you do this by asking humans to differentiate two
images that have been classified the same way?

For example, for the pocket square set, once there's a reasonable single
cluster, could you randomly choose images (or ask "are these the same?") and
ask what is different about the two images.

You may not get a formal taxonomy, but after enough of this, you could have
something experts could work on. The problem, of course, is that without
expert level insight, you'll get a lot of "that one is prettier".

~~~
genevievemp
I've done a similar thing trying to create classifiers for scenes and
buildings. It's hard to have a word for 'bottom left corner of a window with
bars on it', for example. And for scenes there's no taxonomy in advance. I
think it works better than expected (as good as automatic patch discovery
methods), but you end up having to train a lot more classifiers because you'll
get a number of redundant classifiers.

Something that might be cool is to have a step where the common people try to
do the clustering, and an expert corrects them if they do something not
useful. Then there'd be a back and forth between experts and crowd people. I
haven't really thought that out though.

