
The Naive Approach to Hiring People - muriithi
http://weblog.raganwald.com/2008/02/naive-approach-to-hiring-people.html
======
cstejerean
Interesting idea. But I wonder how much it helps to use a classifier for
processing resumes. If you have a high number of resumes to process it makes
sense to attempt to automate it. But if the volume you are processing is low
enough you might be able to get by with letting people make decisions.

For example I can tell way better than Google's spam filtering software
whether an email is spam or ham. The only reason to use filtering in that case
is to save me the hassle of having to dismiss 50+ spam messages per day.

The human mind is pretty good at picking up patterns (even without trying) so
after a while experienced interviewers start to see what kind of people worked
out and what kind of people they hired and later regretted it. I guess using a
naive classifier would eliminate some bias towards things like top colleges
and years of experience.

The real problem with resumes however is that anyone can put absolutely
anything on it. So it's a bad source for training a classifier. It would be
interesting however to measure things like the probability of your
interviewers making correct decisions, or (if you use standard questions) the
correlation between answers to questions and performance (in the event of a
hire).

Ultimately I think I agree with Joel that the only proper way to decide if
someone is worth hiring is to have a technical discussion with them and see
how it goes. Getting a feel for how someone writes code is even more important
(but I'm not necessarily sure that putting people on the spot in from of a
whiteboard is the best way). After someone has been hired I can usually tell
withing several minutes of working with them on real production code whether
they were a good addition to the company.

~~~
raganwald
Thank you. No, that is not strong enough: THANK YOU!!!

> It would be interesting however to measure things like the probability of
> your interviewers making correct decisions, or (if you use standard
> questions) the correlation between answers to questions and performance (in
> the event of a hire).

That was my objective when writing the post, although judging by responses on
reddit, most people thought I was saying that naive bayesian filters could
outperform human hiring managers.

All I was suggesting is that people "Apply what we know about classification
to hiring." And one of the things we know is that _any_ classifier--human or
machine or human assisted by machine--can be improved by regular investigation
of the correlation between the features you observe and the results you
obtain.

> The real problem with resumes however is that anyone can put absolutely
> anything on it.

True, although security by obscurity works for hiring. It's a lot more like
CAPTCHAs than it is like Spam: if you are the only one who thinks that Io and
Self language experience correlates positively with performance as a
Javascript programmer, I think you will find that Io and Self are keywords
that will maintain a high correlation.

Then one day Joel Spolsky writes a post about that, someone reads it and puts
it in a job listing, a head hunter sees it and adds it to some candidate
resumes, and within a year the correlation drops off the map and you have to
start looking for something else in resumes or you have to aggressively
fizzbuzz it in phone screens.

C'est la vie.

~~~
cstejerean
Alright, you've cleared up some of the misconceptions I had about your post.
Great point regarding the security by obscurity in resumes.

If I get a chance I might write a script to search the web for publicly
accessible resumes and track changes in keywords over time. Might be able to
pick up some interesting trends.

