I appreciate that you guys might not be statisticians, but if you're going to try and analyze data like this, you simply must address survival bias. As it stands, these data are meaningless unless you assume dropouts are completely unrelated to your screening.
You claim doing well on the Fizzbuzz wasn't correlated with interview performance, but you also said "We saw twice the drop off rate on the coding problems as we saw on the quiz."
An alternate explanation for your finding then is, more low-quality candidates drop out of the process when given FizzBuzz, leaving a relatively homogeneous pool of higher-quality candidates for the later interview. This effectively reduces the ratio of meaningful interindividual differences relative to noise, which will reduce the correlations.
In all likelihood, both of your correlations are low, but the idea that coding is less predictive than a quiz could be purely a statistical fluke due to survivor bias.
All candidates did both screens (quiz and fizzbuzz). The correlations were calculated against the same population. Now, I agree that survivor bias could affect the quality of these results (we know nothing about the significant % of people who dropped out). But it's not really possible to solve that problem outside of a lab. I don't think it's an argument to not do analysis. For now we're simply trying to minimize the dropoff rate, and maximize correlation. The quiz was better at both.
Well, the candidates doing both screens is better, but it doesn't totally solve your problems.
It doesn't address the survival bias issue, and when you say a significant percentage dropped out, that's not reassuring. But it's not the case that you need a lab to solve the problem. Even a basic questionnaire of programming ability self-assessment might tell you if there are meaningful differences in the population that quits your process. At the very least, you should understand and talk about survival bias in your article to indicate you're aware of the issue.
Even if you still want to claim a difference between the quiz and coding exercise, you're not yet in the clear. For example, did you counterbalance the order you gave them to people? E.g., if everybody did the quiz first and the fizzbuzz second, that meant they were mentally fresher for the quiz and slightly more tired for the fizzbuzz, which could again create a spurious result. And this definitely doesn't require a lab to test.
Don't misunderstand me, I appreciate your attempts to quantify all this, and I actually think you guys have roughly the correct result (given the limited nature of fizzbuzz-style coding), but when you step into the experimental psych arena, you need to learn how to properly analyze your data. Given that your business is predicated on analyzing the results of how your hires do in the real world, you need to really up your analytical game.
I have to agree with kingmob. It very much sounds like survivor bias. My first reaction is that anyone who drops out due to a test has a high likelihood of dropping out because they can't do the test, which would leave you with a low correlation test when compared with the survivors, but a very high anti correlation with the total population.
I read a blog post a couple years ago by a game programmer/designer who outsources a lot of work through places like odesk/elance. Basically his thing was to weed out the fakers, he'd offer anyone ~5hrs at their bidding rate to finish a predefined programming task expected to take ~5hrs. He says this will usually drop his pool to less than 10 out of the hundreds who may apply, and he can usually use at least one of the people who complete the task. It's hard to say how many of these people go away because the task looks too big, and there's risk of not getting paid, but it's clearly a good filter for him.
As far as measuring this survivor bias, you might gain some insight by randomly altering the order of the testing. You could measure when people tended to drop off. You might even find that people all tend to drop off around the same amount of time, or maybe after some certain amount of effort. It might even be worth paying people to see if that would improve completion rates (while introducing it's own biases).
This is not the kind of comment HN needs more of. A better version would (a) drop the snarky putdown, and (b) actually say what Google's conclusions are. Then readers could decide for themselves to what degree those findings contradict these, instead of being told what to think.
You claim doing well on the Fizzbuzz wasn't correlated with interview performance, but you also said "We saw twice the drop off rate on the coding problems as we saw on the quiz."
An alternate explanation for your finding then is, more low-quality candidates drop out of the process when given FizzBuzz, leaving a relatively homogeneous pool of higher-quality candidates for the later interview. This effectively reduces the ratio of meaningful interindividual differences relative to noise, which will reduce the correlations.
In all likelihood, both of your correlations are low, but the idea that coding is less predictive than a quiz could be purely a statistical fluke due to survivor bias.