So why is this completely unrelated activity so important for measuring football players? Because it takes ten seconds to measure, and a guy who can run a 4.2 forty can almost always play pretty damn well.
Tech interviews today are a low cost way of getting a rough idea of ability to code. Judging from Google's output, it's pretty damn effective for them.
Similarly, coding algorithms on whiteboards doesn't tell you much other than how good the candidate is at coding algorithms on whiteboards. Given that the vast majority of their revenue comes from either search (which was built over a decade ago) or companies they've acquired, and the numerous well-hyped failures (Buzz, Wave, +, etc.), I don't think it's actually working all that well for Google either.
So is it more likely that Google is filled with idiots that can't see the err of their ways, or that, across tens of thousands of individual hires, this is the most cost effective and produces the best results (minimizing false positives) on average.
College-admissions style interviewing just doesn't make sense for a company like Google.
The players invited to the combine are the ones teams are considering drafting anyway; all the 40 times do is move players up or down the list by generally small amounts. The point isn't that 40 times are useless, it's that they provide very little additional information about a player. Champ Bailey was going to be a high draft pick no matter what he did at the combine, and everyone already knew that Trindon Holliday was fast but probably too small to succeed in the NFL.
Likewise, someone with a 3.9 from MIT or a bunch of good open source work who's coming for an in-person interview is already qualified, and the whiteboard doesn't tell you anything new. I'd guess Google sticks with them for the same reasons teams tout 40 times - it's good marketing both internally (making decisions seem less arbitrary) and externally (look how tough our interviews are is a more socially acceptable way of saying look how smart we are), and it allows people to deflect blame if a hire doesn't work out. Judging by the number of posts about Google interviews I see here and elsewhere, the marketing is certainly successful.
Also, I think your view of the interview pool is somewhat skewed. Most of the candidates I see do not have 3.9s from MIT (BTW, I believe MIT has a 5-point GPA, so it really would be 4.9), and a lot didn't go to Ivy-League universities.
I'm kinda curious what it'd look like if you took the 2002 version of Google and used it on today's Internet. My guess is it would feel incredibly dated and virtually useless because of spam. We have a couple archived UX studies that were done with the old (pre-2010) UI; I remember that when we launched everybody said "Eww, I hate the new UI. Why change a good thing, Google?" and now when they look at the old UI they're like "Omigod, I can't believe I ever managed to look at that. It's like something straight out of 1998."
That's a pretty fantastic hit rate, especially given how rare it is for any draft pick to work out. Similarly, if Google tries a bunch of projects of which only 10% are expected to work, but 20% of them end up working, then they still did a great job even though 80% of their projects are failures.
For instance, 225# reps, vertical leap, reach, etc (all NFL combine measurements). Great, you are measuring and ranking, but are you doing anything meaningful? Are you actually looking at the right things? Probably not.
If they were, Wes Welker, Tom Brady, Victor Cruz (all in the past superbowl) would have been first round picks, probably top 10. Not undrafted, or late round picks. And JeMarcus Russell and Ryan Leaf wouldn't have been drafted #1 and #2 overall (all the physical tools but not the mental - flame outs).
The point is that we as people tend to think we know what to measure and track, but we likely don't. Frankly, we are probably making it up on the fly and we convince ourselves and others that these are good measures, and until someone figures out the next best thing, they actually are. But, as I said, they are probably not the best, or maybe even good, in the infinite wisdom sense.
But, think about it this way. Both football and programming have ways we can actually see if someone can do what they say that can do: literally, look at the film. Look at game day film on a player. Look at github or other places for programmers. And, just like with football players, give them REAL scenarios that test their ability to think through a problem, in real time. Do the same with a programmer. Put these two together and you get rid of the people who can't think (Russell and Leaf) and you get rid of the "physical specimens" that can't play (Most any Oakland WR drafted under Davis).
I personally think this is a much better way that weeds out the most people. Refine your questions and technique and you can spot the people who can and can't perform pretty quickly. If you are unsure, give them a simulated game (programming problem at home) to see what they can do.
The 40yard is a better metric of the ability
to change directions and accelerate.