The submitted site seems to be having trouble keeping up with the referrals from Hacker News, judging by its response time. (And that is amusing for a site called "refer.ly" that is all about user referrals. The site still needs upgrading to work at scale.)
The question is "I want you to explain something to me. Pick any topic you want: a hobby you have, a book you’ve read, a project you worked on–anything. You’ll have just 5 minutes to explain it. At the beginning of the 5 minutes you shouldn’t assume anything about what I know, and at the end I should understand whatever is most important this topic."
I have to say that the proposed job interview question is interesting, and the thought process outline in the blog post kindly submitted here for evaluating answers to the question is also interesting. That made me think of a way to evaluate the hiring procedure mentioned in this blog post--do empirical validation of whether people hired through that procedure really do better work over the course of their career than people hired through other procedures. That's the scientific way to look at what hiring procedure to use.
Here, in a FAQ that should take less than five minutes to read for a native speaker of English, is what is most important about what science has validated on the topic of company hiring procedures. The review article by Frank L. Schmidt and John E. Hunter, "The Validity and Utility of Selection Models in Personnel Psychology: Practical and Theoretical Implications of 85 Years of Research Findings," Psychological Bulletin, Vol. 124, No. 2, 262-274
sums up, current to 1998, a meta-analysis of much of the HUGE peer-reviewed professional literature on the industrial and organizational psychology devoted to business hiring procedures. There are many kinds of hiring criteria, such as in-person interviews, telephone interviews, resume reviews for job experience, checks for academic credentials, personality tests, and so on. There is much published study research on how job applicants perform after they are hired in a wide variety of occupations.
EXECUTIVE SUMMARY: If you are hiring for any kind of job in the United States, prefer a work-sample test as your hiring procedure. If you are hiring in most other parts of the world, use a work-sample test in combination with a general mental ability test.
The overall summary of the industrial psychology research in reliable secondary sources is that two kinds of job screening procedures work reasonably well. One is a general mental ability (GMA) test (an IQ-like test, such as the Wonderlic personnel screening test). Another is a work-sample test, where the applicant does an actual task or group of tasks like what the applicant will do on the job if hired. (But the calculated validity of each of the two best kinds of procedures, standing alone, is only 0.54 for work sample tests and 0.51 for general mental ability tests.) Each of these kinds of tests has about the same validity in screening applicants for jobs, with the general mental ability test better predicting success for applicants who will be trained into a new job. Neither is perfect (both miss some good performers on the job, and select some bad performers on the job), but both are better than any other single-factor hiring procedure that has been tested in rigorous research, across a wide variety of occupations. So if you are hiring for your company, it's a good idea to think about how to build a work-sample test into all of your hiring processes.
Because of a Supreme Court decision in the United States (the decision does not apply in other countries, which have different statutes about employment), it is legally risky to give job applicants general mental ability tests such as a straight-up IQ test (as was commonplace in my parents' generation) as a routine part of hiring procedures. The Griggs v. Duke Power, 401 U.S. 424 (1971) case
interpreted a federal statute about employment discrimination and held that a general intelligence test used in hiring that could have a "disparate impact" on applicants of some protected classes must "bear a demonstrable relationship to successful performance of the jobs for which it was used." In other words, a company that wants to use a test like the Wonderlic, or like the SAT, or like the current WAIS or Stanford-Binet IQ tests, in a hiring procedure had best conduct a specific validation study of the test related to performance on the job in question. Some companies do the validation study, and use IQ-like tests in hiring. Other companies use IQ-like tests in hiring and hope that no one sues (which is not what I would advise any company). Note that a brain-teaser-type test used in a hiring procedure could be challenged as illegal if it can be shown to have disparate impact on some job applicants. A company defending a brain-teaser test for hiring would have to defend it by showing it is supported by a validation study demonstrating that the test is related to successful performance on the job. Such validation studies can be quite expensive. (Companies outside the United States are regulated by different laws. One other big difference between the United States and other countries is the relative ease with which workers may be fired in the United States, allowing companies to correct hiring mistakes by terminating the employment of the workers they hired mistakenly. The more legal protections a worker has from being fired, the more reluctant companies will be about hiring in the first place.)
The social background to the legal environment in the United States is explained in many books about hiring procedures
Previous discussion on HN pointed out that the Schmidt & Hunter (1998) article showed that multi-factor procedures work better than single-factor procedures, a summary of that article we can find in the current professional literature, for example "Reasons for being selective when choosing personnel selection procedures" (2010) by Cornelius J. König, Ute-Christine Klehe, Matthias Berchtold, and Martin Kleinmann:
"Choosing personnel selection procedures could be so simple: Grab your copy of Schmidt and Hunter (1998) and read their Table 1 (again). This should remind you to use a general mental ability (GMA) test in combination with an integrity test, a structured interview, a work sample test, and/or a conscientiousness measure."
But the 2010 article notes, looking at actual practice of companies around the world, "However, this idea does not seem to capture what is actually happening in organizations, as practitioners worldwide often use procedures with low predictive validity and regularly ignore procedures that are more valid (e.g., Di Milia, 2004; Lievens & De Paepe, 2004; Ryan, McFarland, Baron, & Page, 1999; Scholarios & Lockyer, 1999; Schuler, Hell, Trapmann, Schaar, & Boramir, 2007; Taylor, Keelty, & McDonnell, 2002). For example, the highly valid work sample tests are hardly used in the US, and the potentially rather useless procedure of graphology (Dean, 1992; Neter & Ben-Shakhar, 1989) is applied somewhere between occasionally and often in France (Ryan et al., 1999). In Germany, the use of GMA tests is reported to be low and to be decreasing (i.e., only 30% of the companies surveyed by Schuler et al., 2007, now use them)."
Integrity tests have limited validity standing alone, but appear to have significant incremental validity when added to a general mental ability test or work-sample test.
AFTER EDIT: A kind comment to this comment graciously assumes I wrote this FAQ only just after when the article opening this thread was submitted. In fact, as was pointed out by a kind reply to that comment, I have prepared this FAQ document in advance, because questions about company hiring procedures frequently come up on Hacker News. I began summarizing the research about six months ago, and other participants here on HN have helped me take this FAQ through several revisions as it reached its current form about two months ago. Questions about hiring procedures come up again and again on Hacker News, and I like to store electrons to conserve keystrokes.
I agree that intelligence and work samples are the most critical things to look at. The trick with work samples is that most people focus too narrowly.
Yes, coding is a critical part, so don't skip it. But so is team communication (no matter how great of a programmer you are, if you won't respond to my emails, I won't think you are a great employee). So are code reviews (if you call people morons for not agreeing with your style, no matter how great of a programmer you are, I won't think you are a great employee). So is mentoring (if you can't explain concepts to people who aren't as smart/experienced/whatever as you, no matter how great of a programmer you are, I won't think you are a great employee).
I could go on. I think it is some of these softer skills that many interview questions have tried (and mostly failed) to suss out. Back to the OP, I can see it being a reasonable effort at experiencing some of the non-coding work requirements.
Much better than "how many ping-pong balls would it take to fill this room" and "design a nuclear reactor for me", both of which have been asked of me in web developer interviews.
The ping pong ball question has been used a lot to see how/if engineers will attempt to give a reasonable estimate with proper caveats. No attempt is auto-fail, as is going into too much detail, and the rest is judged by the reasoning, approach, and if decent upper and lower bounds are given.
I imagine the reactor is similar but also checks if you can say where you are not competent and should defer or delegate.
I've never worked in consulting, but, so far as I know, the big, MBB consulting firms tend to incorporate most of the advised measures in tokenadult's post when evaluating potential employees. These measures include GMAT cut-offs, GPA cut-offs, very lengthy interviews that incorporate questions like the one discussed here, and they ultimately prefer to hire people who have previously interned with them, and what better work-sample test is there than an internship?
That sounds like a pretty solid vetting process to me, yet there's a lot of pessimism surrounding big-name consulting. I'd like to hear from anyone who's worked there or gone through this process to hear why they ultimately get such a bad rep.
Consulting companies built their reputations on their work in the 60s, 70s, and 80s. They encouraged companies to do things we now consider Business 101 (it was cutting-edge stuff back then though). Several major companies in the 1960s had no insight into their competitors or even a full picture of the market they were in. These were the sort of organizations that collected very little data beyond sales and balance sheet figures - it was like they lived in a "bubble". The pioneering consulting firms popped those bubbles and actually taught businesses how to do basic market analysis, and turn that analysis into business strategy.
If you're at all curious, I'd recommend reading The Lords of Strategy by Walter Keichel, which a (albeit opinionated) history of the consulting practice.
I'm going to try to answer your question but I imagine it's very Anna Karenina-esque: "Happy families are all alike; every unhappy family is unhappy in its own way."
Often the problem is the size of the firm, allowing institutional problems to sabotage otherwise good practices. Instead of recognizing and promoting the things that will improve the firm long-term, they implement good hiring practices only to prevent lawsuits, bad press, etc.
Revitalizing a large company before it dies is a very difficult problem.
That would make certain searches of news.ycombinator.com more frustrating.
Web pages that duplicate other web pages have been around for almost 20 years now (documentation by the GNU project being a notable early example) and even with all of their resources, Google still fails to filter out duplicates a significant fraction of the time.
Why do you think most people prefer to read it "inline"?
Far less snark, don't you think?
My answer would be: because some of my highest voted comments have been snippets copied from a larger work instead of just linking to the larger work. Perhaps people don't want to open up a new tab, perhaps they don't want to load a new page, perhaps they appreciate keeping the content in context with the rest of the comment thread, perhaps they appreciate keeping the content on HN for "prosperity sake" due to the ever-breaking links of the rest of the web.
Sigh. Thanks. This is the second time I've been caught with an eggcorn. First "windshield factor", now "for prosperity sake". Looks like I might have the privilege of adding to their database myself this time...
The main article tokenadult is talking about is the Schmidt & Hunter paper: "The conclusions in this article apply mainly to the middle 62% of jobs in the U.S. economy in terms of complexity. [...] This category includes skilled blue collar jobs and mid-level white collar jobs, such as upper level clerical and lower level administrative jobs."
The article says that the most common measure of employee ability in general, and presumably for the articles used in this meta-study, was the amount of money each employee earned. Presumably then most of these employees were doing routinized or semi-routinized labor that would lend itself to piecewise compensation. The above quote seems to be consistent with this interpretation. They are also only looking at employees of huge corporations, where each person is one of dozens or hundreds of others doing more or less the same thing.
So is this meta-study relevant to hiring programmers? Yes and no. Probably the measures that were found to have more validity are still going to have more validity than the ones that were found to have less validity. But at the same time using only these tests would be too simplistic; they were designed to predict which factory workers were likely to steal from the company or slack off. And while these are still important factors to consider, the challenges of a modern startup go way above and beyond this.
Essentially this research was designed mainly for situations where you're trying to scale up a large industrial process where you can make money by arbitraging the difference between the output of the average employee and the amount it costs to pay them. The closer you get toward environments where it's essential that each person contribute things that are unique and novel, the less it makes sense to rely on these sorts of hiring tools.
In short, I would say that the above research neither supports nor disconfirms the advice of the original blog post. Of course you could make the case (and maybe tokenadult believes this?) that employers should make hiring decisions mainly by using a series of multiple choice tests that have been empirically validated, but that's an entirely different argument.
Link formatting suggestion: I found the sentences broken by line breaks for links difficult to read (and it discouraged me from starting, because it looked like jumbled nonsense when I skimmed). Instead of a separate paragraph, just putting links in parentheses seems to work well; or even just as-is, because the auto-underlining highlights them.
But it would be helpful to have a line break to separate your cover letter from the FAQ itself. Something like:
I'm sorry, but you've been rejected from my soon-to-be-founded startup company, because I only accept people who comment in 29 minutes or less. Taking a lot of time to comment could slow down meetings to a crawl, you see.
Also, unknown to you, I only accept people who write their comments in the form of a koan and wear red ties on interview day.
> a general intelligence test used in hiring that could have a "disparate impact" on applicants of some protected classes must "bear a demonstrable relationship to successful performance of the jobs for which it was used."
This part has been revised in practical use all over the country in the last 10 years.
If a test shows group-X scoring on average lower than group-Y, this in itself now proves that the test is biased against them (in a racist way). There is no need to link the test's material to performance of a job. That part is irrelevant.
> EXECUTIVE SUMMARY: If you are hiring for any kind of job in the United States, prefer a work-sample test as your hiring procedure.
Hence if you are going that direction, you should also consider the possible lawsuits that would happen if your test shows any type of racial differentiation in the grade (for example due to environmental upbringing or cultural values of the testee), even if your test is 100% neutral to race and 100% spot on for job performance.
The linked article talks about a government agency (NYC Fire Department) who imposed a written exam for fire fighters; the judge ruled that the exam was discriminatory in results and was not shown (could not be shown) to be related to job performance. That's what made it illegal under Title VII of the Civil Rights Act.
You don't have to worry about this if you're a private California software company assigning work-sample tests that are "100% spot on for job performance." Job performance is not linked to race.
(If you think that job performance at a software company is linked to race, I recommend that you keep your opinion to yourself. Maybe you can use it to your private advantage! ... or maybe you have racist assumptions.)
> If you think that job performance at a software company is linked to race, I recommend that you keep your opinion to yourself. Maybe you can use it to your private advantage! ... or maybe you have racist assumptions.
I don't think that. And I shouldn't have to defend myself from being called a racist just because I've entered a discussion that has different races involved (and am white, well, as white as a born-Russian gets). You should also concider how much damage throwing that keyword around can do.
My point was that it's no longer necessary to link the testing material to job performance. But only to simply show that test scores are different between the races. And then to be politically correct.
This has been the trend we've seen in the last 10 years with these types of lawsuits.
There are parts of this country that not only have socio-economic differences between the races that impact the abilities of the individuals, but in some cases have cultures that downright embrace ignorance. If you start testing people for any job that requires a slightly above average ability in those parts, you will get drastically different results in the grade average of the races. This should not be used against the employer.
If the test is correlated to job performance better than other options that do not have disparate impact, it is allowed under the law. FDNY was unable to prove that the test was correlated to job performance (because they didn't ever make that correlation, the test was a cutoff to be considered to be on the force in the first place), which was their undoing.
If you are a small company no one is ever going to get a statistically valid sample size that could demonstrate any bias. They can only discover that data if it were available. If they do a work-performance test on a computer and the results are not kept, and racial/demographic data is not kept about each of the applications and their performance, how can this ever come up? These sorts of concerns really apply mostly to government jobs and large corporations.
I have often thought we should have a hash-tag means of labelling some comments or threads. For example the above FAQ would be #hiringbestpractise (or possibly #tokenadulthomerun ). Then this thread would be found if anyone searched HN for that tag, and indeed would rank top of all #hiringbestpractise tagged comments.
Then tokenadult could simply point people to the hashtag page and say enter the right phrase.
I for one would value seeing the top rated comments from HN on any range of given subjects, even several years back