
Making software engineering interviews predictive of job performance - geoffroberts
https://www.qualified.io/blog/posts/truly-predictive-software-engineering-interviews
======
tracker1
In the end, it's a crap shoot... Even their highest marker of validity is only
about 50%.

In the end, if you make it too hard in an employee market, you will have good
to great candidates drop off... if you make it too easy, you will wind up at
the bottom of the barrel. Market pay will also vary wildly and dramatically,
and perception is equally varied.

Where I work now, we have a small pre-interview code challenge... it's not
particularly difficult and in general should be resolvable in under an hour.
I've seen questions on the other side, that I've literally spent 3+ hours on
something that said an hour. I suggest those coming up with these things
actually implement their challenge on a clean computer (only tools, no
reference code) and time themselves. I now cut myself off at an hour and
submit what I have stating as much.

As to structured interviews, it will depend. It bugs me both ways when I see
more generic, low level pattern questions in interviews only because the
practical implications are usually very limited. On the interviewer side, it
does help to ask some of the more quiz-like questions as it tends to give you
a decent baseline of what the person being interviewed understands.

In the end, personal questions along the lines of what a person works on
(developers) outside of educational or work settings gives a ton of
information on how they work as well as what they know.

Culture, for as much of a B.S. thing that it is, is actually very important as
well. A functional team or set of teams needs a few different types of
personalities and worker types in order to do things well as a team. Being
able to get along together is also imperitive.

In the end.. it's a crap-shoot and understanding the needs, fill and fit are
all varied a bit... trying to normalize doesn't help much. Also, business
needs and talent pools will vary greatly.

~~~
derefr
> In the end, if you make it too hard in an employee market, you will have
> good to great candidates drop off... if you make it too easy, you will wind
> up at the bottom of the barrel. Market pay will also vary wildly and
> dramatically, and perception is equally varied.

I mean, you're talking about a different thing than "job performance" here.

Job performance is a _satisficing_ measure: you only need to get someone who's
_good enough_ to perform the duties you need of them. (Yes, even in software
engineering.)

You're talking about _ranking_ measures. Does it matter if someone is "the
best" if they're still not good enough to do your job? Does it matter if
someone is "the bottom of the barrel" if they _are_ good enough to do your
job? I would say no, in both cases.

IMHO, the hiring bar is _way_ too high in most bigcorps; they care about
vanity metrics about having "the best", when 100% of the work assigned to
these hires could be easily done by people who are not, in fact, "the best."

The point of a job interview, in the abstract, is to filter out the candidates
who _aren 't qualified to perform the job duties even after a few weeks of on-
the-job training_. This leaves you with a pool of people who are all
_qualified_. At that point, you should either just hire one of them at random,
or try to optimize for a second criterion, such as "is willing to take the
lowest compensation to do the job" (which usually translates to "has the
fewest _formal_ qualifications to do the job, while still actually being
qualified in practice.")

~~~
wpietri
> Job performance is a satisficing measure: you only need to get someone who's
> good enough to perform the duties you need of them. (Yes, even in software
> engineering.)

Is that really the case for software? I mean, I'm very skeptical of the "only
the best and brightest" nonsense, as I think it selects for the wrong people.
But I'm also skeptical of the "eh, good enough" heuristic.

Right now I'm helping a friend deal with a code base that was built by an
outside dev shop. It undeniably works; the business has been running on it for
a number of years. And honestly, like most software, it's not a complicated
app; mostly CRUD stuff. But overall it's a spaghetti snarl. Any given concept
might be in the code, or in a config file, or driven by SQL data, or even
driven by JSON blobs stuffed into the database, mutated by the main code base,
and passed along to the JS front end.

Now that I can see the code, I understand why the dev shop needed so many
people working on it, and why all features were hard to do. It's because when
you use a lot of mediocre programmers, they all create work for themselves and
one another. Not intentionally, but just because they aren't able to see how
the differing approaches and sloppy work slowly diminish productivity.

So although I agree with you that a lot of companies hire on vanity metrics,
I'm really skeptical that "whoever, as long as they barely qualify" is a
better approach.

~~~
OrangeMango
> Now that I can see the code, I understand why the dev shop needed so many
> people working on it, and why all features were hard to do. It's because
> when you use a lot of mediocre programmers, they all create work for
> themselves and one another. Not intentionally, but just because they aren't
> able to see how the differing approaches and sloppy work slowly diminish
> productivity.

To me, that sounds like a leadership problem at the outside dev shop. Properly
supervised, properly supported, and with reasonable deadlines, mediocre
developers can provide extraordinary results per dollar spent on probably
80-90% of dev tasks.

~~~
wpietri
I'm familiar with the theory, but I've never seen that actually happen. And
really, that just shifts the problem. If you need really amazing managers and
really amazing senior engineers to make it work, you now just have a different
hiring problem. As well a robustness problem, in that choosing a senior person
badly means the whole team can go off the cliff.

Personally, I'd rather start with the amazing senior people, hire bright
juniors, and mentor them into being amazing developers as well. As well as
investing in tooling for non-developers. If a chunk of work is boring enough
that a mediocre programmer can do it without risk (e.g., creating reports),
it's perhaps boring enough that you can just enable the users (with, e.g.,
report builders).

------
new_realist
“General Mental Ability (GMA) tests (like the IQ test) are very predictive of
future job performance, largely because people with high intelligence can
learn the skills needed to be successful on the job more rapidly. However, due
to legal concerns they’re not recommended for companies hiring in the US.”

This is what a classic FAANG interview is: an IQ test which is disguised as a
relevant skills test for legal reasons. And they work quite well.

~~~
saagarjha
Do they?

~~~
new_realist
They’re the worst form of interview, except for all the other ones.

~~~
amznthrowaway5
Why not just give an actual IQ test?

~~~
bitwize
That opens you up to liability for race-based discrimination. The accepted
wisdom is that IQ tests are inherently racist and favor white evaluees over
black and brown ones.

~~~
bmn__
That's formulated the wrong way. An IQ test per se cannot be racist because it
is not a human being with intentional behaviour. This means blame must fall on
the test designer or test conductor to introduce a racist biasing. In places
where accurate results matter, the best available methods of psychometrics are
used:
[http://enwp.org/Progressive_Matrices](http://enwp.org/Progressive_Matrices)
These are free of bias: culture, language, reading/writing ability does not
matter. If these are used and assuming the test conductor did not screw up,
and if white evaluees score better than black and brown ones, then it is an
objective measurement of reality, not racism. Thus I think the "accepted
wisdom" is wrong. If you know of any examples of racism through IQ tests,
indicating intention, I'd be glad to hear of them.

~~~
pjscott
I agree, but it's not obvious that _courts_ would agree, and lawsuits are
expensive even if you win. The legal liability worries are still there, alas.

------
kube-system
Some of the points here are good, but I think some of them are too simplistic.

This article makes a huge assumption: that performance is the _only_ factor
you're hiring for, and therefore unstructured interviews are worthless. That
would make sense if you're interviewing for a factory of emotionless robots.
But humans are social beings, and human performance as a team is more
complicated than a sum of individual productivity. You need to take into
account the affect that the hire would have on the _entire_ team. Just about
everyone has an anecdote about a coworker sometime during their career that
was super productive, but completely demoralizing to the team. This is a very
valid reason to give unstructured interviews, and if you ask smart people who
give them, this is what they're looking for.

I totally agree with the utility of work-sample tests. There is nothing better
to see someone's performance than simply giving them a small but relevant
chunk of work to do. It is very important to keep them short and flexible
though.

If you make them too lengthy, you'll skew the results against people who have
better things to do with their time. What kind of person works for free
anyway? Certainly not the caliber of people I want to work with.

And if you make the test too specific to certain technologies, you skew the
test against smart engineers who are less familiar with that tech. If you want
a problem solver, give them a problem in plain words, and let their solution
be open-ended.

I don't agree with making tests particularly hard, though. Make it nuanced,
yes, but of _normal_ difficulty for a given assignment. You need the
_appropriate_ people to do the job. If you only hire top-tier algorithmic
experts for your basic CRUD app, then you're going to have a very expensive
CRUD app with a bored and demoralized team. If someone does sloppy work, you
can spot this on a normal exercise as much as you can on an unreasonably
difficult one.

~~~
itronitron
>> What kind of person works for free anyway?

I would expect everyone at some point has done something (worked) for free,
either on speculation, as a favor, or for bragging rights. Open source
wouldn't be a thing otherwise.

~~~
kube-system
Most work being done on significant open source projects is by people who are
being paid for that work.

------
Havoc
I don't think you can. Job performance is something that gets figured out on
the fly.

Unless you're working in a crazy standardized operation thats a moving target.

e.g. We've got a ton of manager & they all sorta do their own thing...they
gravitate towards their strong suit essentially. You can't really test for
that in advance.

Some people excel at leading 30 man teams on stable jobs. Others excel at
being parachuted into technical shitshows and sorting it out. Very different
personal attributes.

It's a crap shoot as tracker1 says and will continue to be one

~~~
thedance
This whole article is predicated on the idea that you can quantitatively
measure the performance of a software engineer. That seems completely
unfounded.

~~~
jhoffner
Why isn't it? Any job where enough information can be known to know that
someone should be fired or promoted is a job that can be quantitatively
measured to some degree.

------
chillacy
The cited data is from a paper The Validity and Utility of Selection Methods
in Personnel Psychology
[https://www.researchgate.net/publication/232564809_The_Valid...](https://www.researchgate.net/publication/232564809_The_Validity_and_Utility_of_Selection_Methods_in_Personnel_Psychology)
in 1998 as a meta-analysis of other findings, but they are not specific to
software development (one account is sourced from 32000 employees in 515
diverse civilian jobs in the 1980s). Maybe software is like any other job
(sales, clerical, accounting) but maybe not.

~~~
geoffroberts
Why would it not be?

~~~
chillacy
I have some guesses, but even if you take that it is, the authors were not
very transparent that they made this assumption, in fact they did not point
out that the research was done on all jobs while they spoke only of software.
That is a bit misleading.

------
proverbialbunny
Data Science interviews almost always have a take home interview as the
technical challenge. The in person interviews tend to be cultural fit
interviews.

This is because DS work is larger than a bug fix, implementing a new feature,
and even larger than writing a program from scratch. Even a small DS problem
is a multi day problem. Most interviews allow one to take weeks to do a
"quick" 2-3 day problem.

This process may be out of necessity, but I feel it reduces noise. You really
do have a better idea of knowing what you're getting.

Software Engineers may groan at the idea of a take home project, but just
throwing this out here: What if a company let the interviewer choose between a
take home problem or an in person 30 min style white board interview? What
kind of results would come from that? Would the people who take the take home
interview work out better at the company, or would the whiteboard interviewers
end up working out better?

~~~
jl2718
There are a few ways to do a 2-3 day data science problem: 1\. Random plug and
chug / datarobot 2\. Copy someone else 3\. Expert feature engineering 4\.
Overly-documented EDA

I don’t think any of this is relevant, but maybe it gives some indication that
they can do anything at all. However, most good DS will pass because they‘ve
been burned too many times by companies just looking for free help on a
problem.

If you’re a DS, don’t ever work on a company-relevant problem for free.

------
hn_throwaway_99
2 comments on this:

1\. The authors of this paper are selling something. This is an advertisement.
Not saying it's wrong or untruthful because of that, but it is an
advertisement.

2\. As another commenter said, the cited data is from a paper that doesn't
look at software engineers specifically. Given that software engineering has
one of the highest amounts of variability in productivity of any profession,
that should be taken into close consideration.

~~~
geoffroberts
The purpose of this paper is to present research that's completely independent
of any product, and show how the concepts apply to software engineering
specifically. What's your source in saying "software engineering has one of
the highest amounts of variability in productivity of any profession?" That
seems hugely unfounded.

~~~
hn_throwaway_99
> The purpose of this paper is to present research that's completely
> independent of any product, and show how the concepts apply to software
> engineering specifically.

Lol, the article actually ends with a link to the product you're selling:

> Ready to build a more predictive hiring process for software engineers?
> Schedule a 1-on-1 review of your hiring process or request a free trial of
> Qualified here.

------
chapium
The best interview process is not going to account for bad corporate policy,
terrible bosses, or other confounding factors.

~~~
proverbialbunny
So true!

It's sad it's so hard to tell if someone is going to be a terrible boss or if
there are systemic issues.

Some of the best interviews I've had, have had the worst bosses, and vice
versa. Though, totally anecdotal, especially given I can only go off of the
jobs I've accepted which is a sample set of the best interviews I've had.

------
lugu
Just to share my experience.

1\. We sucks at hiring, so we need to train juniors on how to hire early. 2\.
Programs live in the head of their creators. The code is just a draft left for
computer to execute. Build a human culture where the aspiration is discussed.
3\. We don’t replace people, we hand over the responsibility. Be ready to work
differently. 4\. I conduct interview by asking the candidates to explain me I
piece of work she is is proud of. If I understand the what, how and why, she
usually pass. 5\. Remember: most issues are not technical but organizational.
We can train people, but we can’t change them. 6\. Set high quality standards
for your team. Make them proud of their work. On boarding will be easier.

So, it is less about how to hire the shinning star and more about how to
welcome and how to create a culture around your values.

My two cents.

------
TSiege
The best interview I've ever had was I was asked how I might design a real
user feature for their company and review a sloppy PR with errors. For the
first I came up with two options and weighed the best reason for each and
which I might go with depending on the circumstances of the project. For the
second I wrote PR comments I would leave any developer with suggestions for
improvements, concerns, and praise for good work. That was representative of
my actual job as a software engineer.

The normal is taking some sort of tricky pop quiz styled questionnaire, which
in my opinion was not representative of anything about me beyond I know how to
use a for loop efficiently. Yet this is what I find these companies offer.
They tout them as a quick fix cause its easy to standardize, analyze, and hard
to do therefore good, and last but not least waste the candidates time but not
the companies. Their the metrics are moot. Not only have they been proven to
be ineffective by the big companies that spawned them, but I also find them
demoralizing and a turn off.

It seems like the industry is heading towards a self imposed SAT style quiz
where the lucky winner of this startup war holds all the power and provides
little benefit to either side of this process.

------
agentultra
I think the important thing is to realize that you need to rely on realistic
factors and many different signals to get useful data.

I've been to plenty of interviews for "senior software engineers," where they
used the standard gauntlet of algorithms/problem solving/data structures
questions in rapid succession for eight hours.

I tend to deliver most of my projects on time and of a high quality. I value
things such as correctness and maintainability, and am conscious of other
factors to consider such as viability and availability of time and capital. I
spend a good amount of my time writing and seeking feedback from colleagues
and think it's important to share our ideas and talk about our work.

And when I practice my work I don't think I've ever been in a time-constrained
scenario where I had to come up with an implementation of an algorithm while
being judged and evaluated for how I think. I know enough about my work to
understand when I need to use BFS or DFS and when to use a tree, a map, or a
heap; but I also know that if I can't think of it off the top of my head I can
walk over to my library and pull out a book or look it up online.

I'm also a proponent of formal methods. I've written specifications to help
solve race conditions and validate implementations of locking algorithms. I
can write basic proofs.

And I'm a great writer as a result. I spend a good amount of my time writing
and communicating my ideas, seeking feedback from colleagues, and in general
improving my understanding of problems.

How does finding a cycle in a sorted array or reversing a linked list signal
to you whether I would be a good fit for your company as a senior engineer?
Does your company also value clear writing and good communication skills? Do
you value engineers that can manage themselves and get work done autonomously?

Modern interview practices at most places I've been through don't know how to
get those signals. It's useful knowing that the people I work with are also
familiar with when to use a heap versus an array but I also value other things
at the senior level that most companies aren't really testing for.

I always prefer data over gut instinct but I feel like the process of hiring
people is not amenable to clean signals... it's a very messy process and I'm
not hopeful that there will ever be a silver bullet.

~~~
zygy
What does your interview process look like?

~~~
agentultra
It's not perfect either but we're pretty open about it:
[https://weeverapps.github.io/interviews/](https://weeverapps.github.io/interviews/)

Internally we use a rubric for the various job roles we hire for and we aim to
remove or counter biases at every step of the process. Our teams value
diversity and are pro-active about equality and inclusion.

A big part of the process is code review, something our engineers actually do
spend a lot of time doing, so the process is mostly focused around that.

------
m3kw9
To boil this interview thing, you are trying to make a very far prediction of
someone based on very little data. The industry standard way to gather that
data will probably get you to see if that person has done it before but not
how well. Unless that person is sorta famous and even then he could be an
asshol to the team and a disease

~~~
itronitron
Furthermore, the unstated unknown that many interviewers have on their mind is
whether the candidate will be successful _in this specific company._

------
quanticle
_“In economic terms, the gains from increasing the validity of hiring methods
can amount over time to literally millions of dollars,” writes Schmidt
regarding the importance of valid assessment methods. “However, this can be
viewed from the opposite point of view: By using selection methods with low
validity, an organization can lose millions of dollars in reduced production.
In a competitive world, these organizations are unnecessarily creating a
competitive disadvantage for themselves. By adopting more valid hiring
procedures, they could turn this competitive disadvantage into a competitive
advantage.”_

The problem is that this misunderstands the purpose of the hiring process. The
purpose of the hiring process is to get people who are qualified, while
signalling that your company hires good people, and staying out of legal
trouble along the way.

I'm not sure I agree that an organization can gain "millions of dollars" by
fine-tuning its hiring processes to the nth degree. While I absolutely agree
that there are skill differences between programmers, I contend that the
economic differences attributable to skill between programmers are dwarfed by
the economic differences between any automation and no automation. Simply put,
having a terrible programmer automate your business processes today is far
better than sifting candidates to find the _perfect_ programmer to automate
your business process two years from now. Hiring is about satisficing, not
optimizing.

~~~
geoffroberts
I understand your point—it's not about optimizing your process above all else
(at the expense of speed, for example). But you can and should optimize your
hiring process just like you would your sales process, onboarding process, etc
and the outcome you're looking for is hires that are successful once hired. It
costs companies far more if they sacrifice in the name of speed and make poor
hires.

------
fileyfood
"Hard Interviews, Happy Workers". This note near the end assumes correlation
shows causation, which it does not. You could also argue that more workers
want to work in certain jobs, creating more competition and leading the
company to make interviews harder. The workers in those jobs may be happier,
but not because the interviews were harder.

------
mr_toad
The elephant in the room is that we can’t accurately measure job performance
by any objective metric either.

Many things would be simpler if we could: performance reviews, promotions,
incentives, salary equality, recruitment etc.

------
phibz
Interviews aren't just about talent and ability. Some fairly abstract concepts
are important too.

How the individual fits within the team is vital. You might have one of the
smartest people on the planet but if they can't effectively communicate or
understand and work within the cultural norms of the team you're going to have
a problem.

These norms and practices vary and shift between teams, countries, and even
within the same team over time.

For example hiring a Linus Torvalds for an intermediate, client facing,
political position, is probably going to have poor results.

These are difficult things to test for and are as much a decider of job
performance as raw knowledge and technical experience.

------
jamespollack
so you're gonna pay us to do these "work samples" at market rate, right?

~~~
geoffroberts
Actually, yes! Erik Bernhardsson, the CTO at Better.com who is cited in the
article a few times, does exactly that for finalist candidates that they ask
to provide a significant work sample. I think that's 100% appropriate. But
that said, a work sample doesn't necessarily have to be something that takes
hours on end either.

~~~
jamespollack
ayy, that's awesome if they pay candidates for significant work samples. i
think there's some discussion to be had about what's significant. anything
more than 1.5-2hours in my opinion. an hour or two is how long an interview
might take in any profession. asking me to work unpaid for 4 or 8 hours (one
project: 3 days!) is excessive. maybe call it the "half-day" test. is it a
half-day of work? if so pay. if not, then you can do that once during an
entire interview process.

------
saviorand
tl;dr: offer your applicants test exercises

~~~
geoffroberts
Yes, but not generic tests—in order to design a hiring process that's actually
predictive of future job performance, the "tests" have to be designed to be
directly relevant to the real world work that the engineer will be tasked
with.

------
LifeLiverTransp
I wondered, have any others been expected, to reverse the interview direction?
As in ask the company manangment, sales and tech ceos, to answer your test
scenarios?

"You have developed a new product with broad market appeal, and sales already
found you a huge first customer, which would secure you for years to come. In
return they expect you to tailor the product to there needs. How do you
react?"

It sounds mad, but it seems somehow to be expected by some. Like they expect
you to test your company for warning flags as thoroughly as they test you..

