
Against metrics: how measuring performance by numbers backfires - YeGoblynQueenne
https://aeon.co/ideas/against-metrics-how-measuring-performance-by-numbers-backfires
======
luiscleto
Reminds me of Goodheart's law[1].

We have known this for a long time, but it is hard to have an alternative
system which scales well with huge organizations. For startups and small
companies, I could see an informal system working pretty well, but as the
company grows to hundreds or thousands of employees, it becomes necessary to
standardize and have some kind of metrics used for reporting and evaluations.
This will inevitably shift the company's culture towards gaming those metrics.

Somewhat like grading systems in education. High grades don't necessarily mean
you will be capable of generating more value to society than average grades or
even low grades. And students often become good at improving their grades
without that actually adding much value. But there is a correlation. And we
don't have many better (non-experimental) alternatives that I'm aware of.

[1]
[https://en.wikipedia.org/wiki/Goodhart%27s_law](https://en.wikipedia.org/wiki/Goodhart%27s_law)

~~~
solatic
> This will inevitably shift the company's culture towards gaming those
> metrics.

Yes, but that's only the beginning of the story. People gaming metrics is a
type of security problem, in that "attackers" try to game the metrics while
"defenders" try to make them less game-able by improving the accuracy and
precision of how the metrics are gathered so that the final numbers continue
to tell a valuable story over time.

The issue isn't that metrics can be gamed; it's that organizations which pride
themselves on being data-driven rarely make the investment in hiring blue
teams and red teams to defend and attack the metrics. If you appreciate that
investing in cyberdefense is key to protecting your company from cybersecurity
threats, why can't you appreciate that investing in "metricsecurity" is key to
protecting your company from "metricsecurity" threats?

~~~
Angostura
> The issue isn't that metrics can be gamed

The issue is that many of the most important parts of many organisational
activities can't be easily measured through simple metrics at all.

~~~
solatic
I think this is a rubbish excuse made by people who don't understand the role
of such a "blue team". If the organization initially adopts metrics that are
counter-productive (e.g. measuring feature completion and not technical debt),
it is the role of the blue team to change the metrics such that the neglected
areas are properly accounted for in final performance metrics. No metrics
should be final; only iteratively tuned to achieve results that are more and
more indicative of the underlying performance.

It is difficult, but still possible, to measure technical debt and other
"hard" metrics. It is precisely the job of the blue team to deal with that.

~~~
Angostura
And how do you measure the person who takes time out of their day to help a
colleague in another department, who is having a tough time understanding an
issue, so the person takes 15 minutes out to help them, boosting morale and
overall company cohesion?

Or should they be penalised for wasting 15 minutes?

~~~
bluGill
Depends, do you want that to happen or not? You seem to be making an
assumption this is good, but in some organizations this would be a bad thing
to penalize. I'm not sure why you would do this in engineering, but it is
important to acknowledge that this isn't a universal good and so maybe your
company wants to discourage it for some reason.

Assuming you want people to help each other, you need to capture metrics on
it. A few years back I had a metric of helping n people in a different
department: I kept track of those interactions so I had something to report at
the end of the year.

~~~
TeMPOraL
Was that your personal metrics? Metrics created for yourself are subject to
less gaming because when you start lying to yourself about those, you will
start to wonder why keep those metrics at all.

If that was a company-issued, top-down metric, I hope it wasn't defined
literally as "helping n people in a different department", because that has
enough wiggle room to sail an aircraft carrier through. The difficulty of
creating a good metric here comes from the difficulty of defining what exactly
does it mean, in company context, to "help other people" \- and also what it
explicitly doesn't mean.

~~~
bluGill
I had to report it to my boss. It was top down, but only a few interactions
were required, and it wasn't reported farther up the chain. Because I had to
report to my boss, he knew me well enough to judge if it was enough. It was
just enough of a metric to ensure people looked for something to bridge a
communication gap, without being hard enough that people tried to game it
much.

------
zentiggr
As I read through the 50-odd comments, I wound up with a kanban / 5 whys level
thought. All these points of discussion revolve around pluses and minuses of
metrics, but the next level is "Why establish metrics at all?"

At a deeper (very generalized) level, we've been infected by production line,
Deming style, efficiency focus.

If you are handling previously defined routines, great. Optimizing can help
you.

As soon as you are working with adding value, be it in code, education,
product development, metrics are _all_ premature optimization.

Promote the idea that you can't apply optimization techniques to creative
steps and encourage managers to get off the numerical crutches and make value
judgments. Takes a better manager but then that's the point anyway, right?

~~~
Jill_the_Pill
"we've been infected by production line, Deming style, efficiency focus"

Could you say more about this? I'm trying to form an opinion on this Deming-
ism. My impression is that the actual Mr. Deming gave different advice:
"Eliminate slogans, exhortations, and targets . . . Eliminate quotas, numbers,
numerical goals."

~~~
fsloth
I presume OP intended to write Taylorian?

~~~
kome
[https://en.wikipedia.org/wiki/Taylorism](https://en.wikipedia.org/wiki/Taylorism)

------
dr_dshiv
In education, there is a practice known as "backwards design" described in the
book "understanding by design". Here, goals are set, behaviours that are
evidence of achieving the goal are set (these are assessments), and then
instructional activities are designed that "move the needle" on the
assessments. But the critical 4th step in this iterative process is continuous
alignment between the goals, assessments and instructional activities. This
alignment process is intended to prevent the negative outcomes of teaching to
the test.

From that perspective, any metric oriented organisation might want to have an
explicit alignment process that ensures that the metrics continue to capture
the desired outcome.

~~~
shikoba
I'm a developer. I dare you to provide a metric that can measure my outcome.

There is no metric that can measure my outcome.

~~~
wins32767
Let me ask you this, can you tell which of your coworkers are good developers
or a bad ones?

~~~
skybrian
Not unless it's an extreme case. I found writing performance reviews
excruciating because I don't pay that much attention to what other people do
and hate to judge people.

------
jacques_chester
It's time again for me to recommend _Measuring and Managing Performance in
Organizations_ by Austin[0].

Don't let the underwhelming title fool you: it's largely a book on why it
can't be done. Austin takes classical principal-agent models and extends it to
add a third participant: the client. He then shows that difficult-to-observe
work at first improves and then worsens under any metrics regime.

No single metric can be found that will give the desired performance. The
basket of metrics that _can_ successfully create the desired outcome is one
where the principal has total visibility into the agent's work -- which
contradicts the original problem of difficult-to-observe work (ie, anything
requiring skill).

[0]
[http://www.dorsethouse.com/books/mmpo.html](http://www.dorsethouse.com/books/mmpo.html)

~~~
Loughla
That does look interesting, thank you for the recommendation.

------
edw
Has it occurred to anyone else that the HN karma system is a performance
measurement technique that undermines the goals of HN? I've stopped
commenting, largely because for any comment I might make, I can imagine the
cynical karma-harvesting comment that someone will make in response, the
comment that would never have been made if there were no karma system here.
And when I do comment, I waste a lot of time trying to harden my comments to
such opportunism.

~~~
vbuwivbiu
I've largely stopped commenting due to karma also. It's not just that wording
the same sentiment differently can result in either up or downvotes, it's that
I don't want to have to care about getting points on the internet

~~~
diminoten
If HN were to revamp how it handled comment sorting, what features would get
you to comment again?

~~~
edw
A thought experiment: What if HN changed nothing except that it never
displayed point counts for posts, comments, or users? I think it would tamp
down gadfly behavior. It may also, however, do the same to behavior that we
want to encourage. There are no easy answers.

------
rubidium
A good manager will use metrics as information to help in assessment of work,
not as the sole measurement. Very important for any manager to coach is the
real organizational concern behind the number.

If you’ve got more that n=5 things, at a certain point you _need_ metrics to
make sense of how things are going. Are we getting our product to customers
when they want it (OTD)? Being able to say, yes 90% of the time is very
helpful compared to saying “who knows, we don’t do metrics here.”

------
melling
“But the most dramatic negative effect of metric fixation is its propensity to
incentivise gaming: that is, encouraging professionals to maximise the metrics
in ways that are at odds with the larger purpose of the organisation”

~~~
navigatesol
> _“But the most dramatic negative effect of metric fixation is its propensity
> to incentivise gaming_

"Gaming" the workplace is happening with or without metric fixation. At least
with metric fixation it's front and centre when it occurs.

~~~
js8
Your post reminds me of the famous AI koan:
[https://news.ycombinator.com/item?id=10970937](https://news.ycombinator.com/item?id=10970937)

Maybe there is some truth in it. But I also feel that you might get a better
result if you let humans to figure out what is needed to be done to achieve a
goal instead of telling them through a metric.

~~~
philipodonnell
The feature that distinguishes a well designed metric is exactly that.
Specific enough to ensure alignment with broader strategic objectives, general
enough not to dictate exactly how.

~~~
js8
I think the point of the article is summed up in the sentence:

"The key components of metric fixation are the belief that it is possible –
and desirable – to replace professional judgment (acquired through personal
experience and talent) with numerical indicators of comparative performance
based upon standardised data (metrics)"

Basically, it is a problem of trust. You're trying to replace trust in people
with a metric which, in theory, lets you not to have trust in them.

Personally, I think it is a foolish goal. You are effectively replacing the
trust in people (which, admittedly, is vague) with trust in the metric (which
connection to reality is vague). The vagueness has a reason - reality is
complicated.

Why not just tell people what the "broader strategic objective" is, instead of
trying to come up with a metric that is an exact (and so necessarily wrong)
description of it?

~~~
philipodonnell
Trust is the wrong word, its more about alignment. I do trust that employees
can exercise their professional judgement. I don't expect that every employee
can perfectly align themselves behind a strategic objective without guidance.

If my broader strategic objective is to cut costs by 15% allow more
competitive pricing, the last thing I want is for every individual to define
what that means to them and hope the math works out in the end.

~~~
js8
"I do trust that employees can exercise their professional judgement. I don't
expect that every employee can perfectly align themselves behind a strategic
objective without guidance."

That's strange to me, because the former surely seems to be much more
difficult than the latter.

------
fooblat
I think this article conflates Management by Metrics and intense focus on
short terms results. Management by Metrics is just as compatible with a long
term focus or goal.

Now on the issue of people gaming metrics, that is a real concern and I
typically address it with "counter metrics" intended to protect against
gaming. It is not perfect, but it works well enough when you take the time to
choose your metrics (and counter metrics) carefully.

So instead of "improve A by 20%" we get "improve A by 20% without impacting B"

~~~
jacques_chester
> _Now on the issue of people gaming metrics, that is a real concern and I
> typically address it with "counter metrics" intended to protect against
> gaming._

From the article:

> _In an attempt to staunch the flow of faulty metrics through gaming,
> cheating and goal diversion, organisations often institute a cascade of
> rules, even as complying with them further slows down the institution’s
> functioning and diminishes its efficiency._

~~~
fooblat
This reads to me like a culture problem. A good engineering culture should
help a lot to insulate against people who just want to game the system and
aren't really invested in the success of the project.

Of course, how you get there is anything but simple.

~~~
XorNot
The problem is: what is success of the product?

~~~
renholder
> _The problem is: what is success of the product?_

Agreed. If it's measured in a metric, like bugs for every check-in or number
of check-ins or numbers of lines of code, then reducing it down to metrics
doesn't do _anything_ to achieve the goal of the success of the product.

I'd - cautiously - argue it actually favours success of the individual _over_
success of the product.

------
navigatesol
> _The intelligence analysts who ultimately located Osama bin Laden worked on
> the problem for years. If measured at any point, the productivity of those
> analysts would have been zero. Month after month, their failure rate was 100
> per cent, until they achieved success._

I guess if the metric was "caught Osama Bin Laden or not", then yeah, I agree.

But who on earth would measure the analyst's productivity just so? Nothing but
a strawman.

> _The source of the trouble is that when people are judged by performance
> metrics they are incentivised to do what the metrics measure, and what the
> metrics measure will be some established goal. But that impedes innovation,
> which means doing something not yet established, indeed that hasn’t even
> been tried out._

Why? If a (admittedly weak) metric like "# of paying customers" is used as a
goal, how does that "impede innovation"? How are people being stopped from
coming up with creative ways to get more customers? Why do we assume the ways
in which they improve the metric will be bad, and not innovative?

~~~
jimktrains2
> I guess if the metric was "caught Osama Bin Laden or not", then yeah, I
> agree.

> But who on earth would measure the analyst's productivity just so? Nothing
> but a strawman.

Is that much different than measuring LoC, tickets closed, or features
implemented? Some features just take time -- sometimes just the research alone
to implement something could take a few weeks where literally nothing
measurable happens. That's the point: completion metrics aren't useful for
larger or open-ended projects. There was no gnatt chart, no schedule for
finding Osama bin Laden; similarly you sometimes cannot plan out a large
software project from the very start.

------
Symmetry
Is there any reason to think that people are worse at gaming qualitative
evaluations than they are at metrics, in general? Things like making friends
with one's boss, appearing to work hard, and so forth.

~~~
js8
The people who are gaming are probably not. However, the people who are not
trying to game the system might decide to do it if there is an objective goal.

But perhaps you're right. In capitalism, the most common purpose why companies
exist is to game the system (i.e. make profit). I don't see why in that case,
employees shouldn't try to game the people who game them (i.e. capitalists).

However, in companies that are collectively owned, this becomes less clear.

~~~
SkyBelow
>However, in companies that are collectively owned, this becomes less clear.

Unless it has no management there is still a management class with
disproportionate decision making power and thus the antagonistic relationship
necessary for there ti be a clear case for gaming continues to exist. Humans
are really bad at removing hierarchies, to the extent that I would assume any
place without hierarchies has them in a harder to notice format.

------
Mikeb85
I think metrics can be very useful but as with many things, you need to know
how to interpret the data.

I'm in the restaurant industry, so my #1 metric is profit (over the long run
anyhow). Now, the next metric is sales, since profit is a % of sales. So to
get those sales, we're looking at how many new guests are coming through the
door, repeat guests, average guest check, etc... After that, it's margins.
Sales mixes play into that (relating to COGS), and labour. Now, to determine
what we're getting out of our labour, more metrics. How much each server is
selling, what products they're selling, how many guests they can serve in a
night, etc... For cooks, it's about productivity and consistency.

Anyhow, there's a lot of shit restaurants and workers out there (due to low
barrier of entry education-wise), lots of people misinterpret the data,
leaving to a feedback loop of shit managers and employees. Often restaurants
will focus too much on one of those metrics, leading to places that just gauge
guests, or others that entertain guests who are fishing for free shit.

Now, as for metrics in the programming world which probably don't matter, I
always hear about lines of code written. Obviously it's easy to game and hard
to discern, as code can be too terse or too verbose. Programmers are also
usually so far removed from the sales part of the business that there's no
objective sales metric to use either. And that's where you need good managers.
I'm sure there's a good set of metrics that give some idea of performance, but
they need to be interpreted by someone.

And all of this reminds me of stats/economics (what I did in university),
where you're bombarded with data and need to interpret it. Like GDP per capita
can indicate general well being, but then you adjust it to PPP to get a better
metric of quality of life, and add inequality calculations to understand how
it affects different segments of the population, and you can go even deeper
with specific stats underlining quality of life (amount of disposable income,
amount spent on housing, amount spent on entertainment, etc...).

Anyhow, tldr here is that metrics matter, but interpreting them is a skill and
makes all the difference.

------
alexandercrohde
All valid points. Of course though, you can't rail against metrics as a whole.
We all use them every day and always will.

We don't personally verify a financial advisor is a genius, we look at their
annual rate of return.

We trust that if unit tests fail, code isn't good.

We trust the SATs, which are a better and more scalable and fair measure of
aptitude than any group of humans I could imagine.

The piece rails against metrics, which often backfire, but doesn't even
mention the alternative (trust those in power make any choice they desire with
no justification) which also is often abused.

------
agentultra
I don't have an attribution for this quote, so if someone knows it please
share...

    
    
        If you want an opinion let's go with mine.
        Otherwise let's look at the data.
    

It's true that many quantitative metrics can incentivize people in the wrong
way. There's a funny story about a city trying to get rid of rats that
incentivizes people to hunt rats and turn in the tails... but the people start
farming the rats in order to get the reward instead while the project leaders
pat each others' backs.

However as an engineering leader you should not discount the value of
qualitative metrics or metrics altogether. I try to only make a decision "from
the gut" when there is no other choice and decision must be made. Data is key.
How do you know how well your code review culture is thriving if you do not
see, across the entire team, metrics reporting the number of open pull
requests, how long they are staying open on average, how many comments they're
getting, etc? How do you know if one of your engineers is struggling if you're
not watching what gets checked in, what code is being rewritten frequently,
etc? How do you gain leverage for your team to management if their are no
facts, metrics, to back you your arguments?

Just a reminder that you _should_ use metrics. Maybe avoid using them as
reward mechanisms and use them as leverage instead.

I don't work for GitPrime but you should definitely check out what they're up
to.

[https://try.gitprime.com/data-driven-engineering-
metrics/](https://try.gitprime.com/data-driven-engineering-metrics/)

~~~
hopler
"and if you want data, let's go with mine". -- how to lie with statistics.

~~~
agentultra
Well if you want to play that game then how do we know the person who's
relying on their instincts and intuitions isn't lying either?

Who do we believe?

I didn't suggest using metrics to make anyone accountable but using them as
leverage to empower your team.

What would be the point of gaming those metrics? Nobody would profit from lies
and if you're checking the providence of your data, in the case of Gitprime
your own repository, then identifying a culprit would be rather easy.

------
deanalevitt
> _Contrary to commonsense belief, attempts to measure productivity through
> performance metrics discourage initiative, innovation and risk-taking._

Is this really contrary to commonsense belief? I think most business leaders
are very aware of this, however, the alternative is chaos. That's why certain
specialist teams are given leeway.

~~~
acdha
I think awareness is _very_ bimodal. I've sat in rooms with people who really
thought they could transform a business with 5% YOY improvements.

(That's not always the wrong view, either — a key part of this is recognizing
what kind of situation you're in and whether you really have the trust and
resources to bank on a revolutionary change)

------
maxprimer
The author in the article focuses only on one-half of the metrics within a
given process, the lagging results. There is another whole dimension to
performance metrics which is the causal dimension, your leading metrics or
leading indicators.

You can only effectively use a lagging indicator–such as time to restore–once
you understand the activities that make up the world of restoring service.
Does the product have an SOP document? Is there monitoring in place to alert
technicians quickly? What is the training level of your employees? Are your
employees empowered and engaged? All of these things matter far more than
lagging outputs.

Sure, we can disparage lagging indicators all day, but we can't throw them out
just because they can be gamed. You have to push deeper and that requires
time, process knowledge, and building a healthy working environment.

------
avrohom770
Finding useful performance metrics for evaluation is difficult and assumes
somebody knows what is best for the dept/company/division, that the evaluators
are well trained, and which metrics show the correct way. There are useful
metrics if you are playing Golf, baseball, poker, bridge, chess, but even then
individual performance will vary year to year and even the stars have slumps.
The Normal Distribution shows 95% of employees are within +3 Std Dev of the
mean with only 2.5% excelling and 2.5% needing help in a given year and in the
next year everyone will have a different ranking. Performance metrics are
mostly a waste of valuable time for both the employee and the management.

------
jl2718
Compared to the other sciences, it seems to me that management science, as an
entire discipline, is a nearly complete failure. This is a field where startup
experiments by complete newbies lead the conversation against a network of
business schools and long-standing organizations. There seems to be no
consensus whatsoever on the best way to manage people. It’s not even clear, to
me at least, that modern management practices at many large corporations are
provably better than some kind of naive baseline like a pure democracy or pure
dictatorship. Expert texts are nothing more than pontification and hand-picked
anecdotes. Does anybody really know what works?

~~~
gabriel34
Even worse, in the same source basic concepts have multiple definitions,
partially overlapping, partially contradictory.

This is exactly why I believe C-level officers are grossly overpaid. Previous
success does not guarantee future success, especially in different industries,
hence they don't know some magic formula to make his salary * 1000 in profits
for the shareholders anymore than a mid level manager does.

------
KC8ZKF
The author was interviewed on the Econtalk podcast last year.

[http://www.econtalk.org/jerry-muller-on-the-tyranny-of-
metri...](http://www.econtalk.org/jerry-muller-on-the-tyranny-of-metrics/)

------
mothsonasloth
Its happening all over my company, with dashboards for all kinds of useless
metrics. The worst part is gaming unit test coverage to get over the 80% mark.

I keep trying to say I would rather have 40% well written tests than 80%
crappy tests.

:(

------
wins32767
Management via metrics is a really useful shortcut to generating alignment,
but it is a shortcut. If you tell people that they need to move a number there
isn't a lot of ambiguity and you can just keep beating the same drum over and
over. Trying to align people on the whole business context is much harder and
takes a lot more time though obviously much more valuable.

------
hartator
> to replace professional judgment (acquired through personal experience and
> talent)

The thing is there is no magical way to measure this. How you define personal
experience and talent? From the number of years spent in the industry? It's
still back to another metric.

Metrics do matter. Common sense does also matter. Making decisions in a
chaotic world is indeed difficult.

------
gabriel34
This has been subject of administration theories for the last century:
[https://en.wikipedia.org/wiki/Management_by_objectives#Argum...](https://en.wikipedia.org/wiki/Management_by_objectives#Arguments_against)

------
jamessun
"Show me the incentive and I'll show you the outcome" -Charlie Munger

------
teahat
What this article misses out is that the metrics will be present and used
whether they are articulated or not. By making them explicit they can be
improved, validated, debated, and removed. Standards are consistent. When
they're implicit, you get none of those benefits and instead promote an
insider culture of unspoken biases that get no scrutiny at all.

The choice isn't between metrics and human intuition, it's between explicit
and implicit metrics.

Of course there are bad metrics; to use an example from the article -
measuring the output of analysts finding OBL on a binary metric is simply a
bad metric. But if we start from there, we can improve it. How many leads do
they have at the current time? What stage is each lead at? What intelligence
has it generated to this point?

------
pwncake
The Hustle wrote a good article about this a little while ago:
[https://thehustle.co/Goodharts-Law](https://thehustle.co/Goodharts-Law)

------
_Codemonkeyism
Required reading

"Measuring and Managing Performance in Organizations"

Has a lot of information about how people react and adapt to setting KPIs, aka
"gaming the system".

------
peterwwillis
> Economists [..] report that in recent years the only increase in total-
> factor productivity in the US economy has been in the information
> technology-producing industries.

And guess how those industries did that? _Metrics._

Metrics are a simple tool. They are almost worthless by themselves and can
even be a detriment. But like any tool, if you _pick the right ones and use
them right,_ it's much better than not having the tool. There are _tons_ of
ways business processes can improve by analyzing metrics, it's not just an
employee productivity stick.

------
okaleniuk
I used to work once for a company where the developers were reworded for
writing more lines of code than the average. Once.

------
polskibus
Reminds me of a great book, about managing the expectations instead of
managing the companies for real, and how the system is rigged to select for
short-term gains what in most cases backfires.

[https://www.amazon.com/Fixing-Game-Bubbles-Crashes-
Capitalis...](https://www.amazon.com/Fixing-Game-Bubbles-Crashes-
Capitalism/dp/1422171647)

------
ratling
You get what you measure.

