
Net Promoter Score Considered Harmful - arnklint
https://blog.usejournal.com/net-promoter-score-considered-harmful-and-what-ux-professionals-can-do-about-it-fe7a132f4430
======
Dave_TRS
>We Can’t Reduce User Experience To A Single Number

This is the crux of the issue - CEOs do need a simple metric to summarize
performance in each of their business units in order to evaluate performance
and prioritize company resources. Metrics are only useful for decisions if
they are easy to understand, and consistently measured. Just as Earnings is
used to measure profitability, and Revenue is used to measure growth, NPS is
the best they've come up with so far for customer experience. Of course none
of these 3 metrics is perfect, but they still need to be measured and
complimented by more detailed reporting. To better understand Earnings you
look at changes in the component parts, revenue and cost. To better understand
NPS you look at the mix of scores (1s, 2s, 3s, etc) and follow up questions
like actual referrals, which differ by business (unlike NPS which can be
applied more broadly).

The author devotes most of the article pointing out that there are areas where
follow up questions would give a richer understanding of customer experience,
which of course all NPS proponents would agree with.

What he could do a better job of is trying to convince us why executives
shouldn't even attempt to summarize their company's performance in customer
experience, just like they summarize lots of other complex activities to
report to shareholders. Too many companies focus only on hard metrics like
revenue and profits, which is why they find NPS a helpful way to steer the
company's focus back to the customer.

~~~
jmspool
Well, let's let the author try to convince you that NPS is a harmful, horrible
number to summarize a company's performance on.

He would tell you that NPS is only like earnings or revenues if we allowed
either to have 50% or more of their data filled with arbitrary numbers, not
audited data collected from state-licensed specialists who would lose their
job if it was discovered the data was manufactured whole cloth.

The author would also tell you that NPS is easily gamed and there's no
checking on whether that is or not. He wrote extensively in the article the
various techniques that folks can game the numbers. If this is a number
reported to shareholders, shareholders should insist (No, Demand!) that the
numbers be corroborated by a neutral third party that will accept liability
for any errors. (No surety insurer will guarantee such a liability, for the
risk of error or misrepresentation is way too high.)

As you stated, most use follow-up questions to get a richer understanding of
the customer. What the author would tell you is that it's clear the NPS
recommendation question taints those followup questions and diminishes their
validity and inherent value. If the true goal is to learn a richer
understanding of customer experience, there are many better ways to achieve
it.

In other words, the author believes if executives want a simple metric that is
better than NPS, a random number generator is the fastest and cheapest way to
achieve it. Why bother with customers at all, if all you're going to do is
squander your interaction with them on such a foolish metric.

— The author.

~~~
Domenic_S
> _For some reason, NPS thinks that a 6 should be equal to a 0. Nobody else
> thinks this. Remember, if you worked at a company like Intuit, all that hard
> work to get everyone to move from a 0 to a 6 would not be rewarded. Your
> executive would not get their bonus. It’s as if you didn’t do anything._

This seems perfectly reasonable to me. Outcomes matter -- not effort -- and
reaching 6 is not the outcome NPS wants.

~~~
jwatte
Separately, the distribution will never be that narrow in practice. Once the
highest rater reaches 7, NPS will start improving. The author even states
herself that the input has noise, so the "everyone's a 6" argument is a straw
man.

------
leroy_masochist
I work as a consultant to PE companies as a source of cash income while doing
other entrepreneurial stuff. Much of this work involves pre-deal commercial
diligence and/or post-deal portfolio company strategy review. In the context
of this type of work, which often involves formal customer / partner
interviews, I think NPS is actually a very useful metric.

For sure, NPS has its limitations. Broadly speaking, I think it's a bad metric
for something like asking random strangers how much they like their iPhone.
But if you're trying to figure out the strength of, for example, an ERP
software company's customer franchise, NPS is really useful, for two reasons:

1\. When you're talking to people who know a product well, and you're talking
to them in a professional context (e.g., a scheduled 15-minute interview that
is part of their work day), they tend to be thoughtful about the 0-10 rating.

2\. The follow-up question traditionally used in NPS surveys, "What could
[company] do to make that score a 10 instead of [number]?" almost never fails
to provide valuable feedback. I believe this is because of the way the
question is set up: first the respondent is asked to provide a broad indicator
of their customer satisfaction, and then, with that number identified, they're
asked for details as to _why_ they chose that number. I have no background in
psychology but I think it's pretty intuitive that this question-flow focuses
the mind and prompts people to provide insightful constructive
recommendations.

~~~
bunderbunder
To your 2nd point, there is some danger there with big ticket products like
ERP software.

Asking "What can we do to make that score a 10" of existing customers is fine
as long as you've got it balanced with some other work to help you gauge the
thinking of the portions of the market that aren't already sending you a 6
figure annual check. But I bet many companies are lulled into a false sense of
security by numerous answers to that sort of question that just request minor
improvements to existing functionality. Probably frequently given by people
who answered 9 to "would you recommend?" and then migrated to some hip new
cloud offering a year later.

~~~
leroy_masochist
In terms of gaining/maintaining a holistic understanding of a given market and
the competitive opportunities within it, I fully agree with your point that
understanding the needs of one's existing customers is necessary, but not
remotely sufficient.

That broader context is outside the scope of the work I usually do, though --
I'm usually focusing much more specifically on understanding the strength of
existing customer franchises.

Certainly, two customers who give 9-out-of-10 scores might have very different
levels of satisfaction. Sometimes it's a 9 because "I love the product and the
customer service is great but on principle perfection is impossible and I
don't give 10s" and sometimes it's a 9 because "the product is pretty good and
does everything we need, but our main sales rep is really hard to get ahold
of". This is why, in practice, I find that more insight comes out of the "why
not 10" questions than the pure NPS number.

------
svantana
>Any normal statistician would just report on the mean of all the scores they
collected from respondents

What? No!!! Any self-respecting statistician would know that the mean of a
quantity where addition is not well-defined is meaningless.

~~~
philipodonnell
Wow, yeah, agree. Its not the only point in the article and some of the other
ones are fairer criticism, but this line was a bit painful.

Use the median!!!

~~~
chewbacha
I winced as well! But at least he grabbed an actual mathematical concept used
to describe a set of data.. as opposed to whatever NPS is using.

------
correlation
NPS is first and foremost a marketing tool for Bain & Co to sell expensive
consultants and def not "the one question to ask".

Several marketing scholars have pointed out the psychometrically undesirable
properties of the metric and there is conflicting evidence on e.g. the
metric's predictive validity, e.g. in relation to company success

(All in all that old HBR marketing article contains very little details to
back up the claim that it is a good and valid metric.)

If you can go with behavioural outcomes for measuring success, e.g. purchases,
I think that that will always be more powerful than what a user says in a
survey.

And if you can then causally link (e.g. through experiment) something you can
influence (e.g. some dimension of "UX) to that outcome you actually have a
quite good decision tool too.

NPS does not provide that.

------
bunderbunder
I regularly screw these scores up by honestly answering the question I was
asked.

> On a scale of 0-10, how likely are you to recommend Netflix to a friend or
> colleague?

0

> You answered 0 to the last question. Why wouldn't you recommend Netflix?

It's 2018, people. Everyone I know either already uses Netflix, used to use
Netflix, or doesn't own a computer. If I started going around making a
recommendation like that, they would think I'm a prat.

~~~
umanwizard
I think the methodology assumes that only a small fraction of people are
pedantic enough to do that

~~~
bunderbunder
I gave an extreme example for comedic value.

The kind of situation I'd expect to see more often in real life is that NPS
scores are inflated for fun luxury goods, because they naturally inspire more
enthusiasm, and deflated for essentials, because very few people are even
capable of getting excited about socks.

Mostly I was meaning to hint at, where I've seen things like NPS go off the
rails is when people assume they measure what the marketing pitch says they
measure. The reality is invariably more subtle.

~~~
walshemj
I had similar call just after a 10 day stay for a kidney transplant they rang
up to ask how likely I would be to recommend the hospital to someone!

~~~
DanBC
The English Friends and Family test a large scale implementation of this kind
of scoring system.

Data is available here:
[https://www.england.nhs.uk/fft/](https://www.england.nhs.uk/fft/)

------
giovannibajo1
My feeling about NPS has always been that the very wide pointing system
polarizes users towards the two extremes; most people aren’t analytical and
they don’t build up a mathematical way to objectively come up with a number
between 0 and 10. They would instinctively vote 10 if they’re happy, 0 if
they’re angry, and something in the middle if they’re not fully satisfied. The
way they map partial satisfaction to a number is unscientific, and this is why
NPS only rewards perfect scores (9 or 10).

Similar findings were made by Google with the YouTube rating system that was
switched from 1 to 5 stars to a simple upvote/downvote after realizing that
most users are polarizing towards the extremes:
[https://techcrunch.com/2009/09/22/youtube-comes-
to-a-5-star-...](https://techcrunch.com/2009/09/22/youtube-comes-to-a-5-star-
realization-its-ratings-are-useless/)

~~~
smackay
Recent experience with NPS both for internal use (are the employees happy) and
with users parallels this also. Somewhat happy users appear to start with a
score of 10 and discount points based on various negatives. Unhappy users
start with 0 and only adjust upwards. albeit with more inertia.

To be fair, everything I've read about NPS says don't use it in isolation. The
article covers this with the "why?" question. You need to follow up with more
detailed questions to get any real benefit from it. But that reduces the
survey to a simple tool for flushing out your unhappy users. It gets more
complicated because, like unhappy families, everyone is unhappy in their own
way.

Some additional things that muddy the waters:

1\. Cultural fit - not giving a good score in seen as impolite. As a result
you have to be pretty, damned unhappy before considering anything else. When
you're users have reached that point you've probably lost them regardless of
what you do.

2\. Time of use - new users will be enthusiastic and give good scores because
new users are enthusiastic and give good scores. By the time they have used
the platform for long enough to make a valuable contribution they have
generally lost the will to tell you in detail why your product is great or
sucks.

2\. Sample size, sample size, sample size. The math here is not precise but
it's precise enough to tell you if your sample size is too small. The main
effect of too small a sample size is the variation in the score is so great it
is impossible to infer anything and you may as well abandon the approach lest
you end up chasing shadows trying to make something better.

3\. Internal use - simply don't do this - ever. There's a vast amount of
politicking and employee uneasiness associated with this. Nothing is ever
anonymous no matter what is said publicly.

So, in the absence of anything else NPS is probably a reasonable measure if
only to generate a conversation internally and for trying to gauge whether
your efforts are moving the needle and in which direction.

------
notimetorelax
There's a lot to unpack in this article, but one thing caught my eye. In the
"The Ultimate Question 2.0" book the author claims that NPS needs to be
considered as a relative measure to competitors. So if we follow this advise,
United needs to aim to have the highest score accross all airlines. To give an
example, an NPS of -20 is decent if competitors have below -30.

~~~
simonhamp
I like the idealism of this, but I highly doubt reality would follow through.
At least not until organisational transparency is proven to be a winner in
Business degree courses

------
dahart
> We Can’t Reduce User Experience To A Single Number

Growth is a single number, and NPS is measuring growth, not UX.

I'm no big lover of NPS, but this analysis is awful! He claims it's bad
because he doesn't understand it. That's not very scientific either.

I'm not sure NPS works well at all, but the idea behind it is obvious. It's a
growth metric. The goal of NPS is to tell you how many new customers an
existing customer will refer to you over the lifetime of their account.

This is just like population statistics. NPS is trying to measure your
customer birth rate by asking how many customers are (or intend to be)
pregnant. It's not an accident that there are only two thresholds, and it's
wrong to conclude that these thresholds indicate a problem with the method.

If a customer recommends less than one new customer during the entire time
they're with you, then you have a replacement rate less than 1, and you're
losing customers over time. If you have a replacement rate between 1 and 2,
and your customer lifetime is long (say, years) then you aren't growing fast.
If your replacement rate is 2 or higher, and your customer lifetime is short
(say, weeks) then you are growing more than 200% month over month virally,
without the need for marketing.

What the people who designed NPS did, I am sure (meaning I'm speculating, but
giving the strongest possible interpretation), is measure some responses and
compare it to the number of actual referrals, then drew the lines where the
referral rates cross from negative growth to neutral grow, and from neutral
growth to positive growth. That's what I would do. And it seems plausible that
people who give a score of 6 or less won't end up referring anyone, on
average.

Sadly, the article doesn't conclude with any real alternatives for measuring
growth. Since NPS is an indirect growth metric, the better answer may be to
simply measure your growth directly. That means understanding engagement and
activity, not just counting how many accounts exist, but other than that,
counting your active customers is a single number that will reliably tell you
growth, and can't be gamed by your customers -- it can only be gamed by
yourself and your team.

~~~
jmspool
> Growth is a single number, and NPS is measuring growth, not UX.

Connect the dots for me on how NPS measures growth. Where does it tie to
growth at all?

> NPS is trying to measure your customer birth rate by asking how many
> customers are (or intend to be) pregnant.

Horrible analogy, but ok. I'd say, if there's any equivalent, it's asking how
many people think they are likely they might get pregnant ever.

> What the people who designed NPS did, I am sure (meaning I'm speculating,
> but giving the strongest possible interpretation), is measure some responses
> and compare it to the number of actual referrals, then drew the lines where
> the referral rates cross from negative growth to neutral grow, and from
> neutral growth to positive growth.

They didn't do anything like that.

> And it seems plausible that people who give a score of 6 or less won't end
> up referring anyone, on average.

It does seem plausible. It isn't validated by any science, but it's certainly
plausible. (Like the earth is plausibly flat.)

> Since NPS is an indirect growth metric, the better answer may be to simply
> measure your growth directly.

Agreed!

~~~
dahart
>> What the people who designed NPS did...

> They didn't do anything like that.

Now I'm really confused by your statements. I just read the link to the
original source that you posted on hbr.org. What Reichheld described is
exactly what I said above, he correlated survey responses against actual
growth rates, and drew the lines between negative and positive growth rates.
Not only that, he asked the question multiple different ways, and found out
which question statistically landed the most accurate answers.

Why are you claiming they didn't do that? Are you saying the article is lying
about the data they used to come up with NPS?

------
simonhamp
If you’re going to ask what someone _will_ do, it’s best to keep it to a
simple binary decision and always remember that that’s only representative of
a single moment in time/the current customer service cycle for that
individual.

As this article identifies correctly, the more important question is the
harder-to-quantify “why?”.

Armed with this, the fluctuating and heavily biased score will have more
meaning and context. Using that data to improve the service/product for the
next cycle may do more to realistically reflect sentiment change than any
weirdly random maths.

Of course, this still ignores what someone actually does do. But it’s possible
to track certain cases of referral and actually to encourage trackable
referrals using the correct approach. This is a more solid measure (if
possible) than any hypothetical gauge of future behaviour - again as the
article points out.

------
bichiliad
Let me first say that the burden of proof here relies on Fred Reichheld (or
whoever else champions NPS now) to defend NPS from the claims in the cited
papers.

I totally agree that NPS feels like placing faith too much in something
magical. However, I don't feel totally convinced by this article; when people
calculate NPS, do they really ignore any other analysis on the raw input (i.e.
all 0's vs all 6's)? Yes, there are totally exceptional datasets that make the
NPS look like it hasn't move; has that been a problem for people in the wild?

Also, is picking one ecommerce customer out of the data enough to say that
there's no correlation between NPS and future behavior?

I've seen NPS used as a way of keeping a pulse on a community; if it drops
sharply, something is clearly wrong in a way that normal monitoring can't
surface.

~~~
jmspool
> I've seen NPS used as a way of keeping a pulse on a community; if it drops
> sharply, something is clearly wrong in a way that normal monitoring can't
> surface.

If this is your goal (and it's a good goal), there are way better questions
than NPS to use here. I'd go with a simple "How did we do today?" question,
versus the convoluted NPS mechanism.

~~~
bichiliad
Yeah, probably. It's the sort of thing where someone's going to want to know
your NPS anyways, so if you're collecting that data you may as well break it
apart a little. And by no means do I think you shouldn't be doing other sorts
of user research.

------
mercwear
NPS is great when used and managed correctly. Unfortunately most companies
send out an NPS survey and ONLY look @ the score. The score is a lagging
indicator of the company taking time to actually read the comments gathered
from the NPS survey and act on them to improve the customer experience.

NPS is not harmful, poor leadership and failure to correctly manage a feedback
program is.

~~~
mathattack
Exactly. Also, many execs do wrong comparisons on the score. (Comparing across
industries, for example)

The real lessons are in the verbatims.

------
twiss
I think the NPS question could benefit from some more specific answers, such
as:

\- Yes, I would bring up how much I like [product]

\- Yes, if the subject of [product category] came up

\- Yes, if my friend had a problem that [product] could solve

\- No, because I don't care enough about [product category]

\- No, because I don't like [product] enough

\- No, because I dislike [product]

------
notimetorelax
A long article with a mix of good and some poorly informed arguments, that
ends up a sales pitch for a UX workshop.

~~~
hobs
The article ended and then there was an advertisement in italics, I didn't
read the article as an advertisement for their book as much as just a
comprehensive slamming of NPS.

Which is a terrible metric that pretty much nobody should use.

~~~
notimetorelax
I agree with that, I think it's an interesting tactic to increase visibility
of your product.

1\. Create a strong worded post that attracts attention

2\. Attach to this post your sales pitch.

~~~
andrewingram
I don't know if this tactic applies in this case. Jared Spool is a _very_
well-known UX expert, he's also been ranting about NPS for ages. It feels more
like the other way round, he wrote the post and attached a sales pitch, rather
than (as you say) attaching a post to the sales pitch.

------
ianbicking
I kind of like NPS: it doesn't try to make sophisticated distinctions, and
distinguishes between enthusiasm and acceptable mediocrity. It's oriented
towards growth because of that emphasis on enthusiasm. The very low end of the
scale doesn't really matter, as many of the people who score the product very
low have already stopped using the product, so grouping 0 responses with 5
responses seems reasonable to me.

But the NPS number is more a number for executives than people working
directly on the product. NPS doesn't give UX people anywhere near enough
information to do their job. None of these top-line numbers have enough
information for that.

I found myself thinking about this:
[http://www.ianbicking.org/blog/2016/04/product-journal-
data-...](http://www.ianbicking.org/blog/2016/04/product-journal-data-up-and-
data-down.html) – there is a real dysfunction that the author is seeing, and
is common. Top-down process design means that executives design the work
process and give it to their managers, and managers design the process and
give it to their reports, and you end up with processes that aren't designed
to help individuals do their own jobs, instead everyone is supporting someone
else's job. Executives make broad decisions: is this product succeeding? If we
continue our current approach where will that take us? Are the teams
performing well? Someone who is doing UX shouldn't be worried about these
questions, they should be concerned about specific details, because those
specific details are what a UX designer can change.

------
jwatte
I believe the NPS mechanism of "6 is bad" intends to capture more psychology
than the statistical analysis on the article gives it credit for.

That being said, I agree that a three point scale would be better in most
cases.

The focus on the number proves the saying: when a metric turns into a goal, it
ceases to be useful. The goal should be "dollars earned." All the metrics
should just assist in improving this metric. Else you'll get the bad
statistical hacking described. Those hacks are not just for NPS; I think we've
all seen them from various companies.

Of course, dollars earned also ends up being gamed (see also: Enron) but at
least there are usually better safeguards against this in most organizations.
And increasing dollars earned are usually a good end state.

------
mrisoli
I've seen people who barely understood the NPS formula obsess over their NPS
score as well as customers who wrote quizzical responses such as "I love your
service it is the best, everything is perfect" and grade an 8, overall I find
its an interesting vanity metric.

~~~
JumpCrisscross
There are products I love but which I never promote to my friends. I provide
the company with revenues, but will never contribute to their growth. On the
NPS scale, that’s a solid 8.

------
tjpnz
I worked for a large Japanese company that settled on NPS as one of their main
KPIs. It was spearheaded by their American educated CEO and was implemented
across all their products. The data looked good even when the company was
hemorrhaging customers. Despite the flaws I can't help but wonder if questions
were asked about its applicability to Japanese audiences.

------
daanlo
NPS is a great tool if you want to understand your word of mouth growth. The
issues described in the article are issues of Data Interpretation and not of
the actual tool. EG "I gave a 0, because I dont know anyone to recommend the
product to" is a typical word of mouth problem to Work on

------
txmx2017
In customer service, a 10 is good, a 9 is good with room for improvement, and
everything below an 8 is bad. The author makes a big deal about the difference
between a 0 and 6. But both mean the customer is disatisfied. There’s no real
difference.

~~~
fotbr
No real difference? Six might mean I'm willing to give you another chance.
Zero might mean I'm so disgusted I'm going to bad-mouth your company at every
opportunity, and do everything I can to see you fail.

For a big company (Forbes 500), or one with an effective monopoly (cable
companies, airlines), or one with big enough backers (some SV startups) it may
not matter. In those cases, the company doesn't need any single individual,
and the individual doesn't have the power to really hurt the company.

For your local "mom and pop" company in a small town with only a few
employees, the difference between a 6 and a 0 might be massive, perhaps to the
point of staying in business or not.

~~~
lbotos
I work in Support and have for almost 10 years now. In NPS theory, you wanna
focus first on the 8s, as they are closest to a promoter (9) and then work
your way down. Usually the 8s have easier fixes than the 0s as well.

You can choose to go down and focus on the 0s too, but they help to often spot
bigger pain points that could shape the product.

Again, the thing to remember with any number is we optimize for what we
measure. NPS can be helpful but you need to have the NPS feedback machine
running smooth or else you are going to spend your time building/running that
vs. actually doing your thing.

------
NelsonMinar
NPS: like Klout for Customer Satisfaction

~~~
danaseverson
Only it's not similar in any way.

------
tuna
this field was studied and explored in the 90s by University of Wu-Tang Clan
in their seminal paper C.R.E.A.M
[https://www.youtube.com/watch?v=PBwAxmrE194](https://www.youtube.com/watch?v=PBwAxmrE194)

------
tomxor
...Article considered harmful: Constant JS thread use, avoid.

------
tangue
"NPS thinks that a 6 should be equal to a 0." No : It considers that at 6
you're not a promoter. But your score will be 6. And the progress from 0 to 6
will be reflected on your score.

~~~
Lazare
> No : It considers that at 6 you're not a promoter. But your score will be 6.
> And the progress from 0 to 6 will be reflected on your score.

That's the thing though: The progress from 0 to 6 is _not_ reflected in any
score.

~~~
wpietri
That is a problem in the contrived examples here, but I'm not convinced it's a
problem in the real world. If I look at actual NPS scores of brands I'm
familiar with, they match what I hear about them from people. E.g., Tesla 96,
Apple 72, Comcast -3. (from [http://indexnps.com/](http://indexnps.com/) )

The theory of NPS is that what matters is what people say about you. If people
from 0-6 are all going to say negative things when asked, then lumping them in
the same bucket is reasonable. It may not be as good as a more subtle scale,
but it may be much better than thinking the numbers are linear.

It's possible that one could come up with a mapping that's even better, of
course. But NPS is simple enough that even executives understand it. A
marginally more-accurate number that nobody understands is probably worse,
because people will trust it less. The point of the NPS score is not
theoretical accuracy, it's motivating change.

~~~
viridian
The problem is that if I'm giving a company a 5 or a 6, I probably just sort
of tolerate the company in lieu of reasonable competitors, ie McDonalds being
the only quick food anywhere near where I work. If I'm giving a company a zero
it means I hate them and have a strong desire to see them go out of business
(ie Google), and will also help with any endeavor to speed that along if it's
easy for me. There's a massive difference between those two.

~~~
wpietri
I get that, and maybe that's now it works for everybody. There's certainly a
massive difference in feeling; maybe that really does translate to a big
difference in an individual's behavior. But does that translate to much of a
difference in terms of word-of-mouth growth? There I'm not so sure.

Even if it did, though, it's not clear to me that there's much difference in
the utility of the NPS metric. Are companies with a lot of zeroes also
companies that are sincerely seeking to improve? Would a more complex scoring
system motivate more change? If so, does the benefit gained outweigh the
extent to which the added complexity harms NPS adoption elsewhere?

In practice, if some company had an unusually high number of zeros relative to
sixes _and_ were very serious about change _and_ the metric didn't shift much
when a bunch of people moved from zero to six, you can bet that someone would
explain this in a meeting and everybody would still be excited. So although I
get that this would be a problem if NPS were the only number used, I'm just
not persuaded that some sort of NPS++ metric would be any better in actual
use.

