

Why average retention rates can lead to 50% error in CLV - pospischil
http://blog.custora.com/2011/08/why-average-retention-rates-can-lead-to-50-error-in-clv/

======
patio11
And if you think that sucks, wait until you either a) use CLV to justify
marketing spend or b) use CLV to justify your valuation with an investor. Oof,
self-inflicted damage.

One particular solution is to stop reporting CLV as a single number. I mean,
if you happen to know that there are two disjoint sets of Good Customers and
Bad Customers than that is _very useful information_ to anyone who needs to
know the CLV number to do their job. "What can we afford to spend to acquire a
new customer? I have a new channel I want to try." "It really depends on
whether you're getting Good Customers or Bad Customers. We can only pay $50
for BCs, but for GCs we can go $200+."

You then get into issues like "How do I tell the difference between a Bad
Customer and a Good Customer at an equivalent vintage where neither have
churned yet?" It may be the case that there are behaviors which you can use as
a proxy for which group someone is likely to fall in. Dharmesh Shah talks
often about a Customer Happiness Index that Hubspot uses, which is essentially
a regression that predicts churn rate based on measurable customer behavior.
It's all sorts of win to find that something like that works for your
business. (Hypothetical example: if Dropbox found that customers who used
photo sharing are the best possible Dropbox customers, it would make sense to
test things like a) biasing marketing to target photo sharers or b) bias
product design to push photo sharing as a feature.)

~~~
pospischil
Absolutely - the next step is to calculate the CLV of different acquisition
channels. Maybe Organic search results in a favorable mix of good vs mediocre
vs bad customers, whereas affiliate marketing results in a poor mix.

You can use this information to help inform a new channel decision (a new paid
search channel will likely be more similar to another paid search channel then
it will be to an affiliate program).

Behavioral triggers (which Custora uses) get more complicated -- but maybe
we'll touch on that in a future post.

------
srgseg
A proposed solution, for the critique of any mathematicians here:

Always split each channel into ten 10-percentile bands, instead of just taking
an overall average. Then model total expected revenue per channel based on
these bands.

Perhaps as few as three 33-percentile or five 20-percentile bands would be
enough if the number of samples is low when first testing a channel.

~~~
eru
Perhaps, if you are still fresh in integration and measure theory, you can go
for a continuous solution, that doesn't require a choice of bands.

~~~
_delirium
You don't have access to the "real" continuous distribution, though, but only
a finite sample of points from it. Most non-parametric ways of modeling that
are going to require _some_ choice of smoothing parameter, either something
like a histogram bin width, or a kernel-regression bandwidth (in either case,
you can use the data to choose one, using cross-validation).

~~~
eru
How about working with the distribution function instead of densities? That
way you still have the sampling, but you don't have to decide on a bin size.

------
3pt14159
Also, for most SaaS apps, your first month churn is typically _WAY_ higher
than your second, third and fourth. You need to simulate in order to find the
true LTV, also, be sure to include the fact that ARPU typically goes up over
time as people upgrade their accounts, so account for that too.

~~~
aaronjg
Indeed! That is why it is so important to include heterogeneity of the
customer base in calculations. You tend to lose the flighty customers early so
retention rates increase the longer a customer has been using the service.

------
jamesbkel
For anyone interested in some further reading:
[http://marketing.wharton.upenn.edu/documents/research/Schwei...](http://marketing.wharton.upenn.edu/documents/research/Schweidel_Bradlow_Fader_Portfolio_MgmtSci_11.pdf)

I included this technique in some work I did about a year ago, spoke a few
times with the authors. Granted, this is geared towards operations that offer
multiple services/varying levels of service as part of a "portfolio". Think
telecom and finance. It addresses CLV, but includes a regular reassessment of
value based on a customers pattern of behavior.

tldr; Instead of being service/no service it looks at a customer's propensity
to change service (upgrade or downgrade), this includes downgrading to the
point of termination.

------
callmeed
I've always struggled with CLV calculations–both figuring it out and deciding
if it's even valuable to know.

We have customers in the thousands and, while a handful leave every month,
many have been paying monthly for 2, 3 or more _years_. So, doesn't the CLV
for a business that has been around for 5+ years _change constantly_? What
good is that?

Also, we have a setup fee for some products. So, once we keep a customer for
30 days (i.e. they're no longer eligible for a refund), they are worth _at
least_ the amount of the setup fee, right?

BUT, if my CLV is $1,500 for a product with a setup fee, does that mean it's
okay for me to spend $1,200 to acquire 1 new customer? I have a hard time
believing that ...

~~~
pospischil
Sounds like you have some fantastic customers/a great product!

It's possible (and likely) that your CLV is changing over time as the mix of
customers you are getting is changing - the key is to be able to calculate CLV
as early as possible while still getting an accurate number.

If you are going to earn $1500 in profit from a customer, why wouldn't you be
willing to spend $1200 to acquire more? The issue you may run into is that, if
your $1500 customers found you organically, you can't necessarily expect
customers you acquire via different means to be worth the same.

There are some other variables at play as well: are you cash constrained? How
soon do you need your acquisition expense to be paid back?

------
jasonkester
This article stops short of giving the one piece that we need to take action
on this: The formula.

So, given a database full of customer records, all of which have a signup
date, some of which have a cancel date, what is the formula one would use to
determine the average expected "lifetime" of a customer.

I suspect that the author doesn't want to give us this formula, as it's part
of the secret sauce that the company behind the blog post sells. Still, maybe
if we ask nicely enough we can guilt it out of him?

~~~
carbocation
I'd bet it's different for every company, so learning his formula would not
really help. Learning the variables that he generates from his database and
uses as inputs to his modeling would, on the other hand, be helful.

~~~
jasonkester
No, it's just statistics.

Sure, the reasons why people are leaving might be different, and their
distribution of lifetimes might be the way they are because of business
reasons. But the math to determine that distribution will be the same.

I actually went so far as to ask this question over at the Statistics
stackexchange site. Being mathematicians, they debated it briefly, then
concluded that it was possible to determine, which, to a mathematician, is the
same thing as a solution.

I'm still hoping somebody who's done the calculation will share the magic SQL
query.

~~~
carbocation
Maybe we're talking past each other. I'm saying that the specific model for
one site won't work for another site. But the _procedure_ will work just fine,
because the procedure is just regression or any of its cousins.

------
amorphid
I understand why CLV is an interesting number. Maybe it is only really useful
to know that you can spend money attract profitable customers. I used to think
my business is awesome because I largely get repeat and referral business.
When I started thinking about CLV I started to understand that I had no real
formula for acquiring new customers. It's important to show you can take
active steps to bring in new customers.

------
wccrawford
It looks to me that those calculations are only correct if you ignore future
customers. Only your current customers are used for the calculation.

CLV assumes that everything continues as it is today, including gain and loss
rates of customers. If you stop gaining new customers, then yes, it's going to
be WAY out.

~~~
alex_c
Future customers obviously affect revenue, but how do they affect CLV?

~~~
jaredmck
Future customers would be acquired via different channels than current
customers, and therefore they'd have a different CLV (depending on the channel
mix).

------
aaronblohowiak
Building a simple monte carlo or markov chain monte carlo simulation of
attrition _and upgrades_ to predict LTV, while their analysis of why average
is not a great way to calculate, I'd be interested in reading about better
alternatives.

------
Dylan16807
I was baffled by the site for a moment giving me a blank page. Apparently css
sets the opacity of the main area to 0 and javascript has to come in to
override it.

~~~
bengtan
Me too.

Text only Google cache so noscript users can read it:

[http://webcache.googleusercontent.com/search?q=cache:http://...](http://webcache.googleusercontent.com/search?q=cache:http://blog.custora.com/2011/08/why-
average-retention-rates-can-lead-to-50-error-in-
clv/&hl=en&sa=G&biw=1040&bih=1006&strip=1)

------
jey
The post title should really expand the acronym "CLV" since the title makes no
sense unless you already know what CLV is.

------
fleitz
If you're using CLV to determine the cap on the cost of acquisition then the
average is important unless you know what kind of customer you're buying in
advance.

~~~
pospischil
Well, yes: average CLV is important for acquisition, but using an average
retention rate to calculate CLV will lead to a grossly inaccurate CLV
calculation.

