

A/B testing our friends - azarias
http://blog.meritful.com/post/33023288815/a-b-testing-our-friends

======
birken
I'd be slightly more careful about making the claim you are making (though in
your defense, nearly every person who makes a boastful post about their A/B
testing is makes the same mistake).

For starters, you are picking an incredibly noisy metric, taking the maximum
observed conversion and the minimum observed conversion and then dividing
them. To better illustrate this, I wrote a little test script to simulate what
you did many times. It assumes you have 4 email variations, send 125 emails
each, and the average conversion rate is 5% across the buckets. Then it takes
the minimum converting bucket, the maximum converting bucket, and divides
them. Then does that 10k times and averages out what the gain would be by this
metric, which is ~2.7x (code here: <https://gist.github.com/3846278>). 2.7x is
a lot, but of course in this example it is pure noise, every single bucket
converts exactly the same over the long term.

This metric also is particularly unhelpful because your goal is to have one
high converting email. The metric you have chosen will be artificially boosted
if you happen to have one horribly converting email. While that is potentially
interesting fodder in just how differently various emails can convert, having
one horrible email doesn't help you very much.

A better way for you to do this analysis is to dump your raw results into a
mathematically sound A/B test calculator (I'm a bit of a homer but I don't
think you can beat ABBA [<http://www.thumbtack.com/labs/abba/]>), then look at
the confidence conversion ranges of the various emails and only make claims
based on that. Like... I tried 4 versions of email copy and got my conversion
rate up to X% (+/- some hopefully small confidence interval)! One of the
emails was a real stinker and only converted at Y% (+/- confidence interval),
thankfully I A/B tested first and didn't end up getting stuck with that one!

~~~
azarias
You make an important point. We obviously have a lot of testing and refining
to do. Thank you for taking the time to investigate our claim :)

------
partymon
I think one thing to be careful about is not to read too much into response
from your friends... they are already biased. That said, the difference within
that set is interesting.

~~~
azarias
That makes sense...that was why we wanted to point out relative performance
rather than absolute numbers. It is also probably fair to point out that the
content and tone of our messaging differed quite a bit, which explains some of
the differences.

~~~
partymon
There's a wealth of knowledge about email marketing which doesn't come up
nearly as much on HN

------
fluxon
Cache, in case it 404s -
[http://webcache.googleusercontent.com/search?q=cache:blog.me...](http://webcache.googleusercontent.com/search?q=cache:blog.meritful.com/post/33023288815/a-b-
testing-our-friends)

------
jackds
When does family/friends announcements turn to spam? I know many marketing
email providers are weary of writing to your entire address book of people who
did not opt in.

~~~
azarias
We actually looked at using MailChimp for this, but we didn't want to go
outside of the TOS. So, we ended up merging the mail locally using Thunderbird
and sending it from our personal accounts. I am sure there is more to be said
about the line between the two, however.

------
b2rock
Are there any best practices for launch day email announcements? Good ways to
build a list, or take advantage of one you already have?

------
gojomo
From the graph labels, it appears the subject line which moved the most
relevant info to the front won, as opposed to leaving out the startup name or
pushing it to the end.

But, they don't mention the sample size, and they do mention that there were
other changes in the email content and call-to-action. Nor do they make any
mention of statistical significance.

So we can't tell what actually helped, only that perhaps ( _if_ we assume did
their significance-testing correctly) that some set of changes can make a big
difference.

~~~
azarias
Pasting in the emails was going to make the post very long, but to summarize,
we presented Meritful as a tool for students to build a positive web presence,
as a tool for teachers and mentors to run projects and give feedback to
students, as a tool to help with college admissions by creating an impressive
portfolio and interaction from respected folk and another comparing Meritful
to existing products for teens.

The sample size was roughly evenly split between the total number of contacts
we had.

