

A/A Testing: How I increased conversions 300% by doing absolutely nothing - s3nnyy
http://kadavy.net/blog/posts/aa-testing/

======
Menge
With 56 tests and looking at 2 dimensions, having ~6 false ideas certainly
isn't surprising even if you calculate to 95% CI..

But despite our focus on the null hypothesis, it is actually just the cost
break even to make false conclusions when two variants are equally effective.
So you are even less likely to make a mistake in a real A/B test the larger
that mistake is going to be and cost.

Emulating statistical theories poorly is the essence of many methods of AI
that despite occasionally ridiculous results are quite powerful.

If you instead simply measure your conversion rate of each campaign, you can
not tell if you are negatively trending in terms of the content because there
will be outside factors of user engagement especially if you lack an adequate
source of new subscribers. Only the most engaged humans will pull the signal
out of that noise enough to get better over years without an external system.
(If I were not going to A/B test, I would at least first estimate conversion
rates and then track my Brier Scores over time.)

I don't have any experience with email campaigns, but what I see with web ones
is that most organisations are fairly delusional when it comes to
understanding their actual audience and communally they are rarely open to
accepting corrections from people with less domain knowledge, (but domain
knowledge is primarily a fancy form of bias.)

For example, if your product caries risk, many of the customers who contact
you will be the most risk averse looking for security, but 95% of your
customers will probably be risk tolerant. I've repeatedly corrected some via
an A/B test where the B versions conversion rate continued to apply long after
it became the only version, but their staff will implement A style changes
whenever given an opportunity to do changes without testing. I'd guess that
customers they deal with give them a false impression and our repeated
demonstrations that there are significantly more customers with a different
view is not an adequate correction.

Not using outsiders it is harder to get such a full corrections to your
biases, but you can at least get a new employee's (or spouses or critical
customer's) idea tried out in a way where they are much more likely to
quantitatively show promise in way that demands acceptance rather than being
chalked up to noise using the human emotional system.

------
verelo
I think what this is really saying, or at least what this should be saying, is
all tests should have a control.

[https://en.wikipedia.org/wiki/Scientific_control](https://en.wikipedia.org/wiki/Scientific_control)

