The hidden costs of A/B testing

yummyfajitas · on April 10, 2016

It is, in fact, fairly normal to get a very low number of tests that actually move the needle. Most ideas you have suck and will not improve our conversion rate! This fact will not change if you stop measuring.

The article is right that some ideas aren't worth testing. For example if your checkout page is broken, just fix it.

One mistaken idea a lot of people have is that half your A/B tests will win and you'll get instant magic results. This myth comes from multiple places. Vendors and CRO agencies push case studies about wins, not about the process (which includes lots of losses). Until last year all A/B testing tools (including the one I work for) were nothing but false positive engines [1] due to support for multiple goals without applying Bonferroni, using non-sequential tests, and similar issues. Even if you just ran a bunch of A/A tests you'd get lots of wins!

[1] This is still true to some extent. The situation is now better - VWO, Optimizely and A/B Tasty all have fairly solid stats. There are a lot more than these 3 vendors, however.

(Disclaimer: I work for an A/B testing vendor.)

homero · on April 10, 2016

But I can make $60 million

https://blog.optimizely.com/2010/11/29/how-obama-raised-60-m...

ac132 · on April 10, 2016

Good feedback, agree completely :)

rcarrigan87 · on April 9, 2016

A/B testing has become somewhat of a cargo cult, with tools like optimizely making it insanely easy to run tests. IMO, simple copy tweaks or button color changes are usually waste of time tests.

I'm a big fan of running A-B tests on major product changes - which is basically just standard business benchmarking. Measuring the impact and understanding the results of a change are important whether you run a web app or a flower shop.

PPC is the one place for me where smaller AB tests are absolutely essential. In most other scenarios there are too many confounding variables to really understand much from small tweaks in a product.

imbeau · on April 9, 2016

100% agreed. I've worked for companies where A/B testing is so consuming that nobody see that they're shuffling deck chairs on the Titanic.

Like ANY other tool, A/B tests have their place, but the incremental gains you're likely to produce from a different color button pale in comparison to what else you might be able to do with that mental energy.

Also completely agreed that paid traffic is a completely different beast.

ac132 · on April 10, 2016

Agree completely. :)

forgetsusername · on April 9, 2016

>Some ideas are obviously good

Are they? Of course everyone thinks their ideas are "obviously good", but testing often proves otherwise. That's the whole point of it.

T2_t2 · on April 10, 2016

This comes down to definitions. What I'm taking the article to mean is the sort of mini/micro A/B tests that many people do (taglines, button colours etc).

Using your example of "obviously good", lets say something seems "obviously good", so you launch it, and after a week, you see that it converts worse or better. That's an A/B test, where A is before the change and B is what is after. You don't need software for that, you don't need to do on-the-fly A/B testing, you can just launch.

Given a site has existed for a while, it has a pretty known A for most things you'd want to test. Doing A/B tests on small elements seems may indeed not be worth the extraneous costs of time etc.

rangitatanz · on April 10, 2016

There are a lot of factors that can contribute to the second week converting better or worse Eg public holidays, paydays, sporting events.. That's why ab tests run at the same time so you can compare the same population.

ac132 · on April 10, 2016

Most managers decide things without tests. Doing this well is easier said than done, but it's the reality of why some companies outperform others.

Exceptional growth doesn't happen because you're great at a/b testing. It happens when your team is good at decision making period.

anarchitect · on April 10, 2016

I’ve run almost exactly the same number of experiments as the author, and experienced the same frustrations until recently.

It sounds as though the author needs to combine the "How we improve conversion without A/B testing" section with user tests and data analysis to build stronger hypotheses. Since we've put more effort into the research phase, our success rate has improved dramatically.

The real value of A/B testing is validating meaningful hypotheses that help you learn what matters to your customers, not unrelated individual improvements. By only observing the overall trend, you miss out on this.

I wrote about this recently here: https://medium.com/@nickboyce/5-steps-to-a-better-a-b-testin...

ac132 · on April 10, 2016

Disagree. The main reason we abandoned a/b testing is it's a major resource drain. Getting better at a/b testing would be another resource drain. Time is finite & how you use it determines where you end up.

The point of the article isn't to say a/b testing doesn't work or shoudn't be done. It's simply to say many companies are mis-using their resources by following the standard thinking on a/b testing.

We could spend another month figuring out how to get better at a/b testing or we could spend our time on another activity that produces a higher return on time spent. It turns out that's what we did and the return has been much better.

jspash · on April 10, 2016

I couldn't agree more. Simply spending the time to understand why you are testing certain elements is valuable on it's own.

The author doesn't mention what kind of tests were performed, but often when I read about the "futility" of A/B testing, it's usually due to lack of up front preparation and discussion of the tests true objectives. So the classic "red button vs. green button" might get you a result in the testing software, but it doesn't necessarily translate to more sales/leads or whatever the ultimate goal is.

ac132 · on April 10, 2016

The post wasn't about the "futility" of a/b testing. It's about the costs of a/b testing that are rarely considered. There's plenty of "pro" a/b testing advice in the world that leads startups to mistakenly waste resources on it.

A solid a/b testing process doesn't just happen without time & effort. You need someone smart analyzing user behavior and forming hypotheses as well as a talented design team to develop concepts that test them.

We simply acknowledged the costs of doing a/b testing well and decided our resources are better spent elsewhere.

arnorhs · on April 9, 2016

Interesting article and very contrasting opinion. It's hard for me to agree with most of it though, but I might simply be too closed minded.

A few notes:

>I should admit that we still A/B test some ideas. Rather than adhering to a strict A/B testing process, we let our team to A/B test ideas they're curious about.

The idea is that you don't test everything in the first place, only test things where your thinking is "My hypothesis is that changing or adding feature X will improve certain metrics" -- and in those cases, you a/b test. Else you aren't really proving that you're right or wrong. If it doesn't matter if you're right or wrong, why implement feature X at all? If it's a qualitative improvement, it doesn't make sense to a/b test it.

>3) Performance >Related to lost conversions, A/B testing tools make your site slower and this also reduces your conversion rate--even for the control. The additional conversions you lose because of this performance hit are another, often ignored, cost of A/B testing.

You should probably be using a different tool if it makes your site slower.

>5) Speed >A/B testing slows down your organization's decision making. Some ideas are obviously good and adhering to a strict A/B testing process reduces your time to go live. Time is finite and the number of improvements you implement per year has a major impact on your growth trajectory.

It's impressive that growth is simply a factor of time within this company. This makes sense only if their average production decision making has a positive impact on growth/conversion rates as a factor of time.

>You can monitor your long term conversion trend rather easily via Google Analytics to gauge if you're making good decisions.

Yes, but that would be very unactionable. If the growth is going up, you pat yourself on the back and say "I'm clever" and if it isn't you say "I suck". But since you don't know which decisions or what the real reason is for conversion going up or down, you don't know what the fundamental metrics are that drive your growth up or down. You basically say "I don't care why things are going well" or even lie to yourself and say "Things are going well because I make good decisions."

Growth is not just about growing, but about knowing why you are growing and using that knowledge to find new ways to grow. If you know the fundamentals about what drives your growth, you also know which things you can or cannot change without affecting growth.

tragic · on April 10, 2016

> You should probably be using a different tool if it makes your site slower.

Any recommendations? I see a lot of tools out there that work by basically rewriting the page on the client side, which will tend to give you suboptimal performance. And if your app's already a JS heavy SPA thing, your over optimistic byte budget is probably already running a deficit.

ac132 · on April 10, 2016

We use Split: https://github.com/splitrb/split

ac132 · on April 10, 2016

Thanks for commenting on the post. :)

To take another contrasting opinion. Who cares if you know exactly why you're going. We know in a general sense why we're growing: we constantly improve the experience of our service and make customers happier.

It's quite hard to maintain aggressive year over year growth targets. If you're pulling it off it really doesn't matter if you figured out why via a/b testing or not.

Obviously, we'd prefer stronger YOY growth without knowing the exact causes than weaker YOY growth because we wasted resources on a/b testing to get precise answers to why we're succeeding.

Beyond that, I wasn't referring to trending growth via GA to see if you're making good decisions. You can trend conversion rate by channel to see if you're getting better over time. Obviously, if conversion is going up and to the right for most channels you're making good decisions.

Leander_B · on April 10, 2016

Can't really agree much on this article and I have the sense it is somehow written out of frustration and lack of experience.

The '6 hidden costs' can all be addressed, some are not even hidden like time, running tests take time and resources like anything else. The other factors like speed/performance and confusion can all be solved very easily.

Some of my take-aways for testing

- Don't focus on small changes like buttons, copy etc.. these are mostly useless yes

- Make radical changes, e.g. on stickermule can try to have the whole product to make a sticker to checkout on 1 page or start with configuration first and then select sizes etc..

- To filter out false positives, compare same sample of data (e.g. organic traffic is good) for the same period range before a winning test went live and after

Personally I have seen some great results with the above tips in mind, from B2B making visitors perform small actions first or showing a demo vs. presenting them with a form immediately (4x more leads) to e-commerce sites adding a direct checkout vs. normal checkout path (lower avg. order but much higher conversion rate to largely make up for it) etc..

You could argue these are part of product (which a conversion department should be part of, not marketing) but without doing the testing you can only guess about the outcome. And sure many tests are fixing usability (e.g. seen 250% increase in registrations by fixing telephone number prefix and date of birth field formatting) and UI issues but that's part of it.

ac132 · on April 10, 2016

It's unfortunate to dismiss another view point as you're frustrated or inexperienced while ignoring the actual points made.

The biggest point being #1: Resources are finite. Time spent on A/B testing cannot be spent elsewhere. How you use your time determines your growth trajectory. A/B testing is a growth tactic, but it's not always the best use of your resources.

There is absolutely no way to solve this problem. You'll always have a trade off to make when it comes to how you spend your resources and time. You shouldn't neglect that cost when deciding if a/b testing is necessary for your organization.

Btw, keep in mind, I admit we still do some a/b testing. It's just not an key component of our growth strategy.

Leander_B · on April 10, 2016

I agree it might not always be the best use of resources, this depends on a wide array of factors within your business.

What I mean is that don't see it anywhere as a hidden cost in the true sense of that word as you know it takes resources and time, but more so about wrong priorities and business value.

As for the other points, they are so easy to solve:

>>Lost conversions

Trade off you need to make beforehand and need to be aware of with the end goal in mind which is to improve your main conversion metrics.

>>Performance

Most decent conversion platforms use fallbacks and also your way of setting up experiments makes a big difference.

>>Confusion

Letting other departments know what you're doing and what's running, might be harder in big corps though.

>>Speed

This goes for everything in a company I think (new feature/product releases, quick fix vs. proper solution etc.). I personally don't mind spending x amount of time to improve y % in conversion rate, so comes down to setting good KPIs and testing what really is important to your business.

>>Focus

Can be done with a proper conversion team that has all these roles covered (bit hard to set up but possible) or making the conversion department a client within the company. Also the teams mentioned work for other departments as well and improving conversion is a win across the board not just for the conversion team.