Hacker News new | comments | show | ask | jobs | submit login

As someone who works for a major e-commerce site, I am often the one who has the most influence when it comes time to decide which testing method to adopt. Multi-armed Bandit testing can be good, just as standard A/B testing can be good. But the factor which trumps all of these are the total costs of testing (and the return on investment to the business). One must consider the following before undertaking any of these testing methods:

1. Implementation Costs - How much time will it take to implement the testing code? Some tests are easier to implement than others. 2. Maintenance Costs - How much time will it cost to maintain the test for the duration of the testing period? We've ignored this in the past only to realize on occasion that implementation introduces bugs which incur cost and can be disruptive. 3. Opportunity Costs - What is the cost of doing the test versus not doing the test? Consider setup time, analysis, and final implementation.

After going through a few tests now, we have a pretty good sense for what the total cost to the business is. We don't really look at it as adopting one test method over the other, but instead rely upon the projected ROI to test this versus that, versus doing nothing.




If you've conducted multiple tests and "time to implement the testing code" is a major consideration, then you're doing it wrong. If ROI is also a major consideration, then again you're doing it wrong.

Seriously to add an email test right now at the company I'm contracting for takes 2 lines of code. One appears in the program that sends email and looks like:

    $email_state_contact->ab_test_version("test_1234", {A => 1, B => 1});
where test is the name of a test, and 1234 is a ticket number to avoid accidental conflicts of test names. The other appears in a template and looks something like this:

    .../[% ab_test.test_12345 == 'A' ? 'button1' : 'button2' %].png...
That's it. The test automatically shows up in a daily reports. When it wins, you get rid of that code and put the right thing in the template.

Done.


I can imagine a number of situations where the implementation is significantly more complex. While ideally A/B tests should be looking at relatively small changes, where each change is independent, many times people are making profoundly larger changes.

If you are testing the conversion rate in shopping carts, and the changes involves drastic redesigns of the flow through the shopping cart process, that could be a serious technological difference and requires substantial time to implement.

Not every test is as easy as changing the copy on an email.


Even if you're making larger and more complex changes, the overhead of your testing methodology remains the same. That is how you measure things should be a fixed (small) effort, The cost of building the test is whatever the test is.

In other words multi-armed bandit versus A/B test is something that you shouldn't be deciding based on the effort of the testing methodology.


I don't think he was referring to the technology behind the A/B test itself, but rather the technology behind the change that was being made.

That's how I interpreted his statement. I agree with you that the actual A/B testing overhead should be minimal and fairly trivial to put into place.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: