
I'm Split Testing ... Why Haven't I Doubled My Revenue Yet? - dcancel
http://markitecht.tumblr.com/post/1009509922/conversion-metrics-explained
======
btilly
Here is an alternate theory.

Stick the numbers post-conversion in to [http://elem.com/~btilly/effective-ab-
testing/g-test-calculat...](http://elem.com/~btilly/effective-ab-
testing/g-test-calculator.html) (41 successes out of 638 trials versus 35
successes from 416 trials) and the conclusion of unequal performance has
72.42% confidence. Meaning that more than 1 time in 4 you'd have a difference
that big or bigger by chance.

In other words the entire basis of this post could be a chance statistical
fluctuation that should be ignored.

It is true that there can be effects where pushing less qualified leads
through the top stage of the funnel doesn't get them to the end. However my
experience with A/B testing is that it is more common for the extra people put
in the system by an A/B test at the top to convert the rest of the way
relatively similarly.

But not always! Which is why if you have sufficient volume you should always
measure to actual sales. There is no other way to be absolutely sure that you
are improving end sales.

However in this example that would mean running the test for something like
20x as long. In that case it makes sense to be pragmatic, test from one step
of the funnel to the next, and then pivot on the answers you get. Furthermore
to start you should focus on the top of the funnel for the simple reason that
higher volumes will get you answers faster there - you can easily try a dozen
ideas before you could test one idea deeper in the funnel.

Once you've improved your site enough to get a better percentage of actual
sales, you'll be able to purchase more traffic. Doing both of those things
will put you in a position to conduct more rigorous A/B tests to eke out more
subtle differences. But that is down the road. Focus on testing what is
easiest in the quickest possible way first.

~~~
JangoSteve
_In other words the entire basis of this post could be a chance statistical
fluctuation that should be ignored._

I agree that the particular stats referenced in the article may not be
statistically valid, but I wouldn't argue that those stats were a supporting
detail rather than the entire basis. The main point as I understood it was to
illustrate, more or less, why a 100% increase in conversions to the purchasing
page does not equal a 100% increase in conversions from the purchasing page to
actual purchase.

They are saying that once you start attracting traffic beyond the early
adopters, your additional traffic is now comprised of a different group of
people who exhibit fundamentally different behavior in how likely they are to
make a purchase even once they've hit the purchasing page.

~~~
btilly
_The main point as I understood it was to illustrate, more or less, why a 100%
increase in conversions to the purchasing page does not equal a 100% increase
in conversions from the purchasing page to actual purchase._

Yes, it offered a theory about why this was so. Yet my experience is that a
100% increase in initial conversions typically results in approximately a 100%
increase in sales.

My further experience from doing A/B testing for many years is that lots of
people are eager to grab any numbers you give them, then run with them and
form grand theories that aren't backed up by the actual statistics. Those
theories have a remarkably low success rate in explaining the results of the
_next_ A/B test you run. (Or, frequently, the current test once we let it run
longer.)

 _They are saying that once you start attracting traffic beyond the early
adopters, your additional traffic is now comprised of a different group of
people who exhibit fundamentally different behavior in how likely they are to
make a purchase even once they've hit the purchasing page._

It was an attractively presented theory. True, I've learned to be cautious of
attractively presented theories which aren't actually backed by data. But it
was definitely attractively presented.

However no evidence was offered that the people progressing in group A were
actually significantly different than the people progressing in group B. And
if the test truly was letting crappier traffic through, and the crappiness of
that traffic was the primary cause of trends in subsequent behavior, then that
traffic has to be REALLY crappy to explain the difference. Occam's razor says
that random chance was the cause of the data, or at least a large enough
contributor that there is no immediate need to think too hard about other
possibilities. And so until better data becomes available, I'm going to
suggest that "chance fluctuation" deserves a hearing.

Medicine sometimes calls these zebras. Why? Because it is like someone hearing
something with 4 hooves run past an open window and immediately figuring out
that it is a zebra. Sure, it _COULD_ be a zebra. _SOMETIMES_ it proves to be a
zebra. But the odds are much better that it was a horse.

Guess the garden variety answer before guessing the exotic one.

~~~
markitechtMA
Thanks for the feedback.

So, regarding this being a fluke one-off or the assertion that there isn't
data to back this up, I have reams of tests that display similar behavior,
many with tens of thousands of unique participants. As the other commenter
said, the example in the post was icing, not the cake, but I take your point
and it's a good message that people are reading my posts who know something
about this and maybe I can get a little deeper into advanced topics and
results and stuff, so that's good to hear.

If your early-funnel optimizations directly predict the end-funnel conversion,
I think that 1) that's awesome 2) the products you are testing have a huge
amount of room to grow (also awesome, hope the hiring goes well it's always
tricky finding great people) and 3) the phenomenon i was describing doesn't
really apply to you and those products yet.

One thing that may not have been crystal clear is that it's one group of
people that we are talking about, and examining the segments of that group.
Let's put it another way. There are 1,000 people who see a certain offer page
on a given day. 1 of those people is a fellow who has read 3 reviews, used
competing products, and decided 100% that he is going to buy the product. The
page could have a tiny little 10px link to add the product to cart, it could
be in the footer and #EEE and this dude would still find it and complete the
purchase.

Now, as you make it clearer to people why and how they should buy the product,
you are not generating more interest like the first fellow's. You are simply
inviting a broader selection of the audience to consider the purchase a bit
more. That's really the principle that is at play here.

Thanks again for reading, and for the comment.

------
JangoSteve
Ah, it's a rhetorical question. I was ready to read a rant; well played, sir.
If you're short on time, check out the last graph, it's gold.

------
po
Another tactic besides trying to move out the curve is: screw the skeptical a
__holes and create another product that the early-adopters will go for. If you
have people tearing holes in their pants trying to get their wallet out
faster, capitalize on it. Focus on that very small wedge on the left. Let's
call this "the Apple strategy".

------
zemaj
If this holds for more situations, I guess the conclusion I would draw is that
spilt testing should be done as far down the funnel as possible to generate
the most return.

~~~
StavrosK
If you're losing 90% of the people on the first step and 50% of that 10% on
the second step, is the second step really what you need to be optimising?

~~~
markitechtMA
Possibly, yes. The 'start at the cart' theory of optimizations says that you
should absolutely start with optimizing that 50% and working 'backwards.' The
idea being that it is a lot easier to close sales that are in process than to
generate more sales leads that may or may not be qualified enough to actually
initiate a purchase.

Focus on reeling in the fish on the hook, to put it grossly, as opposed to
putting more hooks in the water.

Pretty easy to see the other side of this though, and like the correct answer
to any poker question, the answer ends up being "it depends". =)

