
What works in e-commerce – A meta-analysis of online experiments [pdf] - sweezyjeezy
http://www.qubit.com/sites/default/files/pdf/qubit_meta_analysis.pdf
======
tomarr
If you want to see pretty much all the most effective measures in action, go
to a ticket reseller like Viagogo and follow something through to the basket.
It is both impressive and amazing the amount of psychological steers there are
on the site.

~~~
gruez
Or flight/hotel/car rental booking websites. Google flights is a notable
exception.

~~~
_jal
The easiest way to make me go somewhere else is to adopt these tactics. They
make me feel like I'm locked in a battle of wits with an autistic con-artist,
or maybe a suspicious Fed - if weird rituals are not faithfully observed under
conditions of vast information asymmetry and artificial stress, I will suffer.

There's no hope of avoiding that with airlines for me - I don't fly enough to
avoid the prole line. But literally any storefront where I can avoid that
shit, I do, and it breeds not just dislike - it makes me want see them to
fail.

~~~
lucaspm98
The people with the knowledge of these tricks are so few that unfortunately in
most cases the additional revenue is too much to keep the minority happy. It's
fascinating to look at ecommerce storefronts and search for all of the subtle
tactics they're trying to use on you.

------
bartkappenburg
Great research with lots of data and tangible results.

I'm missing 'authority' as way to improve conversion. Funny that they use it
themselves in two ways: letting PWC check the results and, in a milder form,
using a scientific way of communicating the results (LaTeX, article layout
etc). The impact would have been less if it was just a blog post :-). I'm
curious about results on authority, maybe someone from qubit can give us some
insights on that?

At my company[0] we offer a solution to sites to implement these strategies
through notifications/nudges. Having said that: we firmly believe in A/B
testing but we believe even more in recognizing (we do that through machine
learning) what technique works best on a personal level. This means that a
site can have, for example, two strategies and that we apply none, either one
or both on the visitor. That way you can reach higher uplifts.

[0] [https://www.conversify.com](https://www.conversify.com)

------
joe-stanton
This is a really useful article. It's a shame that so much development time is
wasted on large numbers of fruitless optimisations just because they are
"easy" (eg. tweaking the colour of a CTA).

That being said, I'm surprised many of the results are so negative. It would
be great to also see the max uplift achieved for each category. A number of
retailers I've worked with have been able to beat these uplifts by quite a
bit. I wonder if it might be significantly skewed by the kind of clients Qubit
has?

~~~
Silhouette
_It would be great to see the max uplift achieved for each category_

Indeed. What matters most with these kinds of experiments isn't really the
average results, but what is possible and the distribution _among beneficial
results only_. After all, the whole point of A/B testing is to try experiments
and then either keep the changes if they improve results or stay with what
you've already got if the changes didn't bring an improvement. Surely all the
treatments that led to negative changes would just have been discarded in
practice? It's still important to see the full picture as well, if only to
guide decisions about which experiments are even worth trying, but I think
there's another side that doesn't fully come through here.

~~~
gwern
> isn't really the average results, but what is possible and the distribution
> among beneficial results only

No, the bad results also matter: you are still spending visitors and revenues
in testing out bad variants, which is part of determining the costs and
benefits. Even with a bandit approach, you incur logarithmic regret in the
number of variants. And testing a bad variant is common: the best category,
'scarcity', has a 16% probability of the variant being harmful. A Value of
Information calculation has to take into account the harm done while testing.

~~~
Silhouette
(Hence the final sentence of my previous comment.)

------
cylinder
Well, running a rudimentary eBay store you realize these things help pretty
quickly, but it's good to have data.

However... Starting a new e-commerce property? Good luck finding traffic in
anything profitable. Amazon and other Giants dominate search rankings so I'm
not sure how you will find your traffic unless you create a new niche. Maybe
you're a thought leader in a hobbyist space, that can work... But you're not
going to be succeeding because of these tricks

~~~
nik736
That's simply false. There are always categories or "side"-categories that can
be highly profitable and big stores only care about as a side business (small
category in their store).

Being a small category for a huge store means they don't care about that
category too much but it doesn't mean there isn't a lot of cash to be made.

I've helped several companies to build an online store that did exactly that
and they are all doing 5 digit revenue month after month.

It's easy to rank higher in search engine for something you are 100% focused
on compared to a big stores where it's only a small thing that they don't
highlight.

~~~
pbowyer
Can you give any examples of small categories? I can't think of any, let alone
get my brain round what kind of thing they would be.

~~~
ecommerceguy
I can think of plenty, but I'm not going to say... Tehehe sorry for the
joshing, it's interesting that free shipping doesn't correlate into increased
revenues, or at least as much as one would think (although I'd like to see
this same test on merchant fullfilled Amazon listings)..

------
inopinatus
Inconsistencies to report.

Page 2:

    
    
        • scarcity (stock pointers) +2.9% uplift
        • urgency (countdown timers) +2.3% uplift
        • social proof (informing users of others’ behaviour) +1.5% uplift
    

Page 6 table 2.2:

    
    
        scarcity 2.9%
        social proof 2.3%
        urgency 1.5%

~~~
sweezyjeezy
_facepalm_. The table is correct. Will correct this tomorrow. Thanks for the
heads up.

------
1k
I'm surprised free shipping has a negative impact on revenue. Worst case I
imagine would be that revenue would increase but with less or negative profit.

Likewise most GUI-related tweaks seem to have a negative effect (mobile
friendliness, search, navigation). Assuming it gives a better mobile
experience, why would anyone spend less - unless the goal is to get them off
mobile and onto the desktop.

~~~
Theodores
I think that the GUI-related tweaks must be on sites that are done properly
and well established. For these guys the customers know what to expect so a
change in design will be received negatively by people that are used to the
old site. Although not in ecommerce, the redesign of the BBC site is like
that, people want to cling on to the old site and instinctively don't like the
new. In time they adjust or get a touchscreen computer and 'get' the changes
and what the redesign is about.

Most people are working on sites other than the top ones. With these websites
where customers are not regulars, change to the GUI is received differently.
Sensible updates to the UX will convert.

------
Silhouette
Thanks for sharing some real data. It's always interesting to see.

I too am surprised to see the conclusions come out as negative as this over
such a large overall data set. Just from our own experience, even making quite
modest changes to small web sites, it's not _that_ unusual to see the kinds of
change that came out with a negative mean in this report actually making a
very noticeable positive difference.

I wonder whether this is partly a matter of interpretation and presentation. A
lot of the treatments that had a slightly negative mean also had a lot of
variance, which suggests that quite often those treatments do work but it's
not reliable and requires experimentation to make sure you only keep the
genuinely beneficial cases. It seems plausible that there were a few of the
"75% improvement in our case study!" kinds of results lost in the long tails,
but that what the data is telling us is that those really are outliers and
don't happen nearly as often as we might wish.

------
ifcologne
Lot's of interesting data and some findings I didn't expected. Thanks.

What I have missed in this paper is the impact of customer reviews (a 4.5
ranking has a higher uplift potential than a 5.0 star ranking - according to
some studies). And the number of reviews has impact as well. Not enough
studies around for a meta analysis?

------
geetfun
Anyone who hangs out on Facebook group e-commerce forums basically have seen
the outrageous claims the authors of this paper alludes to.

Nice to see an analysis like this for a change.

~~~
castell
can you recommend good groups?

~~~
geetfun
A lot of Shopify groups.

Search up anything Shopify or ecommerce mastermind.

Be aware that a lot of them are set up as groups that simply try to sell you
stuff.

The best ecommerce folks are basically the best marketers.

------
JorgeGT
The PwC assurance report URL gives a 404:
[http://www.qubit.com/sites/default/files/pdf/pwc-qubit-
assur...](http://www.qubit.com/sites/default/files/pdf/pwc-qubit-
assurance.pdf)

~~~
sweezyjeezy
Apologies : [http://www.qubit.com/sites/default/files/pdf/pwc-qubit-
assur...](http://www.qubit.com/sites/default/files/pdf/pwc-qubit-
assurance_0.pdf)

~~~
Raphmedia
Now that's pretty bad. There is a modal that open _before_ the document is
displayed.

"Chrome PDF Viewer

Click YES to accept the terms of the disclaimer on page 1 of this document.

If you click NO the document will close.

[OK] [CANCEL]"

Not only does this want me to agree to something before displaying it, the
legally binding options (Yes / No) aren't even choices in the modal.

~~~
thefalcon
Thankfully, clicking CANCEL opens the PDF anyway.

------
ssharp
In my experience, the A/B tests that are most likely to win are the ones where
you make UX changes designed to make it easier for visitors to do what you
want them to do. These not only improve your conversion rates, they are also
less spammy and intrusive as things like exit-intent modals. They are also the
types of gains that do compound.

Want an easy win? Make mobile checkout better. It's generally the worst. I was
on a fairly large, publicly traded, retailer's site over the weekend and had a
goofy error that was extremely easy to make on their mobile checkout page.
While I was alerted to the error, it also emptied my shopping cart and erased
all the address and payment info I spent time typing in.

------
nocoder
Very interesting thanks for sharing. As someone working in e-commerce, I was
smiling when I saw some of them but it is frustrating on how often these
ineffective experiments are repeated.

I would have loved to see the cut of performance by industry/sector. My hunch
is some of the things would work really well in travel but not as well in
others especially low involvement categories and categories with lower average
selling price. It would also be interesting to know the average duration of
these A/B tests, I think some of things like scarcity and urgency will have
larger effect in the shorter time duration vs others like UI changes which
will take a while to produce substantial results, mostly because customers
will have to learn new behaviours. Product recommendations is interesting
because it is notoriously difficult to get them right and feel they tend to
work better in long tail categories like media vs. head heavy categories like
mobiles or laptops. They may also not work well in categories where brand
influence is high and are generally high involvement and high cost.

------
iagovar
It's good to see an analysis, albeit most of this info was common knowledge,
maybe except call to action buttons causing a decrease. I thought it was the
opposite.

Sometimes factoring all variables when doing testing becomes impossible.

~~~
sweezyjeezy
This stuff is less common knowledge than you might expect. When bringing on
new clients at Qubit (particularly smaller ones), we still find so many of
them obsessing over tiny UI changes expecting it to make an impact. Much of
the e-commerce industry has bought into the idea that the cosmetics of a site
is the most important thing to get right.

~~~
mrweasel
I believe that the main focus of any ecommerce site should be pricing,
availability and having the payment and delivery options your customers
expects.

E-commerce sites want to "engage" with their customers, but fail to realise
that the customers don't want a relationship. They just want whatever product
you're selling as quickly and cheaply as possible. Most of you customers will
be coming via price comparison sites or Google, so they aren't going to you
directly (unless you're Amazon or eBay). For most e-commerce sites the
customers aren't going to stay long enough to notice imperfections in the UI.

What you do need is a dead simple checkout (no signup required) and an equally
easy return form. Everything else will be used by only a tiny percentage of
your customers, often those you don't want to deal with anyway.

For some weird reason making the checkout to simply will result in a large
number of purchases cancelled right after placement. In my experience
customers are more likely to cancel within 15 minutes of placing the order
than at any other time.

~~~
srrr
Two counterpoints:

I have done a/b tests where increasing the price of the product increased the
conversion rate. The original price was to low and the customers thought this
was a cheap non quality product.

Also I, and many people I know, are not shopping for the lowest price, but for
the best overall package. (Reputation of seller, return policies, shipping
duration, display of the product in the shop, filter and search abilities to
find the right product in the shop, ...) There are of course many users who
simply want the lowest price, but one important thing if you a/b test is: You
don't have to go with the biggest group of users (in this case price
sensitive), there are so many other oppurtunities if you understand your
users. In e-commerce these oppurtunities open especially often, if the user is
not exactly sure what product/brand he needs or wants to buy. Competing on
price is really hard, competing on advice is often easier as a small shop.

~~~
dsfyu404ed
Everyone is price sensitive once you've met the minimum threshold for those
other things.

What you're selling sets the threshold for those things. Return policies,
advice and reputation matter more for embroidered clothing than they do on
bulk sales of nuts and bolts.

------
jameslk
I wish they would have evaluated the effect of dynamic pricing. That is,
showing different prices to different visitors for the same product. Perhaps
not enough online retailers employ the practice, although it seems to be an
important tool for retailers like Amazon[0].

0\. [https://www.theatlantic.com/magazine/archive/2017/05/how-
onl...](https://www.theatlantic.com/magazine/archive/2017/05/how-online-
shopping-makes-suckers-of-us-all/521448/?single_page=true)

~~~
ripberge
What you are describing is an A/B test of a price, not typically what industry
calls "dynamic" or "variable" pricing. Amazon tried A/B testing prices and
caught hell for it. They no longer do it, but they will do dynamic pricing.

------
Systemic33
A lot of these are brilliantly displayed with varying degrees of integration
and success on airline pages. [1],[2]

[1] [http://sas.dk](http://sas.dk) (Search for e.g. Copenhagen -> London)

[2] [http://lufthansa.com](http://lufthansa.com) (Search for e.g. Frankfurt ->
Copenhagen)

------
hahamrfunnyguy
I've done the countdown timer thing before, but not in a sleazy way like
Viagogo. I've done limited-time sales where the item starts at a percentage
off and gradually increases to full price. It seems to work well.

------
ronack
This is interesting but I definitely question some of the results, for
instance reporting a negative impact for changing search results. Many
businesses have been built on improving conversion through search result
optimization.

~~~
robbiemitchell
I interpreted this as them creating/improving on-site search. Not related to
adwords or display optimization.

~~~
ronack
Sure, but what were the improvements to on-site search? More relevant results?
Tweaking the display? I know from experience that improving search result
relevance in e-commerce has a (sometimes drastic) improvement in conversion.

------
lostphilosopher
This was really interesting. Thanks for posting it. Does anyone know of a
place to find similar content? (Analysis of web trends and practices from a
data driven perspective.)

~~~
mitchdoogle
Nielsen Norman Group ( [https://www.nngroup.com](https://www.nngroup.com) ) is
a good place to look for somewhat similar content. NNG focuses more on user
experience, and so their studies are usually based on direct observation of
users, rather than large amounts of data gathered through analytics.

Still, it has proven a very valuable resource for me when trying to explain a
decision I've made in a new website design. They have many free articles that
offer some good insights, as well as some more in-depth reports about specific
sectors that will cost a few hundred dollars each.

------
Iv
tl;dr: The 3 items possibly statistically significant are:

\- Saying there are just a handful items left in stock (+2.9% revenue per
client) \- Saying other people are watching this product (+2.3%) \- Time
limited offer (+1.5%)

I did not see mention of combining these factors. I doubt the gains are
cumulative.

My main takeaway is that most optimizations are not worthy if you have the
opportunity to spend your time/money on something else to bring value to the
consumer.

Also, I think #1 and #3 are dick moves and #2 needs some good crafting to not
be. I doubt the cost in reputation is worth the increase in revenue.

~~~
gumboshoes
Whether they're dick moves is all in how you handle them. In the nonprofit
world, these things are well-known and tend to be handled with finesse and
even grace.

1\. It's not just about items in stock. "Our SomeNonprofit membership program
has just 10 places left at the platinum level, where you receive the following
benefits..."

2\. Social proof is more than just what people are watching. "Fans of
SomeNonprofit donate an average of $121 to SomeNonprofit every year." Or,
"When we asked SomeNonprofit donors what they liked most, they said it was the
way SomeNonprofit does SomeThing. If you like SomeThing, too, then donate to
support SomeNonprofit." It's really just about demonstrating that others have
done a thing so it's okay for you to do it, too.

3\. One nonprofit I'm involved with very successfully does a fundraising
campaign for the last 24 hours of every year. It's an artificial time
restraint in some ways, but it also capitalizes on being the very last day
each year that tax-deductible donations count toward a year's taxes. It's a
true time-constraint!

~~~
Iv
OK, agreed here. I need to specify it a bit more: 1 and 3 are dick moves when
they are totally artificial, like saying only 3 left while the stock is around
300 or saying "only 2 hours left" when the time constraints are imaginary.

In your case they are not, so I agree that it is interesting to remind people
of actually seeing these.

