
37signals: A/B testing part 3: Finalé - wlll
http://37signals.com/svn/posts/2991-behind-the-scenes-ab-testing-part-3-final
======
patio11
I love when smart companies with great brands double business results with a
few hours of A/B testing. The more this happens, the more you have to think:
wait a second, if ten years of obsessive development and marketing and
branding gets us X sales, and 10 hours of A/B testing added X sales to that...
_what the heck was I doing last week again_. Answer: something that did not
double sales.

I bang this drum frequently, but let me bang it one more time: you should be
A/B testing.

~~~
davidw
> you should be A/B testing.

37 signals should be A/B testing, because they have the volume to do it.

Many of us don't, or aren't sure. Anyone want to comment on what sort of
numbers you should be looking at, roughly, in order to get statistically
significant results?

~~~
patio11
It's sensitive to your volume, conversion rates, and the magnitude of the
difference in conversion rates between A and B. In general, volume is good,
higher conversion rates (or easier intermediate conversions, like to email
submit rather than to purchase) are good, and large differences in conversion
rates are good. It would take a lot of time to statistically significantly
discriminate between .001% and .0012% on volumes of 5 visitors a day.

People like easy answers, though, so my easy answer is "A/B testing is
worthwhile if you have 100 visitors a day. Otherwise, spend the bulk of your
time finding 100 visitors a day first."

~~~
arnorhs
Is 100 a day enough? Wouldn't it take forever to get statistical significance?
(unless the results are really dramatic)

~~~
LargeWu
It depends on how much variance you normally have, and the size of the change
you hope to measure.

<http://en.wikipedia.org/wiki/Statistical_power>

------
aresant
I would love to see the data sets that support the lifts - those little "up
3.49 / down 3.38" can get notoriously tricky in terms of statistical relevancy
using GWO or otherwise.

Also to note, I often find that radical new variant designs for well
established clients have rapidly diminishing returns.

Sometimes the sheer "newness" can skew the data as folks sitting on the fence
(returning visitors) push the convo data up.

All that said, I'm a fan of the new look and feel, fun to watch 37signals
influence turn the market - expect to see lots of similar looking stuff in the
future.

~~~
sunir
There are a couple important biases where just changing something results in a
positive result.

* <http://en.wikipedia.org/wiki/Novelty_effect> (everyone loves seeing something new!) * <http://en.wikipedia.org/wiki/Hawthorne_effect> (being a subject of an experiment is motivational)

There are a couple ways to deal with these biases. One is to try a large
number of different designs, so the bias is spread across a large number of
options. Thus the comparative winner is more likely to itself be intrinsically
better, at least amongst the new designs. Note that this still negatively
punishes the control.

The other way is to wait longer until the new design is no longer 'new'. In
practice, that is impractical! Really, all you can do is pick the winner and
keep monitoring it after the fact to see if there is a regression in the
result.

I know others have said this, but you can harness these biases. Simply change
the design frequently for no reason other than to stay 'fresh'. It keeps
people interested and it shows you're alive and kicking.

~~~
chc
The Hawthorne effect probably wouldn't apply since it requires the subject to
know he's the subject of an experiment. The novelty of the new designs might
be a factor, but since we're measuring new sign-ups, it seems kind of unlikely
that a lot of people were familiar enough with the design of the page to
realize it was new.

~~~
jcampbell1
It is highly likely that people know they are looking at a new design. People
don't just adopt a CRM on a whim. For my site, (foreign language learning),
the customer visits an average of 6 times before buying. New designs drop the
visit count temporarily, a clear sign the novelty effect is strong.

------
wlll
Parts 1 and 2 for easy clicking:

1) [http://37signals.com/svn/posts/2977-behind-the-scenes-
highri...](http://37signals.com/svn/posts/2977-behind-the-scenes-highrise-
marketing-site-ab-testing-part-1)

2) [http://37signals.com/svn/posts/2983-behind-the-scenes-ab-
tes...](http://37signals.com/svn/posts/2983-behind-the-scenes-ab-testing-
part-2-how-we-test)

------
xutopia
"Finalé" is a funny aberration. It is not a French word but rather the
incorrect pronunciation spelled out in French characters.

~~~
mitchty
Finale originates from an Italian word which meant final, not all Latin words
in English derive from French.

Italian: <http://www.wordreference.com/enit/final>

English etymology:
[http://www.etymonline.com/index.php?search=finale&search...](http://www.etymonline.com/index.php?search=finale&searchmode=none)

Also, there is no such thing as "French characters". The grave and acute
accent for example are used in multiple latin character based languages.

~~~
davidw
The accent on the final (hehe) e would be wrong in Italian. The Italian word
is pronounced something like:

fee-NAH-lay, not fee-nah-LAY.

IIRC, it is much more common in French to place an accent on the final
syllable than in Italian, where the second to last usually gets the emphasis.

In any case the point stands that the word looks wrong.

~~~
chc
The accent _isn't_ supposed to be on the last syllable in English. It isn't
pronounced "fee-nah-LAY"; it's "fin-NALL-ee." In my experience, people
sometimes correct the last sound to a long A rather than a long E, but I've
never heard the accent on the last syllable. I think either you know some odd
people or it's some regional variant.

Source for British English:
[http://dictionary.cambridge.org/dictionary/british/finale?q=...](http://dictionary.cambridge.org/dictionary/british/finale?q=finale)

Source for American English: <http://www.merriam-
webster.com/dictionary/finale>

~~~
davidw
> I think either you know some odd people or it's some regional variant.

I think you've read the thread a bit quickly and are a tad confused. I was
talking about the Italian pronunciation. I know how it's pronounced in English
as well, as that is my native language.

~~~
chc
Oh, my apologies. I read "accent" as referring to the pronunciation (since you
followed it with a contrast of "fee-NAH-lay" and "fee-nah-LAY"), but you meant
the diacritic. Yeah, you're right, that's not the correct spelling of the
word.

------
ssharp
"The whole A/B testing concept probably came from from “strategy analysts” or
“MBAsses”. Anyway, now I’m a believer in A/B testing."

Is he is also a new believer in not making irrational judgements of people
based on their job title or education?

~~~
asianmack
Touché! I was high-and-mighty too cool for school Designer role-playing.

~~~
ssharp
It's fine, but I didn't really read it that way. It probably doesn't help that
this place leans to a strong "business guy = moronic jackass" mentality. I'm
sure that led to some assumptions on my part.

edit:

Let me add, that other than the comment I commented on, I really enjoyed the
series. There was a lot of creative ideas to test that extended well beyond
the usually published A/B test results that test fairly mundane changes, like
the color of a button.

------
dkrich
This was a great demo, but if I could make one suggestion- I think it would be
a more useful exercise to group layouts by category and then test based on
educated assumptions from there.

What does that mean? Well I believe strongly in the "less is more" strategy,
and I have a hunch that the real difference in results lay in the lack of
options for visitors to the Basecamp intro page. With the two previous long-
form versions, users had the option to spend time reading and eventually
navigating away from the page. With the photo page, there is really only one
option- go to the sign-up page now. The images certainly make the first
impression very engaging, and I think it is well done, but I would be
interested to see whether a meaningful change would occur if instead of the
smiling people, you had a beautiful cityscape of the Chicago skyline, or no
image at all. I suspect it wouldn't be huge. My point is that it would be
useful to gain actionable insights that could be repeated, such as "always
make a sign-up button the only option and focal point of a splash page" as
opposed to "always use smiling people." Otherwise, the real value of the
lesson may have been overlooked.

------
rmc
The gender of the "big smiley customer" didn't seem to matter too much. This
makes me smile.

------
stevenp
It's good that they're using real people in the photos in those designs. Few
things bug me more on marketing sites than when people use those happy stock
photo people, who are usually jumping up and down or looking at a pie chart,
or staring at me in their hands-free headset, ready to take my call.

------
paraschopra
Our (Visual Website Optimizer) customers also arrived at a similar conclusion:
human faces indeed increase conversion rate. We compiled two A/B tests on
human faces v/s images into one case study. Here it is:
[http://visualwebsiteoptimizer.com/split-testing-
blog/human-l...](http://visualwebsiteoptimizer.com/split-testing-blog/human-
landing-page-increase-conversion-rate/)

This corroborates the result found out by 37Signals. (Although the caveat that
it is not always true still applies. So, you should A/B test it before
implementing on your website) Looks like there is indeed something special (on
sub-conscious level) about human faces.

------
powertower
This is interesting because I think what these tests prove is that people
coming to the Highrise website are not searching Google for CRM terms, they
are rather searching for the brand, or are coming from targeted ads... As in,
they don't need to be "sold" on anything or given more information to read,
they just need a button to click (are already determined to buy / now goal is
to just remove barriers for them).

What works here will NOT work somewhere else. Certainly will not work on my
website (<http://www.devside.net/server/webdeveloper>) where I have to sell
the benefits from the start... Even my buy button is at the very bottom of the
page because that _forces_ the visitor to read that page... "Above the fold"
is nothing but bounces and lost sales for me.

~~~
chaz
"Jason Fried’s mantra while testing was: We need to test radically different
things. We don’t know what works. Destroy all assumptions. We need to find
what works and keep iterating—keep learning."

The specific learnings are irrelevant. Only the process.

------
callmeed
I'm anticipating a new wave of SaaS homepages with photographs in the
background.

~~~
rokhayakebe
Starting with me. I know it is cheesy to copy but if it works it works.

~~~
joshuacc
And the only way to know if it works _for you_ is to test it.

------
mathattack
Great series.

There are plenty of reasons to be skeptical of small values. The statistical
significance probaly won't be there except for high volume sites. There is a
paradox - AB is better for big changes, but those changes could also be the
most disruptive to existing users.

Even with these concerns, look at how efficient this is compared to
traditional retail. Imagine all the black magic required to figure out which
display for Crest works the best. It was worthwhile to do with very blunt
tools. We may not be in the realm of scalpels, but we are well beyond
chainsaws.

Thanks for sharing!

------
blue1
"finale", a music term, is an italian word (not french) and has no accent.

------
tosh
I'm glad they share their findings but would have loved to see more actual
information in this 3 part series.

what about:

* actual signup rate (not only compared to before) * retention (that's the thing that really counts right?)

does anyone know good articles on a/b testing that cover the important things
like statistical significance and how to make sure that the novelity of the
change isn't skewing the results and so on? I fear that too many people fall
into the trap of drawing premature conclusions and wasting time and money :(

------
staunch
The background colors were very different in the big photos.

------
jordibunster
The design with the big customer photo also has the most obvious "call to
action" button.

Not only that, it has the least amount of links that go elsewhere.

------
ynd
37signals are known and celebrated for their design style, and I think it's
awesome that they are willing to experiment with new ideas and discard old
ones. Particularly because the new designs in the post look nothing like their
usual style.

------
rushabh
There is also the issue of ethnicity. If you are targetting a global market,
what pics do you put? Asian, African? Caucasian? Does that affect your
conversion per region? That would be interesting.

------
noelwelsh
I like that 37signals tested really different designs, and then refined the
winner. This is, in my opinion, the best way to do things to avoid getting
trapped in the local maximum.

------
huhtenberg
They should've also tried the GoDaddy girl :) I wouldn't be surprised if the
numbers were _way_ higher even though she won't a bit be relevant to the
context.

