
So, you need a statistically significant sample? - astrobiased
http://technology.stitchfix.com/blog/2015/05/26/significant-sample/
======
mshron
The "default" alpha and beta are not the correct ones for a website A/B test.

If you're designing a drug, you'd better be very careful not to accidentally
approve something that is useless. It would cost a ton of money and lives in
the long run if it was no better than placebo. False rejection of the null is
very bad. False acceptance of the null is not so bad.

By contrast, if you're doing an A/B test on a website, you're actually not in
bad shape if you accidentally think that a red button is a bit better than a
blue button, assuming that they're pretty close. False rejection of the null
is okay.

However you are screwed if you miss out on the chance that a red button gives
you 50% more conversion. With websites, false acceptance of the null is very
bad. It's okay to mistakenly think your button is effective but it's very bad
to mistakenly think that the button is ineffective.

Websites have the opposite cost benefit calculation to science generally and
shouldn't use the same parameters.

~~~
analog31
Perhaps a simple conceptual tool is to consider risk as the product of cost
and likelihood, and choose an acceptable level of overall risk for type 1 and
2 errors. Thus the potential cost of each error has to be part of the decision
making process.

------
fmela
> [F]or any study that requires sampling ... making sure we have enough data
> to ensure confidence in results is absolutely critical.

Is this necessarily true if you can sample from the population in a fair and
unbiased way?

~~~
learnstats2
>making sure we have enough data to ensure confidence in results is absolutely
critical.

Yes, it's necessarily true. If your sample is small you are necessarily
subject to large sampling error.

In essence: the individuals you happened to pick (even fairly) are
overrepresented, and the rest are underrepresented.

~~~
fmela
Yes, of course, I can see that if you have an extremely small sample, then the
_resolution_ of your results will suffer. However, I think it's much more
important to ensure unbiased sampling than it is to ensure a large sample
size.

For example, if you sample 1% of the population in a fair and unbiased way,
that would tell you something with a much higher degree of confidence than if
you sampled even 10% of the population in a biased way (or in a way such that
you don't know whether you are biased or not).

------
pyrocat
I feel like my entire Stats course in college could be summed up with this one
article. Bookmarking this for later reference, thanks!

