

Frequentist Statistics are Frequently Subjective - yummyfajitas
http://lesswrong.com/lw/1gc/frequentist_statistics_are_frequently_subjective/

======
btilly
The real problem with statistics is that people want an answer to an
impossible problem. Namely they want to be told the probability of the world
being a particular way. But anyone familiar with conditional probability can
easily see that there is no way to come up with that answer, because the
conditional probability of something _after_ an observation depends on the
assumed probability _before_ the observation.

There are multiple approaches to this problem. What frequentist statistics
does is answers a _different_ question. Namely, "What is the probability of
getting a result this strongly against the null hypothesis if the null
hypothesis is true." This is appealing in that it is an objective probability
that seems to say something about the problem under discussion. However people
consistently read it as, "What is the probability that the null hypothesis is
true?" Which is simply wrong.

There is a second problem with frequentist statistics, which is what this
article is about. Which is that the objective-looking probability you get
depends on the null hypothesis chosen in ways it shouldn't. Basic familiarity
with Bayes' Theorem and conditional probability demonstrates that it is
impossible for it to matter whether your intention was to flip a coin 6 times,
flip it until you get both heads and tails, or flip it until you get heads.
This factor cannot affect the conditional probability of the coin being
biased. But those three different intentions translate into 3 different
calculations, and 3 different answers in frequentist statistics.

So that's the frequentist approach. What is the Bayesian approach? It is to
come up with statistics that inform us on how prior beliefs before observation
turn into posterior beliefs after observation. The advantage of this method is
that it is intellectually honest. The disadvantages are that it is complicated
and people notice that it is confusing. (The frequentist approach is confusing
as well, but people don't notice their confusion. Instead the confidently draw
the wrong conclusion that the null hypothesis has been proven.)

There are other approaches. The article touched on my favorite when it pointed
out that we should report likelyhood ratios rather than probabilities. This is
absolutely right. The effect of an observation is to modify our prior beliefs,
and likelyhood ratios concisely describe how we should modify them. Plus they
have the great ability to stack - you can take 3 experiments and combine their
likelyhood ratios to come up with the likelyhood ratio for having seen all
three experiments.

Unfortunately, though, everyone knows frequentist methods, people accept them,
and it is very hard to get people to see what is wrong with them. So alternate
approaches, though theoretically superior, face an uphill battle towards
acceptance.

~~~
tel
The difficulty I tend to encounter in promoting Bayesian statistics among
scientists is the "sudden" appearance of a statistical model. Too many people
go one further than believing frequentist statistics are answering "What is
the probability that H0 is true" but instead just fully over the analysis and
believe that frequentist methods tell you, simply, accurately, objectively,
whether an experimental treatment is "significant".

If you try to press on what people believe "significant" to mean it gets ugly
fast, but it's generally a good thing and definitely necessary to publish. If
you don't get significance it's just because you need a bigger n. If you can
think of some factors or covariates then you really need to use ANOVA.

Stating that there is anything more complex to looking at data and deciding
what it means is practically unthinkable.

Likelihood ratios are definitely nice, though. I've sort of gotten people to
think about it at a high level by talking about "information flows" and log
likelihood values.

~~~
Eliezer
A separate problem, not dealt with in that particular essay, is the quite
hideous degree to which average scientists don't understand the statistics
they use.

I would put a good deal of the blame for this squarely on frequentism as well.
Bayesianism isn't hard to understand, it's just takes an effort of the teacher
to explain well - I've made certain notable efforts in that direction myself.
Once you do get it, you get it.

~~~
tel
I think, roughly, the blame goes out to Fisher and anyone else who promoted
the "Recipe for Understanding the World" style statistics. It's not that
people are being blocked by their understanding of complex frequentist methods
but instead the idea that they don't need to understand anything more because
statistics is just a black box you use for verification.

Insert results, get a green or red "significance" light, move on.

------
hamilton
This debate has been raging for 250 years. Brad Efron, one of the greatest
living statisticians today and one of the few who can transcend the debate,
has VERY interesting things to say about it. He believes that Empirical Bayes,
or using some of your data to model your prior, will win the day.

[http://www-stat.stanford.edu/~ckirby/brad/papers/2005NEWMode...](http://www-
stat.stanford.edu/~ckirby/brad/papers/2005NEWModernScience.pdf)

~~~
nova
> or using some of your data to model your prior

I think that kinda missing the point.

~~~
hamilton
I beg to differ (not sure if you actually read the Efron talk I just posted).
I get the thrust of the Less Wrong article, though I think that frankly his
language and vitriol is direly misplaced. He's railing against some class of
Frequentists who attack Bayesians. This is sort of a Fox News tactic; pick
someone, in this case, a commenter on a blog, and generalize their comment to
the entire population. It's fairly unstatistical, if you ask me. I don't get
why anyone would do this. A good Frequentist is not a lost idiot, nor does she
rail against Bayesians. In fact, a good Frequentist understands the beauty of
Bayesian methods, too. I'd even wager that a good Bayesian understands why
Frequentist methods have dominated the 20th century.

I'm attempting to realize that rare event in these vitriolic philosophical
academic debates - add a bit of middle-groundness. In this case, the middle
ground of Empirical Bayes has proven to be very, very useful in large-scale
simultaneous inference problems. It's proof that there IS such thing as
convergence in this debate. Taking the shrill tone the author takes as a
signal of how little regard the two sides have for each other, it's obvious
that we need some convergence.

~~~
nova
> it's obvious that we need some convergence.

Why? This isn't politics. Maybe one approach IS better than the other. Or
maybe there are better methods we don't know.

In fact I think most Bayesians would consider Frequentist methods either plain
"wrong" (like, at best works in special cases) or as approximations to the
full Bayesian way.

And that doesn't mean they are useless, because going fully Bayesian normaly
requires "a lot" of computation, or maybe we have issues with elicitation (I
suspect this is the biggest hurdle for frequentists). But Bayes is still the
gold standard.

------
smanek
To my (limited) understanding, Bayesian statistics are just as subjective. If
you and I start with different priors, and are then are fed the same evidence,
we can end up with vastly different conclusions.

Since priors are defined as a function of my state of mind (or state of
ignorance, as it were), it seems pretty subjective that the 'result' should
depend so heavily on my initial state of mind.

E.T. Jaynes' book has a example of this with regard to E.S.P. research (I
believe <http://omega.albany.edu:8008/ETJ-PS/cc5d.ps> is the relevant chapter
- but it's been a year or two since I've read the book).

A real statistician should feel free to correct me though - I'm more of an
algebra guy ...

~~~
btilly
Your understanding is incomplete. Bayesian statistics are not subjective at
all. Instead they objectively describe the correlation between prior belief
and conclusion. People may draw different conclusions for subjective reasons,
but that subjectivity is in the people, not the statistics.

Now it seems you are objecting to the fact that your initial state of mind
affects your conclusion. But it is unavoidable that there is no way to make
sense of observation except through the lens of prior beliefs. You can be
explicit about it and draw conclusions with a correct methodology about it as
in Bayesian statistics. You can be implicit about it and draw conclusions with
an incorrect methodology as in frequentist statistics. (See my other post in
this thread for an explanation of why frequentist methodology is incorrect.)

Let me offer a simple example. Suppose a pregnant woman you know gives birth
to a boy. Is that evidence that babies are more likely to be boys than girls?
Obviously it is. Is it strong evidence? Obviously not. Should we upon
observing that conclude that boys are more likely than girls? Obviously not.

Now suppose that you have no prior knowledge other than we see about similar
numbers of boys and girls. I know that of the last 100,000 babies born in the
USA, 51,157 are boys. Suppose we are both told that at the local hospital, 92
of the last 200 babies born were boys. I submit that we both will _and should_
wind up with different conclusions about the relative likelyhood of boys and
girls. Why? Because different prior knowledge lead to different prior beliefs,
and those prior beliefs when modified by identical evidence lead to different
conclusions.

~~~
smanek
You have to admit that your example is pretty contrived though - usually
setting priors isn't so clean.

How do you 'properly' set the prior probability that someone really has
E.S.P.? That a researcher is secretly colluding with a test subject? That a
researcher is flat out fabricating their results? That this is just an
instance of the Hans effect? Or a thousand other possible hypotheses ...

~~~
btilly
I picked my example so that the proper influence of the prior would be clear.
You are absolutely right that there are many cases where different people with
equivalent information have different beliefs. But being faced with
complications like this makes correct reasoning _more_ important, and not
less.

When you try to sweep the proper influence of the prior under the carpet,
people will manipulate their statistics to draw the conclusions that they
want. Worse still, if each is using subjective techniques that they believe
are objective, the unexamined assumption will lead to them talking past each
other.

By contrast with correct reasoning you can show each why they continue to
believe what they believe while showing them how the other side is not going
to have their opinions changed. In my experience this opens up a bigger
possibility of useful dialog and changed opinions.

------
mattmcknight
I get lost in the argument with these simple coin flipping examples. The way
it's presented is so confusing for the weird second case of how many trials it
takes to get a tail.

In the first case, the probability of getting "five tails or more" from a fair
coin is 11%, while in the second case, the probability of a fair coin
requiring "at least five tails before seeing one heads" is 3%.

I didn't check the numbers here, but this seems perfectly reasonable to me in
terms of hypothesis testing. He's playing on the fact that for cases like
THXXXX you would get a result of 2 in the second experiment. You are taking
the same results, but applying a different metric to them (position of first
head versus count of tails). Of course p will be different.

I understand that the value chosen for significance of p is subjective, but
the experiment itself makes perfect sense to me.

~~~
btilly
The problem is that the conditional probability of the coin being biased given
the observation of TTTTTH absolutely does not and cannot depend one whether
you intended to flip until you saw a heads, or were going to flip 6 times.
Therefore you've got a subjective factor influencing your conclusion that
probability theory says should _not_ be a factor if you are reasoning
consistently.

------
gort
To me, more interesting than Bayesianism versus Frequentism was the
observation that scientific procedure as it currently stands allows a field
like Parapsychology to survive.

