
Ways to fix statistics - aqsalose
https://www.nature.com/articles/d41586-017-07522-z
======
basseq
I consider myself analytically-minded, and statistics still gives me a
headache. Most people understand basic relationships: "These numbers are
different, but not meaningfully so" or, "These two elements are more strongly
related than these other two."

The problem is threefold:

1\. _Learning statistics is a tautology._ Meaning: the best way to learn
statistics is to already know statistics. Go google how to do basic
significance calculations in, say, Excel. The _first_ paragraph of the
resulting articles will dive into t-test and twin-tales and P-values and on
down the lines. Ask someone how to do a simple comparison, and they'll digress
into null hypotheses and "what test did you decide to run". You'll probably
see some nasty formulas and opining about which PhD-level approach is better.
It's just opaque.

2\. _Tools are professional grade._ Software needs to bridge the gap between
users' intent and their ability to do the analysis. Statistics requires R (you
have to know how to program _and_ statistics), SAS ($$$), or another package
that limits barrier to entry. Most people know how to ask the basic question,
but no idea how to get the answer or interpret the results.

3\. _People who know statistics can 't interpret results to laymen._ I see
this a lot in business: really bright folks build segmentation models and
predictive models, you start talking about cohorts and medians... but that's
where it ends. Your c-suite doesn't care about that: it cares about the so-
what.

~~~
imh
I'm sorry, but this is entirely unfair, especially points 1 and 3.

First, addressing #3:

>People who know statistics can't interpret results to laymen.

This doesn't match my experience. Stats experts can sometimes be terrible at
communicating their work, but no more so than the security expert at
explaining why we have to jump through these hoops, or the web dev explaining
why the site went down. Some technical experts are good at communicating to
lay people, some aren't. Stats is no outlier.

But for learning and using statistics:

> Go google how to do basic significance calculations in, say, Excel. The
> first paragraph of the resulting articles will dive into t-test and twin-
> tales and P-values and on down the lines.

This seems to really be saying _Using statistics requires you to understand
statistics._ Which seems sensible.

The big issue is that uncertainty is a different kind of thinking. To put it
at its simplest, if you want a yes or no answer and I keep saying "Well,
probably yes, but maybe no" you're going to be really frustrated that I keep
using the language of uncertainty. "I just want excel to give me a yes or no
answer, but the articles keep trying to teach me about 'maybe' in the first
paragraph!" Uncertainty involves new kinds of statements. No way around it.
The software and the concepts require it, and there's no (correct) way to
reduce it to "I just want excel to tell me whether this effect is real." When
you say most people know how to ask the basic question, I disagree. Most
people want to know "Is this effect real, yes or no?" But no matter what tools
you use, that's not enough of a question to have a single correct answer.

"You'll probably see some nasty formulas and opining about which PhD-level
approach is better." That's just untrue. It sounds anti-intellectual and a lot
like the defeatist "I'm just not a math person." The formulas you'll see in
those 'Intro to NHST' tutorials require high school math and are covered in
the first stats class you'd take in undergrad.

Edit: Sorry for the rant. I'm going to leave it up, but I'm just really turned
off by "I consider myself analytically-minded, and statistics still gives me a
headache," followed by 'here are the problems with statistics.' It's not hard,
but you _have to learn the foundation to use it_. Do that before criticizing
how we use it.

~~~
ACow_Adonis
In his defense (and admitting your points are all completely right):

I consider my self analytically minded (i hope I am to some degree since its
how I make my living).

Statistics during university also gave me a headache. Still does.

10 years later though: I've now worked for the national statistics body. I put
my head down and said "Screw this, every time I don't get something, I'm going
to try to implement it from first principals until I do. I'm going to code it
up. Every time I don't understand the paradigm/language, i'm going to try to
do everything in that mindset until I do."

I worked for several years in the stats methodology division until I moved on.

I am now of the not uncontroversial opinion admittedly, but ever increasing in
popularity, that the reason I didn't understand stats in university stems
from:

1) It is taught badly.

2) Often what is taught IS wrong.

3) In practice it is used wrong: sig testing is the most overused technique
I've seen in any field i've had experience with, except perhaps linear models
(another stats baby) and in my experience, once one has internalised the
lessons/mindset of variance/sampling/etc rather than "correct/not-correct",
statistical significance is almost always tangential to the actual analytical
question, yet it is often treated as the goal. I think its popularity is
partly due to the fact that it does provide someone a framework for yes/no
decision making, negating the very mindset change needed to properly
understand statistics. Frequentist techniques were the focus in university
stat level classes, and such a framework and models are often forced into very
non-frequentist type situations where a bayesian/subjective interpretation is,
in my subjective opinion, more rational and justified. My stats classes had
almost no material on computation, logic, falsifiability, experiment and data
design, etc, which, subjectively I view as much more important to real
statistical work than rote learning what a regression, least squares, or R^2
measure is.

A great deal of the time, analysts have trouble explaining the methods they
use because its obvious (to me) that they don't really understand the methods
they're using (if they did, they often wouldn't be using them).

~~~
dr_zoidberg
The more I study (halfway through a tough PhD now), the more I realise there's
an obsesion with making hard things harder, sometimes just because you were
taugh that way, other times because hey, "it's post-graduate education! it´s
suposed to be hard".

And the many times me and the research group I work with have sat, read things
from the basics upwards, doing an effort to explain everything as simple as
posible, while not hiding complexity, it took a lot of effort, but our work
was highly praised and moved back into the courses we teach. And students are
happier (or at least not miserable as they used to be when teaching hard
things was done "the classical way").

So I agree with (1), and all I can say is that, when in a position of
teaching, it's best to take the time, work our material from the ground up,
and teach it in the simplest posible way.

------
danieltillett
While there are problems with the way statistics are used by scientists and
others, the elephant in room is the incentive system. p-hacking and the like
result from the need to get “exciting and publishable” results on a consistent
basis or effectively be fired (the simple path is no “sexy” results -> no
“high quality” papers -> no grant funding -> no job).

A possible solution that I have not seen proposed (this is mostly likely my
ignorance) is for journals to only accept/reject research proposals, not
finished papers. The journal would publish the resulting paper no matter the
outcome of the research.

~~~
Vinnl
The problem is that journals have incentives too. And these unfortunately
often align: sexy results -> more income.

~~~
danieltillett
This is true, but it is surprisingly weak as the income of a journal is not
too closely related to its impact factor.

My proposal would still mean that Nature would get the most sexy papers
because it would get the most sexy proposals. A bigger problem would be
journals choosing proposals from authors with a history of getting sexy
results rather than on the quality of the proposal. This reliance on "track
record" is why the grant system is so broken.

~~~
Vinnl
Isn't it? I mean, I know there's no hard data on it since most journal
subscriptions are bought in bulk, but the journals are using their impact
factor to sell those subscriptions (case in point: [1]). Likewise, if they
charge publication fees, authors will be most willing to pay large sums of
(their funders') money for journals with high impact factors.

And do you really think Nature would still get sexy results if there are sexy
proposals? Or would your proposal (which I do agree with, for the same reason
the traditional publishers won't do it:) lead to better research but fewer
"sexy" results? After all, journals with a high impact factor are also known
to have more retractions.

(The traditional) publishers are going to keep doing what is bringing in the
most money - or at least, what they believe what will. At this point, they
don't believe that your system would lead to higher income.

What could change that if authors are no longer incentivised to chase high
impact factor journals. But that means there will have to be better ways to
evaluate them that will actually get adopted (i.e. not require evaluators to
comb through every applicant's research [2]).

This is a difficult problem. And yes, I realise I'm not coming up with a
solution. The only ideas I have on that front have a really small chance of
success, but it's a bit too much to elaborate on that here. (Although in time
I will write about them at [3].)

[1]
[https://mobile.twitter.com/simoxenham/status/235698457124425...](https://mobile.twitter.com/simoxenham/status/235698457124425728/photo/1)

[2] [https://theconversation.com/why-i-disagree-with-nobel-
laurea...](https://theconversation.com/why-i-disagree-with-nobel-laureates-
when-it-comes-to-career-advice-for-scientists-80079)

[3] [https://medium.com/flockademic](https://medium.com/flockademic)

------
aqsalose
See also further discussion on Andrew Gelman's blog:
[http://andrewgelman.com/2017/11/28/five-ways-fix-
statistics/](http://andrewgelman.com/2017/11/28/five-ways-fix-statistics/)

------
schuetze
Step 1: Become Bayesian.

Seriously though, I am inclined towards approaches, such as pre-registration,
which limit the number of researcher degrees of freedom in analysis. It's not
necessarily that statistics are broken. It's that the system incentivizes
researchers to break the assumptions underlying these statistical tests.

~~~
analog31
What I wonder is, given a system with perverse incentives, won't people find a
way to abuse Bayesian statistics?

~~~
timClicks
Probably.. but with Bayesian methods at your model/data updates its priors,
rather than you effectively embedding your prior beliefs into your models via
selectively choosing tests that support them.

~~~
dnautics
It's very easy to let that slip into post hoc justification of your priors.

~~~
analog31
In my view, "prior" may be a misnomer. There is nothing that I'm aware of in
Bayes' theorem to suggest that you have to formulate your priors _before_
gathering or analyzing your data. I would describe priors as constraints that
are included in an analysis, to narrow the results based on additional
information that you're aware of. Bayes' theorem mainly provides a framework
for computing what happens when you do that.

~~~
kgwgk
Maybe the prior doesn’t need to be formulated _before_ getting the data, but
it needs to be _independent_ of the data.

------
jamez1
> Worse, NHST is often taken to mean that any data can be used to decide
> between two inverse claims: either ‘an effect’ that posits a relationship
> between, say, a treatment and an outcome (typically the favoured hypothesis)
> or ‘no effect’ (defined as the null hypothesis).

This is the whole problem, the rejection of the null hypothesis is not just a
ritual to follow before you then accept the alternate. Fisher's core idea is
that it only takes one piece of evidence to reject something, while it should
take many to confirm something.

So the only conclusion you should draw is about what you've rejected. You
can't then take that and infer the acceptance of something else. It's a subtle
but important distinction that is the whole reason for this process.

------
rvern
_Likelihood functions, p-values, and the replication crisis_ :
[https://arbital.com/p/likelihoods_not_pvalues/?l=4xx](https://arbital.com/p/likelihoods_not_pvalues/?l=4xx).

------
MichailP
Is there a calculus based approach to statistics? Like understanding linear
regression simply as a minimization problem, but ALL THE WAY ie extending to
other statistical techniques?

~~~
lliiffee
To build on fny's answer, there is one school of statistics (Bayesian
statistics) where there basically is a "right" way to do analysis for any
problem, provided you make the necessary assumptions (likelihood and prior)
correctly. However, the most common statistical concepts (e.g. p-values or
confidence intervals) are not in the Bayesian school

~~~
mattkrause
Credible interval or highest posterior density intervals are arguably much
closer to what most people _think_ a confidence interval are.

I’m not sure there is a single right way to solve any given problem isn a
Bayesian way, but it does force you to think more about the problem at hand
and make your assumptions explicit.

------
stewbrew
I think this is an interesting collection because most proposals acknowledge
that statistics cannot be fixed by means of statistics alone but rather by
taking into account the human factor. Bayesian statistics won't keep humans,
who want to achieve something that's not primarily about statistics, from
doing stupid things.

------
jpfed
One can imagine workflows that combine some of these solutions. Imagine pre-
registering experiments and analyses... and then fellow scientists privately
weigh in with their predictions for the results. These predictions can then be
used to form a prior for a false-positive risk calculation.

------
aisofteng
>We need more observational studies and randomized trials — more epidemiology
on how people collect, manipulate, analyse, communicate and consume data.

I don't understand what is meant by "epidemiology" in this sentence.

