
Reproducibility should be at science’s heart. It isn’t, but that may soon change - martincmartin
http://www.economist.com/news/science-and-technology/21690020-reproducibility-should-be-sciences-heart-it-isnt-may-soon
======
mtrn
At my day job I work a lot with data, from a multitude of sources. It amazes
me how hard it is to build data products (you could think of it as an analog
to scientific experiments) and know _exactly_ what parts of the processing
pipeline influenced what parts in the result.

On the other hand: When I hear that many scientists use Excel for complicated
processing chains and are sometimes themselves hardly able to reproduce
anything a few month after the paper was published, I believe we hit some new
low point in scientific activity. Combined with an infrastructure focused on
research _impact_ and _number of papers_ published, this is really a sad
situation.

~~~
x5n1
Real science is random, see No Method. When you make it a career that has
requirements and expectations you set this sort of process to take place.
Still even through this process, if enough people are doing it, there is bound
to be some breakthrough science. But most of the people are wasting their time
cooking data to make their careers go further, rather than make progress in
science.

~~~
tdaltonc
I assume you're referring to "Against Method" by Feyerabend?

His argument is absolutely not that science is "random". It's that the methods
of science cannot be predicted, prescribed, or universal, even over very short
time scales and between similar fields.

Even if Feyerabend is right [0], it's still perfectly reasonable to expect
studies to be repeatable.

[0] I'm sure he is but I think that the degree and practical significance of
his observation is questionable.

------
ar-jan
The article uses both the terms reproduce and replicate, but doesn't go into
the difference, and seems to use the them with just the opposite meanings
they've had in academic use - roughly speaking:

Reproduce - to verify a study's results by re-analyzing the same data.

Replicate - to re-do the whole experiment.

Though the terminology is not quite intuitive, and there are other relevant
distinctions to make. See
[http://languagelog.ldc.upenn.edu/nll/?p=21956](http://languagelog.ldc.upenn.edu/nll/?p=21956)
for some discussion of the history.

~~~
pc2g4d
I had never heard of this distinction, and I also dislike it. This definition
of "reproduce" seems to fly in the face of the word's more general meaning.
"Verify" and "re-analyze" seem like a more natural fit.

Yes, we need a way of describing those two different things. But using
"reproduce" and "replicate" (which, as the article showed by its actual usage,
tend to be seen as synonymous) does not seem like a good way to go.

~~~
arcanus
> which, as the article showed by its actual usage, tend to be seen as
> synonymous

In the sciences, language often takes on an entirely different meaning than in
popular lexicon. This is one of the reasons I cringe every time I hear someone
state, 'It is just a theory'.

> "Verify" and "re-analyze" seem like a more natural fit.

At least in my field (computational science), verification already has a very
specific meaning. Overloading terms is avoided, although that also happens.

In terms of successful communication between scientists, it is far more
important for everyone to agree on the meaning of a term than what particular
word is used itself. But perhaps this is also accounts for difficulties
communicating important results to non-specialists...

~~~
pc2g4d
Possible compromise: let the scientists have their terminology, and let
general audience publications like the Economist have theirs.

Mostly I just don't like the idea that the Economist article's usage is
"wrong" just because some scientists use different terminology. Different
spheres of interest will always be tripping over each other's terms when they
come into contact; too many pixels have already been spilled trying to remedy
this inevitable occurrence (mine included!)

~~~
collyw
"Hacker" being a perfect example of this in the tech world.

------
jonawesomegreen
I think reproducibility is one of the biggest challenges to face science in
our time. Luckily we have the tools we need to solve the problem already. Of
corse paper journals don't want to waste space and reproducing results, but no
such constraints are on online publications. And while moving publications
online we can solve the issues of journal publishers owning the copyright on
papers written with public money. The big question that's left is trust. A lot
of the journals that have gone online are essentially scams publishing
whatever you want for a fee. We still need the peer review step the close the
gap.

~~~
dagw
The big problem isn't that journals don't want to "waste" space publishing
reproduced results, but that scientists don't want to "waste" their time
reproducing other peoples results. Unless the result is a once a decade game
changing result for your field, there is no incentives in reproduction. You
cannot change peoples behaviour without changing the underlying incentive
structure. Do that and everything else will trivially fall into place.

~~~
digi_owl
Nah i think the bigger problem is the MBAs walking around, trying to apply
widget factory metrics to academia, healthcare, and other hard to measure
environments.

This in turn has lead to the whole "publish or perish" environment, as the
MBAs use published articles as a replacement for widgets made and citations as
sales.

------
daphreak
The Planet Money podcast #677 (The Experiment Experiment) discussed an effort
to reproduce some experiments.

One of the methods they discussed to both increase reproducibility and reduce
experimenter bias was to register the experiment procedure and hypothesis with
the journal before performing the experiment. It's been a while since I
listened but I think one or more journals is supporting this workflow.

I'm glad we are working towards a better scientific process. These days
sensationalism scores more grant money and Scientific American articles. We
need incentives to improve our body of knowledge not just make headlines.

------
hackuser
> An analysis of 98 psychology papers, published in 2015 by 90 teams of
> researchers co-ordinated by Brian Nosek of the University of Virginia,
> managed to replicate satisfactorily the results of only 39% of the studies
> investigated.

The Economist is overstating the results a bit. From the coverage at the time:

 _Strictly on the basis of significance — a statistical measure of how likely
it is that a result did not occur by chance — 35 of the studies held up, and
62 did not. (Three were excluded because their significance was not clear.)
The overall “effect size,” a measure of the strength of a finding, dropped by
about half across all of the studies. Yet very few of the redone studies
contradicted the original ones; their results were simply weaker._

More here:

[https://news.ycombinator.com/item?id=10132993](https://news.ycombinator.com/item?id=10132993)

------
singularity2001
Thats why I love [http://gitxiv.com](http://gitxiv.com) : arxiv.org papers
with open source GitHub archives.

------
Marius_B
Reproduction, replication, are words not accurate enough. The point is that
access to the data, the code, procedures, and so on is the rigorous way
towards independent validation.

The best part of the scientific method is that any scientific contribution has
to pass the filter of independent validation. Now it became technically
possible for the authors of the research to give all they have in order to
facilitate independent validation.

This is very good news for the researchers, because it is much more feasible
for them to make open their results than to wait for the publishing system to
recognize that we are in the 21st century.

------
EGreg
Besides reproducibility, you also need to make testable predictions for it to
be science. I wonder what would the testable predictions be for most evidence
that all (or almost all?) evolution proceeds entirely by random mutation and
natural selection. If this isn't true then a lot of the stuff in evolutionary
psychology and other "just so" explanations may be completely unscientific.
Exemplified by crap like this:
[https://www.theguardian.com/science/2007/aug/25/genderissues](https://www.theguardian.com/science/2007/aug/25/genderissues)

~~~
tokenadult
If you are looking for solid evidence that current living things originated by
common descent and speciation through evolution by natural selection, that
evidence is abundant.[1] Some of the evidence backing up what is called
"evolutionary psychology"[2] is much more debated, and much more problematic,
and other psychologists are quick to criticize many evolutionary psychology
publications.[3]

[1]
[http://www.talkorigins.org/faqs/comdesc/](http://www.talkorigins.org/faqs/comdesc/)

[2] [http://www.cep.ucsb.edu/primer.html](http://www.cep.ucsb.edu/primer.html)

[3]
[http://www.larspenke.eu/en/publications.html](http://www.larspenke.eu/en/publications.html)

~~~
EGreg
Those are two different things: the theory of common descent has a lot of
evidence in its favor. On the other hand, the claim that all features observed
today have been the result solely of random mutation and natural selection, is
rather untested. It doesn't make many testable predictions. The mathematical
issues with the theory (wherein, on its own terms, the chances are
infinitesimal of it being true) are rarely addressed in mathematical terms,
instead most responses are of the type "well X1 feature _might have_ helped Y
animal survive and reproduce better until it led to X2 feature". If you look
at tons of published literature that mentions evolutionary explanations, it is
all "just-so" stories which already assume the theory is true.

I am talking about the following criterion in order to qualify as a real
scientifically tested theory:

[http://www.stephenjaygould.org/ctrl/popper_falsification.htm...](http://www.stephenjaygould.org/ctrl/popper_falsification.html)

According to this criterion, the theory that random mutation and natural
selection alone produced all the genotypes we see is like Marxism and Freud's
psychological theories. If we assume they are true, we can explain everything
with just-so explanations. But that's not falsifiable, it doesn't make
testable predictions. NOTE: this is _different_ than saying all animals had
common ancestors. It says we know how the changes happened that led to the
current features. The fact is, we don't know. We barely have scratched the
surface. We are still in stages comparable to the luminiferous ether theory in
ohysics or humors in biology. I think in 100 years people will look at our
theories of random mutation and natural selection (whether classical,
punctuated equilibria etc) and consider us naive. And yet we have proponents
like Dawkins yelling from the rooftops that "evolution is as proven as
gravity". That's equivocating the word "evolution" with common descent. There
is also a lot of pressure and political interest from demand for a
naturalistic explanation, similarly to how there is demand from the religious
camp for "the institute for creation research". It affects who gets research
and who gets published. Science is a human endeavor and it is affected by
political and organizational pressures just like everything else. But when
something is testable and is being tested, it's obvious. Here nothing is
obvious - even the math is dubious. These theories are just riding on the
coattails of the theory of common descent because we don't have better
naturalistic explanations at the moment. But just because a theory is the best
we have doesn't mean it's true. Overstating it is, is more advocacy than
science.

~~~
ejk314
That hypothesis isn't falsifiable. We know "X effects Y" from experimental
result. You're asking for proof that "X and only X effects Y", which is the
same as asking for "There does not exist some X' that effects Y" where the set
of X' is unbounded. No, that's not testable and doesn't need to be.

Just like we didn't need to know about the X' of Einstein's equations to know
that the X gravity was the primary force holding you to the ground.

There very well may be an X' that effects observable biological features other
than evolution, but that X' needs to be tested on an individual basis. So
while it's productive to hypothesize an X' and test it, there's not much we
can do to rule out all X' as a class.

~~~
EGreg
So if you can't rule them out, your theory shouldn't postulate that they don't
exist. It's one thing to do 100,000 experiments and have the results
COMPLETELY predicted by Einstein's equations beforehand. It's quite another to
invent just-so stories in terms of the theory itself after each observation!

------
eleitl
The delicious irony is that article which is published in The Economist makes
no mention of rampant irreproducibility in the field of economics. Honi
soit...

~~~
Mithaldu
Is that because those studies are inherently wishy-washy, or as with many
other scientific studies, the processes aren't actually detailed enough, or
the code itself is missing?

------
MollyR
Depends on the science as well. I don't think it fair to group biological
studies or clinical trials with psychology.

Also alot of biological labs do try to reproduce experiments to further their
own research. They just dont publish negative results.

------
godzillabrennus
This is what we are working on at [http://www.myire.com](http://www.myire.com)

------
hackuser
I think the most interesting question is, given how flawed the scientific
method is in practice, how does science get such great results?

