
Prestigious Science Journals Struggle to Reach Even Average Reliability - pjf
https://www.frontiersin.org/articles/10.3389/fnhum.2018.00037/full
======
Vinnl
Unfortunately, the main reason these journals get used is not because they are
better at selecting reliability, but because they provide authors with the
credentials needed to advance their careers...

 __Edit: __See also[https://medium.com/flockademic/the-ridiculous-number-that-
ca...](https://medium.com/flockademic/the-ridiculous-number-that-can-make-or-
break-academic-careers-704e00ae070a)

~~~
jack6e
But the reason they are considered qualified to serve as credentialing
authorities is, in part, due to their perceived reliability. Of course the
significance of published work and its influence on the field is perhaps the
primary driver of a journal's reputation for readers, but significant work can
only be influential if it is also reliable. Were a journal's readership to
realize that its publications could not be trusted, it would lose influence
and thus the perception that having work published in it is a meaningful
professional achievement.

~~~
Vinnl
Is it, though? Are researchers even aware when previous research has been
retracted? Retracted research often continues to amass new citations (which,
of course, need not necessarily be positive).

Although I would certainly not assert that research in the "top" journals is
particularly unreliable, it is not unbelievable that, when they tend to
publish more extraordinary result, a smaller percentage of that tends to be
true. I have not seen any indication that that leads to such journals losing
influence, and it certainly doesn't lead to the Impact Factor being lowered -
which often influences the "perception that having work published in it is a
meaningful professional achievement".

~~~
2020-3030
The Retraction Watch website
[http://retractionwatch.com/](http://retractionwatch.com/) is one good method
of highlighting retracted work. For researchers in any given field, keeping
current on literature is hard work given the pace of publication but does
include seeing retractions.

So, the answer to your question depends on which researchers we mean. Experts
in a field have the best chance of noticing relevant retractions in their own
field. If you are a researcher in one field who has interests in many other
fields, it is probably much less likely that you will be able to follow the
retractions in those peripheral fields. That is why Retraction Watch and other
mechanisms of highlighting such errors is useful.

------
lokopodium
It looks like the modern scientific framework just doesn't scale well. With
paper output so great, it's nearly impossible for a researcher to sift through
the haystack. Researchers resort to reading the papers from known groups only,
which filters out most of the fluff, but perhaps leaves meaningful works from
outsiders unnoticed.

~~~
mncharity
> reading the papers from known groups only

 _Known_ groups. Community gossip provides sometimes-essential context. "You
can't trust person X on topic Y, because when he does Z, he sees what he wants
to see." Not having access to that context, is another challenge faced by
folks less connected.

------
truculation
There are a huge number of bureaucratic hoops to jump through before you can
get an article published. The flow chart is huge and gets more elaborate year
by year. So if the quality is _still_ poor then I think this simply means that
we're doing it wrong. Either science isn't what we think it is or our heart
isn't in it (or both).

My 'method' of choosing what to read is simple: I follow people whose work I
know and enjoy. I trust them. Actually I think that's pretty standard. For
example, Leonardo gets published just by having left stuff lying around the
place.

~~~
Retric
I think the expectations for novelty may be to high.

Look at the new drugs list every year and you see a lot of churn with minor
variations common, but less novelty than many assume.
[https://www.centerwatch.com/drug-information/fda-approved-
dr...](https://www.centerwatch.com/drug-information/fda-approved-
drugs/year/2017) And this is with 100+ billion in public + private money world
wide per year.

Sure, there is plenty of stuff being discovered, but we can't expect a steady
or increasing rate of progress across all areas.

~~~
aaavl2821
The relationship between public research funding and drug approvals is
actually pretty limited. The bottleneck is actually funding for translational
research, i.e. The space after public funding but before the $200B+ in big
pharma r&d kicks in. This area is called the "valley of death" and is where
most science dies

Venture funding is what gets drugs through the valley of death. There's a
pretty good correlation between the types of drugs VC funds, the drugs big
pharma companies buy from VC, and the drugs that get approved. The kinds of
drugs that get approved are less correlated with areas of focus for public
research spending Or diseases with most societal impact

Wrote something on this last week:
[http://newbio.tech/blog/vc_basics_1.html](http://newbio.tech/blog/vc_basics_1.html)

~~~
Retric
Pharma research is not limited to the US. FDA will approve drugs developed in
other countries though their will be some hoops involved they are tiny
relative to finding something useful.

Anyway, as you say: "Randomly picking 24 early-stage drugs to develop is a
losing bet." What I think you are missing is VC's can extract money from
failed drugs. VC's are playing with other peoples money and they get a larger
chunk of upside while limiting their downsides.

This means investing in VC's has poor returns on average which is why the
system looks the way it does.

PS: Trying to fit that decreasing curve when the actual trend line is clearly
positive is ridiculous.

~~~
aaavl2821
Agreed that pharma research is not limited to the US, but neither is VC
investing. US VC firms account for the majority of global biopharma VC
investment (though china is catching up). As it stands now most FDA approved
drugs are developed by US, European or Japanese firms, and in 5-10 years china
will be up there as well. The US is the most profitable drug market, so as a
quick and dirty analysis I think using FDA approvals, VC funding and big
pharma m&a data makes sense. Not perfect, but it's a decent heuristic

VCs do make money on management fees, and VC returns in biotech from 2000-2012
were poor, but biopharma VC has done insanely well the last five years. Better
returns than software VC, more IPOs and big m&a exits than tech despite only
accounting for 20% of venture funding

A lot of money is now chasing these high returns, but there are not enough
good biopharma entrepreneurs. So there more money but not more good startups.

And yes, that chart about r&d spending is not statistically valid :), but it
isn't meant to be. But it illustrates a trend that big pharma is cutting back
on r&d, especially early stage research which they outsource to startups. This
topic is tangential to the topic of the post, but the article that was the
source of that chart has good context around pharma r&d trends

------
jimhefferon
This must be mistaken?

> While the number of scientists has been growing exponentially over the last
> decades, the number of journals with a large audience has not kept up,
> neither has the number of articles published per journal. Consequently,
> rejection rates at the most prestigious journals has fallen below 10%

Do they mean acceptance rates?

~~~
Vinnl
Yes, the author says so here:
[https://twitter.com/brembs/status/966046745854730241](https://twitter.com/brembs/status/966046745854730241)

------
tw1010
"Struggle" would imply they're actually trying. People do what is
incentivized, and as long as reliability is placed towards the bottom of the
priority latter, it won't improve.

------
SagelyGuru
When you read between the lines of these "productivity" academic career
policies and measures, you realise that they are aimed squarely at fame rather
than quality of science.

I guess the argument goes something like this: when you get fame through your
high profile journal publications, we will be able to attract more grants and
more fee paying bums on seats.

Who cares about what you have actually written? Under the current model,
nobody really.

------
aaavl2821
The issue the article doesn't spend too much time on one of the major reasons
why reliability is so low: there are too many confounding environmental
variables to control for, so work done in one lab often doesn't generalize

One seemingly standard, innocuous piece of equipment can make immune cells
"aggressive", while a different brand can make them tolerant. No one knows
why. Genetically identical mice ordered from one vendor can consistently have
different disease progression compared to mice from another vendor. Background
noise in animal facilities can mess with experimental outcomes and even cure
disease. Two lots of otherwise identical reagents can act almost like
different chemicals

Many findings are the result of some set of confounding variables rather than
the independent variable. It is impossible to know these things in advance or
control for them. The best tactic is to do the same experiment in a different
lab with a different research team to control for all these "other factors".
Which is expensive (often six of seven figures) and can take months or even
years. It is best practice when startups license tech from a university to ask
for an exclusivity period where they reproduce the key experiments before
committing to a full license

Imagine that for every open source library you use, all you have is a detailed
read me, a set of tests, and some infrequently commented white board code. You
reconstruct the code as best you can (which isn't that hard), but for some
reason the tests don't check out. The bug isn't with your software. You call
the original researchers and find they used a few libraries that you didn't,
and these didn't show up in the publication. You have to manually install all
these libraries, test them, and integrate them. Still doesn't work. You
realize they used a different os and have to account for that. After six
months, still nothing. After visiting their office, you realize the computer
used to write the software is on the floor on some very fuzzy carpet, and
there is some weird electrostatic bug in the hardware that contributed to how
their software performed

This is perhaps not the best analogy, and I'm not a scientist myself though
have managed scientific teams for years, managed a few tech transfers, and
evaluated tons of science as potential investments, but I just wanted to drive
home the point of how tough science can be. If others can critique this I
would welcome it as it would be great for wet lab scientists and engineers to
understand each other's worlds better

Of course there is also fraud, cherry picking of data, and careful
manipulation of study design to make externalities work in your favor, but a
lot of the reliability / reproducibility crisis comes down to lots of
uncontrollable / unknowable variables and expensive and long development
cycles

~~~
nonbel
>"Many findings are the result of some set of confounding variables rather
than the independent variable. It is impossible to know these things in
advance or control for them."

It is literally the primary job of the (experimentalist) scientist to figure
this stuff out. You are describing people skipping the science and trying to
jump directly to the conclusions. This is exactly what Richard Feynman called
"cargo cult science":

 _" For example, there have been many experiments running rats through all
kinds of mazes, and so on—with little clear result. But in 1937 a man named
Young did a very interesting one. He had a long corridor with doors all along
one side where the rats came in, and doors along the other side where the food
was. He wanted to see if he could train the rats to go in at the third door
down from wherever he started them off. No. The rats went immediately to the
door where the food had been the time before.

The question was, how did the rats know, because the corridor was so
beautifully built and so uniform, that this was the same door as before?
Obviously there was something about the door that was different from the other
doors. So he painted the doors very carefully, arranging the textures on the
faces of the doors exactly the same. Still the rats could tell. Then he
thought maybe the rats were smelling the food, so he used chemicals to change
the smell after each run. Still the rats could tell. Then he realized the rats
might be able to tell by seeing the lights and the arrangement in the
laboratory like any commonsense person. So he covered the corridor, and, still
the rats could tell.

He finally found that they could tell by the way the floor sounded when they
ran over it. And he could only fix that by putting his corridor in sand. So he
covered one after another of all possible clues and finally was able to fool
the rats so that they had to learn to go in the third door. If he relaxed any
of his conditions, the rats could tell.

Now, from a scientific standpoint, that is an A‑Number‑l experiment. That is
the experiment that makes rat‑running experiments sensible, because it
uncovers the clues that the rat is really using—not what you think it’s using.
And that is the experiment that tells exactly what conditions you have to use
in order to be careful and control everything in an experiment with
rat‑running.

I looked into the subsequent history of this research. The subsequent
experiment, and the one after that, never referred to Mr. Young. They never
used any of his criteria of putting the corridor on sand, or being very
careful. They just went right on running rats in the same old way, and paid no
attention to the great discoveries of Mr. Young, and his papers are not
referred to, because he didn’t discover anything about the rats. In fact, he
discovered all the things you have to do to discover something about rats. But
not paying attention to experiments like that is a characteristic of Cargo
Cult Science."_
[http://calteches.library.caltech.edu/51/2/CargoCult.htm](http://calteches.library.caltech.edu/51/2/CargoCult.htm)

~~~
aaavl2821
Do you think that many experimentalists actually are able to / have the luxury
of figuring all this stuff out if an experiment works? Of course when stuff
goes wrong you investigate all confounding factors until you make it work, but
that doesn't mean you've chased down every loose end. And if you are doing
industrial drug discovery and development, then validation, documentation and
quality are absolute priorities. But from my experience (again, managing tech
transfers, not being an experimentalist), people solve for getting something
to work with some level of reliability, and you simply cannot test all
confounding factors

To use the Feynman example: experimentalists do all of the things the rat
researcher did to "debug" an experiment. But you can do the same study in a
different lab, and there may be another environmental variable you have to
control for that didnt exist in the first lab. Especially when you have
experiments that are orders of magnitude more complex than the rat experiment

My perspective is that in theory you can do perfect science and figure out all
the variables but in reality you just can't. Again, not a scientist, but I
think the reproducibility issues in research and low success rates in drug
development support that perspective

If you are a mathematician, there is basically no expected failure rate. You
prove it or you don't. If you are an engineer, some stuff just isn't possible,
and unpredictable stuff happens, but many times there's a right answer. If you
are a psychologist, if you get the treatment right half the time you are good.
If you are a drug researcher, even getting one approved drugs is phenomenal.
It's by nature a field with low technical success rates

~~~
nonbel
> _" Do you think that many experimentalists actually are able to / have the
> luxury of figuring all this stuff out if an experiment works?"_

It is a crucial part of science, not a luxury. Failure to do so means you are
doing something else (I use the general term "research").

> _" But you can do the same study in a different lab, and there may be
> another environmental variable you have to control for that didnt exist in
> the first lab."_

Yes, independent replication is required to be sure you understand the
experimental conditions.

> _" My perspective is that in theory you can do perfect science and figure
> out all the variables but in reality you just can't... It's by nature a
> field with low technical success rates"_

Perhaps, or perhaps this is an excuse by people who haven't even tried to
actually approach the problem scientifically. They (and/or the funding
agencies) just want to jump right to the exciting discoveries and cures
instead of doing the science. Since the scientific approach has not been
tried, we don't know.

> _" the reproducibility issues in research and low success rates in drug
> development support that perspective"_

They also support the perspective that sloppy research wastes insane amounts
of time and money. Jumping to the "it is so complicated" excuse without
actually trying to do things correctly seems really disingenuous.

Anyway, I left biomed exactly for this reason. There did not seem to be any
interest in doing actual science. No one was ever going to try a direct
replication of my work, no one cared about developing and testing quantitative
models (only coming up with meaningless "significant" p-values), etc.

~~~
aaavl2821
I agree with everything you've said. My experience comes from trying to
replicate science that simply isn't reproducible in ways that are not possible
to understand from reading a publication. Others I've worked with have similar
experiences

These disappointing experiences with imperfect science has led me to
acknowledge, as you have, that the system is not ideally set up to facilitate
good science. In industrial science (pharma companies) it is better as it is
harder to hide from bad science but still not perfect

I don't fault researchers for not doing perfect science. In many cases funding
/ resources are not sufficient to allow for exploration of all avenues.
Pressure to publish is real. And as you say, independent replication seldom
happens as people have incentives to spend time on other things. That is what
I mean by a "luxury": if you are fighting for scarce grant dollars, or you are
fighting for scarcer tenure, you can't always afford to do work like this. So
there is a lot of suboptimal science. It is a sad state of affairs, but I
can't blame the researchers for how the system works.

If it was as easy as writing robust software tests and having someone else run
your code on their computer, there would be little excuse for bad science. But
it's harder than that. I see how you can think of that as a cop out, but to me
it just seems like the reflection of an imperfect reality

~~~
nonbel
> _" My experience comes from trying to replicate science that simply isn't
> reproducible in ways that are not possible to understand from reading a
> publication."_

Right, this is because no one has done the science. Like Feynman said, someone
needs to figure out all the stuff we need to keep track of to be able to do
this.

I personally saw no interest in doing that (pretty basic stuff like "rat gets
more food when giving this drug, therefore the drug made them
smarter/healthier"; why not more hungry/motivated, etc?). It was very far from
perfection. It was that someone comes up with an assay and mentions some of
the limitations at the end of the paper. Then for the next 40 years no one
checks this, everyone just continues to run the unverified assay and assume it
measures what they want.

> _" I don't fault researchers for not doing perfect science."_

I just call it science vs not science, with a hard rule that in science people
need to be able to independently replicate your work. Otherwise I call it
research (a superset of science), and refer to the people doing it as
researchers. I think this is neutral enough, while still allowing the
distinction between science vs not.

Mainly, if it really is the case, I want it made clear that the current
culture/environment/whatever is not allowing a scientific approach to be
attempted. Maybe this other method can get somewhere, I dunno. But it does not
have a proven history of success like science.

------
contact_fusion
I know this may be perceived as nitpicking, but I am generally annoyed
whenever the reproducibility/reliability crisis is referred to as a problem in
science, writ large, rather than a specific problem in specific scientific
subfields. This seems to be particularly common when the articles are written
by practitioners within these subfields, such as neuroscience, biology, or
psychology.

There is no reproducibility crisis in my field of science - astrophysics. To
back this claim up, I searched at Retraction Watch for any mention of the top
tier journals in astronomy & astrophysics - specifically, ApJ [1], MNRAS [2],
A&A (none found), Icarus (none found), Nature Astronomy (none found) - and
found exactly one correction and one retraction. Now certainly errata are
published constantly, and I would be foolish to assume that just because
Retraction Watch didn't catch many instances of scientific fraud or abuse
within my field, that it practically doesn't exist. But no evidence for a
"crisis" seems to exist, at least in my corner of science. Astrophysical
research appears to be quite reliable and reproducible. For similar reasons, I
haven't heard of a reproducibility crisis in analytical chemistry, or optical
physics, or mathematics, to name a few - or, close to the interests of Hacker
News, computer science. Feel free to correct me if I'm wrong.

I'm not saying this to bash non-astronomers, or non-physicists, or
specifically life scientists. Biologists, neuroscientists, and psychologists
are crucial participants in the greater scientific enterprise and their
efforts lead more directly to alleviating human suffering than my work ever
will. But when talking about reproducibility and reliability problems we
cannot conflate different scientific disciplines with vastly different
cultures, practices, and norms - a crisis in biology does not imply that
physics has one too. Physics may have its own problems, but they aren't the
same as the ones in biology.

Witness the rise of groups such as the Flat Earthers, who are rejecting basic
scientific knowledge known for literal millenia.(Eratosthenes, anyone?) Or
witness the anti-vaxxers, abandoning modern medicine for charlatanry.
Pretending science is a monolithic enterprise and abandoning a fine-grained
understanding of the validity and power of different types of scientific
evidence just gives these movements strength, feeds their delusions, and
weakens the prestige of scientists when we do need to stand together.

[1] [http://retractionwatch.com/category/by-
journal/astrophysical...](http://retractionwatch.com/category/by-
journal/astrophysical-journal/) [2] [http://retractionwatch.com/category/by-
journal/monthly-notic...](http://retractionwatch.com/category/by-
journal/monthly-notices-of-roy-astro-soc/)

