It's not as simple as that, for all sciences - once again an article on repeatability seems to have focused on medicinal drug research (it's usually that or psychology), and labelled the entire "Scientific community" as 'rampant' with " statistical, technical, and psychological biases".
How about, Physics?
The LHC has only been built once - it is the only accelerator we have that has seen the Higgs boson. The confirmation between ATLAS and CMS could be interpreted as merely internal cross-referencing - it is still using the same acceleration source. But everyone believes the results, and believes that they represent the Higgs. This isn't observed once in the experiment, it is observed many, many times, and very large amounts of scientists time are spent imagining, looking for, and measuring, any possible effect that could cause a distortion or bias to the data. When it costs billions to construct your experiment, sometimes reproducing the exact same thing can be hard.
The same lengths are gone to in order to find alternate explanations or interpretations of the result data. If they don't, they know that some very hard questions are going to be asked - and there will be hard questions asked anyway, especially for extraordinary claims - look at e.g. DAMA/LIBRA which for years has observed what looks like indirect evidence for dark matter, but very few people actually believe it - the results remain unexplained whilst other experiments probe the same regions in different ways.
Repetition is good, of course, but isn't a replacement for good science in the first place.
If you follow your logic, that the different experiments using the same accelerator negates the whole thing, to the extreme then doing two experiments on the same planet/solar system/universe won't be enough..
And I don't buy your inverse argument that one (good) experiment is good enough either.
It is very difficult to tell from outside the group if the experiment is actually good or not (although you probably can tell if it's bad).
Screw-ups can happen no matter how many people look at the data if there is some flaw in the experimental setup - only way to make really sure is to use different experiments to measure the same thing.
Similarly, I am not saying that you don't need to repeat - just that it isn't the be all and end all of what defines 'science'. Supporting this interpretation is that I mentioned DAMA. Nobody is accusing them of not taking care, but nobody really believes the result either.
This is, in principle, a problem. We unfortunately have no way to measure the speed of light in andromeda, unlike what we can here on earth, so we really have no idea if our astronomical models are wrong given the non-constancy of the speed of light in andromeda or elsewhere.
So, yes, in principle not being able to repeat science experiments everywhere in the universe is a problem. However, I think if one thinks a little less broadly, testing newtonian gravity in say, Italy and also in China shows at least across the Earth, the phenomenon is similar. Then, at least one can say, "certainly, gravity is the same in Italy and China, and perhaps across the surface of the Earth." That is a stronger statement than "gravity is this way in Italy." Ordering claims by "scientific goodness", we can say that
Gravity is the same *across the universe* > gravity is the same across the Earth
> gravity is in Italy
Somewhere the LHC stands between "the SM is validated across the universe" and "the SM is validated at one detector at the LHC". Yes, it would be "better" if the Higgs was found at other experiments, but the current situation is "better" than if the Higgs was found at one detector there and not in any other. Repeatability, like everything in science, is not a binary step function but is some continuous function over the domain [0,1].
But it does point at one measurement of G on cosmological timescales and distances from 2006:
There are other papers with similar measurements published in the past two decades.
That isn't a particularly fruitful approach since it doesn't permit any real discovery - it's rather brain-in-a-jar - but it is still something you have to assume away.
Since these operate in concert across the observable universe, and themselves involve various other interactions (elements, strong and week nuclear forces, gravity, speed of light, rates of hydrogen fusion, etc.), we can conclude that either there is no appreciable change in any of the underlying fundamental constants or that change occurs in a compensated fashion such that no net change is detectable.
The second fails Occam's Razor. The conclusion that the laws of physics appear to be similar throughout observable space seems robust.
In particular, you're making super-strong statements like "be all and end all of what defines 'science'" -- that's not what I see in the article. The article is reasonably pointing out that we have a repetition crisis and that more emphasis should be placed on it. That's not the various super-strong things you're claiming you see in it.
The title is picked by editors to get the most people to click on it.
You miss all nuance.
It's becomes very quickly apparent to those who've read the article when others comment without having read beyond the title.
But that doesn't matter, because it is not the experiment you are trying to replicate, is the effect or the observation.
Carefully changing the design of the experiment can allow to verify that your explanation of the effect or observation is accurate.
Of course, if the change the experiment too much, a failure to replicated will have less useful information.
So I think what has to occur is a gradual "loosening up" of the controls from strict replication to weak replication, with both types of replication giving information about the effect you are testing: its validity and its generality, respectively.
I should think this is mostly needed in life sciences. Other, more 'exact' sciences seem to not have this problem.
I don't think we do. I think we need to foster a culture of honesty and rigor. Of good science. Which is decidedly different from fostering a culture of "repetition" for its own sake.
Paying for the cost of mountains upon mountains of lab techs and materials that it would require to replicate every study published in a major journal just isn't a good use of ever-dwindling science dollars. Replicate where it's not far off the critical path. Replicate where the study is going to have a profound effect on the direction of research in several labs. But don't just replicate because "science!"
In fact, one could argue that the increased strain on funding sources introduced by the huge cost of reproducing a bunch of stuff would increase the cut-throat culture of science and thereby decrease the scientist's natural proclivity toward honesty.
> and b) does not accept something for a fact just because it's in a journal
Again, it's entirely unclear what you mean here.
It's impossible to re-verify every single paper you read (I've read three since breakfast). That would be like re-writing every single line of code of every dependency you pull into a project.
And I'm pretty sure literally no scientist takes a paper's own description of its results at face value without reading through methods and looking at (at least) a summary of the data.
Taking papers at face value is really only a problem in science reporting and at (very) sub-par institutions/venues.
I don't care about the latter, and neither should you.
WRT the former, science reporters often grossly misunderstand the paper anyways. All the good reproducible science in the world is of zero help if science reporters are going to bastardize the results beyond recognition anyways...
No one is proposing repetition for its own sake. The point of repetition is to create rigor, and you can't do rigorous science without repetition.
> Paying for the cost of mountains upon mountains of lab techs and materials that it would require to replicate every study published in a major journal just isn't a good use of ever-dwindling science dollars. Replicate where it's not far off the critical path. Replicate where the study is going to have a profound effect on the direction of research in several labs. But don't just replicate because "science!"
I could see a valid argument for only doing science that will be worth replicating, because if you don't bother to replicate you aren't really proving anything.
Exactly. A lot of the science I've done should not be replicated. If someone told me they wanted to replicate it, I would urge them not to. Not because I have something to hide. But because some other lab did something strictly superior that should be replicated instead. Or because the experiment asked the wrong questions. Or because the experiment itself could be pretty easily re-designed to avoid some pretty major threats.
The problem is that is that hindsight really is 20/20. It's kind of impossible to ONLY do good science. So it's important to have the facility to recognize when science (including your own) isn't good -- or is good but not as good as something else -- and is therefore not worth replicating.
I guess the two key insights are:
1. Not all science is worth replicating (either because it's too expensive or for some other reason).
2. Replication doesn't necessarily reduce doubt (particularly in the case of poorly designed experiments, or when the experiment asks the wrong questions).
Foster all you want, an honor system doesn't protect you from the incompetent people and dishonest people publishing junk for funding or self-promotion. If we had a culture if repetition it promotes cross checks that make up for the flaws in human nature.
The entire point of science, and this is not a hyperbole, is that results are reproducible. If the experiment is not reproducible one must take the results on faith. There is no such thing as faith based science.
In order to build a shared body of knowledge based on scientific facts, then, results must be repeated. It is how different people can talk about the same thing, without fearing an asymmetry of knowledge and understanding about the axioms on which their discussion of the world rests. Otherwise it is faith or religion or narrative, something other than than science.
No, it's not. The point of science -- its end -- is to understand the natural world. Or to cure diseases. Or, more cynically, to learn how build big bombs and more manipulative adverts.
Reproducible results are the means, not the end.
I know that seems like hair splitting, but it's important. Epistemological purity can do just as much harm as good, because even the most pure science is usually motivated more by "understand the natural world" or "improving our understanding of some relevant mathematical abstraction", rather than by episemological purity itself.
To be quite honest about it, I feel that this sort of epistemological purity that insists on reproducability as a good in itself feels a lot like some sort of legalistic religion.
> If the experiment is not reproducible one must take the results on faith. There is no such thing as faith based science.
I don't think I (or anyone here) is arguing against this. Or against reproducing important experiments.
I'm wholly supportive of reproducing results when it makes sense. But I'm also wary of, in a resource-constrained environment, prefering reproducing results over producing good science in the first place.
To be concrete about it, I'll always prefer a single (set of) instance(s) of a well-designed and expertly executed experiment over 10 reproductions of a crappy experiment. In the former case I at least know what I don't know. In the latter case, the data from the experiment -- no matter how many times it's reproduced -- might be impossible to interpret in anything approach a useful way.
Put simply, a lot of science isn't worth the effort of reproducing. Either because it's crap science, or because the cost of reproducing is too high and the documentation/oversight of the protocol sufficiently rigorous.
The point of science isn't an to perfectly adhere to the legalistic tradition of a Baconian religion. The point of science is learn things.
> To be concrete about it, I'll always prefer a single (set of) instance(s) of a well-designed and expertly executed experiment over 10 reproductions of a crappy experiment.
I'd take 2-3 repetitions of a moderately well-designed and moderately executed experiments over either. Even the most well-designed and executed experimental protocols can produce spurious results, due to the stochastic nature of the universe.
There is a disconnect between the motivation and capability of scientists in the current funding system and what the public wants. So an easy solution is that if the public wants reproduce-able science, they need to pay for it. I'm sure some scientists who couldn't make it into Harvard or Caltech (ie., me) and thus can't do cutting edge science would be happy to take the dollars, have a living, and just reproduce the work of others. But you can't simply declare to scientists they should do X while not enabling them to.
What's more, the scientific process is used discretely. One fact at a time. Understanding of our world, its meaning, these things are accumulative, over the entire context of our experience, and utilize things like feelings, and faith, and religion, and narative, to create.
Science is funded by the public, and done for the public. Good science reporting is very important to ensure that science continues to get funded. Too often scientific papers are written in a way that makes them incomprehensible to anyone outside of the field, whether that is through pressure to use the least amount of words possible or use of technical jargon.
Do we have a good source of information on that?
But to address your question anyways:
1. USFG scientific funding instutitions are only one source of science funding. There are many others. If you look across the federal government, there's a downward trend: http://www.aaas.org/sites/default/files/DefNon%3B.jpg
One must also take into account non-federal-governemnt sources, which in many cases have substantially decreased their investments in R&D since 2008.
2. As pcnt of GDP there's been a steady decline: http://www.aaas.org/sites/default/files/RDGDP%3B.jpg
3. From an impact-on-culture perspective -- which is the relevant one in my comment -- I think (2) is more intresting than your data and also more interesting than (1). The question should be "how difficult is it to fund good science", not "how much are spending in absolute or relative terms". This is, of course, very difficult to quantify. But looking at percentage of GDP is at least better than looking at absolute dollars.
I find many who are against repetition have certain views that are helped by soft science.
And I'm about as far from "soft sciences" as you can get.
Really? You're suggesting that psychologists (to arbitrarily pick a softer science) would deny physicists (to arbitrarily pick a harder science) should reproduce their studies where possible?
That seems remarkable to me, perhaps I've missed these discussions. Can you provide evidence that this is a pervasive movement in some sciences, rather than the opinion of a few?
"Many" is a trigger weasel word, of course, and needs backing up.
My interpretation -- perhaps incorrect -- is that you feel the softer sciences are wilfully undermining the quality of harder sciences. I very much doubt this is the case. Some philosophers of science and some softer science key influencers may introduce difficult and challenging questions about the appropriateness and usefulness of some research methodologies (as are people in this thread) but I doubt they'd make the blanket assertion you're suggesting.
b) can be a problem with meta-analyses and reviews. When gathering data "from the literature" , not all the data gathered is of the same quality/certainty, which can have a compounding effect. Or when someone from a mathematical or computational field tries to create a model using data reported in the literature. It is often difficult when working in an interdisciplinary environment to assess the quality of everything you read, especially if you re not familiar with all the experimental methods.
Also, off topic, but i wonder why you chose a throwaway account to weigh into this. i hope it's not a "science politics" reason.
Reproduction is a way of bolstering rigor.
it'd probably make sense to do that, actually, so you can verfiy that the dependency actually fits your use case as time goes on.
That is not sufficient. Honesty and rigor is of course required for good science, but it is not sufficient.
Even if honesty and rigor you WILL still get false positives. Statistics is used to measure how likely this is. Statistical methods do nothing to tell you if any particular case happens to be a false positive. For many studies a confidence of 95% is considered good enough to publish, but if you do the math that means a honest researcher who publishes 20 such studies has probably published one false result! If there are 20 studies published in a journal statistically one is false. Thus replication is important.
It gets worse though, the unexpected is published more often - if it is true it means a major change to our current theories and this is important to published. However our theories have been created and refined over many years, it is somewhat unlikely they are wrong (but not impossible). Or to put it a different way, if I carefully drop big and small rocks off the leaning town of Pisa and measure that the large rock "falls faster" that is more likely to be published than if I found they fell at the same speed. I think most of us would be suspicious of that result, but after examining my "methods and data" will not show any mistake I made. Most science is in areas where wrong results are not so obvious.
> It's impossible to re-verify every single paper you read
True, but somebody needs to re-verify every paper. It need not be you personally, but someone needs to. Meta-analysis only works if people re-verify every paper. Note that you don't need to do the exact experiment, verifying results with a different experiment design is probably more useful than repeating the exact same experiment: it might by chance remove a design factors that we don't even know we need to account for yet.
> And I'm pretty sure literally no scientist takes a paper's own description of its results at face value without reading through methods and looking at (at least) a summary of the data.
I hope not, but even if they check out, it doesn't follow that things are correct. Maybe something wasn't calibrated correctly and that wasn't noticed. Maybe there are other random factors nobody knows to account for today.
The above all assumes that good science is possible. In medical fields you may only have a few case studies: several different doctors saw different people each with some rare disease, tried some treatment and got some result. There is no control, no blinding, and a sample size of 1. But it is a rare disease so you cannot do better.
Now, after drugs are on the market then there is another wave of less reputable research. But, the FDA has already approved the drug so they don't care as much.
That's exactly what almost everyone in academia does. Money is not the only corrupting factor.
I'm not sure of this. By 'exact' disciplines I'm assuming you mean disciplines of science more dependent on mathematical proofs. A CACM paper a little bit ago discusses this, and found a large number of papers where not repeatable. If I recall, it mostly focused on that nobody shares code and/or the code didn't build.
I can't say I know any scientists who think any differently.
The problem these days is that spending time doing replication is not glamourous and will not help you get funds.
I do genetics and development, and the main sanity check we have is the distribution of mutant lines. If you say that mutant X does Y, other people are likely going to see that (or not) when they get their hands on it and start poking around. This strength of working with mutants is at the core of the success of molecular biology. Even if don't set out to confirm someone else's results, you're quite likely to come across some inconsistencies in the course of investigation.
If a field lacks that sort of mechanism, they need to take special care to address reproducibility.
As noted, this does not get rid of the confounding negative effects, where paper D was also used as a (in retrospect, faulty) premise as well. Though no one ever actually comes out and says 'D' is bad.
Over time, A, B, and C theories accrue significant weight while D falls off. It's not as explicit as some may like, but in the end the spirit of replication is well and alive.
Unfortunately for medical drugs and psychology, researchers are mostly gathering data without an understanding of the underlying mechanisms. There are also virtually never have proposals which can be tested for compliance with reality in a quantifiable and isolated way as we can with physics, chemistry, or parts of biology.
So I feel the replication crisis is not a matter of various fields just not knowing how to do "good science", but that these fields by their nature make clean hypothesis testing vastly more difficult and p-hacking and statistical trickery (intentional or otherwise) harder to sift out.
How certain are we that its 'the nature' of these fields rather than political choices that were solidified long ago? There could be much larger samples and six-sigma confidence in life sciences if the grants were allocated differently. I know this is happening, for example, in neuroscience where the (private) Allen Institute is creating significantly more rigorous (and useful) datasets compared to the bulk of studies, because they are funding their studies differently.
In physics, some theories have been tested and interconnected to such a degree, that if an experiment conflicts with theory, it's reasonable to suspect the experiment rather than discard the theory. That's what happened with that apparent faster-than-light travel of neutrinos captured in Italy. It was something like a partially unplugged cable. Once corrections were made for the cable, the theory snapped right back into place. Those theories can be trusted as akin to tools in day to day physics. For my humble graduate experiment, refutation of any major law would have led me to fix the experiment and try again. I simply assumed things like conservation of charge. Such laws probably include electrodynamics, gravitation, quantum mechanics, thermodynamics, and Darwinian evolution.
And this is a common feature of major studies in physics, chemistry, evolutionary biology, and other sciences as well.
Where we may run into trouble is in branches of sciences that don't have over-arching theories or that web of connected results. At the other end of the scale are areas of sciences where the results are mainly a database of observed statistical correlations, with little or no apparent progress towards a general theory. When someone publishes a surprising result, there is no reason to say that the experiment must have been done wrong. You just add it to the pile. The best that can be hoped for is that some kind of meta-analysis will demonstrate an inconsistency among multiple studies. Those fields don't have the "hard" theories that can be used as tools to test experimental results.
To be fair, there's another situation, where you have not one, but zero, reproducible results. That's the state of affairs in the search to unify gravity and quantum mechanics.
In other disciplines, the role of repetition is a lot more important. The look elsewhere effect means, that your statistical significance depends on all other research, published or unpublished.  If you are doing a repetition study, the look elsewhere effect just goes away. So good science in the first place is very important of course, but there is a lot of value in doing repeated experiments for fundamental statistical reasons. Especially in fields were there is no good predictive theory (everywhere except physics) and where it is very hard to go to very high significance.
 I recently read
John Lewis Gaddis, The Landscape of History, 2004
which is a essay on epistemology of history and in a way tries to reject "physics envy," with the argument that physics is everything where there is a good theory, so in a way where we are winning. And consequently physics envy looks a lot like cargo cult, other disciplines need to figure out their own epistemology.
 Incredible high precision experiments which measure the magnetic moment of electrons.
Repetition is science.
Google says: "the recurrence of an action or event."
Science requires 1) a prediction of an observation 2) repeated observation consistent with the prediction.
If there is no repeated observation of what was predicted, it is not science.
>But everyone believes the results, and believes that they represent the Higgs.
It doesn't matter what is believed. "Everyone" used to believe all kinds of things, that doesn't mean that those things were true, or accurate. Repeated observation of prior predictions is all that matters.
>When it costs billions to construct your experiment, sometimes reproducing the exact same thing can be hard.
That doesn't excuse, or allow in, things that haven't been reproduced.
The simple analogy is pictures one has on fMRI. These pictures are, well, pictures based on approximations, not the mind. Not even close.
There is a paradox - one cannot make an instrument out of atoms to "see" what is going on inside these atoms. What they saw are pictures created out of mathematical models, not reality as it is.
Physics has its own cases such as cold fusion which is unreproducable and should not be if all the above hold true.
Why? ATLAS and CMS are different detectors operated by different teams.
We don't have a way to mathematically prove a drug for depression a psychological phenomena.
You can't "mathematically prove" a theory agrees with reality except by reconciling it with experiments (possibly indirectly through links to previous theories), so the only difference here is that a non-ad-hoc mathematical theory of depression medication doesn't appear to be feasible, not that physics has some magic alternate method of proof.
You're missing the point of what I said w/ the Einstein example. He proved it via maths. Whether or not it was empirically validated is irrelevant. It was true mathematically
Today, this cannot be done with fields like medicine and psychology. No one can create a mathematical proof for a cure for cancer, make a treatment based upon that proof, and have it work on the first try
> Whether or not it was empirically validated is irrelevant. It was true mathematically
If you mean it was a self-consistent theory, note that there are infinite physically incorrect theories which are "true mathematically". It is true mathematically in the same way number theory is "true mathematically", but you don't see anyone assume that we can describe gravity with prime number theory. If you mean it was mathematically proved consistent with previous physical theories backed by evidence, we're back to the experimental link, and also there are infinite physically incorrect theories which would agree with Newtonian mechanics and special relativity. If you mean he proved it satisfied some properties we expect from physical theories because they've been consistently upheld (e.g. energy conservation), there's an infinite number of wrong ones there too.
Mathematical convenience has been an excellent guide in physics, particularly fundamental physics (electromagnetic waves, antiparticles, and the Higgs Boson were all the results of positing things for mathematical convenience), but it is not a substitute for verifying novel experimental predictions. Nobody taken seriously in the physics (or mathematics for that matter!) community thinks it is, not even the often decried string theorists.
My point is this: Something is true whether one accepts it or not. E.g. the earth was proven to be round before anyone actually went around it.
Another example is climate change deniers. To them there's no proof of; 1) Climate change; 2) That it's man made.
Another example is people who believe the earth is 10,000 years old (I'm not joking millions of ppl believe this). They will deny any evidence you put in front of them.
That's the beauty of maths. If you can prove it mathematically, it's true. Whether or not you believe it is irrelevant.
As for the existence objective reality, that's another thing that seems hard to prove conclusively. I'd suggest looking at some of the basic epistemology surrounding modern science (e.g. Popper) for some thoughts on this.
As an interesting aside, while it's true that mathematical proofs (idealizing here and assuming incorrect proofs are never accepted by the mathematical community, because they on occasion are) are absolute statements of truth, they may not be stating precisely the truth you expect. Thanks to Godel, we know that it is not possible for any consistent mathematical theory rich enough to talk about addition and multiplication of natural numbers to prove it's own consistency. As a result, we may have a proof that 2+2 != 5, but that actually doesn't exclude the possibility that there is a proof of 2+2 = 5. In fact, his result shows we will never be able to prove such a thing does not exist (since it would imply the consistency of our mathematical system). So our absolute truths from proofs of X are actually of the form ZFC implies X, where ZFC is the background theory which is generally taken to underly modern mathematical works unless otherwise specified. So things are not so clear cut even here.
It seems you're more interested in dicing up and attacking what I said, instead of what I mean.
I'll re-articulate it for you once more: We don't have a way to mathematically prove a drug for depression or to explain psychological phenomena. The studies in the article are talking about those which rely on empirical observation.
How many problems have you worked through in GR? Have you gone through the proofs that GR recreates Newtonian gravity in the low-energy limit? If not, don't go around talking about how Einstein proved GR was a true description of reality with naught but mathematical proof when he didn't think he did that.
Physics has the same requirements for empiric evidence as the life sciences, it's just that we work within regimes where we can apply fundamental, empirically validated mathematical theories directly. If you don't believe me, go ask another physicist.
Yes, it is - that's the very basis for science. One of the main problems is that many academic disciplines have been wrongly classified as hard science (see "social" sciences).
>The LHC has only been built once.. But everyone believes the results, and believes that they represent the Higgs.
Nobody intelligent "believes" anything. We examine evidence and draw tentative conclusions based on that evidence, always retaining doubt because, unless you are omniscient, there is always new information that can come to light that can cause you to change your conclusion. Science is a process. If you "believe" anything without doubt, you aren't a scientist, you are a priest.
At least in life sciences (can't comment on other fields), it's not that scientists don't repeat each other's results. After all, if you're going to invest a significant fraction of your tiny lab budget on a research project, you need to make sure that the basic premise is sound, so it's not uncommon that the first step is to confirm the previous published result before continuing. And if the replication fails, it's obviously not a wise idea to proceed with a project that relies on the prior result. But that work never makes it into a paper.
If the replication succeeds, great! Proceed with the project. But it's time-consuming and expensive to make the reproduction publication worthy, so it will probably get buried in a data supplement if it's published at all.
If the replication fails, it's even more time-consuming and expensive to convincingly demonstrate the negative result. Moreover, the work is being done by an ambitious student or postdoc who is staring down a horrible job market and needs novel results and interesting publications in order to have a future in science. Why would someone like that spend a year attacking the work of an established scientist over an uninteresting and possibly wrong negative result, and getting a crappy paper and an enemy out of it in the end, instead of planning for their own future?
If enough people fail to replicate a result, it becomes "common knowledge" in the field that the result is wrong, and it kind of fades away. But it's not really in anyone's interest to write an explicit rebuttal, so it never happens.
If something truly revolutionary is published and it's relevant to your own work, it's very common to repeat the experiment yourself. If it works, great, continue on with your own work and reference the original paper when you publish new results.
If it doesn't work, the outcome can range from: (1) mentally note and move onto an alternative if they are available to (2) spend a ton of time trying to "fix" it, including contacting the original author.
The only time you see reproducibility experiments published is if the original paper made some serious errors, including outright fraud.
An there is no way you'd be able to test the reproducibility of every paper because some papers aren't that important. The important ones get the attention and if you see multiple scientists referencing a work (and building off of it), you can have confidence it actually works.
If there is only a single paper about the result, most scientists will regard that with a certain suspicion. That doesn't mean we expect every new thing to be fake, but there are just too many ways these experiments can go wrong.
One problem is certainly that the negative reproduction of experiments are more often communicated informally, and not published. If you're active in the field you will likely hear from the scientists that tried to reproduce the results without any success. But most of these are not published.
Any experiment worth replicating, is usually also worth expanding and building on, and that is where the actual verification is taking place. Science works in this regard, but it doesn't work as cleanly and efficiently as one could hope for.
Yeah, proven by expensive studies that were funded by the company making the GMO. Who is going to pay for another study to try to disprove it?
There's a reason we sprayed DET all over our vegetables for years before it was banned: there was no scientific studies proving that it was harmful, even though it clearly was harmful in hindsight.
Science is not instant, and there's no way someone can claim that some brand-new GMO is "perfectly safe", without any long-term studies on its effects over 10, 20, 30 years of exposure. That's just not possible. And yet you try to explain it to these science zealots and they just brush you off as being "anti-science".
Anyway, beyond the 'philosophy of science' issue of whether you can prove something, there is good affirmative evidence that existing GMOs are safe for numerous reasons.
First, there's no mechanistic reason to think they would be dangerous. T-DNAs are not somehow magically toxic to humans; everything you eat is riddled with millennia of t-dnas, old viruses, transposon blooms, etc. etc.
The technologies themselves should be safe as well. BT is well understood and embraced by the organic community as an applied natural pesticide, so you would need to find evidence that the localization somehow makes it toxic. Glyphosate resistance is also unlikely to have any effect a priori because it affects a metabolic pathway completely absent in animals.
Argue all you like about how nothing can be 'perfectly safe', sure, but there's no reason to think that GMOs are dangerous, and people have looked quite hard.
Finally, just look at the Seralini pile-of-bullshit for evidence that there's plenty of incentive to publish anything critical of GMOs. No one is sitting on career-making evidence.
That's like saying "there's no reason to think peanuts would be dangerous" since humans eat them all the time. And yet, they are deadly to some humans. No one knows why.
But they do know that food allergies are far more prevalent in the U.S. than in other countries. And now, suddenly, people in Africa and China are starting to exhibit food allergies that the U.S. has had for a while. So what have we started shipping over to them that's causing these allergies? Who will fund that study?
Do you have some references?
In case it's not clear, I'd like to read articles about food allergies that have been common in USA for some time now becoming common in countries that are coming out of 2nd world status and into 1st world.
The very idea that there can be "something wrong with GMOs" is as anti-scientific as the narrative that there can be "something wrong with medicines". It's anti-scientific on principle, because it's not the safety of individual products you are taking issue with, it's the very existence of the entire scientific field.
Anyways, it takes about 10 years for the average Ag product to make it to market. That's on par with the time for the average drug. If you are serious about safety, start talking about real improvements that could be made to the Ag approval process. Start talking about the flaws in individual studies. Until you do something like that, your anti-science blanket dismissal of an entire field is as ridiculous as someone who is "against medicines".
Are you kidding me? Have you read Taleb on the risks of GMOs? He argues that the possibility space of danger from GMOs is different from the possibility space of danger from regular breeding.
GMO is controlled alteration of specific genes in a crop. Why on earth should we assume this is less safe than uncontrolled changes? Massive changes of crop DNA has been going on for decades. Food you eat every day has originally been produced by essentially nuking seeds and making lots of random mutations. One has then grown various "nuked" seeds looking for those with favorable characteristics. Plants get changed getting genes from bacteria, cross breeding and all sorts of things. Why on earth should be assume uncontrolled massive changes of plant DNA is somehow safer than GMO?
GMO is not really what needs to be proven. It is the other methods of changing plants which need to prove that they are safer than GMO.
The problem with GMO isn't the science of it. The problem with GMO is the politics. Here I agree with the anti-GMO crowd. Companies like Monsanto are highly immoral and a scourge of the earth. It is companies owning particular strains of seeds and requiring everybody to keep buying from them which naturally causes resentment.
Said scientist will gain immense fame and recognition if they show established ideas are wrong. (Source: academia Physicist)
Especially in psych/bio, a negative finding can almost always be blamed on the experimentalist -- even more so if they are young. It is no great feat to conduct a study so poorly that fail to detect a signal others have reported.
Sometimes you find evidence that doesn't quite match up with previous work, but if an explanatory framework isn't forthcoming, it can be difficult to say who's results are right, and what they mean. On the other hand, sometimes you come across something blatantly wrong, so you do your follow up experiments to confirm and you're all set. It's only the latter case that will quickly make it into a manuscript.
I will say, I've seen some big corrections (like this-allele-was-actually-a-completely-different-mutant bad) that just get buried in the results section of a subsequent publication, and no retraction was ever submitted for the original paper. That was definitely a failure in the field, and likely the result of the status of the authors involved.
The doctors and doctors-in-training I work with have altruistic motives, but place too much stock in major medical studies. They also frequently apply single-study findings to patient care, even to patients that would've been excluded from that study (saw this a lot with the recent SPRINT blood pressure trial).
And don't even get me started on the pulmonary embolism treatment studies. What a clinical mess that is.
Yet I'm surprised by your colleagues behaviour nonetheless. I would have thought they'd have more retenue.
I think the problem is that it's simpler to just take a study's conclusion and believe it. Because hey, it's peer-reviewed and in the NEJM! Easy!
The adverse reaction I described is normal in medicine when you oppose the status quo. That's probably true in other professions and industries.
how do you incorporate these findings? ignore them?
if so, it's probably bad for your patients. the only thing worse than a single-study finding is a zero-study finding.
Let me give you an example of how I approach things. The guidelines for acute pancreatitis recommend using a fluid called LR instead of NS for volume resuscitation. This is based on an single study that included 10 patients and simply noted slightly better lab numbers; there was no difference in clinical outcome. Lots of problems with that study, right (small, underpowered, confounders, validity issues, etc)? However, there's no major disadvantage for using LR in those patients (unless hyperkalemia is a concern), so I use it since it might have a benefit.
This is a very simple example. It gets much more complicated than that.
"Probably" is one of favorite words in medicine, btw :).
"Probably" is one of favorite words in medicine, btw :).
I'm very interested and will read what you write with full attention.
I had a severe reaction to fluoroquinolones and maybe some confounding comorbidities and have been pretty much unable to get effective help in our medical system so far :(
-- Richard Feynman, "Surely You're Joking, Mr. Feynman", pp. 225-226
The linked comment doesn't state that it would be a waste of time to replicate on a hypothetical LHC clone.
Rather, the linked comment states that we can accept the Higgs result with reasonable confidence even though it's currently infeasible to replicate that experiment.
Feynman's issue was also qualitiatively different -- the scientist was comparing results from two different instruments. The people in charge of one of the instruments wouldn't allow the scientist to run both experiments on a single instrument. In fact, from context, it's not even clear to me Feynmann would have insisted on re-running the original experiment if the scientist were not using a different accelerator for the second one. Anyways, in the Higgs case, there's no potential for a "comparing readings from instrument A to readings from instrument B" type bug.
More to the point, and FWIW, I somehow doubt Feynman would insist on building a second LHC for the sole purpose of replicating the Higgs measurement. But I guess we have to leave that to pure speculation.
My next reaction was, "wow, it's a sad state of affairs when a postdoctoral research fellow at Harvard Medical School feels he has to spell this out in a blog post." It implies that even at the prestigious institution in which he works, he is coming across people who treat science like religion.
The real issue is that there are anti-incentives to reproducing other people's results. All scientists want to see it, and nobody is able to actually do it, because they'll lose status, publication opportunities, and funding. It's viewed as career suicide. Unfortunately, this article doesn't suggest any solutions to that problem.
It would be best to not suggest this problem is somehow due to "people who treat science like religion". That association isn't called for, nor is it largely true or applicable here. Ahmed even said very early in the article "the majority of irreproducible research stems from a complex matrix of statistical, technical, and psychological biases that are rampant within the scientific community."
Statistical and technical biases are common human errors that affect everyone equally and don't amount to religion. Even beliefs don't amount to religion. You and I both believe electrons run through our computers, yet I haven't verified electrons exist and I've never seen one - have you? Almost everything we know is belief based on what others have told us, whether it's science or not. Even if scientific results are reproduced, unless you're doing the reproducing, you're still subject to believing the results. It's a more believable story when two independent people verify some result, but that doesn't mean that believing one story or one paper demonstrating a result is somehow akin to blind faith, ritual, and deity worship.
Culturally academia is responsible for a lot of cult-like approaches to things. It's just a humanity problem that we have to acknowledge and use science to fight against.
I look at science as an essentially evolutionary process: a single study has a very low probability of being accurate, if the results survive additional testing over time, the probability of accuracy increases.
The only difference between Scientists and people in other fields is that Scientists completely lack self-awareness.
Science is obviously systematically biased.
Have you ever been to a good school? Every word a prof says is designed to make you think he is smart. That's the only way they make careers. They 'live in the identity' of being smart.
It's laughable and hyper competitive.
Bad studies ensue.
It's obvious to anyone with a basic grasp of human behaviour.
The only thing that surprises me is how they are unwilling to admit the problem.
(If so, this shouldn't be news.)
As a point of anecdata: my wife's master thesis was a confirmation study of using LLDA for face recognition. I remember seeing it included in some book by the university press. I gave up Googling for it after 5 minutes.
It's not a binary option. One poor experiment might give us some evidence something is true. A single well reviewed experiment gives us more confidence. Repeating the results similarly does. As does the reputation of the person conducting the experiment and the way in which it was conducted.
It's not a binary thing where we decide something is accepted or rejected, we gather evidence and treat it accordingly.
These days, if you ask a scientist "So how do we prove something is true using science?" they'll be able to recite Popper's falsificationism as if it's a fundamental truth, not a particular way of looking at the world. But the huge gap between the particular theory that people get taught in undergrad--that science can't actually prove anything true, just disprove things to approach better hypotheses--and the real-world process of running an experiment, analyzing data, and publishing a paper is unaddressed. The idea that there's a particular bar that must be passed before we accept something as true is exactly what got us into this mess in the first place! There's a naive implicit assumption in scientific publishing that a p-value < 0.05 means something is true, or at least likely true; this author is just suggesting that true things are those which yield a p-value under 0.05 twice!
What's needed, in my opinion at least, is a more existential, practically-grounded view of science, in which we are more agnostic about the "truth" of our models with a closer eye to what we should actually do given the data. Instead of worrying about whether or not a particular model is "true" or "false," and thus whether we should "accept" or "reject" an experiment, focus on the predictions that can be made from the total data given, and the way we should actually live based on the datapoints collected. Instead, we have situations like the terrible state of debate on global warming, because any decent scientist knows they shouldn't say they're absolutely sure it's happening, or a replication crisis caused by experiments focused on propping up a larger model, instead of standing on their own.
Kinda like Rotten Tomatoes, but... for science?
1. How should we receive costly research that took special equipment and lots of time to develop and cultivate? I.e., CERN?
2. A lot of research is published, ignored, and then rediscovered. In this case, we may want to accept the research until it cannot be repeated (i.e., in another journal publication).
3. Reviewers of academic publications probably are not qualified or have the time to recreate all scientific research.
4. Isn't the academic system at its core kinda... broken?
can your elaborate on what you mean?
Whether or not a paper is fully correct is less important that the further work it stimulates or informs-- either via more papers (impact factor) or via APPLICATIONS of the research.
In the example you're giving, to get accepted, the author doesn't need to do much more than extend the research in some small way, compare it to other research, or explore some aspect more fully. If someone is going to take the time to repeat something, they might as well go a little further.
If someone takes the time to apply research results they're going to, in a way, test the validity of the results. Maybe the author's experiences in the pharma world is very different, but I doubt that.
Yes, it should. We have too many demonstrated instances of studies built on other studies that turned out to be flawed, but the second study, rather than showing the first one was flawed, was rationalized and massaged until it conformed to the first study because the first study already had the imprimatur of peer-reviewed correctness on it.
Only someone replicating the original study directly (give or take sample size or duration or other simple such changes) will have the guts and moral authority to stand up and say "I can't replicate this. The original study may be wrong."
(Mostly because they don't publish code nor data; and academic code is often a horrible mess, and the code was mucked around with between different stages of running.)
Often, I find that authors don't publish their code. If they do publish their code, they rarely publish their code for their benchmarks.
— Donald Knuth
Not all science is done in a lab. Replicating an experiment is obviously feasible for a short term psychology experiment, but in earth sciences (oceanography for instance.) it is far less often possible to reproduce an experiment for the following reasons.
N.B. This is all from my personal experience of one field of science.
1.) Cost. If you got funding to take an ice-breaker to Antarctica to "do science" it required several million dollars to fund. It is difficult enough to secure funding for anything these days, none the less prohibitively expensive attempts to reproduce results. (honestly any serious research vessel will run costs into the millions, regardless of destination.)
2.) Time. Say you are on a research vessel taking measurements of the Amazon river basin. This is a trip that takes months to years to plan and execute. If you return to duplicate your experiment 2 years later, the ecology of the area you were taking measurements of may have changed completely.
3.) Politics. Earth sciences often require cooperation from foreign entities, many of which are not particularly stable, or whom may be engaging in political machinations that run counter to your nationality's presence in the country, or both. Iran and China are two good examples. Both are home to some excellent oceanographers, and both of which can be very difficult to Science in when your team includes non Iranian/Chinese nationalities.
Now, straight to the point, who's going to pay for the repeated research to prove the first one?
It would do something about low hanging fruit in terms of testing reproduceability and since there is a published paper, the student has access to guidelines for setting up and reporting on a large project, which will help them learn how to do their own, original thesis.
Who is and who should are two different questions. The body who funded the original research should be best placed to fund the verification of the research. If the research isn't compelling enough to fund verification then why was it funded in the first place? And if the principle research group is requesting additional funding for more research that builds on the initial unverified research then that sounds like poor governance.
I realise that this simplistic view gets messy and confused when research is really academic led innovation and incubation.
Incentive for corrupting the data seems high, however.
Sorry for the informal language, but makes things a little bit more salty.
When the lightbulb was invented, Edison made lots of money, but I am sure candlemakers had plenty of incentive to fund research that hypothesized that lightbulbs emitted toxins.
But yeah, larger entities (universities, businesses) should also be factoring the cost of reproduction when they commission research.
No one has been able to tell me why the need for reproducibility requires software freedom.
Consider the program 'nauty'. It is available in source code form for anyone to review, but it cannot be used for military purposes. That's not free, certainly. But isn't that enough to call it good science?
Similarly, consider the clause "only for use in verifying the result of paper X". That's also not free. But it serves the goal of letting others be able to verify X.
Also, you haven't gone far enough. It's not only the license that matters, but access. You have to mandate that either anyone can get access to the code for no/low cost for some years (since I can sell my GPL'ed software for $30,000 or take down the download link once published), or link publication with a required submission to some repository with the mission of keeping all that source and data around, available to anyone, at no cost.
Yes, 1$, no threshold necessary.
For projects where the cost of publication of data is not worth the dollar, people will just not accept the dollar. (That's where a 'natural threshold' comes in.)
Or, if I use a government facility, like a supercomputing center, then that also trigger the release requirement? Even if no money changes hands? What about a government network?
It sounds very tricky. If I get a government grant to work on a project, and that grant buys equipment X, which I also use for another project, then must both projects be subject to this form of release?
Even if the second project is 10 years later?
If in working on project X, but in the process of working on X I find some new knowledge Y, and publish it, even though that has nothing to do with the grant from the original X, does that count? (Think AT&T's observation of the cosmic microwave background, when their goal was to reduce noise in terrestrial microwave communications.)
It seems like a very complicated scheme.
I'm not to impressed by "well, that's not how it works right now". The whole problem is "how it works right now". That's what we're discussing, the need for it to not work that way.
We have many systems to go on over the last few hundred years of science. We have the pre-war system, primarily funded by private philanthropy. We have the communist system.
None of them seem to create the stream of highly replicable studies you want.
That may indicate something deep about how people work and how science is really done, and suggest that your admirable goals are not tenable.
This is not necessarily because we're worse people than them, but because the problem is now much much harder. It's always better to acknowledge that hard problems are hard, rather than trying to solve them by pretending they're easy.
As for the model I would propose, I believe all funding models are fundamentally flawed, and the best model is all of them at once, so hopefully the flaws at least sort of cancel out. At the moment, that generally means seeking a decrease of the current government funding strategy and breaking the peer review monopolies, not because either of them are necessarily especially bad, but because they are too powerful and their flaws are coming to define the flaws of science in general.
Some of this would just be a mindset change, to recognize that "research" isn't isomorphic to "producing peer-reviewed papers" and that there's nothing wrong with setting up some equivalents of Xerox Park in other disciplines. Potentially with government money, since my point is more about multiple models than the literal funding sources. If "science" as it is practiced today was less pedestalized, this would be a much less horrifying suggestion.
As you can see from http://www.pewforum.org/2013/07/11/public-esteem-for-militar... , the military, teachers, and medical doctors are on higher pedestals than scientists.
That said, I'm all for the mixed development model.
I still maintain that the military is still on a higher pedestal than science, in terms of funding and prestige. You hear stories of people buying military people in uniform their meal, to honor their service. That's much less common for scientists.
You calculate the fair market value of the public resources you used, and subtract what you paid the public for them. If it is positive, you have a publicly-supported project.
So if you use that government-paid intern for several hours, you ought to pay their agency or department $7.25 for each. You pay for what you use, and there's no problem.
If you work in a government-built facility, and you pay rent for your space, there's no problem. It doesn't matter that the public is your landlord. The space has a market value, and you pay it. There is no net transfer of value to your project at the expense of anyone else. If someone else could make better use of your space, they could have paid higher rent to get it.
If you're accepting a grant, that makes it a bit more difficult for you. If you get $50000, you would have to pay back $50000, plus the interest and the administrative overhead for processing your grant request. And then there's the value of the risk premium and moral hazard. You would have to find some other source of funding to "close" the research, and it would have to do it before starting work. Otherwise, potentially profitable projects could get privatized just before triggering the public release requirement, and the money sinks would be left as public.
If you use public funds to buy equipment for a publicly supported project, and then later want to use it for a private project, you have three options: lease it from the public project, or pay the depreciated value of the equipment to buy it outright, or make your private project publicly-supported.
It isn't any more complicated than the GPL copyleft. If you use GPL'ed code, you have to make public everything you do with it. If you don't want to do that, don't use GPL'ed code.
Which is very hard for things with no market.
I use government libraries. They are free to me. What is the fair market value of that? There are private and subscription libraries, so it's not like no market exists.
What is the fair market value of time on Hubble?
> "if you use that government-paid intern for several hours ... You pay for what you use"
I think you mean $0, not the $7.25 you estimated. Under the Fair Labor Standards Act, an internship is "for the benefit of the intern", not the company. An internship is not supposed to improve the bottom line of a company. An intern may even get in the way, and cause negative value.
And that's my point. The public gains more than can easily be counted by simple, direct market valuation. What is the worth of having students with industrial training? What is the worth of having broad public access to the literature?
Or, for a more real-world case, companies might not be interested in tropical disease research because the revenue won't justify the development costs. But the US military would like to be able to send troops to places with an endemic tropical disease, so they want some way to be able to prevent or treat the disease. The US foreign diplomatic policy would also like the good-will of those countries. The US could, by subsidizing tropical disease research, tilt the "fair market value" so is more weighted towards its military and diplomatic policy goals.
That assumes that part of the corporate revenue comes from subsidy, and part comes from being able to sell the drug on the market. But now, if part of the revenue comes from the government, the company cannot seek patent protection. This reduces the profit expectation, which means the government will need to subsidize the project even more to get a company to be interested in the effort.
You receive no special additional benefit from a library by being a researcher. Everyone can read the same materials as you do. Time on the Hubble costs more than any individual astronomer could pay. If you are keen on closed, private astronomy, you would need to check the NASA budget figures.
If you derive useful benefit from work done at your request, you need to pay the person doing it. If the intern is working for the government for no pay, how would they not just laugh in your face when you ask them to do work for you? You invented the hypothetical; I won't fix it for you.
If the military or state department could derive some benefit from subsidizing private research, they can bloody well do the research on their own. "US Army cures Dengue" would be great for both operations and PR, and would be a much better use of funds than a smart bomb that can stalk you on Facebook and blow up all your friends at the same time as you. If you as a private company want to sell a cure for Dengue on your own, then don't go begging the government for money. Fund it yourself!
Sure. But no grants allow the diversion of fund into private profits, so I don't know what you're referring to.
Take the SBIR grants. It's a way for the government to help small, for-profit companies do the R&D that might lead to results that will benefit the overall US economy and policy. The hope is for the companies to commercialize the results and do well.
It's not money that the SBIR recipients can use to party on Maui. The SBIR system has accounting and oversight in place to help prevent that.
Or, take the (infamous) Bayh–Dole Act. Quoting from https://en.wikipedia.org/wiki/Bayh%E2%80%93Dole_Act :
> The key change made by Bayh–Dole was in ownership of inventions made with federal funding. Before the Bayh–Dole Act, federal research funding contracts and grants obligated inventors (where ever they worked) to assign inventions they made using federal funding to the federal government. Bayh–Dole permits a university, small business, or non-profit institution to elect to pursue ownership of an invention in preference to the government.
That sounds very much like that the public, through its elected officials, don't actually want what you say they want, because what we had was more like what you say we should have, and they decided to change it.
(Because corner cases where small amounts of funding would trigger the requirements can be worked around by just not using that find. As long as the circumstances triggering the requirements are easy to predict.)
I also used a dataset consisting of newspaper articles. It cost me $1.000 to get access to, and I definitely do not have the rights to redistribute it.
The other issue, especially in the life sciences, is inaquedate statistical input. If someone performs an underpowered, confounded experiment and gets a positive result, then someone else performs the same underpowered confounded experiment and gets a negative result, what have we learned except that the experiment is underpowered?
But repeatability actually matters more professionally. Scientifically speaking, if the science is bad it just won't work when others try to make use of it. All bad science will be identified or corrected as we try and make use of it and convert it into new technology. Technology mandates repeatability. So those scientists who fail to produce repeatable science, regardless of how professionally successful they may be, will inevitably fail to produce any new technology or medicine, and vice versa.
What I think is overlooked in this discussion as that a lot of confirmation work already happens. Most (all?) scientific results are incremental progress built on a heap of previous work. In the course of normal research, you reproduce existing results as necessary before altering conditions for your own study. If you can't confirm the results, well then perhaps you have a paper (though it can be politically challenging to get it published, and that's a separate problem). But if you do, then you don't waste time publishing that, you get on with the new stuff.
Ultimately, I don't think scientists do accept results in their field that they have not repeated.
Independent replications of experiment (and the corresponding independent reports of observations) are a crucial part of the scientific method, no matter how much you wish it wasn't. Nature doesn't care if it is inconvenient for you to discover her secrets, or that it is more difficult for you to hype up your findings to the unsuspecting public.
There is a lot of transparency there, a lot of well meaning people with a lot of oversight.
I suggest most would admit 'there could be a problem' there, but it's out in the open if there is.
The problem of lack of repeatability I think has to do with subconscious bias on the part of the experimenters which will be less pronounced when there are 5000 people working on it.
On the other hand, there are many other experiments that are repeated billions+ times a day in order for consumer electronics to work, etc.
I blame this on the neo-liberal ideology. This intense focus on getting money's worth, on tying grants to specific goals, counting publications etc. Driving research exclusively on a very narrowly defined money incentive has driven us further into this sort of mess. The money grabbing journals which has prevented any significant innovation in how science is shared.
I think what science needs is a model closer to that of open source. With open projects anybody can contribute to but where verification happens through personal forged relationships. The Linux kernel code quality is verified by a hierarchy of people trusting each other and knowing something about each others quality of work. Work should be shared like Linux source code in a transparent fashion and not behind some antiquated paywall.
I don't think the grant system can entirely away, but perhaps it should be deemphasized and instead pay a higher minimum amount of money to scientists for doing what they want. Fundamental science breakthrough doesn't happen because people had a clear money incentive. Neither Einstein, Nils Bohr, Isaac Newton or Darwin pursued their scientific breakthroughs with an aim of getting rich. Few people become scientists to get rich. Why not try to tap into people's natural desire to discover?
If competition for research dollars ceases to be so cutthroat, it will go a long way towards solving this and many other seemingly entrenched cultural problems.
In contrast, if you are in machine learning and you are extending an existing architecture you are very directly dependent on that original technique being useful. If it doesn't "replicate" the effectiveness of the original paper, you're going to find out quickly. Same for algorithms research. Some other comments here have mentioned life sciences being the same.
So I think there's a qualitative difference between sciences where we understand things in a mostly statistical way (sociology, psychology, medical studies) where the mechanism is unknown (because it's very very complicated), but we use the process of science mechanistically to convince ourselves of effectiveness. e.g. I don't know why this color makes people work faster/ this drug increases rat longevity / complex human interactions adhere to this simple equation, but the p value is right, so we think it's true. Versus sciences where we have a good grasp of the underlying model and that model is backed up by many papers with evidence behind it, and we can make very specific predictions from that model and be confident of correctness.
For example, here are two product specifications for a dye called Sirius Red, the first by Sigma-Aldrich and the second by Chem-Impex. The Sigma-Aldrich product contains 25% dye while the Chem-Impex contains equal or greater than 21%. These two dyes could be quickly assessed with a spectrophotometer in order to determine an equivalency, however you need both dyes on hand which doesn't seems like a good use of funding. Also this touches on another problem in replication which is, what is in the other 75%+ of the bottle?
An even more inconvenient truth is that scientists cannot even keep their jobs if they prioritize the quality of their work. The pressure to publish novel results is too strong and it is almost impossible to get any support for confirming previous ones.
In order for a lot more replication to get published, what would be needed would be for people who spent their careers replicating others' results (at the expense of not producing any important novel results of their own) to get tenure at top institutions (outcompeting others who had important novel results but not enough published replications).
That aside, i think repeatability is a much more useful goal (rather than "has been repeated"). For one thing, meaningful replication must be done by someone else; for another, it's difficult and time consuming; the original investigator has no control over whether and when another in the community chooses to attempt replication of his result. What is within their control is an explanation of the methodology they relied on to produce their scientific result in sufficient detail to enable efficient repetition by the relevant community. To me that satisfies the competence threshold; good science isn't infallible science, and attempts to replicate it might fail, but some baseline frequency for ought to be acceptable.
What we should demand is scientific results that have FAILED.
When we see a p=0.05, but we don't know that this SAME EXACT EXPERIMENT has been run 20 times before, we're really screwing ourselves over.
So I agree with the title "We Should Not Accept Scientific Results That Have Not Been Repeated". But I would add to it "We Should Not Accept Scientific Results from Studies That Weren't Preregistered". Registration of studies forces negative results to be made public, allowing for the positive result rate / replication rate to be calculated.
Otherwise the existence of a "positive" result is more a function of the trendiness of a research area than it is of the properties of the underlying system being studied.
One part of science is observation. Including observations which cannot be, or at least have not been, repeated. For example, consider a rare event in astronomy which has only been detected once. Is that science? I say it is. But it's surely not repeatable. (Even if something like it is detected in the future, is it really a "repeat"?)
Some experiments are immoral to repeat. For example, in a drug trial you may find that 95% survive with a given treatment, while only 5% survive with the placebo. (Think to the first uses of penicillin as as real-world example.)
Who among you is going to argue that someone else needs to repeat that experiment before we regard it as a proper scientific result?
First off, you can accept the observation at face value as an observation, but conclusions drawn from the claims which have no other support or means of verification should not be accepted and would not be accepted. Fortunately, most of the time even if something is initially sparked by a very rare occurrence, it will have some kind of implications that are verifiable by some other means other than just waiting for something to happen in space.
But even something that is rare and relies on observation, like gravitational waves, we have already been able to identify more than one occurrence.
> Some experiments are immoral to repeat. For example, in a drug trial you may find that 95% survive with a given treatment, while only 5% survive with the placebo.
What's more immoral, releasing a drug that's only had one test, even a striking one, on the public as a miracle cure that you have not truly verified or performing another test to actually be sure of your claims before you release it?
> Who among you is going to argue that someone else needs to repeat that experiment before we regard it as a proper scientific result?
That's how science works. If something is not independently repeatable and verifiable then science breaks down. Look at the recent EM drive. Most scientists in the field were skeptical of it, and once it was finally attempted to be independently verified the problems were found.
Independent verification is the cornerstone of science and what makes it different from bogus claims by charlatans.
I disagree. In all cases, even with repeated experiments, the claims are only tentatively accepted. The confirmation by others of Blondlot's N-rays didn't mean they were real, only that stronger evidence would be needed to disprove the conclusions of the earlier observations.
Astronomy papers make conclusions based on rare or even singular observations. Take SN1987a as an example, where observations from a neutrino detector were used to put an upper limit on the neutrino mass, and establish other results.
> "or performing another test"
This question is all about repeating an experiment. Repeating the experiment would be immoral.
There are certainly other tests which can confirm the effectiveness, without repeating the original experiment and without being immoral. For the signal strength I gave, we can compare the treated population to the untreated population using epidemiological studies.
But under current medical practices, if a drug trial saw this sort of effectiveness, the trial would be stopped and everyone in the trial offered the treatment. To do otherwise is immoral. As would repeating the same trial.
Then perhaps current medical practices should change. The benefits to those who were previously given the placebo should be balanced against the probability that the observed outcomes may not occur in other circumstances.
Down that path lies atrocities. The system was put into place to prevent repeats of horrors like the "Tuskegee Study of Untreated Syphilis in the Negro Male".
You chose to not verify, and insist upon repeating, thus likely consigning people to unneeded pain and even death.
I'll give a real-world example to be more clear cut about modern ethics and science. Ever hear of TGN1412? https://en.wikipedia.org/wiki/TGN1412
It went into early human trials, and very quickly caused a reaction. "After very first infusion of a dose 500 times smaller than that found safe in animal studies, all six human volunteers faced life-threatening conditions involving multiorgan failure for which they were moved to intensive care unit." (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2964774/ )
Here's a publication of the effects: http://www.nejm.org/doi/full/10.1056/NEJMoa063842 .
Is it moral to reproduce that experiment? I say it is not moral, and must not be repeated even though it is possible to do so.
Can a publication about the effects still be good science even though medical ethics prevent us from repeating the experiment? Absolutely.
What say you?
But who didn't believe the original result? And does the same experiment observing multiple occurences really count as 'reproducability'?
We've only had ONE observed local group supernova within the past several hundred years (and that was within our lifetime, thankfully with several relevant detectors up and running). Should we ignore any result or conclusions from this instance?
No - if the data from the next supernova disagrees or reshapes the field - and it probably will, given the huge amounts of resources dedicated to studying it (see e.g. http://snews.bnl.gov/), this will just be evidence of scientific progress - reshaping your position based on experimental data.
Again, I think there is a certain amount of crosstalk with people who say that the "Entire community of scientists" has a problem, whilst actually meaning specific fields. Perhaps an ironic imprecision.
The general idea for research to be accepted is that it makes some novel, albeit small, impact on the field, acceptable for publication in a peer reviewed journal or proceeding. Repeating someone else's experiments wont get you that, so in general it wont help you graduate or move you toward a higher position at a university or in your profession, meaning there is very little motivation for researchers to pursue such endeavors.
So instead of just throwing money at the problem, we may need to entirely revamp how we recognize the pursuits of researchers.
And then people have the nerve to say, "Last week chocolate was bad for me, now it's good? Make up you mind!" No, stop listening to un-replicated studies! Jeez.
I've lost count of how many 'battery breakthrough' articles I've come across, but they seem to pass the newsworthy test.
A) Pre-registering the study design, including the statistical analysis. Otherwise, attaching a big label "Exploratory! Additional confirmation needed!"
B) Properly powering the study. That means gathering a sample large enough that the chances of a false negative aren't just a coin flip.
C) Making the data and analysis (scripts, etc.) publicly available where possible. It's truly astounding that this is not a best practice everywhere.
D) Making the analysis reproducible without black magic. That includes C) as well as a more complete methods section and more automation of the analysis (one can call it automation but I see it more as reproducibility).
Replication of the entire study is great, but it's also inefficient in the case of a perfect replication (the goal). Two identical and independent experiments will have both a higher false negative and false positive rate than a single experiment with twice the sample size. Additionally, it's unclear how to evaluate them in the case of conflicting results (unless one does a proper meta-analysis--but then why not just have a bigger single experiment?).
As an outsider looking in on the Scientific process, I am not really sure how applicable my opinions are, but I see these as useful changes.
Basically, in reverse order, my suggestions for science to adopt are as follows:
Papers in databases need to have fields related to reproduction studies, and it needs to start becoming a prideful part of the scientific process; just as there is a lot of pride and money, researchers should start to thump their chest based on the reproducibility of their work, actively seeking out contemporaries and requesting a reproduction study as part of the pubilshing process, and subsequently updating.
The papers published themselves should take a moment (perhaps no more than a paragraph) to include a "for media" section that outlines the "do's and don't's" on reporting on the research. For example, cancer research should clearly state examples of acceptable understandings in lay person terms as a sort of catch for sloppy reporting. Something like "Do not write "cure for cancer found" or "Effective treatment", instead write "progress made, etc". Basically put a sucker punch to outlandish headlines and reporting right in the paper itself, and let journalists who want to be sensationalist embarrass themselves.
This seems like two very simple acts that could raise the bar for science a bit.
Of course, they probably do know this and just choose to ignore it because "Unverified Study that MIGHT Point to M&M's Being Good For You" won't get as many clicks as "M&M's Are Good For You Says New Study!"
it's not so much checking for the public purpose, it's for others.
Peer review is supposed to do this, but the fact that peer reviewers are often colleagues leads to collusion, whether intended or not.
Maybe we need a separate body of scientists whose sole job—and whose entire prestige—derives from taking down and retracting bad science.
> First, scientists would need to be incentivized to perform replication studies, through recognition and career advancement. Second, a database of replication studies would need to be curated by the scientific community. Third, mathematical derivations of replication-based metrics would need to be developed and tested. Fourth, the new metrics would need to be integrated into the scientific process without disrupting its flow.
Yes, absolutely those things need to happen, but the problem is how to get this funded, how to get people to not see reproducing results as career suicide, right? Items 2-4 will fall out as soon as item #1 happens.
How do we make item #1 happen? What things could be done to make reproducing results actually an attractive activity to scientists?
I'd say the goal that gets credited should not be merely reproducing the results, but finding errors in the previous research. That would count as novel, and is something that is presently recognized as contribution. The only problem is that journals or conferences treat it as unattractive, so good luck publishing something of the kind...
Only if you assume the incentives for the 2nd, 3rd, 4th, etc. reproduction experiments remain the same, right? I wouldn't assume that, both because the first reproduction is the most valuable, and for the reasons Ahmed discussed in the article - that scientists are motivated by their perceived ability to do something novel. So first reproduction might be novel, but the fifth would certainly be less valuable, so I wouldn't personally assume we'd get a flood of useless experiments.
> I'd say the goal that gets credited should not be merely reproducing the results, but finding errors in the previous research
Reproducing an experiment is meant to, without prejudice, either confirm or deny the previous research. It's not meant to confirm the previous results, it is meant to ask whether there could be errors in the research, but without assuming there are errors.
It is novel to validate a result the first time, whether it's positive or negative, and for this incentive system to work, it has to appeal to people who might not find something dramatic or contradictory. It must be appealing to do the work, regardless of the outcome, or it's not an incentive at all.
For example, much of computer "science" is not. Math maybe, engineering probably, design sometimes, but "science" is rarely done. BUT the science envy is there, especially post 1990s, and it is as confusing as heck when multiple definitions of "science" collide in a conference culture.
Yes I'm a researcher, no I'm not a scientist.
Those two aren't the same, and I think far too many think that the point is the latter when, imho, it's actually the former. Pure screwups will likely get found out, just like glaring bugs are usually found. It's when your result actually has a huge variance but you're looking at only one (or a few) samples and draw conclusions from it that's insidious, like the fact that it's the bugs that just change the output by a tiny bit that are the hardest to notice.
- RDF is the best way to do this. RDF can be represented as RDFa (RDF in HTML) and as JSON-LD (JSON LinkedData).
... " #LinkedReproducibility "
It isn't/wouldn't be sufficient to, with one triple, say (example.org/studyX, 'reproduces', example.org/studyY);
there is a reified relation (an EdgeClass) containing metadata like who asserts that studyX reproduces studyY, when they assert that, and why (similar controls, similar outcome).
Today, we have to compare PDFs of studies and dig through them for links to the actual datasets from which the summary statistics were derived; so specifying who is asserting that studyX reproduces studyY is very relevant.
Ideally, it should be possible to publish a study with structured premises which lead to a conclusion (probably with formats like RDFa and JSON-LD, and a comprehensive schema for logical argumentation which does not yet exist).
Most simply, we should be able to say "the study control type URIs match", "the tabular column URIs match", "the samples were representative", and the identified relations were sufficiently within tolerances to say that studyX reproduces studyY.
Doing so in prosaic, parenthetical two-column PDFs is wasteful and shortsighted.
An individual researcher then, builds a set of beliefs about relations between factors in the world from a graph of studies ("#StudyGraph") with various quantitative and qualitative metadata attributes.
As fields, we would then expect our aggregate #StudyGraphs to indicate which relations between dependent and independent variables are relevant to prediction and actionable decision making (e.g. policy, research funding).
Probabilities, for example, are not applicable to partially observed, guessed and modeled phenomena. It should be a type-error.
As for math - existence of a concept as a mathematical abstraction does not imply its existence outside the realms of so-called collective consciousness. Projecting mathematical concepts onto physical phenomena which could not be observed is a way to create chimeras and to get lost in them.
Read some Hegel to see how it works.)