I wish science was that simple. The methods section only contains variables the authors think worth controlling, and in reality you never know, and the authors never know.
Secondly, I wish people say: "I replicated the methods and got a solid negative result" instead of "I can't replicate this experiment". Because most of the time, when you are doing an experiment you never done, you just fuck it up.
Here is an example: we are studying memory using mice. Mice don't remember that well if they are anxious. Here are variables we have to take care of to keep the mice happy, but they are never going to go to the methods section:
Make sure the animal facility haven't cleaned their cages.
But make sure the cage is otherwise relative clean.
Make sure they don't fight each other.
Make sure the (usually false) fire alarm hasn't sound for 24 hours.
Make sure the guy who was installing microscope upstairs has finished producing noise.
Make sure there is no irrelevant people talking/laughing loudly outside the behaviour space.
Make sure the finicky equipment works.
Make sure the animals love you.
The list can go on.
Because if one of this happens, the animals are anxious, then they don't remember, and you got a negative result which have nothing to do with your experiment (although you may not notice this). That's why if your lab just start to do something you haven't done for years, you fail. And replicating other people's experiment is hard.
A positive control would help to make sure your negative result is real, but for some experiments a good positive control can be a luxury.
Guys, the bio side of things is incredibly complicated and trying to set controls that are achievable in your time and budget is the heart of these fields. The thing you are trying to study in the bio side of science is actually alive and trying to study you right on back. If you are going to kill the thingys, they really do want to kill you too. Look at this diagram of mitochondrial phospholipid gene/protein interactions (https://www.frontiersin.org/files/Articles/128067/fgene-06-0...). That is very complicated and that is for one of the best studied organelles in the Animalia family. There are an uncountable and evolving set of other proteins that then interact with that diagram in different ways depending on the cell-type, species, and developmental history of the organelle (to start with). All of which are totally unknown to you and will likely forever remain unknown to you up until your death. Hell, we are still figuring out the shapes of organs in our own bodies. People that have studied areas for decades spending untold millions of dollars on some of the most central questions of life have essentially nothing to say for themselves and the money spent('We dream because we get sleepy'). Trying to tease out a system that has been evolving for 4.5 billion years and makes a new generation (on average) every 20 minutes is going to be just insanely difficult.
This stuff is hard, the fields know it, and we don't believe anything that someone else says, let alone what our 'facts' and experiments tell us. But we solider on because we love it and because we want to help the world.
Outsiders think of journals as pre-packaged nuggets of science fact, with conclusions that you can read and trust.
Scientists who publish in the journals view them as a way to communicate a body of work of "I did this, here is the result I saw. I think it might mean this."
The difference between these two views can not be understated. Each group wants the literature to be useful to them, so that they can use it, and that's understandable.
For most areas of science, especially in the early days of that science, it's absolutely essential that scientist's view be allowed to persist. Is it better to share early, or to wait to publish until you've tried all possible things that could potentially go wrong? I think the answer is obviously that you share the early data, and what you think it means, even if you may be wrong about it.
If the goal is to advance knowledge as quickly as possible, I think that the scientist's view is probably a better idea of what a journal needs to be than what the outsiders' view is. In some fields, like fMRI studies, the field is realizing that they may need to go about things differently. And that means that a lot of the interpretations that were published earlier are incorrect. But that process of incorrect interpretation to corrected interpretation is an essential part of science.
The old system of "gentleman science" (where only independently-wealthy heirs did science, as a hobby) had a lot of problems and wouldn't really be workable today, but one thing it had going for it was that the scientist could count on the funding being there tomorrow, and so had much less of an incentive to overstate their work.
The scientific literature is an ongoing conversation anchored by rigorous experimental facts and data. But rigorous doesn't mean it's clean like a mathematical proof. In fact, most science approaches its "proofs" in quite a different way than math. For example, as far as science would be concerned P!=NP in complexity theory. We've done the experiment many times, tried different things, it's pretty much true. But it's still not mathematically proven because there isn't a formal proof.
That's not to say it's invalid to expect more rigor, or that we wouldn't all love to have "chains of proof" and databases and signatures for data etc. It's that it's simply not practical given how noisy and complex biological systems are. In contrast to math, you pretty much never know the full complement of objects/chemicals/parameters in your experimental space. You try to do the right controls to eliminate the confounding variables, but you're still never fully in control of all the nobs and switches in your system. That's why usually you need multiple different experiments tackling a problem from multiple different approaches for a result to be convincing.
Formalized systems would be great, but I don't think we're even close to understanding how to properly formalize all of those difficulties and variables in a useful way. And it may not even be possible.
Such scripting is a step on the path to formalising methods. They'd help those who just want to see the same results; those who want to perform the same analysis using some different data; those who want to investigate the methods used, looking for errors or maybe doing a survey of the use of statistics in the field; those who want a baseline from which to more thoroughly reproduce the experiment/effect; etc.
The parent's list of mouse-frighteners reminds me of the push for checklists in surgery, to prevent things like equipment being left inside patients. Whilst such lists are too verbose for a methods section (it would suffice to say e.g. "Care was taken to ensure the animals were relaxed."), there's no reason the analysis scripts can't prompt the user for such ad hoc conditions, e.g. "Measurements should be taken from relaxed animals. Did any alarms sound in the previous 24 hours? y/N/?", "Were the enclosures relatively clean? Y/n/?", "Were the enclosures cleaned out in the previous 24 hours? y/N/?", etc. with output messages like "Warning: Your answers indicate that the animals may not have been relaxed during measurements. If the following results aren't satisfactory, consider ..." or "Based on your answers, the animals appear to be in a relaxed state. If you discover this was not the case, we would appreciate if you update the file 'checklist.json' and send your changes to 'experimentABC@some-curator.org'. More detailed instructions can be found in the file 'CONTRIBUTING.txt'"
It's not particularly onerous, considering the sorts of things many scientists already go through, e.g. regarding contamination, safety, reducing error, etc.
> Yes, you made some checklist, great. But no other lab is going to go through all of that. And in your field, if you are very lucky, you may have just 1 other lab doing anything like what you are doing. It would be a checklist just for yourself/lab, so why bother recording any of it?
Why bother writing any methods section? Why bother writing in lab books? I wasn't suggesting "do all of these things"; rather "these are factors which could influence the result; try controlling them if possible".
> Yes, do it, fine, but how long should you store those records that will never be seen, even by yourself?
They would be part of the published scientific record, with a DOI cited by the subsequent papers; presumably stored in the same archive as the data, and hence subject to the same storage practices. That's assuming your data is already being published to repositories for long-term archive; if not, that's a more glaring problem to fix first, not least because some funding agencies are starting to require it.
> Why in god's name would you waste those hours/days just going over recordings of you watching a mouse/cell/thingy to make sure of some uncountable number of little things did/did not happen?
I don't know what you mean by this. A checklist is something to follow as you're performing the steps. If it's being filled in afterwards, there should be a "don't know" option (which I indicated with "?") for when the answers aren't to hand.
> a specification for describing analysis workflows and tools in a way that makes them portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments. CWL is designed to meet the needs of data-intensive science, such as Bioinformatics, Medical Imaging, Astronomy, Physics, and Chemistry.
Even in machine learning, how difficult would it be to get that field to adopt a unified experiment running system? It sounds like a huge engineering project that would have to adapt to all sorts of computational systems. All sorts of batch systems, all sorts of hadoop or hadoop like systems. And that's going to be far easier than handling wet lab stuff.
I think that the lack of something like this in ML shows that there's enough overhead that it would impede day-to-day working conditions. Or maybe it just hasn't been invented yet in the right form. There are loads and loads of workflow systems for batch computation, but I've never encountered one that I like.
In genomics, one of the more popular tools for that is called Galaxy. But even here, I would argue that the ML community is much better situated to develop and enforce use of such a system than genomics.
Maybe a field that's less dependent on resources would be a better fit. An example I'm familiar with is work on programming languages: typechecking a new logic on some tricky examples is something that should work on basically any machine; bechmarking a compiler optimisation may be trickier to reproduce in a portable way, but as long as it's spitting out comparison charts it doesn't really matter if the speedups differ across different hardware architectures.
When the use of computers is purely an administrative thing, e.g. filling out spreadsheets, drawing figures and rendering LaTeX (e.g. for some medical study), there's no compelling reason to avoid scripting the whole thing and keeping it in git.
However it's not a simple thing that automatically combines different studies. It takes skilled application to understand how data connects, what's comparable, what's not, etc. Traditionally, "meta-analysis" is the sub-field of statistics that combines studies. But that only combines extremely simple studies, such as high-controlled clinical trials. It's inappropriate for the complex type of data that appears in a typical molecular biology paper is a chain of lots of different types of experimental setups.
Those who don't know the body of stats, trying to reason about application, is a lot like an MBA trying to reason about software architecture. The devil is in the details, and the details are absolutely 100% important with application of statistics to data.
I knew of a student that graduated with only 8 cells of data out of his 7 years in grad school. That may sound like a small amount, but to his committee, it was a very impressive number. He (very very) basically sliced up adult rodent brains and then used super tiny glass pipettes to poke the insides of certain cells. He chemically altered these cells' insides in a hopefully intact network of neurons, shocked the cells, and then recorded the activity of other cells in the network using the same techniques. Then he preserved and stained the little brain slice so he could confirm his results anatomically. From start to finish, it took him 13 hours total, no lunch or restroom breaks, every day, for 7 years. He got 8 confirmable cells worth of recordings total.
That is a hard experiment. But due to his efforts in adding evidence we now suspect that most adult hearing loss is not due to loss of cells in the ear, but in the coordination of signals to the brain and their timing mis-match. It is not much, bu it adds to the evidence and will for sure help out people someday.
To add, he is now a beer brewer in Bavaria and quit science. This shit takes sacrifice man.
In medicine, case studies are often used for low n issues. There are too many variables for meaningful statistics to be pulled out, but a "this patient had X, we did Y, Z happened" is still a way to pass on observational information. It's recognised that case studies aren't ideal, but it's still better than not passing on information at all.
I always find it amusing when physicists talk about how mind-blowing it is that at the quantum level, things aren't entirely predictable. Over in biology, that's the starting point for everything rather than the final frontier - you don't need ridiculously expensive tools to get to the point where you're finding unpredictable stuff.
laser pointer plus a sharp enough carbide wheel
(trained as a chemist, grad school in biostats & genetics, now mostly design experiments & clinical trials... I have physics envy, except I don't envy their funding models!)
I don't think it's entirely accurate to say that conditional editing of DNA is a new thing. The ready accessibility and combinatorial possibilities, yes, but for targeted conditional knockouts, floxing mice has been a thing for about 20 years now.
I spent 7 years doing experimental biology (from bacteria to monkeys) and trying to replicate someone else's techniques from their papers was always a complete nightmare. Every experimentalist I talk to about this relates the same experience -- sometimes unprompted. Senior faculty tell a slightly different story, that they can't interpret data of someone who has left the lab, but it is the same issue. We must address this, we have no choice, we cannot continue has we have for the past 70+ years, the apprenticeship system does not scale for producing communicable/replicable results (though it is still the best way to train someone to actually do something).
EDIT: An addendum. This stuff is hard even when you assume that all science is done in good faith. That said, malicious or fraudulent behaviour is much harder to hide when you have symbolic documentation/specification of what you claim you are doing, especially if they are integrated with data acquisition systems that sign their outputs. Still many ways around this, but post hoc tampering is hard if you publish a stream of git commit hashes publicly.
Wouldn't it be better to use something like a cloud biology model - where you define experiments via code, CRO's compete on consistency(and efficiency and automation) and since they probably do a much larger volume of experiments than the regular lab, they would have stronger incentives to develop better processes and technologies ?
Between automating labor, economies of scale in purchasing, and access to more efficient technology(like acoustic liquid handling) ,etc - isn't it just a matter of time before cloud biology becomes quite cost effective and combined with other benefits - it would be the only way that makes sense to do research, so funding will naturally go there?
Also - do you see a way to add the extreme versatility of the biology lab into a cloud service ?
> do you see a way to add the extreme versatility of the biology lab into a cloud service
We let you run any assay that can be executed on the set of the devices that we have in our automated lab. So in that sense, yes its very flexible. Also, there's no need to run your entire workflow in the cloud. You can do some at home, some in the cloud. Some people even string together multiple cloud services into a workflow. See https://www.youtube.com/watch?v=bIQ-fi3KoDg&t=1682s
That being said, biology labs can be crazy places. Part of what we do is put constraints on what can be encoded in each protocol to reduce the number of hidden variables. Every parameter that counts must be encoded in the protocol, because once you hit "go" on the protocol, it could run possibly on any number of different devices each time it runs. The only constant is that the exact instructions specified in the protocol will be run on the correct device set.
2. Maybe not better, but certainly more result oriented. Core facilities do exist right now for things like viral vectors and microscopy (often because you do need levels of technical expertise that are simply not affordable in single labs). If there were a way to communicate how to do experiments more formally then the core facilities could expand to cover a much wider array of experiment types. You still have to worry about robustness, but if you have multiple 'core' facilities that can execute any experiment then that issue goes away as well. The hope of course is that individual labs as they exist today (perhaps with an additional computational tint) would be able to actually replicate each other's result, because we will probably end up needing nearly as many 'core' facilities as we have labs right now, simply because the diversity of phenomena that we need to study in biology is so high.
Just a little metadata would help: Experiment A, Phase N, Day X
Thanks for sharing your knowledge and experience in this discussion, by the way. It's what makes HN great.
That's what I meant. Just stick some cameras in the ceiling (or wherever is best) and capture what you can. It seems cheap and better than nothing, but I know nothing about biological research.
I would love to hear more about your work, and the strategies you propose to improve the reproducibility of scientific experiments. My email can be found in my user description.
Friend of mine has been experimenting with wrapping it all up in Docker containers! :-)
Maybe the format needs to change. Perhaps journals should require video, audio commentary or automated note taking for publication.
While at the time I was pretty upset with him, perhaps it's the competitive nature of science and funding that also gives people a mild incentive to be secretive.
In this case, simply publishing code would have resolved the questions.
That's not science, that's bullshit. Can you please expose that? Scientists shouldn't simply get away with such malicious behavior.
People could also publish plain text data as supplementary material, but why do that when you can get away with a raster image of a plot...
I felt my results were solid enough (and that my skill at producing more results was good enough) that this wouldn't hurt me. This was just how things worked in my field (physics).
Interestingly, my experimental colleagues rarely felt that they couldn't reproduce results they saw in a paper.
Any real scientists ought to recognize that secrecy as a strategy is terrible for science as a whole even if it is very temporarily good for them.
Especially for final masters degree projects this is very common as the students don't get paid by the university at all, so many try to find a company to sponsor. But the students still need the uni to publish the report for them to get their final degree so you get this conflict of interest again. Most of these reports are just written with the end goal of getting a degree, not of creating solid research, this really needs the stricter universities not letting through all that crap, for now they shouldn't really be trusted the same way as proper research papers.
Edit: This is additional context for the commenters below.
I'm actually surprised to hear of high school teaching good scientific practice. I don't remember ever being taught that. Widely may it spread.
As you say - it's not a comforting lie, it's a command.
I'm not saying you're wrong about the incentives - scientists are often incentivized toward secrecy - but I deny that we have to follow such incentives.
Am I worried about getting scooped on the results of a major, multi-year cohort study that would take thousands of dollars to replicate? Nope.
Am I worried about being scooped on the math modeling study that is perfectly reproducible based on two slides from a presentation? Hell yes.
There has been some work for archiving code with some journals and some allow video uploads as well for segments from the actual experiments.
A 'world view' column in Nature suggested the same things last week ; the author described a paper of theirs :
> Yes, visual evidence can be faked, but a few simple safeguards should be enough to prevent that. Take a typical experiment in my field: using a tank of flowing water to expose fish to environmental perturbations and looking for shifts in behaviour. It is trivial to set up a camera, and equally simple to begin each recorded exposure with a note that details, for example, the trial number and treatment history of the organism. (Think of how film directors use clapper boards to keep records of the sequence of numerous takes.) This simple measure would make it much more difficult to fabricate data and ‘assign’ animals to desired treatment groups after the results are known.
Most experiments run for years (literally) and no one is going to record or archive, let alone watch, years of footage to confirm that one paper is legit.
A brief experiment showing the apparatus and the collection of a few data points might be helpful for understanding the paper, but I can't see using it to verify a non-trivial experiment.
Recording and storing years of footage shouldn't be a significant problem with modern tech.
Nobody has to watch years of it; they can watch the parts they are interested in. They also can watch at 4x and search, as needed.
Sure, but a lot of information does appear. Not everything appears in papers; should we stop publishing them?
Why not? Just setup a permanent camera and go. Storage is cheap. Organizations with security cameras often have multiple streams to record.
That would help prevent such "hidden specs" from entering the experiment.
 Note: this implies that authors cannot be part of the experiment or conduct it themselves, because they can't "pass on their identity" in a research paper, as would be necessary to put readers on par with the technicians.
Based on your description, it actually sounds like it's making you do things exactly the way science is supposed to work! You quickly identify issues of "real effect or experimenter error or undocumented protocol?" -- and you prevent any ambiguous case from "infecting" the literature.
Those are the same objectives modern science is currently failing at with it's "publish but never replicate" incentives.
> Lastly, you can tell a lot by being hands-on with a study; there's a lot you can miss if you aren't in the room with the study.
I wasn't saying that you can't be there in the lab and do that kind of experimentation, just that scientists shouldn't represent this kind of ad hoc work as the repeatable part that merits a scientific conclusion. The process should be: if you find something interesting that way, see if you can identify a repeatable, articulable procedure by which others can see the same thing, and publish that.
E.g., looking from the perspective of a scientist-reader, if someone has spent a few months doing the ad hoc hands on experiments and achieved interesting observations, then I would want them to publish that research now, instead of spending another half a year to make the repeatable procedure or possibly not publishing it ever because they'd rather do something else than make it up to these standards. There is a benefit from getting all the tricks to make the experiment cleaner, but the clean experiment is just means for acquiring knowledge and ideas for further research, not an end goal by itself for a researcher in that area. Outsiders may have different interests though (https://news.ycombinator.com/item?id=13716233 goes into detail) preferring "finalized and finished research" but to the actual community (who in the end is doing everything and evaluating/judging their peers) generally would prefer the work in progress to be published as-is instead of having more thorough results, but later and less of them.
Outsiders can get the repeatable procedures when it's literally textbook knowledge, packaged in assignments that can be given to students for lab exercises (since that's the level what truly repeatable procedures would require). Insiders want the bleeding edge results now, so they design their journals and conferences to exchange that.
OTOH, it's a problem if people are basing real-world decisions on stuff that hasn't reached textbook level certainty. That's pretty much what happened with dietary advice and sugars. "Two scratchpads say a high-carb low-fat diet is good? Okay, then plaster it all over the public schools."
For a recent personal example, a company published a paper saying that if you give a pre-treat with 2 doses of a particular drug, you can avoid some genetic markers of inflammation that are in the bloodstream and kidneys. Well, I looked at the stimulus and ordered some of my own from a different manufacturer that was easier to obtain and gave it to some mice with and without pretreatment by their compound. Instead of looking at the genes they looked at, I looked at an uptick in a protein expected to be one step removed from the genes they showed a change in. Well I haven't exactly replicated their study, but I've replicated the core points: stimulus with a the same cytokine gives a response in a particular pathway and it either is or isn't mitigated by the drug or class of drugs they showed. Now, my study took 2 days less than theirs, but it worked well enough that I don't need to fret the particular details I did differently from them. If my study didn't work, I could either decide that the study isn't important to me if it didn't work my way or go back a step and try to match their exact reagents and methods.
So yes, I do think the news industry picks up stuff too quickly sometimes, but depending on the outlet, they tend to couch things in appropriate wiggle words (may show, might prove, could lead to, add evidence, etc).
It's like the stages of clinical research - there we have general standards on when it is considered acceptable to use findings for actually treating people (e.g. phase 3 studies), but it's obviously clear that we need to publish and discuss the initial findings since that's required to actually get to the phase 3 studies. However, the effects seen in phase 1 studies often won't generalize to something that actually works in clinical practice, so if general public reads them then often they'll assume predictions that won't come true.
Though I should say that I really like JOVE. You can learn a lot.
You really have to try it yourself before you can understand the degree of troubleshooting that's required of a good experimentalist. You could have scientists live-streaming all their work, and you'd still have the same issues you do now. Even a simple experiment, something most wouldn't blink an eye at, has dozens and dozens of variables that could influence the result. The combinatoric complexity is staggering. The reality is that you try things, find some that work, and then convince yourself that it's a real result with some further work.
The methods that endure are the ones that replicate well and work robustly. Molecular biology is still built on Sanger sequencing, electrophoresis, and blotting, all in the context of good crossing and genetics, because that's what works. Some of the genomic tools are starting to get there, I'd venture to say that RNAseq is reasonably standardized and robust at this point. Interpreting genomic data is another story...
What you're talking about is the authors not reporting variables they considered, and then didn't control for. These not being reported is not universal - for example, I very frequently report every variable considered, and make my code available, with comments about variable selection within.
What the parent post is talking about is "unmeasured confounding", or a Rumsfeldian "Unknown Unknown". If there is something that matters for your estimate, but you're unaware it exists, by definition you can neither report it nor control for it.
Another group was doing a metabolism experiment with caffeine and rats. The only meaningful result they got was the half-life of caffeine. The rats were incredibly animated regardless of whether they had been dosed with caffeine or not, and they basically got garbage for results as well.
I'm an analyst now, and I've noticed biology is a second-class science in the eyes of a lot of hiring managers when it comes to analytics. Not everyone knows how to use data in bio, but if you are a data-type person, you get so much experience working with the worst data imaginable. The pure math types aren't that great at experimental design, and the physics and chem people tend to be able to control most of their variables pretty easily.
I'll admit that bio people tend to be a bit weaker in math, but almost every real-world analysis situation I've been in has been pretty straightforward mathematically. Most of the time is spent getting the data to the point where it can be used.
We tried using a machine vision program written by a grad student at another college to count trees from old survey photos at one point, and it did not work well at all in anything with more trees than a park-like setting. We ended up just having a human circle all the trees they saw and I wrote a program to detect the circles. It worked much better. The program I created was based on something similar used to count bacteria colonies on petri dishes.
Mate I think that's the point of this whole thread that you're commenting in. And the tangential point to the article posted.
Science isn't some binary thing. You can do poor science, and you can do great science. Some variables are hard or impossible to control for. Some fields make this simpler than others. I'd say that as we've continually endeavored with the sciences we're probably better at it now than we've ever been before.
Synthetic conditions are absolutely critical to science. Typically, the more conditions you can specify in the experiment, the more reproducible it should be. Some of these are very difficult, and others in the thread have pointed out that some don't get labeled in the journals.
If we ran such experiments in the wild, completely outside of control, then we can never know what we're really observing. By controlling the environmental variables your observations gain meaning.
I mean by that measure, medicine is not a science either because we don't know most of the possible confounding variables. That doesn't mean that attempting to use the scientific method still isn't the correct choice.
That's not what I'm trying to say here at all. The point isn't about how amenable you are to experimental control. The point is that even when experimental control is easy to isolate out, like in the physics example above (which I don't think is true in all of physics, by the way), it's not free. You're trying to compensate for the lack of available statistical power to measure an effect in noisy data by cutting down on the noise in the data. But you're doing it by generating the data in an environment that doesn't exist outside of laboratory conditions. Writing off replication failure as not being a problem because lab conditions are difficult to reproduce misses this; if the findings are difficult to replicate in other conditions, that could indicate that the findings are more narrow in scope than the study suggests. As I pointed out downthread, for example, if all the rodents in an experiment on a drug are on the same diet, all the experiment proves (assuming it's otherwise well run) is that the drug works in combination with this diet. If the drug works independently of diet, then the findings on the drug are generalizable. If it doesn't, though, they aren't. And if you have 60 years of medical research based in part on studies with rodents who eat diets very differently than what rodents eat in the wild, or what people eat, then it raises all sorts of questions about the state of medical research. That doesn't mean that medicine isn't a real science, it just raises questions about how well it tells us what we think it's telling us.
In the wild, there are plenty of mice that aren't in the vicinity of rats.
One week, they're moving your growth cambers out into the hallway to work in the ceiling. The next, they had to cut power for 8 hours for maintenance, and by the way, it wasn't plugged into an outlet with emergency power. Oops.
Hell, they can't even keep the lab temperature steady. Solutions sitting on your bench will start to precipitate out.
So yeah, the issue is money. It's also planning; you never know what the needs of researchers are going to be in a few years.
In the end, nice facilities can certainly help with a lot, but they don't address the core issues of experimental variables and combinatoric complexity. The way you deal with this is skeptical peers that understand the methods, reliance on robust methods wherever possible, and independent methods to confirm results. Even with all this replication difficulty, it is quite possible to make compelling conclusions.
If you can't do science in that environment, then it's not scientific because you can't control all the variables.
Let's say you want to know where a protein localizes in a cell, of a given tissue, in both mutant and wild-type organisms.
* Immunolocalization. You develop antibodies to the protein of interest, fix and mount tissue, perfuse it with the antibody, and use a secondary antibody to make it detectable.
* Fluorescent tagging. You make a construct with your protein fused to a fluorescent protein. Usually this involves trying a few different tagging strategies until you find one that expresses well. Then you can make a stable transgenic, or try a transient assay. If it works, then you see some nice glowy confocal images. Be careful, though, as the tagging can affect protein localization.
* Fractionation. In some cases, you can get a rough idea by e.g. extracting nuclei and doing a simple Western blot to see where the protein shows up.
In the real world, you might start with the FP-tag and see that your protein is absent from the nucleus in your mutant. Which would be cool, and interesting in terms of figuring out it's function. If that was presented as the only result, I'd reject the paper and think the authors are terrible investigators. I'd want to see, at least, some nuclei preps that detect the protein in the WT, and don't in the mutant. I'd love to see immunos, too, as FP often does mess up the localization.
You can take it even further and start doing deletions. You take the protein and crop bits out to see what happens. You should see stuff like removing the NLS makes it stop going to the nucleus. That's a good sanity check and a sign your methods are working. You can also try to mess with active sites, protein-protein interaction domains, etc. etc. All within a theoretical model of what you think is going on in the cell.
Ultimately, the difficulty of replication isn't that troublesome. An inability to do so is, but science has never been easy. That's what you sign up for, and that's why you need to read papers critically. You get a sense of distrust for data unless there's really solid evidence of something. And when you find that solid evidence, you get a big smile and warm feeling in nerd heart.
Pre-experiment mouse handling procedures as outlined in section M.1.2 and M.2.4 of the Mouse Handling Procedure Standards Body Publication of 2014
Which document could have any number of sections detailing the finer points variable control for the relevant experiment.
Nevertheless, these kinds of standardization documents are important. But only as long as deviations from these standards are not discounted when justified theoretically.
Think about the epistemological ramifications of this.
This is why trust is so important in social contexts, even in science.
This this this this this a million times this. If your experiment with lab rats needs to be done this finely to be replicated, is it really telling you anything about the real world? Because of course, we don't really care about the behavior of laboratory rats, right? Not in proportion to how often we do studies with laboratory rats versus other species of animals. We care because of what it can tell us about biology in general. And if the finding can't even generalize to "laboratory rats whose cages have been cleaned recently," does the study really say what it seems to?
Worse, you could get a false result the other way. What if your control group had all been spooked before they were measured a few times? That could make the control group seem worse than they are!
And this works fine if the drug you're studying works identically given all rodent diets. But you don't know that it does! You're not controlling for the variable of diet in the sense that you know the effect of the variable of interest across all possible diets. You know how the drug works conditional on one specific rodent diet. And if you're not putting what brand of rodent food you use, and that causes replication failures, that suggests that you don't understand the effect of the drug as well as you thought you did. And if you have an entire field of research that undergoes such replication failures often, it's fair to start wondering how much of what gets called "science" isn't the study of the natural world but the study of the very specific sets of conditions that happen in labs in American and European universities.
Would you expect the tests to be done, then again but with both rats hearing the person upstairs installing a new microscope, then with a regular fire alarm, etc?
Describing the food given is quite a step away from describing how frequently the fire alarm goes off.
On the other hand, figuring out what variables to control is huge part of science. Say the development of next generation DNA sequencing technologies. People tried a ton of different variables, conditions, reagents, flow cells, etc. And failed and failed. But eventually they controlled the right conditions, optimized the right things, and now the process is done in thousands of labs every day as a routine tool. This is a technology development example, but the same could be said of the conditions needed to make stem cells.
You use independent methods and good controls to deal with this. This is nothing new, it's been the basis of experimental science for decades. If you're allowed to publish without doing this, the field has failed. Molecular biology, in particular, has flourished because of the effectiveness of genetic controls. Mutants are highly reproducible (you just send out seed, cultures, or live specimens), verifiable (sequencing), and combinatoric (crossing).
If adding a variable changes the actually effect being estimated, then it may indeed have been worth adding.
If it doesn't, they'll just eat away at their precision.
Disclaimer: I did bench and animal work 21yrs ago during undergrad, but havent used this company. So I know the field and how difficult replication is, but i'm not sure what has happened in wet science since 1997.
"Cargo Cult Science"
> I looked into the subsequent history of this research. The subsequent experiment, and the one after that, never referred to Mr. Young. They never used any of his criteria of putting the corridor on sand, or being very careful. They just went right on running rats in the same old way, and paid no attention to the great discoveries of Mr. Young, and his papers are not referred to, because he didn’t discover anything about the rats. In fact, he discovered all the things you have to do to discover something about rats. But not paying attention to experiments like that is a characteristic of Cargo Cult Science.
Then I contact the author and they send me a protocol that is a 'distant cousin' to what I read in the published article.
This caused me to assiduously report methods when I wrote papers.
If you are protecting, not reporting information (e.g., about a new technology), why would you be incentivized to share it?
"Experiments are supposed to be replicable. The authors should have done it themselves before publication"
That seems to imply the experiments aren't even replicated by the people who are supposed to know how to get it right.
- sound proof/resistant, record audio levels.
- be hermetically sealed, temperature controlled and opaque to outside.
Given something as complex as a living animal, why aren't they treated as controlled as a chemical might be?
What a nauseatingly unnecessary comment. Boo
I would guess that the practical benefits are a big reason that they are the 'rules'.
Am I wrong for considering this not-quite-a-crisis? It has long been the case that initial studies on a topic fail to get reproduced. That's usually because if someone publishes an interesting result, other people do follow up studies with better controls. Sometimes those are to reproduce, sometimes the point is to test something that would follow from the original finding. But either way, people find out.
I mean, I guess the real problem is that lots of sloppy studies get published, and a lot of scientists are incentivized to write sloppy studies. But if you're actually a working scientist, you should understand that already and not take everything you see in a journal as actual truth, but as something maybe might be true.
This is how I normally look at things. If I can't easily replicate an experiment, then it's very likely to be wrong.
Sadly, it's pretty rare (and exciting) when you can easily do something based on the methods in someone's paper
Unfortunately that is a problem. Imagine yourself trying to create a simulation of a biological system, which has to rely on experiments. You may come up with a plan, but every little line on the plan will be either very doubtful or outright false. The problem is that many of these doubts could be dispelled if the experiment was a lot more stringent (much larger sample size, much more controlled conditions etc). That would cost a hell lot more, but it would give you one answer you can rely on.
Why then do so many scientists in so many different fields insist "the science is settled"?
No, no, a thousand times no!
Most studies do not have follow up studies that confirm/refute the original. Often such a followup study is hard to publish. If you manage to reproduce it, you cannot publish unless it presents a new finding. If you fail to reproduce it, it often doesn't get published either. And no one writes grant applications that are for replication studies. The grant will likely go to someone else.
When I was in grad school, few advisors (engineering/physics) would have allowed their students to perform a replication study.
>But either way, people find out.
I wish I could find the meta-study, but someone once published a study of retracted papers in medicine. They found that a number of them were still being cited - despite the retraction (and they were being cited as support for their own papers...). So no, people don't find out.
>But if you're actually a working scientist, you should understand that already and not take everything you see in a journal as actual truth, but as something maybe might be true.
I agree. But then you end up writing a paper that cites another paper in support of your work. Or a paper that builds up on another paper. This isn't the exception - this is the norm. Very few people will actually worry about whether the paper they are citing is true.
When I was doing my PhD, people in my specialized discipline were all working on coming up with a theoretical model of an effect seen by an experimentalist. I was once at a conference and asked some of the PIs who were doing similar work to mine: Do you believe the experimentalist's paper? Everyone said "No". Yet all of us published papers citing the experimentalists' paper as background for our work (he was a giant in the field).
Another problem not touched upon here: Many papers (at least in my discipline) simply do not provide enough details to reproduce! They'll make broad statements (e.g. made measurement X), but no details on how they made those measurements. Once you're dealing at the quantum scale, you generally cannot buy off the shelf meters. Experimentalists have the skill of building their own measuring instruments. But those details are rarely mentioned in the paper. If I tried reproducing the study and failed, I would not know if the problem is in the paper or in some detail of how I designed my equipment.
When I wrote my paper, the journal had a 3 page limit. As such, I had to omit details of my calculations. I just wrote the process (e.g. used well-known-method X) and then the final result. However, I had spent most of my time actually doing method X - it was highly nontrivial and required several mathematical tricks. I would not expect any random peer to figure it all out. But hey, I unambiguously wrote how I did it, so I've satisfied the requirements.
When I highlighted this to people in the field, they were quite open with another explanation: It helps them because they do not want their peers to know all the details. That allows them to have an edge over their peers and they do not need to race with them to publish further studies.
I can assure you: None of these people I dealt with were interested in furthering science. They were interested in furthering their careers, and getting away with as little science as is needed to achieve that objective.
>No, no, a thousand times no!
Most studies do not have follow up studies that confirm/refute the original. Often such a followup study is hard to publish. If you manage to reproduce it, you cannot publish unless it presents a new finding. If you fail to reproduce it, it often doesn't get published either. And no one writes grant applications that are for replication studies. The grant will likely go to someone else.
Sorry, let me be clear: If an interesting result is published, people will go to the trouble. Most results are of limited interest and mediocre.
That's only true for a definition of "interesting" that is more like the sense most people assign to "astounding" or "groundbreaking", and even then it's not guaranteed, just somewhat probable. If it's both groundbreaking and controversial (in the sense of "immediately implausible to lots of people in the domain, but still managing to draw enough attention that it can't be casually ignored as crackpot"), like, say, cold fusion, sure, there will be people rushing to either duplicate or refute the results. But that's a rather far out extreme circumstance.
If it's in a journal, it is interesting. Journal editors will require "interesting" as a prerequisite to publishing a paper. Papers do get rejected for "valid work but not interesting".
If journals are publishing papers that are of limited interest, then there is a serious problem with the state of science.
I'm not trying to pick hairs. One way or other, there is a real problem - either journals are not being the appropriate gatekeepers (by allowing uninteresting studies), or most interesting studies are not being replicated.
I have always suspected that, but I've never heard anyone be that open about it.
At a previous job I had to implement various algorithms described in research papers, and in every case except one, the authors left out a key part of the algorithm by glossing over it in the laziest way possible. My favorite one cited an entire linear algebra textbook as "modern optimization techniques."
Yes, they'll just say they expect their peers to be competent enough to reproduce, and a paper shouldn't be filled with such trivialities.
To get the real answer, talk to their grad students. Especially those who are aspiring for academic careers. They tend to be quite frank on why they will act like their advisors.
Oh, and citing a whole book for optimization - that kind of thing is quite common. "We numerically solved this complex PDE using the method of X" and then just give a reference to a textbook. But usually the algorithm to implement is sensitive to tiny details (e.g. techniques to ensure tiny errors don't grow to big ones, etc).
John Ioannidis: "Reproducible Research: True or False?" -- https://youtu.be/GPYzY9I78CI
The wrong incentives for studies are a bigger problem. I think the only way to solve that is with a higher threshold of peer review to be required before one of these "findings" is put out to the public.
I believe that's called "skepticism" which makes you a heretic and "anti-science" in certain fields.
I've seen this time and time again while working in neuroscience and hearing the same from friends that are still in those fields.
Data is often thoroughly massaged, outliers left out of reporting and methods tuned to confirm, rather than falsify certain outcomes. It's very demotivating as a PhD student to see very significant results, but when you perform the same study, you don't find reality to be as black and white as published papers.
On this note, the majority of papers is still about reporting significant results, leading to several labs chasing dead ends, as none of them can publish "negative" results.
We've been doing a lot of data visualization and it often happens that someone comes to me with a thinly veiled task that's really to prove this or that person/process is at fault for delaying a project or something.
Sometimes though the numbers either don't support their opinion or even show a result they don't like and so inevitably they have me massage the graphs and filters until they see a result that looks how they want it to and that's what gets presented at various meetings and email chains.
The information at that point isn't wrong per se, just taken out of context and shown in a persuasive (read: propaganda) rather than informative way.
Unfortunately, Our society is built on rotten foundations.
Even if everyone involved is perfectly honest, you still have the green jelly beans problem.
I once found a good blog about mental health and science, a lot of snakeoil shown about srris, adhd etc. but I'm unable to find it now. Can anyone help me out?
Case in point: the very first thing I thought of is, does this have any relevance to the field of climate science!
So....does it? Because we're told the reason we have to get on board with the program is because the people telling us the facts are scientists, and scientists are smart and trustworthy. However, we know this is not always true, don't we.
So what is a deliberately skeptical person to think?
Are you joking? Are you seriously making the claim that one of the persuasive approaches used in the "public realm" (media, discussions, etc) isn't that we should fight climate change because scientists have almost unanimously decided it is a real thing and we must do something?
If scientists are telling us something, we sure as hell should listen, at least two reasons being they are the experts on the subject (why wouldn't you listen to experts), and the subject is so immensely complicated that an average non-scientist person wouldn't have a chance of "looking at the data" and forming a reasonably correct opinion.
But now you are telling me no one is suggesting I listen to scientists? I could easily google thousands of articles/papers/blog posts/internet discussions where people are doing just that, but you are telling me no, that content does not exist.
What is it about this topic where otherwise reasonable people seem to go off the rails?
If you're not an expert and don't want to invest in becoming one, it's totally rationale to trust a network of experts to - roughly speaking - do their work properly. I'm sure you can find plenty of people advocating that. But my default position would be not to trust a single novel result, regardless of how smart or prestiged the authors were. Strong claims require strong evidence. I rarely hear any scientist or advocate saying otherwise.
I couldn't agree with you more.
I also do not buy into the "it's all your/my fault" marketing, b/c it is using guilt and fear to distract focus from the financially incentivized policy makers to the easily swayed mob... one group has the power to effect change, the other has been deluded to believe it can(rare exceptions occur, often over-embellished by Hollywood). Follow the money.
I also see the brown cloud every day over the city I live near. I know from education and experience the only unchecked growth of an organism/group in the natural order is cancerous and parasitoidal in nature; ie: it ultimately kills the host element. The world is full of Thomas Midgely Jr's who would have me believe the exact opposite of what they also know is true, how to find the truth? I don't. I edify myself, I take personal responsibility for my actions and I try to stay away from these MSM and online discussions... try being the operative word.
'On the whole', it would rather seem we are doing some damage. We don't need perfect predictions to get that.
Second - is the issue of 'risk'.
If there were a 1% chance that your child would be kidnapped if you let them play at the park past 11pm, would you let them do it? No.
Given the level of existential risk inherent in climate change, even if there is a small chance that the climate-alarmists are correct, we basically have to confront the challenge.
Rationally - we should have a very low risk threshold for activities that constitute existential problems for us all.
I'm hugely skeptical of so many specific statements about climate change, especially the politicization and obvious 'group think' - it drives me nuts.
But at the end of the day - 'it looks like in general' there is a problem, and 'even if there is a small risk of it' - we have to do something about it.
Which is how I manage to swallow it all.
So we should take it 'with a grain of salt' but we have to take it, kind of thing.
What scientists are you talking about? Also, didn't you JUST claim "we have no clue"?
By the way, global warming is not the case of some weird uptick in data correlating with some other data being interpreted as causally linked. In the case of global warming, it's really common sense (if you're trained in physics): We understand the infrared spectrum of CO2, and that by absorbing and retransmitting far infrared, it acts to slow the radiative transport of heat. This can be shown in the laboratory. Now, normally you think of the atmosphere being in a steady state, as plants grow they absorb CO2 (converting it into their cellular structure), and as plants die and are digesting or decay, CO2 is ultimately generated. A small amount is buried, but the amount buried is small compared to the overall cycle in a typical year, such that the atmosphere is roughly in equilibrium over the near term, although over the very long term (hundreds of millions of years), CO2 levels have dropped. (By the way, stars generally increase in brightness as they age, so when the CO2 levels were much higher, the Sun was dimmer). This buried carbon becomes fossil fuel, such as coal. Humans dig it up and burn it, but we've really only got good at this process on a large scale within the last 100-150 years (and even in the last 75 years, we've improved productivity by roughly an order of magnitude such that it takes a tenth as much manpower nowadays).
People about a century ago realized that the rate at which we were digging up this long-buried carbon was faster than the rate it was absorbed and buried naturally (makes sense, as the coal was produced over hundreds of millions of years), thus causing the CO2 level of the Earth to start to increase and thus the greenhouse effect to increase. It was just a side note at the time, a whimsical thought about the far future.
But today, the CO2 level has already dramatically changed since the 1800s, and we've also noticed that hey, we can see that predicted temperature rise as well, faster than would be explained just by coming out of the last ice age. This is a totally unsurprising finding if you know the infrared spectrum of CO2 and the rate at which fossil fuel is produced and burned (8-10 cubic kilometers of coal, 3.5 cubic kilometers of oil, about 4000 cubic kilometers of natural gas every year). You can reproduce the change in CO2 level over that time if you assume that roughly half of the carbon we burn every year is absorbed (by ocean, rocks, etc) and the other half stays permanently in the atmosphere.
That CO2 produces an insulating effect on the atmosphere is an indisputable fact. Earth would be much colder without this fact, and you can test it in the laboratory (and this effect is why Venus is hotter than Mercury, even though Mercury is much closer to the Sun).
That humans produce a very large amount of CO2 (i.e. significant fraction of atmosphere's total CO2 over decades) by burning long-buried fuels is an indisputable fact.
That the CO2 level has increased dramatically over the last 100 years is an indisputable fact.
The conclusion is that there MUST be some level of warming from human activity even before you look at the temperature data (which ALSO indisputably shows significant, off-trend warming over the last century), although the exact amount you'd expect depends strongly on the details of our climate system, such as feedback effects (both positive, i.e. increasing the effect, and negative, i.e. stabilizing or counter-acting the effect) from clouds, ice cover, vegetation changes, etc. We DIDN'T have to merely /guess/ at the causality direction after the fact once we saw that all the CO2 was not being fully reabsorbed, as it's a direct physical consequence of the infrared spectrum of CO2, something we can measure in the lab and even replicate from first principles quantum mechanics if we really felt like it. The fact that we observe warming is, to me, just the final validation of what we already knew would happen if we pumped a bunch of CO2 in the air.
By the way, I laugh at the idea that scientists (who can study anything and get grants one way or the other... and are pretty darned poor compared to similarly trained colleagues in the oil and gas business) have a "huge monetary incentive" to push this "agenda" but that somehow, corporations with tens of trillions of dollars worth of fossil fuel assets on the line don't have a similar agenda... I mean, the difference in financial incentive is absolutely absurd!
> I laugh at the idea that scientists...have a "huge monetary incentive" to push this "agenda"
I'm guessing a bit at what you're thinking here, but I think it could be argued that not losing your job could be considered a "huge monetary incentive". I've certainly experienced extreme "peer pressure" to sign off on something that I disagree with before, the idea that office politics literally doesn't exist in the field of science seems quite unlikely to me.
... and then they move to high ground.
Shouldn't you be just as well happy that these "anti-intellectuals" are pointing out flaws in something you thought was true?
I agree with your point, but I suspect your choice of wording tells a little more of the story.
You said "facts", but given your skepticism of their veracity, I think "claims" or "propositions" is more apropos.
Consider the rhetorical difference of these two statements:
"Anti-intellectuals don't accept the facts."
"Anti-intellectuals don't accept the claims."
I would also question your use of the term "anti-intellectuals". I wonder if, at least for some of the persons to whom you apply that label, a better term is "skeptic".
I'm guessing that in a group of 1000 people that you would call "anti-intellectuals", some portion of them really do deserve that label.
Similarly, in the group of propositions you'd call "scientific 'facts'", some really are beyond reasonable dispute (i.e., Newton's laws in everyday-life settings). But that there are some other propositions which the academic community and their mouthpieces (New York Times, etc.) hold with unjustified confidence.
When has this never been the case?
Also a standard unified process for replicability, reproducibility, and reuse is needed. Dock points for not stating random seeds, hardware used, metadata, etc.
Source code, or at the very least proper pseudocode, should be mandatory for all published computer science research.
The math for the soft sciences isn't as concrete and doesn't provide a good foundation. I think there are also major problems with the use of p values. It is too easy to manipulate and a lot of incentive to do so. Teach a science class (even the hard sciences) and you'll see how quickly students try to fudge their data to match the expected result. I've seen even professionals do this. I once talked to a NASA biologist who I was trying to get his chi-square value and took a little bit of pressing because he was embarrassed that it didn't confirm his thesis (it didn't disprove it though. Just error was large enough to allow for the other prevailing theory). As scientists we have to be okay with a negative result. It is still useful. That's how we figure things out. A bunch of negatives narrows the problem. A reduced search space is extremely important in science.
The other problem is incentives in funding. There is little funding to reproduce experiments. It isn't as glorious, but it is just as important.
It is a problem in physics, although a "different" problem. See my comment:
Different degrees of the same problem.
But the other problem is that the soft sciences have a compounding problem. The one I mentioned about the foundation not being as strong. As a comparison psychology needs a p=0.05 to publish. Particle physicists need 0.003 for evidence and 0.0000003 for "discovery". But the big difference is the later are working off of mathematical models that predict behaviours and you are comparing to these. You are operating in completely different search spaces. The math of the hard sciences allows you to reduce this search space substantially while the soft sciences still don't have this advantage. But give them time and they'll get to it. The math is difficult. Which really makes them hard to compare. They have different ways of doing things.
From my own experience in my PhD I've seen outrageous replication problems in CS, microbiology, neurology, mechanical engineering, and even physics on the "hard sciences" side of things. I've seen replication problems also in psychology, sociology, and political science on the "soft sciences" side of things.
People who come from a "hard science" background seem to have this belief that it is way more rigorous than other fields. I disagree. If anything, the soft sciences are actually making a movement to address the problem even if that means more articles being published saying that 40% of psych papers are not reproducible or whatever.
From my experience, at least in horticulture, scientists leave out a few methods because there's some sort of potential patent or marketing process in the works and they don't want to reveal too much and be beaten to the punch.
Does the same method-hiding hold true in journals without length limits or different review processes?
In the software world, Bioinformatics has 2 page application notes. That is nowhere near enough room to have a figure, describe an algorithm, and describe results. In cases where the source code is available, I've found the actual implementation often has steps not described at all in the algorithm. And these differences make a clean room implementation difficult to impossible if you want to avoid certain license restrictions.
Since it has been a decade since I worked in a wet lab, I'm less familiar with examples in that world, but I know not offending chemical vendors is a concern for some people in the synthetic chemistry world. At a poster session, they'll tell you that you shouldn't buy a reagent from a particular vendor because impurities in their formulation kill the described reaction. They won't put that in a paper though.
Software Engineering has Continuous Integration, since it is so expensive to fix software later in the day.
Is there any such thing as Continuous Reproducibility?
Constantly checking that the science can be reproduced?
How prevalent is this in different branches of Science?
The good news is that you can't really fake proofs or formal analysis. But the truth is, many folks in the area do cherry pick use case examples/numerical validation as much as you see in other disciplines. Perverse incentives to publish, publish, publish while the tenure clock is ticking keeps this trend going I think.
And I say this having used Docker myself to make one piece of computational research reproducible. I'm not sure it helped in the end. I do encounter people who want to reproduce it, and mostly I have to teach them how to use Docker and then apologize for all the ways it goes wrong.
It's my understanding that most published mathematical proofs aren't "hey look at this theorem in first order logic that we reduced to symbol manipulation"; rather, they present enough evidence that other mathematicians are convinced that such a proof could be constructed.
Is that incorrect?
Can someone in the field comment?
Just because a paper contains a proof doesn't mean that the proof is correct nor that it's comprehensible. Further, even if a paper went through peer review, it doesn't mean that it was actually reviewed. I'll break each of these down.
First, a proof is just an argument that an assertion is true or false. Just like with day to day language, there are good arguments and bad arguments. Theoretically, math contains an agreed upon set of notation and norms to make its language more precise, but most people don't abide by this. Simply, very, very few papers use the kind of notation that's read by proof assistant tools like Coq. This is the kind of metalanguage really required for that precise. Now, on top of the good and bad argument contention, I would also argue that there's a kind of culture and arrogance associated with how the community writes proofs. Some years back, I had a coauthor screaming at me in his office because I insisted that every line in a sequence of algebraic reductions remain in the paper with labels. His contention was that it was condescending to him and the readers to have these reductions. My contention was that I, as the author of the proof, couldn't figure out what was going on without them and if I couldn't figure it out with all those details that I sincerely doubt the readers could either. Around the office, there was a fair amount of support for my coauthor and removing details of the proof. This gives an idea of the kind of people in the community. For the record, the reductions remained in the submitted and published paper. Now, say we removed all of these steps. Did we still have a full proof? Technically yes, but I would call it hateful because it would require a hateful amount of work by the readers to figure out what was going on.
Second, peer review is tricky and often incredibly biased. Every math journal I've seen asks the authors to submit to a single blind review meaning that the authors don't know their reviewers, but the reviewers know the authors. If you are well known and well liked in the field, you will receive the benefit of the doubt if not a complete pass on submitted work. I've seen editors call and scream at reviewers who gave "famous" people bad reviews. I feel like I was blacklisted from one community because I rejected a paper from another "famous" person who tried to republish one of their previous papers almost verbatim. In short, there's a huge amount of politics that goes into the review process. Further, depending on the journal, sometimes papers are not reviewed at all. Sometimes, when you see the words "communicated by so-and-so" it means that so-and-so vouched for the authenticity of the paper, so it was immediately accepted for publication without review. Again, it varies and this is not universal, but it exists.
What can be done? I think two things could be done immediately and would have a positive affect. First, all reviews should be double blind, including to the editor. Meaning, there is absolutely no good reason why the editor or the reviewers should know who wrote the paper. Yes, they may be able to figure it out, but beyond that names should be stripped prior to review and readded only at publication. Second, arbitrary page limits should be removed. No, we don't need rambling papers. If a paper is rambling it should be rejected as rambling. However, it removes one incentive to produce difficult to follow proofs since now all details can remain. Virtually all papers are published electronically. Page counts don't matter.
In the long run, I support the continued development of proof assistant tools like Coq and Isabelle. At the moment, I find them incredibly difficult to use and I have no idea how I'd use them to prove anything in my field, but someday that may change. At that point, we can remove much of the imprecision that reviewers introduce into the process.
I'm going to bet that in competitive research branches, for practical applications, that have objectively verifiable results, most studies will, in fact, be reproducible.
Only in principle. Most studies are episodic -- they have a beginning, an end, and publication of results only after the end. For continuous replication to exist, laboratories would have to be much more open about what they're working on, less competitive as to ideas and funding than they actually are.
> How prevalent is this in different branches of Science?
It's nonexistent everywhere but in "big" physics, where (because of the large number of people involved) people tend to know in advance what's being worked on. But I'm only saying physics opens the possibility for continuous replication, not that it actually exists, mostly because of cost.
I know that is definitely one of my driving visions for ibGib, and I can't be the only one. Open source, open data, open collaboration.
Scientific experiments usually need actual things to be manipulated in the real world. So I think a concept of Continuous Reproducibility may only applicable to a subset of science that can be done by robots given declarative instructions.
I suggest that future articles about the reproducibility crisis should either: a) Specify "life science" in the title, or b) demonstrate that the generalization is justified.
My field (physics) is certainly not perfect, but we do have a reasonable body of reliable knowledge including reproducible effects. I work for a company that makes measurement equipment, and we are deeply concerned with demonstrating the degree to which measurements are reproducible.
They could even smoothen the process by giving the draft a 'Nature seal of approval' that the authors could use to get other institutions to replicate their work, and add a small 'Replicated by XX' badge to each publication to reward any institution that replicated a study.
Funders of studies might improve the quality of the research they paid for by offering replication rewards. I.e. 5% of all funding goes towards institutions who replicated results from research they funded.
Of course there would still be some wrinkles to iron out, but surely we could come up with a nicely balanced solution?
In addition, if you spend too much time trying to replicate others' work, you have no time to work on the things that will actually continue to fund your own research.
The best thing is to have a healthy, skeptical, but collegial, competition in the field. That still requires more funding though!
Do you have data for this claim? There's tons of extraordinarily expensive experiments through and through, but there's also stuff with incredibly high and time-consuming up-front design and exploration costs that is actually almost trivial to replicate on a per-unit basis.
Locking a lab that can't afford to fund an independent replication study out, or who isn't prominent enough to be able to rouse someone's attention to do it, would be disastrous for a lot of early career researchers.
On the other hand, the money currently being invested seems to produce mostly garbage. "Fake science", just to rustle some feathers.
Should our publication have been prevented simply because nobody else had the resources to replicate?
Well... maybe, yes?
Just because you are the only ones able to do the computation doesn't mean your methods or your results are correct.
however, If you follow your logic, you would prevent CERN from publishing the discovery of the Higgs Boson, because nobody else has a particle accelerator powerful enough to detect them. You would prevent the LIGO people from publishing because nobody else has a multi-mile-long interferometer that can detect gravity waves. There are many unique resources that would be prevented from publishing under your idea.
Contrary to what most non-scientists claim about science, reproducibility is not a requirement for discovery science. What is import is scrutiny- ensuring that when people discover something using a unique resource, other people can inspect the methods and the data and decide whether to believe the results.
You're addressing another issue: they built replicability into their scientific method (which is awesome) but it's still within a single logical entity (which happens to be distributed throughout the world).
LIGO went one better and injected fake events to make sure their analysis systems were working.
Did you think about the replicability of your research while you were working on it?
As for thinking about replicability, sure I thought about it. That's why we released all the data we generated and described our methods in detail- so pharma and academics could use the data with confidence to design new drugs that target GPCRs. I also decided that rather than focus on replicability, I would focus on discovery and sharing, and I think that actually is a better use of my time as well as the time of other scientists.
I find this an easy and useful distinction and your publication should NOT be prevented from being published by this measure.
What is the difference between your paper and an Uri Geller "experiment"? Both are extremely hard to duplicate for objective reasons so their results have to be accepted based on reputation. (Imagine someone trying to publish the same thing but using "Startup LLC" instead of "Google".)
It's pretty clear what's different between my paper and a Uri Geller experiment. If you can't see the difference, you're either being obstinate or ignorant. We certainly aren't banking on our reputation. A well-funded startup with enough cash could duplicate what we did on AWS GPUs now. I would be thrilled to review their paper.
Incidentally, Scott Alexander just published an article  with a great quote:
"Peer review is a spam filter. Replication is science."
By whom? I think you're using as a premise the very matter that is being discussed.
I didn't always believe this, but after spending a bunch of time reading about the numerous times in history when one scientist discovered something, published the results, and were told they were wrong because other people couldn't reproduce their results, only to be shown correct when everybody else improved their methodology (McClintock, Cech, Boyle, Bissell etc).
This ultimately came to a head when some folks critized the lack of repeatability of Mina Bissel's experiments (for those who don't know, she almost singlehandledly created the modern understanding of the extracellular matrix's role in cancer).
She wrote this response, which I originally threw on the floor. http://www.nature.com/news/reproducibility-the-risks-of-the-... After rereading it a few times, and thinking back on my experience, I changed my mind. In fact, Dr Bissel's opinion is shared by nearly every top scientist I worked with at UCSC, UCSF, and UC Berkeley. The reason her opinion has value is that she's proved time again that she can run an experiment in her lab that nobody else in the world is capable of doing, and she takes external postdocs, teaches them how to replicate, and sends them off to the world. In other cases, she debugged other lab's problems for them (a time consuming effort) until they could properly reproduce.
I believe reproducibility is an aspirational goal, but not really a requirement, for really good scientists, in fields where reproduction is extremely hard.
Tom Cech and his grad students discovered that RNA can have enzymatic activity. They proved this (with excellent control experiments eliminating alternative hypotheses) and yet, the community completely denied this for years and reported failed replication, when in fact, the deniers were messing up their experiments because working with RNA was hard. Cech eventually won the Nobel Prize.
Stanley Prusiner: discovered that some diseases are caused by self-replicating proteins. Spent several decades running heroic experiments that nobody could replicate (because they're heroic) before finally some other groups managed to scrape together enough skilled postdocs. He won the Nobel Prize, too.
Barbara McClintock- my personal favorite scientist of all time. She was soundly criticized and isolated for reporting the existence of jumping genes (along with telomeres) and it also took decades for other groups to replicate her work (most of them for lack of interest). Eventually, she was awarded the Nobel Prize, but she also stopped publishing and sharing her work due to the extremely negative response to her extraordinary discoveries.
Mina Bissel went through a similar passage, ultimately becoming the world leader in ECM/cancer studies. She will likely win a Nobel Prize at some point, and I think we should learn to read her papers with a certain level of trust at this point, and expect that her competitors might actually manage to level up enough to be able to replicate her experiments.
Because if a piece of work is important enough. People will find the truth. If it is not important enough.. then it does not matter.
And sometimes it is impossible to solve a difficult problem using the first lab one builds in pursuit of that concept.
In these cases the true type of laboratory needed only comes within focus after completing the first attempt by the book and having it prove inadequate.
Proving things is difficult not only in the laboratory, but in hard science it still usually has to be possible to fully support your findings, especially if you are overturning the status quo in some way.
That still doesn't require findings to actually be reproduced in a separate lab if the original researchers can demonstrate satisfactory proof in their own unique lab. This may require outside materials or data sets to be satisfactory. Science is not easy.
I think all kinds of scientists should be supported in all kinds of ways.
Fundamentally, there are simply some scientists without peers, always have been.
For unique breakthroughs this might be a most likely source, so they should be leveraged by capitalists using terms without peer.
Part of publishing is to put it out for discussion. It should be the best effort and thought to be true, but science is about the evolution of knowledge. Very few things are completely understood, especially in biological and micro/nano fields.