At the same time, I think one of the biggest inaccuracies in the public perception of science (and one of my pet peeves) is the idea that all science is hypothesis science. It turns out there's also still plenty of discovery science to be done -- and while it's less common than it was 100 or 200 years ago, it's quite important!
In geology, this often quite literally takes the form of a blank space on the map -- there are plenty of unmapped quadrangles on the geologic map of the world at 1:24,000 and finer scales, and the USGS will pay you just to learn how to fill them in .
This is one of the few subjects on which the misinformation is so pervasive that even the wikipedia article  is substantially inaccurate (I blame the overly-simplified formulation of the scientific method that most of us are first exposed to in elementary school). The one part that you can correctly infer from the wp article though, is that discovery science is having a bit of a resurgence recently thanks to the proliferation and reuse of large datasets.
One of the simplest distinguishing characteristics is that there's no such thing as a negative result in discovery science: if you're mapping a blank area, whatever you find will be something we didn't know before.
Edit: if this comment is unhelpful, please suggest an update.
If you were handing out the funding, assuming equally good teams, do you choose the work that targets Parkinson’s mechanisms or the functional map of a less-explored bit of CNS?
We are doing basic exploration as well. We are just prioritizing targeted work in a resource constrained environment.
I don't think the issue is about resources per se. I think it's more about Universities prioritizing high impact research, as in, the kind of research that appeals to non-scientists, in industry, media, or academia in general.
Thus either neurology is way more mature than I'm aware of (a real possibility, I'm far from knowledgeable on it) or we do have a problem.
We do know a LOT about neuroscience, but we've no idea how much we've yet to discover. Likely, it's a fractal, so we're not anywhere close and never will be. Like in Physics, we know a LOT about electrons and how to build a bridge, but we know nothing about 'dark energy'. It's also true in neuroscience that we know a LOT about action potentials but nearly nothing about how astrocytes interact with synapses. It's not a smooth march, but a VERY spasmatic stumble.
I don't think fields should progress in any manner whatsoever. They progress as they do and at their own paces. Otherwise it wouldn't be research, it'd be following plans. It's not something that can be proscribed, but more described.
Total anecdata, I have no source for this.
I've had opportunities to review papers from people I have co-authored with before and declined the opportunity. I've reviewed a paper, then got an invitation from another journal to review the same paper. I declined, but sent the editor my former review. I thing there are strong social mores in the scientific community.
To take geologic mapping as the example again, the economic ROI has fallen over the past few centuries, when there were bigger unknown areas and you were more likely to literally or figuratively strike gold with only a coarse survey. So while there's still funding for pure mapping, there's not as much as there used to be.
I would argue that the scientific benefits of pure discovery are often still quite high, but it's typically easier to convince reviewers to fund a project when you can point to specific well-defined hypotheses that you'll be testing rather than just "because we don't know what's out there".
It's a bit of a yin-yang situation IMO: discovery science often leads to testable hypotheses for future work, and things we learn from hypothesis-testing can open up new avenues for discovery
It regularly astonishes me how easily people are willing to accept scientific malpractice with such excuses.
While I agree that mechanisms like registered reports are the way to fix these things at the core, I do think that these mentalities ("it's just the system, I can't do anything but cheat with my scientific publications, my career!") are a big part of the problem.
Good read on the issue:
I get that. But the other side is that we pump up a lot of kids about science, then filter for very ambitious, very dedicated people and shove them into a resource-constrained environment where some kinds of results are much more strongly rewarded than others. It's absurd to set up a system like that and expect that science will be done to the level of frankness that you (and I) want.
I know some really smart, hardworking, very honest PhDs who have shifted to the tech industry because the roll of the research dice didn't come up right for them early in their careers. They are far from happy about that. I wish they could have stayed in science.
And I'd note that the person you're replying to is not "accepting scientific malpractice". They're saying they can't blame people for trying to survive in the system that they're caught in. If you want to be mad at somebody, be mad at the people who have set up the system, who could change it but don't.
It's unscientific and illogical to expect humans to sacrifice their lives on the altar of your belief: statistically, that's not what they're like. Especially in a system that filters out the most honest, least career-focused people early on.
This is a malpractice that effectively invalidates the research. But it doesn't feel so. It feels more like a thought-crime.
This is 'data mining' right? And I've occasionally wondered about this, since I don't work in a scientific field but did once make use of the scientific method for some research I did. And yes the findings weren't especially conclusive but I'm not sure I could've tweaked the hypothesis to make it work.
So, had I found something really interesting that didn't fit the hypothesis, is the 'right way' to conduct a new experiment from scratch? So say I did that, and used the 'tweaked' hypothesis, of course I'd find something interesting, because it's already there.
In this new 'pre-registration' framework, how can I correct the problem and pursue the interesting idea but keep the science in-tact? Because, if I used some sort of cross-validation at the outset and I have all the data available I presumably can't change the sample, so the hypothesis presumably has to change.
There are methods to account for follow-up experiments. Bonfaroni correction , for instance, requires you to increase your significance level with each new test.
On the other hand, other tests typically require the researcher to make explicit assumptions on the correlation structure of the experiments despite the fact that it is not directly observable.
Another more recent technique for 'exploratory' yet correct technique is to exploit differential privacy and dithering.
That would be datamining done wrong. Its perfectly fine to look at data to provoke new hypothesis. But you should not be using the same data to confirm the hypothesis that it provoked. Either use fresh data or make sure that you still ensure correctness if you are reusing the data.
If between tweaking the hypothesis and publishing it, you add the step "you perform another experiment which tests the tweaked hypothesis", you have just described the scientific method.
Though of course, there's a difference between "drug A doesn't work for condition B but seems to work slightly for C" and "drug A doesn't work for condition B but 10 of 12 individuals with condition C have shown significant improvement"
Most of the cases: yes, although your example of clinical trials is slightly different, and in that case, I do think the data should be publicly available to other researchers even in the case of a null.
In fact, you could replace  by a number arbitrarily close to 100% by increasing  accordingly.
You have an hypothesis, do an experiment, it fails. You mark the hypothesis false and move on, never putting the work into publishing it (why would you?).
At the same time, 19 other researchers have the same idea. Some 18 of them do the same as you, but one does the experiment and get a success. He will publish his work (why wouldn't he?), and it will be the only piece of literature available about the subject.
Where on this narrative did anybody do anything even remotely unethical?
Agreed in theory. In practice such don't even qualify - by definition - as science. Using the word science in such contexts gives such things far more credit and legitimacy than they deserve.
Likewise. The BBC reports that 2/3 of scientific research cannot be reproduced. Let’s be honest about what that means: 2/3 of scientists are fraudulent or incompetent.
No, it means that 2/3rds of the attempts to reproduce research fails.
When you work on the very edge of science, the lab you're working on it probably either the only one, of one of a very small number who are even able to perform the research that you're doing, due to a combination of highly specialised equipment, subject knowledge and just plain experience. Without that combination of factors, it's quite easy to mess up an experiment and therefore fail to reproduce the research.
I've never heard that. Is there some evidence that it's largely luck? Does the theory imply that Einstein and Newton were extremely lucky?
For your everyday empirical research there is a lot of luck. You have to disprove a lot of possibilities to get to the true data. And as we only reward the positive findings your lucky if you’re test that one truth. If you’re sifting through drugs to find there possible other uses you’re going to have a lot of null results. But those are good they’re still extra knowledge.
Had this registration been in effect 300 years ago, Newton/Darwin would still be fine.
As to luck, who can say?
The fix for that would be to change science funding so a null finding doesn't hurt the researcher's career.
Let's say you try a particular educational intervention. it turns out that if has no effect on children's learning.
This is really useful information, because it means that we now know that there is no point in schools trying it.
There's something to be said for brute force on rare occasions though.
By publishing null results you avoid 100 scientists going the same endearing but unproductive routes, which can also help build good hypotheses and speed up progress.
A null result (no significant evidence for anything, a p value above 0.05) is truly null. It doesn't even confirm the null hypothesis.
Common wisdom says that Statistical Hypothesis Inference Testing doesn't work because researchers engage in "P Hacking". That's a half truth, researchers really don't understand the meaning of the p-value.
The stronger "absence of proof is proof of absence" would be fallacious.
A more accurate way to state this is “Absence of evidence is not always evidence of absence”. Or “Absence of evidence may or may not be evidence of absence depending on the circumstance”, though it doesn’t roll off the tongue as easily.
"When trying to reject the null hypothesis (sensu Fisher), absence of evidence (for its invalidity) is not evidence of absence (of its invalidity)."
Fisher wasn't the only statistician, he wasn't the only confused one either. We can go with Neyman and Pearson instead:
"When trying to reject the null hypothesis with a sufficiently powerful experiment, absence of evidence is evidence of absence."
This requires a power analysis, which is something most studies lack. Constantly reminding everyone of these details and misguided philosophical differences is tiresome, so I prefer to go with Laplace, Bayes, and Jaynes:
"Everything is evidence."
Today, most science is funded by government agencies, like the NIH. This means, bureaucrats who know nothing about science allocate other people's money to research they don't care about. Since they don't understand the research, they need objective measures of "research quality", and they picked number of publications and impact factor of the journals as measures. What you measure becomes your objective, and we got "publish or perish" and all sorts of scientific misconduct.
Science needs a free market, too.
Five million euro of tax payer money have been blown on a Neanderthal genome. All we got out of it is a hyper sensitive analysis that picked up an artifact (source: first hand experience with the raw data) and called it admixture. But it's a high profile publication thanks to the catchy headline and more tax payer money flows in the same direction.
This wouldn't have happened if the bureaucrats who presided over the grant money knew anything about the science and looked into the methods. Or maybe it would have happened anyway, because this was never science but just a publicity stunt. Either way, funding "science" this way is bad.
A great introduction to the topic and a great general guideline. It's not a collection of recipes, more like a nudge in a direction where you don't need canned recipes anymore.
"Surely that only works if your prior belief in your hypothesis is neither exactly 0 nor 1?"
And that's correct. Prior distributions are never completely concentrated at either extreme, for those could never be updated. If your hypothesis is now totally wrong, your confidence in it (the posterior probability) will converge toward zero.
In other words, its generally much easier for your head to come out of the barrel without an apple.
I get what you're saying, but the counter to that is it would be too easy for scientists to conjure up studies that they have no reason to believe is true, do an experiment, publish the null finding (that frankly, they were expecting), and ask for more money.
It's almost trivial to think up experiments that will give you a null finding.
Unless of course you want to reward null findings which might, depending on the circumstances, open another can of worms.
What about a lowly restaurant-owner who skimps on food quality or sanitary costs to "help their career?" A psychopath or "can't blame them?"
Erroneous research can also have human costs.
White collar and blue collar workers should both be scrutinized equally.
I'm not sure where this acceptance of moral depravity among more-privileged peoples comes from, but I really don't like it.
For example, a big part of how the file drawer effect happens is that writing papers is expensive and time consuming, and it's hard to get anyone to publish negative papers, and academics' careers operate under a rather brutal "publish or perish" regime. All that adds up to, if you get a negative result, you've got a whole lot of very concrete reasons to cut your losses and move on. The goody two shoes who's scrupulous about reporting all their findings is not going to get rewarded for their efforts with a job. Nor will they be rewarded with the satisfaction of knowing they've done their part to improve the average quality of the published literature. Out-of-work scientists don't get many opportunities to do research.
This is the most-downvoted comment I've ever made on HN, and I think your explanation is in summary what people disagree with.
However, the problems of bias and integrity in scientific research can and do have costs in terms of harm to human life. It's just that the connection between just following incentives and bad scientific research is much more abstracted, and therefore is not clearly intentional negligence, as the case with something like food safety.
Preregistration of a metaanalysis makes sense and should be standard practice.
One thing I don't know: what does preregistration even look like for a metaanalysis? You can state some conditions up front, but a lot of the degrees of freedom only come into play once you're deep in the work looking at specific studies.
The broad problem isn’t cheating, it’s people not publishing null results. That applies equally to experiments and meta-analyses.
Of course to be a proper study they should pre-register and only report on future publications.
Hopefully detailed null results will avoid too many people going down the same path. But even more hopefully, it will let other researchers read the process and perhaps find another way to actually get to success.
"We tried A x B in this way, that way, some other way, none of them worked" is pretty valuable info. Also pretty good when it comes to confirmation of theoretical work
How searchable is this data? Like, do I need to be an expert who is up-to-date on most proceedings in the subfield to know this, or is this information easy to pull up with a few searches?
It is very easy to run a terrible study.
Even with a good idea, it is very easy to run a terrible study.
Should negative results be replicated?
I would be so published all the time...
At least for me, there are two cases why I'd bother to do that and why I could write in a grant proposal that this work is necessary - either I want to build on that study; in which case most likely I wouldn't publish just a pure replication but rather a comparison of my changes with my replication of the original study, and this paper would be counted as a novel study. Alternatively, I'm doing a replication because I'm not certain whether the original study is actually true, because I have some solid reason to believe that it's wrong.
As I said before, at any point of time in any scale of research the next research steps that "need to be done" vastly outnumber the resources to do it, so obviously not all of these things can be done. The majority of reasonable grant proposals get rejected, so that research doesn't get done - it's all a matter of prioritization; unless there's solid argument that this research task is within, say, the top 20% of the important research tasks, it won't get done. And most studies are not so important to justify repeating the effort; perhaps it was justified to expend X resources to get to that result, it doesn't necessarily mean that it's worth to spend 2X resources to get to a slightly more certain result after replication. It only needs replicating if lots of people are going to build their research on top of these results, and that simply doesn't happen for most studies.
Also, at the very least, if I had a general idea, these "nulls" would help me to further refine what I might want to poke at. The way null is used here is wrong and misleading.
No doubt, I agree with you. We all do. My rub is that I see repeated calls from science that the public bow to its mastery. Unfortunately, a institutional lack of transparency is hardly grounds for trust. Perhaps it's time for science to hold itself up to it's own standards?
I like how they're clear how their findings are dogfood at this point. They should pre-register a proper meta-study now :-)
In a more serious vein: Do they actually intend to pre-register meta-studies too?
On the Internet, we have unlimited room for publishing results; there's no reason not to publish the negative ones.
When research was published only in paper journals, there was scarce space even for positive results; possibly it would have been considered a waste to use that precious space for negative results. Also, people reading those journals had limited time and wanted the most important results; they may have expected what we would call a 'curated' collection of studies. These days the studies are published in databases and nobody expects them all.
Even a failed creation has parts that can be re used in other projects after all.
Like, I've read anecdotes from people saying that the editors or whomever at some of these journals might turn down publishing null findings. Think about that for a second. What does that tell you about their mindset? What possible reason would you have for not publishing null findings?
Editor: "We gotta move these journals johnny! They need those spicy findings. If the findings aren't spicy, this stuff won't sell."
That's hyperbole, but that's essentially the only reasoning I can think of, and it's absurd. Like, what professionals reading journals are going to be like "Whelp, the findings in this journal haven't been spicy. I can't dab on this nonsense. I'm going to start reading the other journal." said no researcher ever; neither literally or in essence.
A point null hypothesis for a continuous variable is literally always false. Especially for the softer science, I've even seen studies mocked for having too many data points, since it's known that with enough data null hypotheses are false.
The story is better if your null hypothesis is an interval, but then you're really just obliquely using the interval to bound something you could be measuring more directly anyway.
What I'd like to see is moving away from null hypothesis testing altogether and focusing on measuring things. For example, focusing on measuring effect sizes, or the probability that a hypothesis is true.
For instance, in searches for new phenomena in high-energy physics, one usually puts an upper limit on deviations from the expectation of "known physics" (i.e., standard model). That essentially translates to statements like "if this particle exists, its mass should be higher than X TeV, or else we would have seen it already in our data". Of course, in reality, the particle probably does not exist, so you cannot really measure its mass!
Null hypothesis tests basically try to calculate the probability of a data set given the null hypothesis. What you really want is the probability of a hypothesis given the data set.
So in that case, you want to estimate the probability of theories of physics, such as those that include the particle and those that don't.
There should probably be a Nobel Prize for null findings to fix the incentive structure. But how do you grade and compare the different null findings? By effort? The ramifications of a null finding are likely more limited.
Essentially: when awarding a Nobel Prize, map out which null findings narrowed the path sufficiently to support the Nobel Prize worthy work, and give them recognition as "supporting acts".
You write as if there are "positive findings" and "negative findings". This isn't true in orthodox statistics. There are only "negative findings" (the null hypothesis is rejected) and "null findings" (nothing is rejected, nothing is confirmed). Only the negative findings get published.
What doesn't exist is "positive findings". Nothing is ever confirmed: not the null (it's assumed to be true) and not the alternative (it's not even tested).
Now who wants to print a journal in which 95 out of 100 articles say "we learned nothing" and the other 5 can't be reproduced? Much better to print a journal in which every articles claims a results, even if none of them can be reproduced.
The studies saying the five can't be reproduced, where are they.
If I'm designing an experiment to attempt to confirm a theoretical model then finding similarities in the 10 prior attempts that failed could give me clues as to what to try. Certainly if 10 respected labs have done things in exactly the way I was going to try then it's worth questioning long and hard whether I really need to repeat that procedure.
Why did these all fail to reject the null hypothesis. That's a powerful question.
Because that's nearly always the outcome. By conventional statistical metrics, the null hypothesis isn't rejected >=95% of the time.
"The studies saying the five can't be reproduced, where are they."
They don't get published, because of the aforementioned statistical problem. The bias toward positive results isn't irrational; it's a natural response to the fact that the vast majority of what any scientist produces will be a "negative" result.
The way you learn what not to try is by studying under experienced scientists, and talking to other current practitioners. For any field, there's a vast shared experience that guides experimentation. As a new researcher, a good place to find this kind of information is in review articles and book chapters. But mostly you get it by working with experienced people.
On the other hand I think something that does support your point is that there are also a lot of bad hypotheses being published, increasingly even in reputable journals, after what is clearly extensive p-hacking. 'So, yeah we took these 29 variables measured in arbitrary, yet extremely specific, fashion, and lo and behold - our hypothesis is affirmed!' It's hard to see how these papers get published outside of the 'spiciness'.