During a conversation with an academic researcher (non-Computer Science) friend, when I brought up the topic of data sharing, especially in context of the infamous "replication crisis", they have their reasons not sharing.
I'm loosely paraphrasing here, while trying hard not to misrepresent/misremember their exact views:
"I want to protect my data; I don't have enough time to present my data in a presentable form; and more importantly, they'll just steal my idea and go present it as theirs—and I might lose funding" ... and so on.
I can empathize with the academic pressure of "publish or perish". And not least of all, "need some food on the table, and roof over my head".
But I still wonder, there must be other effective ways to gently persuade a said researcher (especially in the 'soft sciences'—I'm not using the term derogatorily) on the importance of sharing data that allows reproducibility of a given experiment?
The replication crisis will continue until publishers incentivize replication.
What remains is the "small matter" of persuading the publishers to action.
And that's a good thing. A revolution in academic science led by Elsevier would be like King George leading the American Revolution. Real fundamental change isn't going to come from the profiteers who caused the problem in the first place.
If NIST required that you publish the data (say within N years to cover the concern about getting scooped on follow-up papers), and dinged you on future funding applications if you didn't meet their quality/reproducibility metrics, perhaps that would help to align incentives.
This is the same sort of idea as requiring research from public funding to be put in open-access journals so that the public can benefit from it.
But since I'm an academic layman, I can't punch any valid holes in the idea. :-)
Regardless, I hope this idea gets discussed more by the real stake holders. Perhaps you might to do a public write-up of the idea.
Finance has the exact same problem.
Of course, that ignores the analysis section, which is somewhat important.
But still, a vast improvement over today's way, where you may end up spending a lot of time only to get a vague reject.
But the 'variable under test' is definitely part of the hypothesis, and a study without a hypothesis is, not useless, but much less likely to produce a useful result.
There's a relevant xkcd:
If the (pre-registered) hypothesis was that green jelly beans cause acne, then this is at least an interesting result.
If you run this experiment with no particular hypothesis, and then decide on that basis that green jelly beans cause acne, this is just a setup for a later failure to replicate.
At best. At worst, no one bothers checking your results, the company stops selling green jelly beans due to bad publicity, and people who enjoy the green ones are deprived of them for no good reason.
In the proposed experiment, the hypothesis is, "Green jelly beans cause acne.", and the variable under test is "whether or not green jelly beans cause acne". The hypothesis is basically a statement of what variable is under test AND a prediction of what the results of the test will be.
What I'm saying is that the prediction part of the hypothesis is basically a statement of bias. It's useful in noting what an interesting result would be and what the biases of the researcher might be, but it arguably is counterproductive in that it takes the focus off of observing the variable with an open mind.
If we simply preregister the variable under test, that gives us everything we actually need from the hypothesis, and avoids the problem you describe: instead of testing 20 different variables and only publishing on the one that yields a low-P result, we test the single variable and that's it.
The reality is a lot of time spent by scientists is actually defining the problem properly (even after you think you’ve defined the problem properly), and preregistration studies would hinder the flexibility to adapt and change the direction given new understanding.
The idea of evaluating experiment design and committing to publish before seeing any data/results also strikes me as deeply flawed. Graduate students can sit around think up many deep questions and draw up tons of beautiful experimental designs. That is the easier part. The harder part is actually running the experiments properly and interpreting the findings correctly. If we are publishing experiment ideas, every graduate student would be submitting 100s of ideas to top tier journals. However, most of those experiments would end up producing inconclusive or utterly garbage results, which would provide little insight if published.
The other issue I see is that many interesting findings come out of serendipitous results that are not related to the initial experiment or hypothesis. You start out trying to answer one question and then stumble upon some else that is very interesting (so you then explore that). If you have already submitted one experimental design and are supposed to publish on that, what do you do if you stumble upon something else that is actually much more interesting? In my experience with research you don't often go directly from one question/hypothesis/method to an answer and then publish. It's much less linear than that with many dead ends along the way. It's just not something you can pre-publish.
All of that said, one area where I think we would agree is on publishing the results of well performed experiments with null results. I think it would be useful to the community if people published reports saying "we had this interesting question, did a proper experiment, and it turns out the hypothesis was null and there was nothing interesting there." In the current system, that will likely not get published. But, other researchers would benefit from seeing that result because they may have the same hypothesis and may waste time trying to explore the same dead end. A Journal of Null Results could be useful here. However, you do run into the issue of not knowing if an experiment failed because of a researcher mistake or error...
The problem we're trying to solve isn't poor experimental design, it's a failure to publish "boring" results--a bias against uninteresting results which should cause us to distrust any interesting result.
> The idea of evaluating experiment design and committing to publish before seeing any data/results also strikes me as deeply flawed. Graduate students can sit around think up many deep questions and draw up tons of beautiful experimental designs. That is the easier part. The harder part is actually running the experiments properly and interpreting the findings correctly. If we are publishing experiment ideas, every graduate student would be submitting 100s of ideas to top tier journals. However, most of those experiments would end up producing inconclusive or utterly garbage results, which would provide little insight if published.
An inconclusive result does provide insight. If you have 1 study showing vaccines cause autism and 37651 studies which are inconclusive, that starts to be pretty conclusive.
If the results produced are "utterly garbage", then that really shows us that the "beautiful experimental designs" weren't actually effective.
> The other issue I see is that many interesting findings come out of serendipitous results that are not related to the initial experiment or hypothesis. You start out trying to answer one question and then stumble upon some else that is very interesting (so you then explore that). If you have already submitted one experimental design and are supposed to publish on that, what do you do if you stumble upon something else that is actually much more interesting? In my experience with research you don't often go directly from one question/hypothesis/method to an answer and then publish. It's much less linear than that with many dead ends along the way. It's just not something you can pre-publish.
You publish the thing you agreed to publish, mention the interesting phenomena in the analysis, and apply to study it.
> All of that said, one area where I think we would agree is on publishing the results of well performed experiments with null results. I think it would be useful to the community if people published reports saying "we had this interesting question, did a proper experiment, and it turns out the hypothesis was null and there was nothing interesting there." In the current system, that will likely not get published. But, other researchers would benefit from seeing that result because they may have the same hypothesis and may waste time trying to explore the same dead end. A Journal of Null Results could be useful here.
A journal of Null Results already exists, and doesn't solve the problem because it's just reversing the publication bias.
> However, you do run into the issue of not knowing if an experiment failed because of a researcher mistake or error...
I'm pulling this out because it is worth addressing, as it fundamentally misunderstands the problem.
An experiment that gets a null result is not a failure. An experiment which fails to observe the variable under test is a failure. If science is working correctly, I would expect the majority of correctly-performed experiments to produce null results, because the obvious phenomena have already been discovered. The idea that "interesting result = success and null result = failure" is fundamentally unscientific and needs to be stricken from the human consciousness.
And, the way this bias plays out in practice is that you're sitting here wondering if an experiment produced a null result because of researcher mistake or error, but not applying the same skepticism to "interesting" results. This is the opposite of a rational position: as far as I know, there isn't a replication crisis for null results. It's the interesting results which are having a replication crisis. If you're suspicious that null results are the result of experimental error (as you should be) then you should be far more suspicious that interesting results are the result of experimental error.
> The problem we're trying to solve isn't poor experimental design, it's a failure to publish "boring" results--a bias against uninteresting results which should cause us to distrust any interesting result.
I do not see the value in a quest to do boring and uninteresting research and publish it, even if the experimental methods are exquisite. The goal of PhD research is to explore and find something new or interesting to add to the scientific body. Graduates already cannot read all the interesting papers in their fields. If you add 10x more boring papers, they are never going to be read and will just be a waste of journal editor and reviewer time.
> An inconclusive result does provide insight. If you have 1 study showing vaccines cause autism and 37651 studies which are inconclusive, that starts to be pretty conclusive.
You can't add up 37651 insignificant results and say they equal a significant result. If there was a study with significant results showing no link between autism and the vaccine, it would get published (many such studies have been published). If your study just shows nothing, then most won't care because they can't learn much of anything from it.
> If the results produced are "utterly garbage", then that really shows us that the "beautiful experimental designs" weren't actually effective.
That is one of my points. Many experimental designs are beautiful and seem great, but then when researcher do the experiment many different issues can come up yielding little to no useful data. If you had to evaluate (for pre publication) every graduate students experiment ideas, you would run out of time!
> You publish the thing you agreed to publish, mention the interesting phenomena in the analysis, and apply to study it.
No one wants to read about the uninteresting experiment. They want to read about the interesting discovery and want learn about that. So, you publish the study of the interesting part and don't bother publishing the boring experiment.
> A journal of Null Results already exists, and doesn't solve the problem because it's just reversing the publication bias.
I guess I believe a bias towards publishing interesting studies with significant results is not a bad thing. Publishing inconclusive, uninteresting, or trivial findings does not add much value, if any, to the scientific body in my opinion.
> An experiment that gets a null result is not a failure. An experiment which fails to observe the variable under test is a failure. If science is working correctly, I would expect the majority of correctly-performed experiments to produce null results, because the obvious phenomena have already been discovered. The idea that "interesting result = success and null result = failure" is fundamentally unscientific and needs to be stricken from the human consciousness.
I don't believe "null result = failure", I believe many null results are not interesting and likely are not useful if published because many experiments do not produce useful data or significant results for a variety of reasons.
That's part of the problem. Publications are seen as achievements. If you got accepted to a prestigious journal or conference, you can list this on your CV as an impressive "award"-like thing. A publication list is not just a list of "Look, this is the kind of stuff I've been working on, have a great read at it", but "Look, my research is so great it got accepted to all these fancy places!".
Publications are therefore unfortunately not merely about sharing new info with the research community but an award show. Ideally a publication would be the start of the conversation: "this is what we found, this is the method we propose, what do you think of it, community? Will you pick it up?" The test is then whether the ideas get adopted. But that's harder to measure. Citations try to approximate it, but it's a very crude approximation. A citation, as such, may mean tons of different things: e.g. a) a deep critique (Negative impact) b) being cited as part of a long block of "these other works exist, too" (Low impact) c) another work substantially based on the deep ideas of the original paper (High impact), d) being listed in a table for comparison, ala "we beat this other method", but no other discussion of the original paper (Low impact), etc.
If a publication was nothing more than a "hey, look, this is interesting", then I'd say, publishing mostly novel sexy results would be fine! After all, the surprising cases are those that teach us the most. However, as I said earlier, a paper is not only about "hey, this is interesting", but a "hey I want to advance my career", too. And in a twisted way of logic, I can agree that therefore we could put a band-aid over some of the problem by publishing (rewarding) systematic work with negative or boring results. But ultimately, this goes against the original purpose of papers, that is alerting the scientific community to potentially new information that we haven't known about before.
Ideally, to assess someone's scientific career, there would be at least one smart, attentive, impartial expert taking their time reading through the publications, taking notes, pondering, digesting it all, consulting other experts etc. However, this is too subjective.
Quantitative metrics seem superficially more objective and therefore egalitarian. The original idea is probably that if we just based everything on subjective judgement of scientific importance instead of publication count, there would be even more networking and friendship-based quid-pro-quo back scratching.
But everyone is overworked, and those who aren't, want to keep it that way. So nobody wants to put in the effort to actually interact with the deep content of research. It's too complicated and too opaque.
The problem is, flashy results are by their nature more attention-grabbing on all levels. It's not just some small perverse incentive. This is how all of us work, this is how history works, how everything works. The winner takes all, the rich get richer etc. We remember the Einsteins of history, those who just worked systematically and didn't find much aren't heroes. And if that's our bar, then people will do everything to look like they clear it. In any system, scientists would have to hype up their impact, it doesn't matter who makes the decisions.
Currently, universities want to employ researchers who will make a visible impact. Because that means attracting funding, but also attracting bright people from all around. Career-conscious researchers want to go to universities that help them market themselves well (good PR departments etc). PR is not only for laypeople as the audience, there is such a flood of research nowadays that even the experts of a small niche cannot keep up with everything happening.
The root of it is human nature, competition, deception, cliques, hierarchies. But the new about it is the scale of it, and the accompanying mechanization of it all. The idea that you can mass-manufacture innovation. That you can expect thousands upon thousands of researchers to make regular breakthroughs and, to say my field as an example, publish tens of thousands of novel AI-related ideas every year. It's related to credential inflation, and fake signaling: people with good academic track records got the good jobs and the respect, so people try to emulate that. Everyone tries to become the 1%, the rock star. And everyone wants to hire the 1%. So just like an evolutionary pressure, people try to appear like the successful. Soon enough the old signal doesn't work anymore. It used to be a high mark of educational level to have passed high school. Today that's a bare minimum. College used to be a meaningful differentiator. Now more than half of young people are "college educated" in developed countries. The next step is about becoming "researchers". Nowadays, having some publications is not a big differentiator. We see this also in title inflation like monkeying around in Excel is "data science" and "AI".
It's not just a monkey's paw. There is no central figure orchestrating it, asking the monkey's paw for more papers. It's a distributed system of agents acting in their self-interest. Nobody wants papers for papers' sake, they want to make defensible, justifiable decisions that will not get them fired and will pass satisfaction up opposite the path where the money is flowing, all the way to the CEOs, politicians and taxpayers.
I don't think that this is necessarily a problem. The problem is that these "awards" are awarded based on results rather than on work.
In my mind, a prestigious journal should be one where the studies have a high percentage of replication of their published work. A high-impact study which fails to replicate is just a popular lie, and has no place in science.
After all, the journal system was invented to solve distribution of papers. We have the internet now, so is there any need for the journal system any more?
Independent reviewers would/could easily step up to pick up the interesting papers and present a "feed" of the good stuff.
One angle of my fascination was how the [rating] baby was pretty much tossed out with the bath water. It was shouted down and denounced by journalists (as for example an adult filter which ironically ended up its only application) Some journalists described a perspective as if they had a kind of tenure. They were published for so long that the idea of a rating system was just offensive. We could argue that a good rating system would use existing talent for calibration but the real question to ask imho is: If PICS was so bad, what did we get in stead? Anon 5 star ratings? Thumbs up? HN points? Number of github saved games? To say it doesn't compete with publishing in journals is somewhat of an understatement.
End of the day all we are looking for is good meta data. If note worthy people in a field want to endorse a HN topic, a blog posting, a usenet post, a tweet, a youtube video, a facebook posting or a torrent real credit could go to the author.
A rating system or spec therefore could simply accommodate that process. (It should for example require the author and their endorsers make backups available.)
Journals are from the horse and carriage days. It is quite embarrassing how we didn't come up with something modern.
 - https://www.w3.org/PICS/services-960303.html
 - torrents are nice to share huge data sets
I hadn't heard of PICS before, it's interesting. Is there a published story about why it failed anywhere?
It isn't just journalists apparently, I think most people hate anything that smells like scrutiny. There are probably plenty under appreciated authors. They ironically have no way of knowing. I'm convinced there are and always have been fantastic ideas out there that didn't even get written down. Why even bother having them or fleshing them out if they cant be judged?
In my inner dialog I consider it the greatest puzzle of all times (and expand the scope to rating all human creations) if anyone ever truly solves it all previous revolutions will be reduced to simple events. The potential for recursive self improvement of such system is probably equal to a general AI but the results will be better.
haha, with the lack of published stories I just discover I sound to myself like I have a lot of explaining to do making such fancy claims. I will have to ponder writing it into a blog. For now a business plan that everyone is specifically uninterested in is not going to work.
But then again, that is like, just everyone's opinion? We have no means of judging the value of it.
It'd probably make writing them easier too.
The problem is way deeper than just academia. Such as, is there fairness in the world deep down, is mass-produced excellence possible? Does individual greatness actually exist or is it all just a power play?
Overall, the quality of science is extremely difficult to measure, precisely because it operates on the border of the unknown and because people try their best to appear the best possible. Science is difficult to understand and is often far removed from the here and now, and may only bear fruit decades down the line. It's hard to judge for the same reason that antelopes are hard to catch for cheetahs: competition (mainly the antelope vs antelope type).
In the end, science has only been this mass product for a few decades. Before that it was mostly a pastime of weird nerdy aristocrats or people paid by aristocrats for showoff purposes. Or church people with too much time on their hands.
In reality, from the top down view it's a huge gamble. You try to get good people to do their honest best and then see what happens. Then at the end there will be some breakthroughs. But only a few every few years in each field. However, this does not satisfy the participants. I toiled away as well, but the reward is only paid to the lucky one. So everyone tries to be the lucky one, which perversely pushes everyone to take fewer risks, making the collective likelihood of a breakthrough lower, but their own expected reward better.
My grandfather used to recite the story of a farmer who had three pigs. Every morning he'd throw two apples in their confinement. He'd then grab a big stick and beat the one that didn't get any apple: why didn't it try harder?
My prediction is that as with all signaling spirals and treadmill effects, there will be something new to aspire to, to tell the wheat from the chaff, a signal that's harder to fake. It's a constant race. You demonstrate your fitness by being adaptive to how the system changes. Overall the "quality" of people obviously doesn't change over time, it's just that the competent/powerful drive the criteria to their benefit.
As academia/publishing etc. is now flooded with "the plebs", "the elite" will move on and will perhaps use other criteria.
Now, going back to assuming this is about the object-level science itself. Where to find the best science? You cannot do this in general. You have to educate yourself and dive in yourself. You try to learn how to judge people's character and try to listen to and digest the assessment of those you trust.
There's no other way, gather experience and become "better" yourself. Use the cognitive resources of your brain to try and outsmart your opponent: the writer of the piece of text you are reading. This cannot be standardized/metrificated in a simple way (outside human-level AGI). If your organization does not put in the cognitive power of extensively processing the content of a particular research and critically examining the motivations behind it etc., there is no way to judge it. Then you're back to credentials. Did it come from a highly cited person? Is this person endorsed by other big shots, where the "seed big shots" are the researchers at the historically most prestigious institutions.
Currently, to find interesting research I personally use Github recommendations, Google Scholar alerts watching for citations of landmark papers (good indicators for progress in niches) and authors. A well-curated Twitter-feed is also useful, as is arxiv-sanity. In the end, I have to make up my mind if it's good work or not, and as everyone I don't have infinite cognitive resources. So I make snap judgements based on paper gestalt, affiliations, plausibility, result tables, etc. If it clears this bar, I dive in more. Over time, you learn to trust some smart people and can follow them online and see what they say and recommend. And continuously learn and grind your brain. Cognitive work cannot be spared, just like you cannot spare physical exhaustion in sport competitions.
Well, I think there's two separate problems:
1. Academic journals not publishing "boring" results, which incentivizes scientists to bias toward novelty and not publish null results. I think the solution to this is for journals to accept based on methodology and subject matter of experiments/studies, and to commit to publish before the study is even done, so there's no chance that results can influence the decision to publish.
2. Science journalism being written by people unqualified to analyze the science who instead sensationalize for clicks.
I'm not particularly concerned that the people are "only reading papers from big shots". The problem is that prestige is measured by impact analysis rather than replication. If you incentivize a metric which is largely a popularity contest for ideas, it should be no surprise that what you get is popular lies. One might think that the focus on impact in prestigious journals means there might be higher replication in less prestigious journals, but from what I can tell, the less prestigious journals are no better: they're mostly targeting the same metrics, they're just less successful at reaching their goals.
 My intent is not to actually accusing anyone of intent to deceive here--I just can't quite figure out the right word. Untruths doesn't quite match what I'm trying to say.
Some people will have the best of intentions, they will really be doing "science" out of the goodness of their hearts and out of their desire to help mankind, but they'll end up publishing trash that gets acceptance and sometimes even praise.
What are the incentives that need to be instilled to fix the problems we currently have? I'd say we need to incentivize competence and humility. How? I don't know exactly, but values like these seem to be able to be instilled through cultural practices and traditions.
Also, especially humility seems to be lacking from many cases of bad science. If people accepted criticism and accepted that they don't really understand all that much, I believe scientific quality would improve. You do have a lot of reasons to not be humble in academia though, as this comic lays out.
I kind of agree, but I will state it somewhat differently. Note my experience is in physics and healthcare, so may not apply for all fields.
In my experience, the desired skill set shifts to more management/admin/bureaucracy/money-chasing once your in a professor or professor-like position, as opposed to nitty-gritty researcher in the grad school phase. The incentives for the grad school phase is good science. The incentives for the professor-like phase is grants/papers/awards.
Problem is I think that the competent are few, and when the cultural norm is that they must be humble then they stand no chance against the many incompetent.
IM(H)O: Science shouldn’t be humble in the face of non-science. As long as it is it will lose. The idea of conflict free great science is a pipe dream. We need a culture that accepts (intellectual) conflict.
However, I think that competence really only has a chance if the incompetent are humble. There will be conflicts and these conflicts should be embraced and someone who is incompetent and not humble will fight the existence of conflict rather than the actual scientific issue that needs to be solved or understood.
Real science is a hell of a lot harder than p-hacking and HARKing your way to a great career. At least some of the incompetent know this. They will not play nice.
Engaging in that conflict with each other is how we expose new ideas for further analysis.
By staying truly humble in the face of non-science, science provides a calm even backdrop, against which it is more manageable to evaluate worthwhile findings from bullshit.
This is orthogonal to conflict. A conflict could be handled humbly (Rapoport’s rule and all), or the opponents could drown the signal of their arguments in the noise of pride.
To abandon humility would be to fight noise with more noise.
I don't think this is a cultural norm so much as just the Dunning-Kruger effect  in play. People who are highly competent still realize there is much they don't know, and that makes them humble. I suspect if you go and find someone widely recognized as an expert in pretty much any field, and ask them if they know all there is to know, you'll find no one says yes.
Was Richard Feynman humble?
1. We are all - well, almost all of us - incompetent in many aspects of our lives, and competent only in some.
2. The incompetent are often not willing to simply cede their place and let the competent do what (arguably) needs to be done.
3. Proving and verifying competence is quite difficult unless you are yourself competent and close to their field competence in which is in question...
First, about identifying bad research: The "extraordinary claims require extraordinary evidence" is already practiced. It's not that "we've overturned quantum theory" articles that are causing problem - those are quickly and effectively shot down. And it is rare that the "perfectly aligns with a political interest" can be applied. The only actionable one is "see what others think about it", and it's no panacea either.
Bad and fraudulent science like the recently retracted Surgisphere covid paper is abundant. I was trying to track down the origin of the "reduce salt intake" and "limit egg consumption to no more than 2 per day / 2 per week" recommendation in the past, assuming there was hard science behind them. There isn't; and indeed they're slowly being reversed everywhere - but they were prevalent for half a century, with a lot of other research taking them as axioms.
The remdesivir trials have been p-hacked to death - anyone who took an interest was seeing it happen in real time - yet, the scientific community turns a blind eye.
The ketogenic diet is vilified in every mainstream media and most nutritional "science" publications; the headlines are rarely inline with the the actual results, but that's what people remember (Not that nutrition science is really science)
And the other recommendations about fixing it are comparable to "lets solve evils of the US 2-party system! All we have to do is make those two parties vote to take away their power". Academia and science publishing are where they are now because it benefits essentially all the incumbents (at the expense of the rest of society).
The problem description (at least in the comic) is good. Any suggested action ... not so much.
Nutrition science is real science, but unfortunately any actual nutrition science you might accidentally hear about is overwhelmed by people trying to sell a lifestyle, a book about healthy eating, or your eyeballs (to advertisers).
Ironically keto promoters are a big offender here themselves.
It's easy to study large populations and find correlations. You publish a paper, the media reports something, and that becomes the accepted wisdom. But you have no idea how to factor out all the possible confounders in your study.
It's hard to do a study where you make a change to people's diet and see if that affects health outcomes, so it is rarely done. And really that's what you need to do to see what is really going on. So what we are left with is a bunch of associations that may or may not hold up.
Each of the suggested actions, taken together, would seem to have a very positive improvement on the status quo. Would you care to explain why the suggested actions would be "somewhere between ridiculous and useless?"
Frankly, having worked in academia long enough to see at least a couple shifts in culture, the only thing I can see that comes out of this is a couple more things get added on to the ever growing checklist of publishing a paper/submitting a grant application.
I think we need to get away from the sort of thinking where large structural problems can be solved by tiny incremental improvements. If you really want to solve the problem, one or more of [Granting Agencies|Journals|Universities] has to be completely torn down and built back up.
Sort of, a huge portion of income is from grants, particularly after the first few years from being hired. More importantly, a huge portion of the University income is from grants. When a researcher recieves a grant, there is an "overhead" percentage that goes to the University. Universities hire, in part, to maximize those overheads, which means getting the researchers with the best chance at getting big grants.
Changing the hiring process may affect how PHD students act, but once they're "in the system", they are subject to all the same problematic incentives.
In my decades at it
(digital side of bioinformatics)
the cash flow is in the other direction.
right, and unless the new institutions are in a financial vacuum, they will remain built on and affected by broader systems, resulting in conflict of interest.
Not OP, but the proposed "solutions" not only add more work items to the ever-growing checklist (as mentioned by meow1032), but to be useful everyone must spend even more time checking everyone else's work items:
Solution 1, requiring data sharing and preregistration, greatly increases the work of peer review, perhaps by an order of magnitude. Someone needs to check that the data produces the published results and that the final analysis plan matched the preregistration. That is hard, time-consuming volunteer work, with no reward incentive. Current peer review currently trusts the authors did what they said they did, correctly, and it still takes 4-12 hours to review an article. Most reviewers cut corners. If no one does the work to check the open data or preregistrations, "open science" will be merely performative, with no quality improvement.
Solution 2, changing hiring policies to "look beyond publication and citation numbers", is pretty much what hiring committees already do. But with ~ 200 applicants per job opening the depth of examination per applicant is somewhat limited. As in solution 1, lack of time for deep checks is a problem. Applicants who are well-networked with good pre-existing reputations (i.e, who are plugged into the web of trust) get hired; everyone else doesn't. From a perspective of research quality, this may be a good thing.
Solution 3, funders fund boring / rigorous research, could improve matters in theory. But with only enough money to fund ~ 1 in 10 proposals, projected impact will always be an overwhelming concern. Proposals will include a "research integrity" section or similar and nothing substantive will change.
Solution 4, scientists "vote with their feet" (stop participating in the dysfunctional parts of the system), is a call for people to come up with their own solutions or support other proposed solutions, not a solution in its own right. Ironically, it is perhaps the most useful because it pushes back on the idea that poor quality science is inevitable under the current structure. "Perverse incentives" must not become a generally accepted excuse to sacrifice scientific integrity for the benefit of one's own career. Science is meant to discover new information. Without a culture of integrity, that information will always be suspect, regardless of what top-down interventions are attempted.
An effective intervention must reduce workload or at least break even, not increase it. Or increase the resources available. Otherwise people will be forced (actually forced, not just incentivized; there are only so many work hours each day) to cut even more corners elsewhere to make up for lost time.
In theory, people like the idea of making science better.
In practice, people don't like the idea of fewer papers, boring papers, ambiguity in hiring and tenure, uncertainty on financial return-on-investment, and less institutional/national pride talking points.
The bad incentives and metrics we have haven't happened by chance - they have emerged from our collective desire for science to be useful, sexy, and a reliable function of money/effort spent.
that's a bit unfair. yes, it does require that in some semblance, but that's not necessarily the one and only lever we have. most of us realistically expect, i hope, that we'd need a variety of political maneuvers to move our democracy toward a more representative and less insular direction.
science is no different--it took many small steps to get into this situation, and will take many small steps to get out, any one of which will seem wholly incapable on its own.
At this point in time, many of the people in power (full professors, administration, etc) see papers as essentially useless, in part because of all the problems being discussed here. p-hacking, guest authorship, etc and so forth and so on. They're seen as a dime a dozen and somewhat ignored.
This might sound good until you understand that this means that they're dismissing the entirety of the scientific dialogue essentially, and that their alternative is grant money. My institution had training workshops for grad students where faculty would tell students that research is essentially worthless unless grant money is attached to it. The idea is that papers are a dime a dozen, and that if something is worthwhile, the feds will put their money where their mouth is so to speak.
The problem with this, of course, is that the grant system is horribly nepotistic and distorted. The biggest predictor of successful grant application last time I looked (based on empirical research in peer-reviewed journals) is co-authoring papers with someone on the review panel. Grant receipt is only weakly related to citation metrics, and from personal experience I can say that institutions are constantly encouraging researchers to inflate grant costs to bring in more indirect costs.
You could fix everything about journals and the publication process and it would do nothing about the shadow scientific world that exists in parallel that drives all the rest of the problems.
Eliminate indirect funds, require tenure of all researchers, set aside funding mechanisms for researchers that aren't tied to specific grant applications, randomize grant rewards, depriortize journals, ... there's a lot of things that need to happen.
* Maybe increases sodium intake is linked to poor health outcomes because highly processed foods are linked to poor health outcomes. We know that sodium content increases significantly during food processing and that most highly processed foods are really unhealthy.
* Maybe people in early stages of renal failure are more likely to progress to a noticeable state if they consume lots of salt. Then it would stand to reason that people with healthy kidneys have nothing to worry about.
I think that a requirement to publish results either way on any study that's been pre-registered is important.
Now, I don't have a good solution either, unfortunately. What might work is that we require replication work for a PhD or have a certain percentage of a journal dedicated to verification. That, combined with some meta-studies to reward people with citations for replication, might work without fully swimming against the current.
It's a hard problem, really.
Unfortunately, there is no funding for such research. Which is find really sad. Private grants might need to show "impact" but State run grants don't have that many constraints, and they could conceivably offer such grants.
I don't think this will work. All it will do is devalue the value of replication studies because only PHD students do replication studies. It's also not in their best interest especially if they dispute findings of established researchers.
Also, we have to get away from the idea that the scientist's job is to think and write, and literally all of the other work can be shuffled off onto low wage (or no wage), low status workers. This is one of the biggest reasons that science is going through such a crisis. If you want enough papers to consistently get grants you probably need at least 4/5 PHD students every few years. This causes a massive glut in the job market. It also dissociates scientists from their work. I've met esteemed computational biologists who could barely work a computer. All of their code was written, run, and analyzed by graduate students or post docs. They were competent enough at statistics, but that level of abstraction from the actual work is troubling.
> All it will do is devalue the value of replication studies because only PHD students do replication studies. It's also not in their best interest especially if they dispute findings of established researchers.
Most studies are done by students regardless, so it seems unlikely that replication studies would be devalued merely because they're done by students. Although disputing the findings of established researchers can be risky, they would be publishing jointly with their PI (or, with the above implementation, multiple PIs), not alone with no support. Few students want to stay in academia, so it usually doesn't matter to them if a professor at some other institution gets offended. Most importantly, if everyone is doing replication studies, there will be so many disputations flying around that any particular person is less likely to be singled out for retaliation.
1. Studies can be much more expensive than most people think. In my field, a moderately sized study can easily cost $100,000+ if you're only accounting for up front cost (e.g. use of equipment, compensating participants). Someone would have to foot the costs of this.
2. Studies can be incredibly labor-intensive. PI's can get away with running studies that require thousands of man-hours because they have a captive market of PHD students, Post-docs, and research assistants all willing to work for low wages or for free. PHD students usually don't have the same amount of man-power.
3. For obvious reasons, studies that require high cost, high man-power work tend to get replicated naturally less. In other words, the least practical studies to replicate happen to also be the most necessary to replicate.
A couple of things I would dispute:
> it seems unlikely that replication studies would be devalued merely because they're done by students
I think academics value work in a particularly skewed way. There is "grant work" and there is "grunt work". Grant work is anything that actively contributes to getting grants for one's institution. Grunt work is everything else. PHD's can do grunt work, but that doesn't mean it will be valued on the job market. For example, software development is actively sought after in (biology) grad students, because it's a very useful skill. However, I've also seen it count against applications as professors because it shows they spent too much time on "grunt work". Software development skills don't win grants.
> Few students want to stay in academia
In some fields there aren't any options except to stay in academia or academia adjacent fields.
> how do differences of culture — however defined — interact with traditional economic mechanisms involving prices, incomes, and simple comparative statics? Are those competing explanations, namely cultural vs. economic?
Richie's answers are mostly focused on changing the culture of science, and while there are lots of ways we could change the incentives, none of them would be pretty.
Example: let's say we want to more closely align research and actionable results, e.g., a product a company can use (Brian Armstrong argues for something like this ).
Solution: radically reduce public funding for scientific research and for university education as a whole (in line with Bryan Caplan's arguments in "The Case Against Education" ). Academics, who would be many fewer in number, would then have to get more of their funding from companies, who (presumably) would:
A) guide them towards asking market-relevant questions, and
B) have a clear incentive to check the data, re-run the code, etc. -- so that the product they built based on that research didn't flop.
I think most people would recoil at this proposal. But that's what comes to mind when I think about fixing the incentives rather than the culture.
P.S. Small nitpick: Richie gives an _example_ of a perverse incentive in lieu of a definition.
Science beholden to business just proves whatever is convenent for business owners not the truth.
Most long-term high impact findings come from foundational research, not applied science.
Also scientists would then just tune their research to sound good to investors.
Tuning economic knobs in an area that should be as free from outside pressure as possible seems counterproductive.
No, being stricter in rewarding rigor over perceived usefulness is the way to go.
Wouldn't that make many scientists (both good and bad) move to other countries where funding is easier to get?
FWIW, and I didn't clarify this enough in my post -- I meant this as an example solution of something that would change the incentives rather than the culture, not necessarily as my own full-throated endorsement; I do personally think that steps in this direction would be for the best, but it's not as though there wouldn't be downsides that would need to be managed/mitigated.
Bit pedantic but you should use they instead of he/she. People can have other pronouns.
When I've talked about this with colleagues, they've argued this would make problems worse, and it might, but I think the idea was more that once you reach a certain threshold you should be funded some amount.
Of course, I'd argue that this is basically the idea behind tenure at an R1 institution, that once you reach a certain level the state or private board of trustees is paying your salary to do research. But nowadays tenure at an R1 institution is proxy for external grant dollars, which defeats the purpose and is redundant.
The real problem is indirect funds, which create profit for institutions. The federal government needs to eliminate indirect funds, so that grant dollars aren't perversely incentivised and tenure is based on research quality rather than profit.
One of those other pronouns is literally "they", so you're still risking using the wrong pronoun by using "they" for someone who prefers to be referred to as "he".
Was anyone else taught the rule that when writing about a third-party whose gender is unknown, use your own pronouns?
On the other hand, if you really think UBI is a good solution, then I'm afraid you don't know how an economy really works.
> On the other hand, if you really think UBI is a good solution, then I'm afraid you don't know how an economy really works.
Could you elaborate more?
Zero Hedge on why UBI doesn't work: https://safehaven.com/markets/economy/Why-Universal-Basic-In...
The Guardian on why UBI doesn't work: https://www.theguardian.com/commentisfree/2019/may/06/univer...
Tenure doesn't really help with this problem in the lab sciences. While it lets faculty members keep their jobs, it usually doesn't come with enough funding to do experimental work. In fact, while you can "keep" a tenured position without grants, many places find ways to...discourage that (move your office to a shoebox in the sub-basement, crappy teaching and service assignments).
I don't even think we have a good word for what this practice is, but I'll go with "Art" because it takes a lot of that.
His book on Immigration is a large-scale version of this skill in practice and I suspect a lot of HN readers might enjoy it, regardless of if you agree with his points or not.
Where you spread refuted papers out through citations to other scientists and newspapers.
It could be included as a factor when hiring scientists.
And of course the person who refuted a false paper should receive the citations of that false paper. It's only fair.
This would form the basis of the Crank Index of a paper, which can be simplified into a stoplight system: Good research with good sources is GREEN. Getting featured in the Journal of BS earns a paper the esteemed distinction of a blaring scarlet RED, citing a RED paper will mark a paper ORANGE, citing ORANGE research leaves you YELLOW... Throwing together lots of ORANGE and YELLOW citations will nudge your paper up the spectrum towards RED.
This would incentivize researchers to not only care about the quantity of the citations they share with each other, but to be extremely vigilant of the quality of those citations as well.
As for a list of unambiguously bad papers, we do have Retraction Watch: https://retractionwatch.com/. It's mainly a retraction tracker, but there is also associated community effort to proactively identify research misconduct.
expands on: ...
depends on: ...
alternative approach to: ...
This doesn't mean that a vanishingly small percentage of papers are wrong, only that it is very hard to identify errors because papers usually don't contain enough information to fully reconstruct the results. There are a lot of assumptions of good will in the system.
> [journals] can demand scientists share their data, and to prove that they've written down their analysis plans before they touch the data
I wonder if this doesn't gloss over a deeper underlying problem: journals have traditionally assumed the copyright of the paper. Journals themselves have an incentive to obfuscate and protect the underlying data and content.
Ultimately, any complex system or institution will be more susceptible to gaming when it is mature and its value proposition clearly established. Anti-gamification is hard to design into the early stages of a system when it is needed most.
It really is the incentives themselves that are the problem: just looking at number of publications and citations (or even: citations of articles in the journals that your articles happen to be published in as well) when determine who to fund or hire.
The problem there is that we have those metrics, are relatively quick and easy to obtain, which are accepted because they are what's been used so far - even though plenty of research has pointed out their flaws yet . And anything new that is proposed as a replacement of those metrics (whether other metrics, or other systems of evaluation) is dismissed for not being proven to live up to a standard that the currently used methods do not either, or for not being available quickly or easily enough. (Which is reasonable - e.g. it's not viable to read and properly evaluate all research of your applicants.)
(Disclosure: I do volunteer for a project, https://plaudit.pub, that tries to offer an alternative nevertheless.)
• Some older calculations were run with somewhat older versions of the code. Of course we believe that the results wouldn't change, and we recalculated some, but not all. We didn't keep track of exactly which version was used for which calculations, because that's simply very demanding in the middle of a complex research project.
• Some data in the text and tables of the paper are still extracted manually from the code. We don't have a full templating system where the data could be automatically inserted into the paper. You could use something like Jinja to do it, but then every coauthor needs to have high technical skills and it's just time-consuming to maintain in general.
Doing the bare necessities of science (recording methods for results) can be almost automated:
Put all code in Git
then in your script add something like:
git rev-parse HEAD >> run.log
Also pull requests.
I've heard several people ask for this, but never understood why. Most citation formats let you include page numbers; you can usually work in other location information ("See Foo et al. (2020)'s Figure 3A") too.
Probably best that you read through whole papers instead of looking for one sentence.
I'm currently studying biochemistry and have a few years of experience as a software engineer. In trying to dive into the papers in the field and just trying to replicate the data analysis, I came to see how bad the state of data and code availability is. It varies a lot between subfields, but overall the current state seems pretty abysmal.
Besides reporting is good because others can do alternate analysis on the data.
Personally I try to put as much on GitHub as possible.
For example, imagine if scientific papers were voted up or down by a community, kind of like stack overflow.
Or imagine if scientific papers had to publish all of their source materials and instructions for replicating the experiment, and there was a system for tracking and showing whether or not the experiment had been validated or disproven?
What if you "game-ified" scientific papers and gave people points for publishing, but also gave people twice as many points for disproving a paper?
Imagine if we had a platform for tracking scientific theories and experiments that was a combination of democratic/meritocratic administration (like wikipedia), change logging/tracking (like github), and reputation management (like stack overflow)...
To fix the system will take a more honest look at the incentives of the people/institutions who create the incentives, and so on.
"Science should be based on solid data: published, auditable, peer-reviewed numbers. Data is good, data is objective, data is truth.
"Academic hiring is broken. We can't base academic hiring on numbers because people game the numbers. In academic hiring we need to be subjective, to evaluate the intrinsic merit of each researcher. Data is corrupt, data isn't sufficiently subjective, data is flawed.
Something like an infobox with P-factor, whether it was pre-registered, sample size, funding organization, double-blind, etc?
This comic tackles the Academia side of things, but a lot of that motivation comes from press coverage. If the press has better capabilities to be critical of bad studies, Academia will give less credence to the same.
Even problem sof cherry picked data would be partially exposed eventually; and eventual exposure is still a very effective deterrent in science. My 2c.
This doesn't take care of citation rings, but does move the needle towards reporting the actual value of a paper/researcher.
"Wrong: Why experts keep failing us--and how to know when not to trust them Scientists, finance wizards, doctors, relationship gurus, celebrity CEOs, ... consultants, health officials and more"
which begins with an interview with John Ioannidis and goes on to discuss in detail why so many academic (and expert) publications are wrong and how they got that way.
Am I missing something? Is there great pay before you've taken a lot of time to move up the ladder, assuming you are fortunate enough?
Help me understand your reasoning rather than just down voting me.