This is an issue with using citations has the measure of everything. If I am writing a paper and mention a particular study that failed to replicate in subsequent attempts, I still cite the original paper. It is understandable that a seminal or surprising study gets cited well after it is refuted. Take for example Jaeggi's studies that claimed Working Memory training increases IQ...a claim shown to not replicate in countless follow-ups. I myself have cited a number of her papers, right before I also cite the contradictory evidence. I would expect similar patterns in other papers. A citation isn't an endorsement of the findings, but rather an acknowledgement that a piece of work is worth mentioning, whether it be in a good, or bad, light.
An alternative to GoogleScholar is Research Rabbit, an AI enabled platform that maps the networks between paper allowing for an author to see who cited the paper they intend to cite, who did the original paper cite etc etc.
This is a systemic social problem, not a technology problem. If researchers wanted to double check every single one of their citations for subsequent retractions, they would simply have their graduate students do it.
The root of the problem is that it’s impossible to build a scientific career out of publishing negative results (assuming you can even publish them at all). What actually matters is that your original publication gets into a high tier journal and gets good press. Once that’s done, no one cares if what you published was garbage or not, because by the time anyone figures that out (if they ever do), you’ll have tenure, big fat multi-year grants, and will be out on the speaking circuit.
It is absolutely unconscionable that "peer review" process doesn't include "review". Replicating results should be a hard requirement for publishing a conclusion. Anything short is merely a hypothesis with some suggestive evidence, a conjecture. Imagine if mathematicians cited published conjectures but called the proven results.
This is a common refrain, but if this was a requirement science would absolutely grind to a halt. There is just too much science being done. And in some fields, what scientists publish is generally sound, if incremental (and maybe of dubious importance).
I think the tradeoff we have taken is "lots of science, mostly good" vs "little science, 100% correct". Similar tradeoff is taken with software & wikipedia, for example.
Or do you apply the same rigorous standards to, say, software? Should every piece of software that is publicly distributed be thoroughly reviewed for correctness?
It probably should grind to a halt because the incentives for “research” are skewed to commerce to such a degree that the structure of the investigatory mechanism is foundationally flawed
Honestly I don't find that process hard because I will not allow myself to live in a state of cognitive dissonance - conflicts in logic MUST be resolved. If there is no epistemic resolution possible, then we have nothing to say about this question with any degree of certainty. We can literally ignore all other information that claims to resolve it because we are not smart enough to actually define the problem.
For example we can make no certain conclusions on the "Hard question of consciousness." As a result there is nothing concrete about the concept of consciousness because it's not even a well defined enough question. Anyone attempting to make a claim are making it in an epistemic vacuum. So it doesn't matter who is talking about it, the state of knowledge of the concept of "consciousness" is so lacking that it doesn't even make sense to discuss.
In my view most of the persistent problems in humanity are in this class: we haven't defined measurements for the issue in question well enough to actually be able to approach a solution.
I always keep in mind that: "I'm probably wrong about my position"
If you are seeking data to confirm your idea then you'll be wrong more often than not
If you are seeking data to falsify your idea then you'll be less wrong more often
Hard disagree, set aside 30% of the budget from the get go by law and use that to let others you have no contact with try to replicate your wirk. I'm willing to bet, that if this were to be tried earnestly 99% of research would be compatible.
You are imagining a solution to a very hard social problem. It is trivially simple on paper, but realistically extremely difficult to implement when you think it through.
The fundamental issue: what is in it for the reviewers in this case? Scientists are people, and you can't really legislate interest in replicating other people's work that is unrelated to your own, unless you really spend money on it.
I'm there with you, it's a culture and social issue. That said the argument it's not fun seems quite weak to me. Scientists have to do source citation, formatting, peer review and many other things that aren't intrinsically done for selfish hedonistic reasons.
Paper is a technology. This systemic social problem around paper has the quirks that it does because paper has limitations (you can't add highlights and margin notes after you've given it to somebody else).
If we stop replicating those limitations, it's not unreasonable to expect that the social problem will change also.
I don't think the answer is search engines in the traditional sense, but we probably do need something that knows how to search. If whatever we use to view research were to display warnings wherever a retracted citation was viewed (or whenever one the citations' citations were retracted...), similarly to how we display SSL certificate expiry warnings in a browser, I think that would create a dampening effect on the momentum that a retracted paper can have.
"Retracted" probably isn't the only color we'd want here. "Replicated" might be another. My point is that summoning up-to-date metadata on a something published last year should just be the default view mode, not something you have to ask a grad student to go do.
>If we stop replicating those limitations, it's not unreasonable to expect that the social problem will change also.
This is called technological determinism and is not an accurate heuristic for how humanity adopts tools
Humans generally adopt tools broadly after the social environment allows it, not when the tool is capable. This has been shown over and over. Never has there existed a tool whose introduction was immediately and universally adopted.
There’s almost always a period of introduction, then decades of middling adoption and refinement, then the social climate changes and production adjusts enough to adopt en masse.
This was true for every major invention and is an artifact of human social structures
I don't see how these are incompatible views. Maybe it takes a decade, maybe it takes a century, maybe it doesn't happen at all... building the alternative remains the first step. Certainly no adoption is going to happen before the medium exists. Also we wouldn't be publishing articles like this one if there wasn't already some desire to change things.
This is a huge issue. What's the point of doing replication studies (which we need) if folks just don't care about the result.
It's heavy handed, but I think every time a study fails to reproduce its results, every single published paper that uses that refuted study is notified, the authors get 6 months with a warning banner on top of the article, and if a correction is not submitted, it's automatically removed.
Science isn't supposed to be easy. We need a system that prioritizes high quality output, not volume.
Im not sure why somebody would cite a failed replication unless they’re doing a lit review or meta analysis. Most would just stop citing the refuted paper to save word/citation counts.
so rather than compare total number of citations, you want to see that:
- citations per year drop once refuted
- if citations continue after refutation, they co-occur with the refutation study
Yes, this is a good point. Tracking citations of the original work in the absence of failed replications would be a more informative metric to measure whether a refuted study was still being used to justify arguments.
Furthermore, it doesn't appear that they're looking at whether the reason the paper was cited was the subject of the failed replication.
To analogize to the legal field, which they mention in the article, if a case I cite is good law on one point of law, but no longer good law on another I may cite it as _Plaintiff v. Defendant_, 123 F.3d 456 (1998) (reversed on other grounds, _Defendant v. Plaintiff_ 78 S.Ct. 910 (2000))
Just because one finding in a paper wasn't replicated doesn't mean that the paper can't or shouldn't be cited for other findings.
I LOVE the new twitter functionality where others put "bad information" into context. Almost as good as the HN 'flag' button. Something like this should become standard everywhere.
The community notes? It works surprisingly well over all, although it does get gamed sometimes. Notably, one criticism is that Elon himself gets a lot of community note suggestions but they don't post.
Maybe publication websites should be required to add a mark “Refuted by: <other paper>” in the PDF, similar to how RFCs have “Superseded by” and “Updated by”.
There's no incentive for publishers to do that, but a good scientist will explore an area before embarking on it. This is where meta-analyses and citation mapping tools come in handy.
At 40 years old, what’s clear to me is that while people have some curiosities they pursue, most have no desire to be rigorous about epistemological questions surrounding their curiosities. This extends well into people’s professional disciplines in my experience.
So even for the small percentage of people who have some narrow focus of inquiry (usually at work) they only have the desire or ability to evaluate the first or second order inputs and effects.
In general this isn’t a problem for day to day life. The packaged chicken on the shelf fits perfectly into recipe, and there is little to no thought of any other externalities (what systems am I “voting with my dollars” to support? What is the relative difference in supply chain between two options)
However if you are going to try and align your conceptions of how systems/environments work and what you desire for future state, then you do have to do the system decomposition and that’s actually non trivial.
So there’s no real “market” for epistemological rigor outside of a small group of people who come across as unreasonable in their demand for rigor, so this is the likely result of a system that is built for personal or organizational success and not an ego-free search for “truth” via intentional rigor.
Luckily the small group continues to grow as a total number
It takes too much work to verify everything in every field personally, you wouldn't be able to move at all. The problem is a complete lack of interest in any tools that would inform people about this, because entire industries and fields are built by bad papers.
We seem to have a lot of hypertext systems that keep track of citations. Seems like it would be cheap at this point to put a black mark on papers that don't replicate, and to propagate that black mark to every paper that references the bad paper without also referencing the failed replication. Put a big red X over the abstracts.
> It takes too much work to verify everything in every field personally, you wouldn't be able to move at all.
Yes and no. The scale of the checking problem is formidable, as you're aware. However, many claims in scientific papers are weak. Even highly cited papers can contain nonsense that doesn't stand up to mild scrutiny. Given that, I think progress can be made towards the problem if every researcher simply spent more time checking previous work. Checking everything is impossible, but progress can be made by prioritizing, dividing the work up among everyone, and doing checks that don't take much time.
The first version of this paper was published in 1991 and has over 300 citations. It's still cited to this day. Despite this, in 2016 I read the later journal version of the paper and pretty quickly realized that the first part of the paper was basically nonsense. You don't need to look hard to realize that either! The problem can be found by simple spot checks, looking at the trends of the equation. (I can't quite recall, but I think that's how I figured out that the model was wrong in the first place.)
My paper refuting the model to date has zero citations that aren't from me. The closest I've received so far was an email from someone who was curious that I was apparently the first to notice the problem, over 20 years after the first publication of the model.
> It takes too much work to verify everything in every field personally, you wouldn't be able to move at all.
Does it? The rate of paper production is so high that it has become a logistical problem in and of itself and wild goose chases are common. Surely some of that energy could be profitably redirected into replication and consolidation.
The black mark system could work but seems like a recipe for some nasty politics. I tend to think that baking replication into the culture might be a better approach -- science advanced beyond secret-hoarding when information sharing became institutionally valued, and I believe that it could advance beyond p-hacking if replication became institutionally valued. Fund & cite replication efforts and the rest will follow.
A related point is made in Michael Huemer's _In Praise of Passivity_, where it is used to argue that almost everyone would be better off simply not getting involved in political processes like voting at all. [1] I think it generalizes to most domains.
Rigorous knowledge in any domain is both difficult to obtain, and it usually commands a small premium compared to the other things you could have sunk your time and attention into. I see at least two options for getting around this:
- Wirehead yourself into thinking truth is beauty and beauty truth, and then tilt at what many others will consider windmills (the fools!). At its best this is probably what folks like Isaac Newton were up to.
- Go all in on the profit motive, embrace competition as a refining fire to force you to be more rigorous or go broke trying. This reduces the problem to knowledge being "just" really hard to obtain.
There may well be more, those are just the two most obvious ones to my eyes.
I can’t think of anything worse than those approaches
It’s pessimistic at the extreme and throws one’s hands up to nihilism
I prefer pushing through that nihilism and into realizing that absurdity of the universe is not a fixed barrier but a challenge to solve for the entirety of the universe
I guess that's a third approach you can take, if you're already a nihilist on some level.
I'm not, so I find the message of finding niches people haven't pushed hard enough into to make the world a better place (and maybe make some money while you're at it) to be pretty uplifting.
In my experience, there's little to no reward for rigor and people mostly invoke it when somebody says something they don't want to hear. People are impressed enough if you cite some bullcrap even if it's been refuted since they themselves don't have the rigor to actually question if what you cited is correct. They usually don't have the rigor to even check your original source in the first place.
It's an existential problem for the species that we can encode into our structures conceptions of the way the universe works that are measurably wrong
Bloodletting is well understood that it doesn't work but there was a time where the most trusted people performed that
Why is that even possible? There was absolutely no epistemic proof that bloodletting was associated with a falsifiable hypothesis which would lead to improved health outcomes - in fact it was all based on these "humors" in the body that needed to be "balanced" which makes no mathematical sense whatsoever.
Even then, as early as the 1400s there were scientists saying "This is actually bad" - and yet George Washington had bloodletting done on him in his dying repose at the end of the 1700s
300 years of an activity that is bad for you that was encoded into medical practice
The conclusion we should come to is that MOST of our assumptions about the structure of the world and how to optimize within it, are based on literally nothing but aggregated social velocity, instead of claims that are measurable and testable
The other problem with right I forgot to mention is that Rigor generally means:
1: Refusing to make a firm affirmative statement on an issue, when that's what people want
2: Correcting what you have claimed in the past, which makes you look gullible and indecisive
With the example of bloodletting, people wanted to believe a doctor knew how to cure them, and once somebody had done bloodletting even if they had doubts later, they would look like fools if they convinced people they had been harming their patients all along.
It's easier and often more rewarding to be a fraud.
An alternative to GoogleScholar is Research Rabbit, an AI enabled platform that maps the networks between paper allowing for an author to see who cited the paper they intend to cite, who did the original paper cite etc etc.