Hacker News new | past | comments | ask | show | jobs | submit login

It might help to expand on "bogus". Bogus has a few levels, going from "not good, but possible from a well-intentioned author trying to do the right thing" (low-level bogus) to "outright deception" (high-level bogus). Small sample sizes, statistical errors, and flawed (but honest) experiment design are all, I suggest, low-level bogus. Faking data and plagiarism are high-level bogus.

I think peer review is capable of, eventually, mitigating low-level bogus. The quantitative standards in fields where low-level bogus is a problem (e.g., but definitely not only, medicine) are rising. Peer review is not a scalable solution to high-level bogus. Figuring out high-level bogus seems to be almost a full-time job [1]. You cannot expect this level of effort from researchers, especially if they are reviewing for free; I would even argue that it's easier to fake data than to figure out it's fake. It also requires more expertise to assess quality research than to write a low-level bogus paper and submit it. There's a mismatch here. There are not enough expert reviewers to handle all the low-level bogus papers.

The solution therefore seems to require some kind of reputational component. There needs to be a cost to engaging in high-level bogus. But this is a hard problem. Do you ban any lead author of a paper with demonstrated high-level bogus? Publicize their names? Ban any author of a paper with demonstrated high-level bogus? Throttle the submissions any one person can make to a conference/journal at a time? I don't know. But the current model will have to change.

[1] https://www.ft.com/content/32440f74-7804-4637-a662-6cdc8f3fb...




It's also possible to be functionally bogus while doing everything 100% by the book. If you control things super well you can inadvertently make your result so narrow that it's basically meaningless, at least with the way the scientific system functions today. If people worked together in a more concrete way to build on prior results this effect might not be so bad.

One of my favorite examples is a mouse study that did not replicate between two genetically identical mouse populations raised in the same conditions run by the same lab. The difference was the supplier the two sets of mice originally came from, and the researchers were able to pin down the cause as differences in the gut microbiome between the two (and in fact one particular bacterial strain). That is an example of great research, but the vast majority of studies will never catch something like this before they publish because they will only use one mouse supplier as they keep things controlled while minimizing costs.

Because designed replication studies are fairly rare and people often do not officially publish when they find things that don't replicate in biology, we are approaching interpretation/downstream use of these highly controlled studies in an extremely inefficient way.

But that isn't really the fault of individual researchers. Technically they're applying the scientific method correctly to their niche problem. It's the definition of the problem coupled with how we combine results across groups that causes the inefficiencies. As problems of interest increase in complexity we can't define them on the scale of individual labs anymore, and for some reason we've addressed it by breaking them up into subproblems in this homogeneous way. Then we just assume that piecing together the little results will work great...

Anyways, I agree there is also straight up fraud and blatantly bad practices on the level of individual papers, it's definitely a continuum. Sometimes such bad results slip into the mainstream of science or even have a huge impact on subsequent funding directions like with the fabricated amyloid beta paper. But I do suspect that for the most part the blatantly bad work stays on the fringes, and the largest negative impact on scientific productivity actually comes from a level of abstraction up.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: