Hacker News new | comments | ask | show | jobs | submit login
1 in 4 Statisticians Say They Were Asked to Commit Scientific Fraud (acsh.org)
527 points by brahmwg 77 days ago | hide | past | web | favorite | 218 comments

I was going to give a lecture across several departments about my PhD research in bioinformatics. The night before the talk I was generating some new figures and saw something weird, which led to me digging through source code and discovering a bug which invalidated the last 18 months of research and all my conclusions. I went to my advisor with this problem and he told me to present it anyways. I refused. I suffered consequences for that, including getting stonewalled by my advisor whenever I tried to publish a paper. I wish I had actually been in a position to refuse.

This is par for the course with computational research. I discovered a bug in my code the last week before submitting my PhD dissertation. Luckily, all of my code was organized in a pipeline that could automatically regenerate everything on the fly, but I needed a supercomputer. The queue was too long for Titan, so I set up an impromptu cluster on Azure (it was the only cloud provider with Infiniband at the time) and paid $200 out of pocket to regenerate the correct figures.

I wouldn’t be surprised if a significant portion of published computational research has bugs that totally invalidate the conclusions. I think we need to push hard to require all taxpayer-funded research to make any code that results in a journal article publicly available.

A few days ago in the description of a paper found on hackernews (https://news.ycombinator.com/item?id=18346943) :

> We also noticed significant improvements in performance of RND every time we discovered and fixed a bug [...]. Getting such details right was a significant part of achieving high performance even with algorithms conceptually similar to prior work.

They call bugs 'details' which, I find, is a frightening state of mind for someone publishing an algorithm.

> They call bugs 'details' which, I find, is a frightening state of mind for someone publishing an algorithm.

It really depends on the algorithm. For example, a bug in the random number generator of a stochastic search algorithm that affects, say, the variance of a distribution won't have a relevant impact on the outcome.

That could have a huge impact. For example if it affected random draws of a hyperparameter in a Bayesian model, leading to incorrect credible intervals ultimately in the posterior distribution. Or worse, if the RNG bug was affecting the variance of randon samples in some deep component of an MCMC algorithm like NUTS or even simple Metropolis. Depending on the exact nature of the bug, it could even cause the model to violate Detailed Balance, and entirely invalidate sampling based inferences or conclusions.

Well that is why you would always run statistical validity tests on the RNG and other intermediate values whenever using a Monte Carlo model in production. Ideally the tests should run as part of a Continuous Integration workflow.

If you (dubiously) write your own RNG, sure. But you should never do that. And for library RNGs, you should execute the unit tests of the library. Frankly, running it as part of CI is at best overkill and at worst adds complexity that costs you. If you pin versions of your dependency and isolate the artifact into an in-house artifact repository, so that the library code is literally never changing unless you explicitly modify the version, then you should test it up front, then again only occasionally if you actually have evidence of a bug. And as part of any code review of the code change that introduces a version change.

Computational research is incredibly difficult because it's usually hard to see the effect of a bug. A triangle drawn on the wrong place of a screen can be easy to see, but a typo in your integration subroutine? Hard to spot if you don't catch it when it is born.

I also had a few do-overs at the end of my thesis, but fortunately had a cluster standing by...

>A triangle drawn on the wrong place of a screen can be easy to see, but a typo in your integration subroutine? Hard to spot if you don't catch it when it is born.

Well, there's also this notion of testing and regression. As I said in another comment a few days ago:

>A few weeks ago I had a conversation with a friend of mine who is wrapping up his PhD. He pointed out that not one of his colleagues is concerned whether anyone can reproduce their work. They use a home grown simulation suite which only they have access to, and is constantly being updated with the worst software practices you can think of. No one in their team believes that the tool will give the same results they did 4 years ago. The troubling part is, no one sees that as being a problem. They got their papers published, and so the SW did its job.

Wow. If it's not reproducible, can it even be called science?

It's not really much different from a physical experiment, where the expectation is that you'll have to rebuild the experimental apparatus yourself to reproduce the experiment.

An independent implementation of the experiment is neccessary for a full reproduction anyways. If you just run their code again, you'll end up with all their bugs again.

(But, don't get me wrong. I like when researchers release their code. It's still very useful.)

This can't be upvoted enough. Makes me thinking that using published source code and reproducing/validating results is completely orthogonal. Maybe it's a good thing the source code for science gets not published.

I like the concept of rewriting all the code by an unbiased third party to see if they can reproduce the results, but in practice what this leads to is:

1. People will not bother. It took a lot of minds to come up with the software used (in my friend's case, several PhD's amount of work). No one is going to invest that much effort to invent their own software libraries to get it to work.

2. Even when you do write your own version of the software, there are a lot of subtleties involved in, say, computational physics. Choices you make (inadvertently) affect the convergence and accuracy. My producing a software that gives different results could mean I had a bug. It could mean they did. It could mean we both did. Until both our codes are in the open, no one can know.

It is very unlikely that you'll have a case of one group's software giving one result and everyone else's giving another. More like everyone else giving different results.

Case in point:


HN discussion:


This issue comes up often on HN, and it used to a lot in /r/science (maybe it still does - I left that subreddit years ago).

If there's one thing I could convey to the world from what I learned from my time in academia, it is this: Most scientists at universities do not care about reproducibility.[1] Not only that, many people intentionally omit details from papers so that it is hard for rivals to reproduce their work - they want the edge so they can publish without competition. This isn't a shadowy conspiracy theory - this is what advisors openly tell their students. Search around on HN and reddit and you'll see people saying it.

[1] My experience is in condensed matter physics - it may not apply to all of academia.

People doing science programming are the worst programmers in the world. The reason is that they are focused on a calculation and result, not the program. I helped a guy speed up his program once. He was sorting around 10^4 to 10^5 elements using a bubble sort (which he had reinvented).

Perhaps it's a reproducible way to build careers?

>> being updated with the worst software practices you can think of.

I can believe that. I have seen code in academia and it was all one or two-letter variables without even linebreaks between statements.

Open source doesn't ensure quality code.

Ideally the code would be part of the peer review process, but code review is really expensive, so who knows how that would play out.

True, but it does provide at least some measure of reproducibility. Quality of implementation and reproducibility are orthogonal and both very valuable in their own right.

> Open source doesn't ensure quality code.

Yes, but closed source helps ensure that low quality code is hidden from sight. It also means that people who distrust or doubt the conclusions have no chance to identify any bug(s) and disprove the results or conclusions.

It's simple:

We stop publishing in papers, and instead adopt smaller chunks of our work as the core publishing units.

Each figure should be an individually published entity which contains the entire computational pipeline.

Figures are our observations on which we apply logic/philosophy/whatyouwannacallit. Publishing them alongside their relevant code makes the process transparent, reproducible and individually reviewable, as it should be.

We can then "publish" comments, observations, conclusions etc on those Figures as a separate thing. Now the logic of the conclusions can be reviewed separately from the statistics and code of the figure.

A comparable solution would be for all involved to value all research, not just the ground breaking, earth shattering type.

As it is, research that yields a "failure" is buried. That means wheels are being reinvented and re-failed. That means there's no opportunity to compare similar "failures", be inspired, and come up with the magic that others overlooked.

Unfortunately, I would imagine, even if you can get researchers to agree to this the lawyers are going to have a shit fit. Imagine Google using an IBM "failure" for something truly innovative.

What you are proposing sounds a lot like the concept of the least publishable unit.


> Each figure should be an individually published entity which contains the entire computational pipeline.

I agree in principle. But, for the experimental sciences, we need better publication infrastructure to make this practically possible.

For example, consider a figure that summarizes compares, between several groups, the mechanical strain of tensile test specimens for a given load. Strain is measured from digital image correlation of video of the test. Some pain points:

1. There is a few hundred GB of test video underlying the figure. Where should the author put this where it will remain publicly accessible for the useful lifetime of the paper? How long should it remain accessible, anyway? The scientific record is ostensibly permanent, but relying on authors to personally maintain cloud hosting accounts for data distribution will seldom provide more than a couple years' of data availability.

2. Open data hosts that aim for permanent archival of scientific data do exist (e.g., the Open Science Framework), but their infrastructure is a poor match with reproducible practices. I haven't found an open data host that both accepts uploads via git + git annex or git + git LFS and has permissive repository size limits. Often the provided file upload tool can't even handle folders, requiring all files to be uploaded individually. Publishing open data usually requires reorganizing it to according to the data host's worldview or publishing a subset of the data, which breaks the existing computational analysis pipeline.

3. Proprietary software was used in the analysis pipeline. The particular version of the software that was used is no longer sold. It's unclear how someone without the software license would reproduce the analysis.

Finally, there's the issue of computational literacy of scientists. In most cases, the "computational pipeline" is a grad student clicking through a GUI a couple hundred times, and occasionally copying the results into an MS Office document for publication. No version control. Generally, an interactive analysis session cannot be stored and reproduced later. How do we change this? Can we make version control (including of large binary files) user-friendly enough that non-programmers will use it? And make it easy to update Word / PowerPoint documents from the data analysis pipeline instead of relying on copy & paste?

If any of these pain points are in fact solved and my information is out of date, I would be thrilled to hear it.

1 ans 2: I like IPFS for this, check it out

3: analysis that uses propriatory is marked appropriately as second class

> computational literacy of scientists


I have two words for you: Ted. Nelson.

Can you expand on this?

I can’t speak for GP, but Nelson invented hypermedia/hyperlinks and had a vision for the future that included documents including other documents. All of that seems pretty compatible.

similar to reproducible builds or nix

research just jumped onto jupyter notebooks, it's halfway there, someone helps the remaining step

www was created to publish information in CERN but we can use it in other contexts too ;) http://info.cern.ch/Proposal.html

Of course it won’t ensure anything, but currently being completely unable to reproduce results, even as the author but just a year from now, is par for the course.

It's not about code quality, it's about transparency and ease of reproduction.

Code review is cheap. I do it for fun. But it doesn't prove anything.

Science should prove things...

science can never prove anything as a matter of principle. it can only disprove all the alternatives. math and logic can prove, but only within the model it has built up, which has been shown to contain unprovable axioms that one must simply accept.

Yeah, I'm aware of the strict theory.

Little of what I do, even with the most rigorous methods available and the best practices from both software development and computational science, proves anything.

I know. And I think it's a problem for science...

Logical proofs will never happen for software development, but surely standards for scientific programming can be tightened up a few levels!

I think I heard of some reform proposals from the Reproducibility Crisis reformers.

I more mean there are whole aspects of science that aren't provable without being able to actually obtain counterfactuals, and that means time machines

This is why I'm a bit skeptical about the global warming predictions. AFAIK they're all based on multi-million lines of code models. Not only it's monumentally hard to have such large code base without bugs, but also, in case of a software implementing scientific models, finding bugs is extra hard (compared to software which works in a visible way and has millions of users, such as Linux kernel, games, car's embedded software etc.). An effort to have such model verified (let's say to a NASA standard) to be bug-free would probably cost billions of dollars. And that's only bugs, let's not forget that all the models are approximations, plus the numerical methods used all have their quirks and limitations etc. All in all, the problem seems too hard to tackle for humanity right now.

While reasonable skepticism is healthy, global warming is such a well sudied phenomenon by now that an unreasonable number of independent codebases must have identical bugs in order for it to be false.

There's also the fact that we have had a pretty solid grasp about the chemical reactions of greenhouse gases since long before computers, both theoretically and empirically, and we know roughly how much is put out in the atmosphere.

Where the models diverge is on far finer points than what is needed to make the basic policy changes that seems to be where we are stuck right now.

> an unreasonable number of independent codebases must have identical bugs

Entertaining[0] badpun’s skepticism, it is not necessary they have identical bugs, only that their bugs yield similarly biased results.

For example, if a significant number of bugs are identified by their affect on the results, then bugs contributing to “wrong” results might be more likely to be identified and fixed.

[0] In the Aristotelian sense.

Not if the bugs are actually just bad/corrupted data. For example, the main dataset the IPCC is based on appears to contain all sorts of bad data that definitely make me skeptical of the conclusions the IPCC comes to (https://researchonline.jcu.edu.au/52041/).

If global warming is actually wrong, it's most likely because of bad/corrupt data in the datasets used.

Quite right. In fact, those statisticians in the posted article are surely mistaken. We know that using tricks to hide things is sound science.

http://www.realclimate.org/index.php/archives/2009/11/the-cr... http://www.realclimate.org/index.php/archives/2009/11/the-cr...

Since it was sound science in 2009, why would it not be now?

The counterpoint is that we have already seen a steady temperature rise. So even though the specifics of the various simulations might not play out as predicted we can expect temperatures to rise.

Knowing that the temperature will rise is not enough to make policy decisions though. You need to predict the increase's magnitude, as well as practical consequences, such as climate change, how much the sea level will rise etc. And for that you need resonable and bug-free models.

> how much the sea level will rise etc

this will be the least of our problems. we understand very little about nature to have any reasonable prediction model. we don't know the inflection point which will cause massive collapse of major ecosystems we depend on.

all we know is that things are changing fast. faster than many non-human organisms are able to adapt.

Steady? We have seen 20-years rise after a colder period. Nothing out of order.

How would the global warming predictions all be biaised in the same way ? _All_ the studies are measuring the same tendency : the temperature is rising. The model does not need to be _absolutely_ precise to be right.

> _All_ the studies are measuring the same tendency : the temperature is rising.

Are you talking about predictions (and not measurements, you don't need a model to measure temperature)? Assuming you are, there's unfortunately a huge problem with modeling (and heavy math and stats-based science in general), in that researchers tend to stop looking for bugs in the model when it returns the results that they expect. In other words, if a bug in the model tells the researcher that Earth's temperature will decrease by 4 C by 2100, he will look over the model until he finds the bug, but if the model tells him that the temperature will increase by 2 C, thus confirming his inner bias, he'll declare it correct and move on to writing a paper based on the "finding".

Alternatively, as a thought experiment, imagine if math research were done in the way climate science is done. We would have proofs that are millions of pages long and were never verified by anyone. We would trust in them only because the author says that they are correct. Is this science?

A given prediction can be wrong, an experiment may be biaised, my point is that you choose to ignore that the vast majority of the experiments and measurements point in the same direction.

>>Researchers tend to stop looking for bugs in the model when it returns the results that they expect.

Again... they are _ALL_ biaised ?

It's not impossible - groupthink (https://en.wikipedia.org/wiki/Groupthink) has happened many times in the past among supposedly most brilliant minds.

Also, how many independent, comprehensive models (with codebases) for global climate change are there in the science world?

Could you perhaps indulge me by describing the action of CO2 in the atmosphere, as you understand it?

Can you explain why 300ppm CO2 is totally normal and fine, but 400 ppm (Parts per Million) CO2 in the Atmosphere basically means the world is doomed?

I'd be more than happy, but it would help if you answered my question, so that I can more effectively address your concerns.

CO2 produces a greenhouse effect by absorbing IR. This alone doesn't prove that Industrial produced CO2 alone is this time mainly responsible for climate change. All the other times' science believes the climate changed because of sun intensity.

> All the other times' science believes the climate changed because of sun intensity.

False. Lots of past climate changes were due to changes on Earth and its atmosphere (and sometimes, specifically, life on Earth), not changes in solar output (e.g., notably, the Huronian glaciation believed to have resulted from the Great Oxygenation Event, which resulted from the exponential growth of photosynthetic life.)

Solar intensity has increased slightly over the last few billion years, but previous changes in climate have been driven primarily by Milankovich cycles, volcanic emissions, and plate tectonics.

That the post-industrial rise in atmospheric CO2 is of anthropogenic origin is hopefully not a point of dispute, but it is demonstrable if necessary. Thus it remains to show that this must raise the equilibrium temperature. So, as you say, CO2 selectively absorbs outgoing IR. In the lower atmosphere, this actually does not have as much of an effect as you might think. Water vapor blocks quite a bit of the absorption spectrum, and the effect of CO2 is more-or-less saturated already.

The mean free path of an outgoing IR photon in the lower atmosphere is quite short. Absorbed photons are re-emitted in a random direction, but take an overall upward course, the mean free path rising with altitude. At the (radiative) top-of-atmosphere, the mean free path is infinite: the photon is more likely to leave Earth. At the edge of space, there is essentially no water vapor, so the action of CO2 is greater. The effect of increasing the amount of CO2 in the atmosphere is to push the CO2-dense region of the atmosphere further out into space. Photons must take a longer path out of the atmosphere, and this must raise the overall temperature of the Earth proportionally, specifically by 3.7 W/m^2 per doubling of CO2, which is commonly held to be equivalent to 1 degree C of global temperature. This must be the case unless our understanding of thermodynamics is very wrong (and if you have an issue with thermodynamics then you have some pretty serious issues).

So, one degree C ain't so bad, right? Well, it wouldn't be if that were it. However, there are several problematic feedbacks. One is that melting a lot of ice lowers the Earth's albedo, which causes it to absorb more heat. Another issue is that there is a lot of this "water" stuff around, which is very readily absorbed by the atmosphere, in a manner that increases very sharply with temperature. Water vapor is a much better greenhouse gas than CO2 by all accounts.

Climate science is not an extrapolation from the temperature record. There is a solid minimum bound on the temperature effects of doubling atmospheric CO2, and a variety of amplifying positive feedback effects. So far, in the last twelve decades, we have not managed to find anything which would reduce those effects to something manageable. At this point, the effect would need to be both very large, in order to offset the strong H2O feedbacks, and very small, to not have been noticed. The most plausible option would be "something poorly understood about the H2O feedbacks". I believe the most successful of such theories would be Dr. Richard Lindzen's Iris Hypothesis, which has generally failed to find support. At this point, there are no particularly plausible mechanisms which would transfer this extra energy to space, and if those did exist, then they would not necessarily be a non-issue: even if thermodynamics and optics are entirely wrong, the planet is warming, and we will have to deal with that even if it can't be prevented.

If you have any other questions, or would like citations for any of the above, do feel free to ask.

Can you explain why a healthy body temperature is about 98F and why you can die if it gets too far from that ?

Can you explain what caused ice ages? 10K rises and falls in few years?

Glaciation is most strongly dependent on Milankovich cycles:

  - https://en.wikipedia.org/wiki/Milankovitch_cycles
  - http://www.indiana.edu/~geol105/images/gaia_chapter_4/milankovitch.htm
Plate tectonics and volcanic activity have also influenced climate in the past, e.g. the closing of the Panamanian isthmus, or the formation of the Deccan Traps.

Interestingly, the original paper proposing AGW (in 1896) was actually intended to explain Ice Ages:

  - http://www.rsc.org/images/Arrhenius1896_tcm18-173546.pdf

"_All_ the studies are measuring the same tendency : the temperature is rising"

There could be lots of reasons for that. Anyway, the temperatures were not always rising, clear temperatures rise was observed in 1930-40s and in 1980-90s. Cooling in 1960-1970s. And yes, prediction has to be precise. If a model predicts rise of 3K in 100years, and you measure 1.3K - your model is wrong. It's even more wrong if you don't take into account any of the natural cycles, even if prediction is accidentally correct.


He said he's skeptical, not that he thinks it's wrong. I've got to say that this kind of rabid response from climate change proponents kinda prods my inner contrarian.

I think at this point the group think is so strong that if anyone came out with evidence against global warming they would commit career suicide by publishing it

> I've got to say that this kind of rabid response from climate change proponents kinda prods my inner contrarian

Oh good. So you'd rather see millions of people be displaced. We're fucked as a species when "I'm a contrarian" is a valid reason to disagree.

I agree with your point, but note..

> they would commit career suicide by publishing it

Meanwhile some researches create their career by publishing this sort of stuff, even when it isn't well researched

Automatically reducing an argument to saying someone is equivalent to an anti-vaxxer kind of sidesteps the issue and is a logical fallacy. If there is a problem with the argument, elucidate the issues, but "Reductio ad X" arguments are not valid. This is a large and nuanced issue, which requires more thought and argumentation than can usually be contained in a HN comment, but I think the reduced thing we have here is whether computer code predicting a certain outcome should be trusted -- and, more importantly, making major decisions based on that prediction, given the nature of bugs and problems we see in normal code. I believe we all agree that potential consequences are far worse, we simply disagree on how to treat those consequences and where they come from. Because there is so much at stake, it is best to be more sure of what is happening and to take the right choice, instead of the first one we thought was correct based on a limited, and potentially flawed, computer model. If the computer model predicts dire catastrophe, then we should take it seriously and do another one with even greater resources in order to ensure the prediction is correct and what course of action should be taken. Perhaps it is right in predicting catastrophe, but not in predicting the -right- catastrophe, in that case we could spend a large amount of resources on the wrong solution and miss the right solution, which could end up being even more catastrophic.

If he has some actual basis for an opinion that contradicts the large majority of scientific consensus, then he should have at it. But to reject science out of hand with no reasoning other than "Well, there could be bugs" is just insane.

> So you'd rather just ignore all scientific results until somebody does a NASA-level code review because there might be bugs?

The bugs, if serious enough, can make it not scientific, in the same way that a paper that has grave errors in it is wrong and thus obviously not scientfic. So, before we do a thorough review of the code, we should treat it with due scepticism.

EDIT: this can also be expressed in terms of risk analysis. For typical software, the consequence of a bug is low - most software is commercial, so the impact of the bug will be limited to that company's bottom line (with exceptions of software that can kill, but these people already are serious about bugs). Also, most bugs in most software are either highly evident (the button does not work, you get random segfaults etc.) or have limited impact.

On the other hand, the bugs in climate change models, given their "pipeline" nature (wrong result from one module is propagated downstream all the way to the final prediction of expected temperature change), can quite often have severe impact. They can also be not evident at all - they can for example change the final outcome predicition by 1 C.

Compound that with the fact these predictions are used to make trillion dollar decisions on global policies, and you can see that the actual damage done by bugs is not unlikely to be in the trillions. And that's why I say it's probably wise to subject the models to extreme scrutiny.

There's enough proof that we're destroying our environment on a grand scale - literally the only place that we can survive as a species. All of what you said is just rationalization for the kind of behavior that has gotten us to this point and will continue far into the future. It's easier to deny anything is wrong than it is to do anything about the problem. And I'm not at all surprised by how many climate skeptics there are here - intellectuals and really fucking smart people who are incredibly good at rationalizing their behaviors and beliefs to avoid the feeling of having to do anything and who think they're able to think about the subject matter more clearly and productively than the scientists who work on it day in and day out.

There is enough proof that the additional CO2 caused massive greening of the Earth.

> I think we need to push hard to require all taxpayer-funded research to make any code that results in a journal article publicly available.

"Already 18824 SIGNATURES – sign the open letter now!" https://publiccode.eu

Was there a large, I guess “significant” difference between your results pre- and post-bug?

I spent a couple months implementing a promising optimization technique that was published in Nature. I assumed that I was missing something or had a bug in my code because I could not reproduce any of the results. It turns out, one guy in my research group knew the author. He requested the source code for me to the program used to generate the data for publication. That program did not even come close to doing what the paper claimed due to some very serious bugs. When I brought up these issues, nobody really seemed to care much. It wouldn't be worth anyone's time to try to correct the issue or write to the journal.

> It wouldn't be worth anyone's time to try to correct the issue or write to the journal.

I don't see how a research scientist could say this; something seems culturally wrong there. Nature is a pretty serious publication. You spent months on it, maybe someone else will. How can it not be worth your time to 'try to correct the issue or write to the journal'?

-Nature is full of bogus, like all journals, if not more. The reason for its high impact factor isn't due to consistently solid papers all around, it's due to a few very high impact, foundational papers and a crowd of minor impact/shaky ones that were sneaked in by big shots and friends of the editors. Look up retraction rates by journal and prepare to be surprised.

-I don't think you realize the step it takes to actually write to Nature and say: "Excuse me, that paper you published is wrong and you should retract it." You're liable to alienate all of the authors as well as piss off the editorial board for pointing out that they let a mistake slip through. You should be prepared to face vigorous backlash, and be completely confident in your own results if you don't want the repercussions to overwhelm you. Many communities related to specific fields are small and niche, so making enemies with one team often means that you've actually cut yourself off a good part of that community. You should be prepared for awkward moments in conferences when you meet each other, incendiary questions when giving talks and scathing "anonymous" peer-reviews for your future submitted work. It's far easier to assume good faith, give the authors the benefit of the doubt ("yeah the implementation is buggy but what software isn't? The main idea is probably sound") and move on.

-I don't know about other fields, but in mine, there's an unofficial accepted consensus on a set of very high-profile papers that happen to be either complete bunk or utterly useless. Again, we're talking about Science, Nature, etc. These papers made a lot of noise when they came out years ago, and now no one in their right mind would base their work on them. Again, you can't just knock on Science's editorial board and go "Excuse me, that hyped paper from a very big shot is useless and you shouldn't have published it", so it's just something people in the community know and whisper among themselves. I guess it may seem strange to outsiders though.

> I guess it may seem strange to outsiders though.

Of course it seems strange. The output of the scientific/academic community is often presented to people as being a source of truth and scientists do little to dissuade this. To learn that it's all rotten to the core is upsetting.

I don't think it is all rotten to the core. I think it is a human establishment trying to do something very complicated with varying results. Sometimes they get things right and sometimes they make mistakes and sometimes they are infiltrated by bad actors.

What I try to do when reading academic papers is, if there are multiple on the same topic, read them and try to understand the differences. If different papers come to the same conclusion, that should make you a lot more confident in the result. If there aren't many papers on the subject then you just need to understand that this is kind of a "best effort" thing that is more likely to be true than a random guess, but not certainly accurate.

It's not rotten to the core. It's a complex system, full of human beings, with egos and incentives and other reasons for doing things.

Different labs also do different things - mine is currently reworking a paper that was basically done because something we did might matter, and my students have developing tests as part of their workload for developing code.

It's not rotten to the core. There are good people in there, but good people are not rewarded, and advancement selects for some bad traits. Maybe that's scarier than it being rotten to the core.

It makes me wonder why more effort isn't put into finding alternatives to this fundamentally broken system.

Start paying people to find negative results, and you'll get negative results.

Unfortunately, you can't build a career out of doing QA for science. Nobody will fund you.

How do you propose getting started?

It's not rotten to the core, it's just imperfect, like all human communities are. Nepotism, conflicts of interests, human error, unconscious biases, etc. don't magically disappear just because we're in academia. As I said below: "All in all, I'd say we're doing fine. We're just not the ethereal source of truth that some people hold us to be, the very same people who, after claming that 'God is dead', are very quick to replace Him with His sillicon-based equivalent around which we would act like priests, except with lab coats in lieu of clerical garments."

It's all about trust, really. Many people choose to trust state-of-the-art medical research when they do vaccinations on their kids; some do not. You can indeed choose not to trust us as a 'source of truth', and it seems a fair chunk of the US political landscape is being led down that path; we'll see where it goes.

If the most prominent repositories of research not only do nothing about but actively resist and retaliate against substantiated challenges to published results, that sounds to me like something pretty rotten, at or near the core.

The core principles of science are in conflict with natural human behavior (which is one reason it took so long to invent science), so saying “they’re only human” is really no excuse.

Most of those other systems don’t claim to be designed to prevent them intrinsically via process, though.

I think it's one of those things you just have to learn to live with :/

And believe me I hate being defeatist, it made me a lot angrier when I was younger and I consider it to be one of the things that stopped me trying to go down the academic route so far.

While coders (who aren't just being boosters for particular technologies) have to live with the fact that the world is run on mountains of bad code, scientists of anyone engaging with academia or journal papers has to sleep at night realising how much bad science is being done and how much the publication process is skewed to publishing and politics over quality or fact checking.

I can no more fix it than I can fix all software bugs or politics in big corporations...

We never claimed to be completely free of bias, or immune to nepotism. We're human.

I think you're making it to be a bigger deal than it actually is. Most of these things are just noise. We just ignore it. If someone rises to prominence with outrageously fake results, rest assured that they will get shot down quickly by concurrents (all the incentive to publish fake results lies within highly competitive fields). If you know what you're doing, and work in a lab where people know what they're doing too, you can still do pretty good science. I would guess than in all communities, academic or not, there's a nonzero amount of bogus, over-hype, nepotism, politics and what not, and a subset of people who actually get things done and make the whole field advance. After a while, the test of time truly determines what was actually useful from what is bunk. That's how it's always been.

Brian Wansink didn't get shot down quickly at all.

Pretty interesting use of 'assume good faith' there!

> people in the community know and whisper among themselves

If no one else, tenured academics should be calling out bunk in their fields, or are failing, IMO.

The role of science in society is to increase the sum of knowledge on which progress can be made. If a scientist knowingly lets false information be published he is causing a disservice to society, harming progress.

Scientific papers are made public for a reason. The correct information should not be accessible only to a small circle of people in the know. If incorrect papers are out there, those who can correct them have a duty to do so.

>If a scientist knowingly lets false information be published he is causing a disservice to society, harming progress.

Often it's not so much false as empty or useless. Nevertheless, since all people in a community know the values of individual papers, and these are the same people 'making progress' in that field, I don't see how science is so much harmed. I agree the situation is kinda ridiculous, but it's not a "sky is falling" situation either.

>If incorrect papers are out there, those who can correct them have a duty to do so.

Sure. How much are you willing to pay me for me to prioritize this 'duty' over the literal hundreds of duties I already have at my lab?

> You're liable to alienate all of the authors as well as piss off the editorial board for pointing out that they let a mistake slip through.

Better rewards for uncovering bad science keeps it from devolving into a political contest. If you're not seeing the right incentives in place, you're either missing something or you're seeing an opportunity to capture some value by implementing those incentives.

> Better rewards for uncovering bad science keeps it from devolving into a political contest.

No. If you increase rewards for uncovering bad science, it will just lead to false accusations.

> If you're not seeing the right incentives in place

Some people think that the very existence of incentives is the problem here. https://hbr.org/1993/09/why-incentive-plans-cannot-work

> If you increase rewards for uncovering bad science, it will just lead to false accusations.

Isn't that what this study suggests is already happening, in as much as false accusations are just more bad science?

> Some people think that the very existence of incentives is the problem here. https://hbr.org/1993/09/why-incentive-plans-cannot-work

This article is full of opaque generalizations about human behavior that more accurately describes the baser impulses of individuals attempting to game systems they don't understand (participants in psychological experiments) and don't so much reflect the actions of self-conscious professionals with any semblance of dedication to their fields.

Curiosity killed the cat and all that, but what is your field? I had a look at your profile, comments and submissions (hopefully that doesn't sound creepy!) and I can't tell.

It's Saturday morning (and I got students' papers to mark) :)

>I don't see how a research scientist could say this; something seems culturally wrong there.

I can tell you haven't spent time as a grad student in a top university in the US. There is almost no incentive for a research scientist to pursue this. They're very busy and stressed out, and this will not help them in any way. Your argument that one person wasted a lot of time, and so others will be spared that pointless effort is a sound one, but you have to realize that in some (sub)disciplines, scientists view a good bulk of the research as problematic and a waste of time. Why go through the trouble for this particular case?

Also, I doubt a simple email to Nature will change much. There would likely be a somewhat lengthy process, which will suck more of your time. And to be brutally honest, the chances are higher that the code did produce those results, but the grad students/post docs have been modifying the code for their next batch of research. Even something as basic as version control is unheard of in much of scientific research.

Definitely a cultural problem, as you describe it.

> it wouldn't be worth anyone's time to...write to the journal.

I'm sorry, what? It almost sounds like everyone in your lab is scum. Hopefully YOU at least spent the 60 seconds to write an email?

Grad students / postdocs / human lab rats aren't scum, the incentives just aren't in place to promote good behavior (such as calling other researchers out on their bullshit). If you're trying to acquire a vaunted tenure track job, you can't afford to piss off $senior_tenured_researcher_at_prestigious_institution, since $senior could blacklist you so that you won't get hired at the incredibly small set of universities out there. Sometimes things work out despite pissing off major powers (Carl Sagan technically had to "settle" for Cornell due to being denied tenure at Harvard, in no small part because of a bad recommendation letter from Harold Urey [0]), but not often.

Even if you do manage to get a tenure track job, you pretty much have to keep your head down for 7 years in order to secure your position.

And once you have tenure, you still get attacked vociferously. Look at what happened when Andrew Gelman rightly pointed out that Susan Fiske (and other social psychologists) have been abusing statistics for years. Rather than a hearty "congratulations", he was called a "methodological terrorist" and a great humdrum came about [1].

When framed against these circumstances, it should be evident that there is literally nothing to gain and everything to lose from sending out a short e-mail pointing out that someone's model doesn't work.

[0] https://www.reddit.com/r/todayilearned/comments/15m8om/til_c...

[1] https://www.businessinsider.com/susan-fiske-methodological-t...

I'm a researcher myself and I guess this is one of those "does the end justify the means?" scenarios... Out bad research and its perpetrators and science loses out on a scientist that actually wants to do good work. Or don't and then watch yourself rationalize worse decisions later on for the sake of your research, slowly becoming as corrupt as they were and realizing that a lot of your cited work could potentially be as bad (or worse) as the ones you helped get published.

I really believe we need a better way. Privately funded / bootstrapped OPEN research comes to mind as a potential solution to bring some healthy competition to this potentially corrupt system. Mathematicians are starting to do this, I think computational researchers have the potential to be next.

> Grad students / postdocs / human lab rats aren't scum, the incentives just aren't in place to promote good behavior

The question is, would additional incentives promote good behavior or just lead to more measurement dysfunction. Some people think that just giving the "right" incentives is needed, but actual research shows otherwise.


Without reading through that very long text, claiming that incentives don't influence human behavior is a wildly exotic claim.

There is near infinite evidence to the contrary. That said, constructing a system with "the right incentives" can of course be devilishly hard or even impossible.

The claim is that it does change behavior, but only temporarily and it doesn't change the culture in a positive way / doesn't motivate people. It ends up feeling like a way of manipulating. That being said, according to this article, the entire incentive system would need to be dismantled. Simply adding more incentives wouldn't necessarily produce higher quality, at least not in the long run. So essentially the process of incentivizing new amazing research for funding is the primary issue and adding incentives for pointing out issues would just be a bandaid.

This sounds like a good critique of naive incentive schemes.

I don't think there is any doubt that humans follow incentives.

But working out what the core incentive problems are, and actually changing them might be both (1) intellectually difficult, and (2) challenge some sacred beliefs and strong power structures, thus making it practically impossible.

The HBR article's discussion of incentives is not really quite what I was thinking of when I wrote my comment. Specifically, the article you cite refers to the well-known phenomenon of how introducing extrinsic rewards via positive reinforcement is counterproductive in the long run. I've often noticed this form of "incentive" / reward being offered in the gamification of open science, such as via the Mozilla Open Science Badges [0], which in my opinion are a waste of time, effort, and money that do little to address systemic problems with scientific publishing.

With regard to the issue of grad students being unwilling to come forward and report mistakes, incentives wouldn't be added, but rather positive punishment [1] would be removed, which would then allow rewards for intrinsically motivated [2] actions.

[0] https://cos.io/our-services/open-science-badges/

[1] https://en.wikipedia.org/wiki/Punishment_(psychology)

[2] https://msu.edu/~dwong/StudentWorkArchive/CEP900F01-RIP/Webb...

It's not at all uncommon that implementations provided with papers do not actually do what the paper says it does. You're often lucky when there is an implementation at all. But most of the time it's just that running the exact same implementation under the same conditions requires setting up a very specific environment, installing specific versions of libraries, using some niche software and converting data from one byzantine format to another. Each deviation from the original paper is liable to subtly affect the results in nondeterministic ways. That's why no one really gets surprised or even cares, life is too short to call out all the bad software written in academia.

If a research result requires specific versions of libraries to replicate, wouldn't that completely invalidate any scientific claim being made?

Any change in precision or numerical methods that affects results surely must be well within the error margins.

I was more referring to changes in the API where the input and output suddenly have to be in different formats in the middle of a pipeline, causing a crash. What can also happen is that somehow the old format is still valid and gets processed all the same, thus yielding nonsensical results. Sometimes a lab devises their own format which no one else uses, and the specification may be updated without notice between the moment they publish and the moment you try things out. Most people have no idea about things like 'backwards compatibility', 'unit tests', 'containers', etc. Code is just a tool to them and the fact that they had to write some is annoying them in itself.

> requires setting up a very specific environment, installing specific versions of libraries, using some niche software and converting data from one byzantine format to another

Containerisation is fairly mature and simple to use. Many in other fields struggle with these exact same issues and are able to create reproducible environments just fine.

I find it amazing that those publishing don't include their implementation, all that work locked away on a rusty hdd.

Containerization is not simple to use for many users of scientific code.

Hell, I'm pretty sure in my department, the first response would be "What's that?"

Containers may be a bit difficult, but every researcher should be capable of using a VM and distributing it.

I'm excited if I can get my colleagues to use version control.

Do you think that they would use a VCS that was less invasive and more transparent to their workflow, like dropbox?

I'm thinking of making a VCS that simply runs in the background and

- Automatically records every file save (effectively a git commit without a message)

- Allows adding messages through tagging (like git tag)

- Handles 'branching' just by asking you make a copy of the directory with a different name, properly understands how to diff/merge/etc copied files/directories that have since diverged.

- Has built in support for large files

Almost certainly, yes.

Git is way more complicated than VMware.

Use SVN just like everybody else.

You obviously don't know many researchers (if any).

In the software dev world--but outside of BSD--I would argue that containerization is extremely immature. Outside the software dev world, it is not easy to use.

Based on my experience on Academia Stack Exchange, in a parallel universe there is a student complaining that their advisor made them rework everything the night before a presentation.

Candidly, I probably would have told you to present it as is, and add a caveat to the last slide that this was work in progress (most internal presentations are assumed to be so) and you're still chasing down some problems in your code. The reason for this is two-fold:

1) It's the night before. Many students I have known and mentored aren't at the point in their career that they can "wing" a major presentation. It would be setting them up to fail in a way I couldn't shield them from (I can deal with changing results, but a bad presentation is largely on the student).

2) The quality of your checking is likely to be poor the night before. There's a number of times I've found an error as something was being prepped for presentation/poster printing/etc. and been convinced it changed everything, only to discover after 48 hours of thought and more checking that the difference in results was pretty negligible - especially in the sense of the qualitative take aways from a presentation.

This, of course, may not have been your exchange with your PI. But I thought it was worthwhile that there are reasons not to have you change everything the night before that aren't the result of villainous fraud.

This seems like a great neutral path forward that shouldn't upset anyone and doesn't paint the research as 100% factual and in-stone, yet. It leaves the door open to continue the work while rectifying the issues. However, it doesn't excuse the way the advisor asked them to just go with it anyway unless there was some nuance left out in that part of the conversation (akin to you what you've stated) we're not hearing.

Also in bioinformatics: I worked up some statistics that turned out to cleanly and robustly contradict my supervisor's previous research.

I was told to reverse my conclusions - refused - and, hey, I no longer work in bioinformatics.

Was it an option to present the paper as normal, but then have a sudden surprise twist at the end which left the audience both surprised and feeling the pathos of imagining how they would feel about discovering the last 18 months of their research was invalid?

Or am I mistaken in reading "present" as "stand up in front of a crowd of people"?

Just thought the same. There was a way to "follow orders" and satisfy your principles in a way that serves science. And if anything happened to him after that, it would be clear that it was a political firing.

This has me really curious about the state of software engineering principles in academia if you can find bugs that invalidate 18 months of work.

I'm not saying it should have been preventable. It just looks like there may be opportunity to improve practices.

Admittedly I'm completely naive to the domain. Are there no forms of validation checkpoints you can reach where your foundations are rock solid and well backed with tests and such?

I had a Prof in grad school who lost years of cryospheric research data because an external hard drive was stolen. This was in 2010. It was a head scratcher, especially being faculty-adjacent to some of the best CS and engineering faculties in Canada.

I've seldomly seen a PhD student in CS here in Germany with a proper software engineering background. I know several that use Dropbox for version control on their code and don't know how to use git. Some at least know and use svn.

That stuff just isn't thought at universities and it's assumed, like oh so many things, that you take it up along the way.

This might sound harsh, but that is something you have to deal with in life (academia or non-academia). What you were tasked to do is present your work so other people made room im their calendars to listen to your talk. Finding an error is no reason/excuse not to present your work. BUT you have to make it clear upfront that there is a problem which might invalidate the results. And maybe already give an estimate how far reaching this error is.

If you have been working 18 months on a topic you will have substantial knowledge you can communicate to your peers and often some work inspires other work even if they are not using your findings.

It took me less than a minute to find your adviser

https://www.linkedin.com/in/jake-boggan-6528944a/ -> http://www.biomath.gatech.edu/people/ -> Jake Boggan (Advisor: Bunimovich)

Probably not the best idea for the OP to post under their own name in a public forum.

Jake ask the mods if you can remove your post.

You rock. Thanks for your integrity. Sorry you had to go through that.

In the bigger picture, it drives me crazy that these results aren't welcomed. We learn from failures too. Funding sources should recognize this.

You could almost say we mostly learn from failures. If you had blind success with a complex method the only way to test the limits would be tweaking it to failure.

That's an incredibly eloquent way to put it. Even though it's how I live most of my life I was never smart enough to express it like you did. Thank you.

Why did you not try to publish those problematic results without your adviser's name on? There are venues focused just on negative/non-reproducible/problematic results.

Oh, scientific/research code is hard

First of all, a lot of people writing it are inexperienced in SW best practices. Then, turning formulas into code is hard, not to mention some article "handwavings" like showing pseudocode that's almost ok except for "do hard thing in this line" and that line expands into a lot of code.

I've also had some weird bugs in code like that (but nothing that would invalidate 18mo of results - btw did you know RAND_MAX is as low as 64k in different compilers, like some older versions of VS?)

(And computation times, though the cloud helps a lot with this)

Good for you, +1 to the greater good for your sacrifice.

Do grad students have any leverage over their advisors? Surely there must be some incentive to root out professors that degrade the reputation of the school. Public shame? Accusations of fraud? There are more options here than students admit, and some other school would have loved to have you.

Part of the thing to keep in mind is that things like accusations of fraud are potentially career ending.

People, including professors, understandably react poorly to other people trying to end their careers. It's important to recognize that something like that is coming out swinging.

Also, in my experience, there's a tendency with many graduate students to conflate "unethical" with "I don't like this". Not saying that's true in this case, but an "incentive to root out professors" is likely going to result in some pretty strong undesirable outcomes as well.

> People, including professors, understandably react poorly to other people trying to end their careers.

Ironic given that this is the primary power of being a thesis advisor.

My mentees failing to progress will show up on my annual evaluation, and will certainly be a factor in my tenure case (especially since my position has no other aspect for teaching).

It's also something the project officers on my grants watch.

It also means I lose whatever I've invested in that student.

One does not idly destroy their graduate students, regardless of what HN occasionally thinks.

No, graduate students in general have no leverage over their advisors. A single word from your advisor and you're out of the field.

There is little incentive to root out professors for any reason. The process of becoming a professor (grad school -> postdoc (N times) -> tenure track faculty -> tenured professor) is generally believed (by tenured professors, of course) to root out anyone unworthy of the position. You can believe what you want about the efficiency of such a process.

Public shame requires public understanding of scientific (mal)practice, so, good luck communicating that. Most of the time, the bad actors in question have already gotten papers past referees; what makes you think the public is capable of more thorough review?

Fraud is considered a serious allegation and as a result accusing someone of it requires going through a thorough process involving a host of university administrators, whose incentives are aligned with the profit motives of the university system.

Transferring graduate schools is essentially impossible, and even in the exceptionally rare circumstances that it happens, it always involves burnt bridges and often has to do with bigger fish (i.e. your advisor being offered a position elsewhere, and you're lucky enough they take you with them.) Without external funding to support you, you are usually replaceable. All graduate departments receive applications far in excess of the number of students they can support. They certainly will not consider taking on another from a school at which you've proven to be a problem. Academia has already established a quite successful leaky pipeline; the beginning (graduate school) is no different.

In academia, hierarchy is the rule, flat organizations the exception. You must purchase your influence, usually at significant cost (and luck is a significant component). As an undergraduate, the system is designed to cater to your interests; as a graduate student, you cater to the university's interests. Scientific integrity is a noble notion, and in some corners of academia, it survives, but it does so in spite of bad actors who thrive in a system designed to produce ten times the number of qualified applicants for each job, all of whom are judged according to easily gamed metrics. It would be nice if things weren't this way.

But the problem is, typically... if you decide to get a PhD in science, it is possible that you're already too obsessed with the subject to ever, truly, give it up, especially if it's "just" over working conditions. I can't speak for everyone, but most people leave because they were forced to.

Transferring grad schools is never easy, and even less so if the adviser decides to start a whispering campaign

There really needs to be repercussions for this type of thing. Naming and shaming is a good first step towards that utopia. If those responsible for these types of unethical actions never get punished they'll just keep doing it to the detriment of society.

The pressure to get positive results is just too high to be compatible with good science. When you have people's whole life and career on the line all the time (publish or perish) then people are going to do what they have to stay employed.

Like Gresham's law bad science drives out good, because it is much faster to do bad science than good. Those that try to maintain quality can't pump out as many papers as the bad and so lose on the grant treadmill.

Any solution that does not address the incentives is doomed to fail. Not fixing this problem will kill science.

And that pressure to get positive results is extra high right at the beginning of scientists careers, when they should be learning best practices! They do a big project for their doctorate and if it turns out poorly, they can be screwed. There's their chance to move on with their careers flying out the window. Maybe the already had their next gig lined up, but it's time to put that on hold :( Or, maybe if we just make an extra assumption, or twist and turn the data just this way, using this model and these covariates... Oh! Here's something significant! Here's your PhD welcome to science.

The pressure doesn't stop at any stage in your scientific career. It is bad at the PhD stage (get results or no PhD), at the postdoc stage (get results or no tenure track position), at the tenure track stage (get results or no tenure), tenured (get results or no grants).

About the only point you can actually slow down enough to care about quality is at the emeritus stage, but even then if you don't produce your lab will be moved into a broom cupboard.

One of the things I make my students do is to try and design projects so that a myriad of answers are all potentially interesting, such that the likelihood of getting a result is high, but the particulars of the research are responsible for what that result is.

Yes I used to do something similar. All my students had two projects - one was high risk, but if it succeeded glory, the other boring, but no matter the result, publishable.

It's like the Tour de France. If you're the type that never cheats, you find a different career, like barista.

If you cheat, you get "high ceilings" and a "marble fireplace" on the "upper east side":


Yep it is disgusting. The end result is going to be science is seen the same as the Tour de France. Unlike the Tour science is not something we can live without.

The destruction of the reputation of science is probably the most dangerous thing going on in our society right now. If we lose science we have nothing.

The incentives for research absolutely needs to change, and I'm really happy that it looks like more and more people are both speaking up about it, and starting initiatives to try to improve.

We surely don't know the answer yet, but pre-registration of experiments/methods to get a guaranteed publication is one quite interesting.

But there are others, and I applaud any initiative in these areas, because it's at it core about the hardest problem of all:

How do we overcome bias in a large system under a lot of financial pressure?

It's such a worthwhile issue to pursue, but also usually quite thankless, so anyone fighting for it is doing something great even if it's in seemingly small ways!

Science is a field where failure should be celebrated; see Thomas Edison and the lightbulb.

... Except that Edison succeeded.

What kind of failures should we celebrate? There are two kinds that I can see: a failure to produce a result because of a lack of skill or knowledge and a failure to produce a result because of a demonstrable or provable reason. The last one is actually a success because that kind of failure produces new knowledge. But we have a culture in far too many fields where this is seen as not publishable.

I have always wanted a 'Journal of Cautionary Tales', for studies that were well thought out, approached correctly, and just didn't work.

"I have not failed. I've just found 10,000 ways that won't work." ~ Thomas Edison

He eventually succeeded.

Joseph Swan, or any of the other 23 inventors of the lightbuld prior to Edison

I wouldn't go as far as to say failure is celebrated, but it should be OK to fail. Your career should not be over because you tried something hard and it didn't pan out.

I almost think we should have a magic card you get to play once in your career that resets your track record. This way you could take risk without totally destroying your career.

The problem is that too many people want a life in academia, heavily insulated from the outside world

Just like school teachers in America, PhD students are taken advantage of for the exact same reasons

People want the jobs, and America doesn't believe in regulating supply and demand

Beyond overt fraud, I feel like there's so much incentive for imprudent analysis.

Several times when I've worked on a project at work that involves data analysis, I get really impatient responses when I don't come to a firm conclusion. We're not even talking about highly charged subjects. I'm not aware if my stakeholders are biased toward one conclusion or another -- I think they're just very upset that my analysis contains uncertainty. They expect me to be able to massage any kind of data to derive clear and obvious facts.

Thankfully, I can refuse this without much consequence, but it's really opened my eyes to the potential pressures to corrupt the integrity of fact-finding in even mundane circumstances.

I saw some truly horrific things done at the last place I worked, and the kind of environment fostered was such that I wouldn't even hold myself to any standards if push came to shove. Now I work at a different company on a team of one and could do whatever handwaving I want but due to our culture I'm allowed to maintain a high standard of analysis, which is my default as a scientific and mathematically minded researcher.

> They expect me to be able to massage any kind of data to derive clear and obvious facts.

Not to be rude, but if that's in your job description, then I can understand where they are coming from. As someone in IT, it's always my fault that the computer is not working even if the ten year old device simply broke. That doesn't mean they are trying to get me to do an imprudent analysis.

Sure, there is a difference between operations and trying to find correct answers to questions (i.e. science), but they do have money to make, and if they're not (even slightly) asking you to bias your work, I can understand if they put a bit of pressure on getting the results clear.

Edit: As u/jeremyjh points out, I completely misread this post. If it's about uncertainty due to not enough data, then please completely disregard what I said. (I can't delete the post anymore because there exists a reply to it.)

That isn't how statistics works. You can't remove uncertainty from a given set of data.

Oh that kind of uncertainty. I completely misread that post.

Whether it's hiring a statistician or an IT professional, it reads to me that a lot of people are hired to do work that their superiors simply don't understand (what else would they be hiring you for? :)

There's a lot of people who would think the whole point of data scientists is to look at the numbers and say, with certainty, what they see. But sometimes the numbers don't say anything, and that's what OP was talking about I think - sometimes a clear correlation just doesn't exist, at which point the person who has hired you probably thinks you're not looking hard enough, and might encourage you to 'clean the data' until you get a result with high confidence.

Reminds me of a project (not in medicine) I had a couple years ago where my group was in the data science role for another group. We helped out with experimental design, data acquisition, and analysis. After all of our work, when we crunched the numbers, we found that the hypothesis being tested didn’t hold. The people funding the work were not at all happy, and went so far as to claim that “the plots must be wrong”. We showed them everything we had done, and nobody could find any issues with the analysis. But, they were irritated because we didn’t want to play the dishonest game of farting around with plots and numbers to make it look like the experiment worked. So, they angrily broke up with us. On the plus side, when I looked a year or two after, it looks like the grumpy people haven’t made any progress and are either out of business or something.

At least you had the data... I've been a number of times in a situation where I was supposed to conduct extensive, thorough analyses of (expensively acquired) data that was evidently useless. I still had to go through the motions, knowing full well that if somehow the analysis yielded the "desired" results, any caveats would be ignored.

"Prove our new train signaling system is as secure as the old one. If you can, you'll get your degree, if you can't, you find someone else to sponsor your research/degree and we find someone else to do the proof." -- Deutsche Bahn (government held national train company in Germany) in around 2001 to a friend of mine (they and him found "someone else")

In government this is so common as to have a name.


"Noone has died from the new train signaling system that has not been put into effect yet. Therefore, we can conclude that the new train signaling system is safer than the old existing system."

Even within the statistics community, there's a spectrum of quick-and-dirty vs fully rigorous. People with the ability and inclination to be fully rigorous often get treated as pedants and perfectionists (in the bad sense).

I often get that with business partners. "The data says <likely X, but with caveats/nuance/uncertainty/under certain assumptions we can't justify>" to which they respond "Can we just say X?" Or "can we get numbers on Y to support a presentation on Z?" when Y seems to support Z, but actually you can't draw that connection, so it's misleading.

Stuff like this happens because people treat extra rigor as pedantry and are comfortable making supporting assumptions that aren't supported by data. The people making fraudulent requests aren't aware that they're fraudulent (usually). In my experience, they just think they're being practical.

I agree, and see this regularly.

"Picking battles" is one way to describe my counterpoint: caveat exactly as much as needed so that a proper decision can be made with the risks involved.

If you want to query your whole team for a joint lunch location but coworker X is out, it is still appropriate to say (assuming more than you and X on team) that you asked the team and you decided on lunch place Y. It's not rigorous (X is left out) but it's still accurate.

This is very different from, say, regulatory or securities reporting where ambiguity is not appropriate.

It’s “nearly 1 in 4” in the text, but I guess they altered the statistics slightly to make their case look better.

That happens, a lot. Not only statisticians.

My friend was half way over her 5 years vesting with a startup as when the CFO asked her to help then improve their numbers due a foreseen investment round. The idea was basically bump revenue basing it solely on the GMV masking returns and not calculating discounts and shipping. They also wanted to hide running costs by forcing vendors to agree on a 90+ days billing cycle, they also pushed the CTO on turning off parts of the system during weekends and holidays and forbid PTO until the deal was closed.

She refused doing the number masking and was asked to leave.

The story I heard related to an actuarial consultant who, after analyzing data and suggesting a defensible year-over-year rate hike to their insurance company client, was asked to change their analysis to support the rate hike the insurance company wanted.

And, of course, if the consultant didn't play ball, the insurance company could always consult with a different actuarial firm.

That's the name of the game for certain types of consulting--support the already-made decision. Mostly, but not exclusively, seen in management consulting to support staff reduction, outsourcing, and the like.

The troubling part of this article (and especially the comments) is how easy it is to be labeled “anti science” today when questioning certain studies. I’d hate to think our culture is causing us to lose the required academic rigor to make data-based decisions.

It's a tricky thing. There is a lot of high quality science. There is some lower quality science -- outright fraud even, unfortunately. And then there is science reporting, which is all over the map, and can easily be misinterpreted.

Humanity has built up an enormous amount of legitimate scientific knowledge and understanding. There has been much difficulty, confusion, and dead-ends along the way -- and it seems this continues to be the case. But it really is an incredible thing, how much we know at this point. It took me a long time and a lot of self-learning to develop a rich appreciation for this.

Of course, there is still much for us to figure out. And we should keep doing so, difficulties be damned (and hopefully mitigated over time).

Yes. Of course the other side of the coin is questioning the science because it doesn't match someone's political philosophy. Sailing between the Scylla and Charybdis is hard.

I was working on the Cingular and ATT merger, my upper management asked me to fudge the numbers on roaming. I would gave the numbers in blue and orange, and blue (att) had better data handoffs on 3g. It made orange look so bad, (shitty network), they made me combine them, to which tanked ATT's network stats to the c-level.

One of many times, stats made someone look bad, and they made me change them.

Most the time, support metrics are altered to make it look like support contracts are hitting everything contractual.

This is daily business everywhere in tech. Mostly fudging to downright lying, just depends on the importance of the data if some mone is tied to it.

I don't approve of this shit, but lucky, I never been asked to commit fraud, like reporting sales that don't exist...

We've had this issue before with people scraping our websites, as our software gets better at detecting scraping, our number of page impressions went down.

Luckily our board was pretty understanding the reduction in infrastructure strain took a nice percentage out of operating costs.

Very interesting study. I'd love to see a follow up on 2 things:

1) Did the statisticians refuse or comply with the request?

2) When they refused, how did the requester react? Were the requesters actually malicious, or just bad at statistics? If everything was fine once the statisticians explained that removing "just an outlier" wasn't a valid option, then this report isn't quite as concerning, and is maybe just an indication that more researchers need to hire statisticians to help them out.

Sensitivity / ablation studies plus datasets opensourcing should be made mandatory for every computational proposition and research asap, so that results can be checked by peers and by the community at large. With publish-or-perish pressure, democratisation of tools, rogue parties and hyperspecialisation, is it getting to the point almost nothing would sustain any stringent scrutiny? Science cannot afford opacity any more, more than that in these blurred times.

Of the less serious offenses, 55% of biostatisticians said that they received requests to underreport non-significant results.


That's why one should big consulting companies like McKinsey, Bain, BCG, etc, who don't need to use hardcore statistics to prove whatever the board/CEO wanted in order to execute some policy.

You mean like good ole Arthur Andersen? Even the big boys will play dirty for the right price.

They have been playing dirty for a long time.

...in the USA, in medical research...

That title is too long for HN. Sometimes I wish there was a limit longer than 80 characters so that no one has to resort to slightly more sensational titles.

I don't know, it's actually more upsetting if it's just medical research? Otherwise I could've told myself it happens less when it really matters.

This didn't even cover the most frequent type of fraud which is to redo the experiment, or reanalyze the data until a sample of noise looks like a signal and then pretend you found something (as illustrated here: https://www.xkcd.com/882/).

I got out of academia and stopped trusting most university research because I observed too much of this culture of fraud.

I wonder how prevalent this sort of thing is in other fields. I also wonder how prevalent fraud is generally, and the extent to which it is driving our decreasing faith in the academy. Also, how much (if any) of this fraud is motivated by political (instead of strictly personal) outcomes.

What is the "American Council on Science and Health"? See here: https://www.sourcewatch.org/index.php/American_Council_on_Sc...

1 in 4 Statisticians weren't asked to commit scientific fraud. Within the article:

"Researchers often make "inappropriate requests" to statisticians. And by "inappropriate," the authors aren't referring to accidental requests for incorrect statistical analyses; instead, they're referring to requests for unscrupulous data manipulation or even fraud."

This isn't even remotely close to what the title of the article claims.

I have serious health issues. I've gradually gotten healthier, in part because I am skeptical of a lot of studies. I know how borked so many of them are, not a thing most people want to hear me assert. I get a metric fuck ton of flack over my skepticism.

I kind of have a mental box where I squirrel away little tidbits that meet two criteria: 1. They seem to come from rigorous studies and 2. They also fit with my general understanding of how life works.

Over time, I mentally group things -- a la A and B seem related -- without assuming that I know how they relate. I seem to have an inordinately high tolerance for ambiguity.

Most people seem to need An Answer even if it's wrong. They have two categories -- black and white -- and when presented with purple or pink or blue, they force fit it to one of their existing categories and don't confuse them with the facts.

I wasn't trying to prove anything to anyone. I was just trying to deal with my life. But having gotten substantially healthier, I now wonder how to talk about such. It seems a wasted opportunity for the world for me to not share, but the world has treated me pretty horribly and done all in its power to tell me to STFU, I'm just crazy and spouting nonsense.

So I sometimes think I should write what I understand to be true and then carefully back it up with citations to try to support it. Then I read articles like this, throw my hands in the air and go "Why bother?"

The way I have been treated seems particularly unfair when you learn how much "real scientists" cook the books. Like why? It feels like pure prejudice when I read things like this.

1 in 4? How do we know that number is not made up?

In fact there is a 25% chance that the person involved in working that out has been asked to commit fraud at some point.

>In fact there is a 25% chance that the person involved in working that out has been asked to commit fraud at some point.

But there's only a 10% chance of that.

Another statistic just in: 3/4 statisticians lie about not committing scientific fraud

To be honest I'm surprised it's not higher. Relevant discussion: https://news.ycombinator.com/item?id=17789308

The bottom line is that there's not much incentive for doing the boring statistical validations (I can tell you that no one likes doing statistics apart from statisticians, and not even all of them) and verifying that everything is reproductible, and a huge incentive for, let's say, 'arrange' a thing or two so that the paper looks better. So many people just kinda do it, and it slips through the cracks because:

-In many fields, the reviewers are not statisticians themselves

-No one really bothers to download the data, setup the environment and libraries, and reproduce the exact steps taken to obtain the very same figures in a paper. Which is understandable given how doing all of that can be such a chore. Anyway, most of the time the exact steps aren't even described. No, jupyter notebooks aren't a solution either (it would take too long to explain why and the post is already long).

-In many cases, the results turn out to be true anyway so people don't notice they were initially put forward with fraudulent validations

-When they turn out to be false, people just shrug and move on with what's actually true. Bogus results often fail to stand the test of time and get forgotten quickly despite initial hype. No one bothers to say: "Hey, that paper from 8 years ago is bullshit and their authors are hacks!" because nobody cares.

-There's a huge psychological barrier to actually call out one of your peers and affirm that they're an impostor and their work is bogus. Especially when said impostor happens to be a big name in the field, that part of you still doesn't believe they would commit fraud, not to mention the social repercussions and backlash of doing such an accusation. We scientists aren't a very confrontational bunch.

So most of the time it kinda works and we're all bumbling along hoping to find some modicum of truth at the end of the road with minimal harm done. But of course sometimes you get these guys who kick off their entire career on a high profile, much hyped fabricated result (Woo-Suk) or even a series of bogus papers (Sato), and that may lead to long-term harm. The good news is that hyped papers or very prominent figures quickly attract scrutiny from their peers, and sooner or later reality catches up with them as labs around the world fail to replicate their 'breakthroughs'.

All in all, I'd say we're doing fine. We're just not the ethereal source of truth that some people hold us to be, the very same people who, after claming that 'God is dead', are very quick to replace Him with His sillicon-based equivalent around which we would act like priests, except with lab coats in lieu of clerical garments.

Ok, so why aren't jupyter notebooks a solution?! I was gonna suggest that also in response to another comment above suggesting figures should be published 'alongside their relevant code'.

They are so unwieldy. JSON formatting means the ipynb format is a huge mess to handle by simple means, so you have to open a browser window to do anything. Loading, editing, formatting, saving is just... ugh. You can't do proper diffs (though some tools are trying to alleviate that). Loading, stopping and restarting kernels is excruciatingly slow. Because cell execution isn't necessarily done in order it can become very, very easy to lose yourself and feel like you're back in the 70's experiencing GOTOs and the joys of spaghetti code. And that's from my point of view with a technical background. Imagine explaining notebooks to a biologist to whom writing any line of code is still something new and daunting. Imagine their reaction when they happen to mess up one cell in their own execution order and alter all the subsequent workflow, try to grok some of the magic %commands, or get some cryptic error message due to some config file not having the proper rights because they installed a library with 'pip --user' and not 'sudo pip'. It's not realistic to think that an entire community of people who aren't technically minded, many of which actually loathe or fear anything looking like code, is going to adopt a tool that even technically minded people struggle to use they way it's intended.

On top of that, many steps necessary to reproduce a pipeline typically need to load enormous datasets. Terabytes of simulated protein structures, hundreds of gigabytes of sequencing reads, phylogenetic trees, alignment files, what have you. Once you somehow acquire that dataset, you need the appropriate tools, many of which need to be specifically compiled for your platform, then run them onto a powerful machine if you don't want the pipeline to take months to complete, etc. (And that's if you didn't use any proprietary software or any GUI based application with no command line interface.) You can't exactly load all of that into Github+Mybinder and call it a day. You can't ask a community of people that doesn't like coding to learn about Docker containers either.

Nevertheless, we (at our lab) do use notebooks when we can because we know they're fashionable. We can only present parts of our results, though, due to the aforementioned constraints, but it's still pretty looking and people like them, so we write short demos using them.

> -In many fields, the reviewers are not statisticians themselves

This is a pervasive problem. I was asked to comment on a review paper on multiple testing corrections for a biology journal. The paper was so bad, so completely misunderstood the technical underpinnings of multiple testing corrections, that I sent the recommendation that it be rejected and that they not even try to do revisions. The authors were not competent to write such a paper. At the same time, they were probably considered the experts in their corner of the scientific world.

> -There's a huge psychological barrier to actually call out one of your peers and affirm that they're an impostor and their work is bogus.

I think it's less a psychological barrier than fear of retribution and there being no effect when someone does call them out.

I always knew I could walk out the door of my university and get a software job making twice as much as my advisor, which rather messed up the power imbalance, and I pissed off a lot of people by speaking up. It had no effect. Lazlo Barabasi is still publishing his nonsense of about scale free networks. Kim Lewis is still churning out papers on toxin-antitoxin complexes in antibiotic persistence, when they are irrelevant to any case of persistence in nature. There are many more names to name, but I've largely forgotten them at this point. The only reason to remember them is to have good counterexamples of how to do science, and most of them are repetitive.

> I sent the recommendation that it be rejected and that they not even try to do revisions. The authors were not competent to write such a paper. At the same time, they were probably considered the experts in their corner of the scientific world.

This sounds depressing. I assume your recommendation was promptly ignored and the paper was published anyway?

If only there was an unpaid army of people who would benefit from learning how to reproduce scientific works.

Reviewers may be tasked to reproduce all findings. That would require the reviewers be less experienced than authors, though.

"Can you make the error bars a bit smaller please?".... "um no, I'm affraid they are as I've calculated "

Is this result statistically significant?

"1 in 4 Statisticians Say They Were Asked to Commit Scientific Fraud"

It would be hilarious if this report were debunked on the grounds that they inflated the number of statisticians surveyed

What's odd to me is that some people seem to, even in the face of this, think it's fair to criticize liberal arts journals for sometimes publishing bad papers. People are just people, in every field.

... and 1 in 4 not prepared to answer truthfully, 1 in 4 not having worked in the field long enough, and 1 in 4 not competent enough to know the difference.

/haha, only serious.

How do I know this study isn’t part of the 1/4?

I can't help but wonder what the rate is that didn't admit to being asked to commit scientific fraud because they did.

Most statisticians work for people who are trying to sell something. Statistics is not always the best way of selling something.

I think that this should prompt scrutiny in fields that are highly dependent upon statistical analysis, such as climate change. However, this very suggestion is intolerable to many and in a way equivalent to blasphemy against the church science. Scrutiny will not be tolerated by academe with regard to ideas held as sacrosanct. I find this culture in academia pretty disgusting.

I wonder what the stats are in China.

80% of China’s clinical trial data are fraudulent, investigation finds


'25%' would have been a more appealing headline ... :D

Is this fraudulent stats?

The other 3 in 4 just lie about it.

Alternatively just one statistician lied, the one conducting this study.


These are broadly addressed critiques with potentially large rewards for discovering/verifying notions contrary to the prevailing understanding.

In other words, the incentive to just go-along if one has evidence to the contrary is low relative to the opportunity cost.

In even less words: money can be made if strong counter evidence to these topics can be verified.

Conclusion: its unlikely that these examples are really areas of hidden wisdom that runs counter to prevailing understanding.

On the one hand, interesting, but I don't think there's a worldwide conspiracy to prove global warming.

On the other hand, 1/4 is a large number, we'd have to break it down by discipline.

Not surprising that you created a throwaway for this comment...

And the other 3 lied

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact