
What to do when you don’t trust your data anymore - di4na
https://laskowskilab.faculty.ucdavis.edu/2020/01/29/retractions/
======
kuu
I've had a similar case to this problem, not in research but with production
data for a company. Decisions were made with wrong data and therefore, bad
decisions. It's really an agony and hard to swallow. But the best thing to do
is, as you did, face it and be honest with yourself and all the others and
admit the error.

I think this is part of being a data scientist.

------
6gvONxR4sf7o
It's amazing how common it is to see something that doesn't make sense in the
data and instead of deciding to investigate, they just move on. I'm guilty too
sometimes. But almost every time I dig, I find something that invalidates it.
Maybe an ETL somewhere is putting an empty string instead of NULL in
particular cases. Maybe my join is duplicating certain rows. Maybe you got a
table (or spreadsheet) that doesn't make sense. It all violates the
assumptions of an analysis, but collectively we just say "can't eliminate all
bugs." I hope the future is better.

~~~
perl4ever
Some places, they do QA, and when they find something in a random sample, they
fix that instance and continue, rather than looking for all the places it
happened and why.

------
wyldfire
> “Why Sheet 2 exists is an interesting question,” was Jonathan’s response and
> he agreed that “it is well that the paper is being retracted” when I asked
> him directly about this and that perfectly sums up my feelings as well. ...
> While I’m not sure what Sheet 2 means or why it exists; I do know that the
> data in this paper also suffer from inexplicable irregularities rendering
> any results untrustworthy.

> Given the problems in my data sets, these folks are proactively
> investigating data that they received from Jonathan ...

It's great that she is sticking solely to the facts and being very scientific.
But I'd love to read a journalist's in-depth investigation into Johnathan's
motivations. Everything points to this being a deliberate, dishonest
fabrication of data.

------
pmags
Some additional details about these and related retractions from Pruitt's work
can be found here:

[https://retractionwatch.com/2020/01/29/authors-
questioning-p...](https://retractionwatch.com/2020/01/29/authors-questioning-
papers-at-nearly-two-dozen-journals-in-wake-of-spider-paper-
retraction/#more-118826)

------
warlog
Glad she came clean, and glad the retractions are coming in.

And now she has a tenure track position at a UC.

While the scientific record might be corrected, the historic impact on a
cohort of people who got less because of this remains unacknowledged and
uncorrected.

Walk it _all_ back.

~~~
jtfairbank
Did you read the article? Sounds like she didn't do anything wrong, but was
sent falsified data by a more senior scientist that she trusted. After
discovering that, she took steps to correct the record, alert other scientists
that they need to double check their papers, and build automated tools to
catch similar issues going forward. That sounds appropriate, and high in
integrity, why should she be punished for doing the right thing?

~~~
roca
That sounds reasonable as far as it goes, but if you were the person next in
line for that UC Davis position and your research wasn't based on falsified
data, I think you'd be feeling pretty unhappy about this.

(I hope that in reality there's a lot more to the author's research than the
retracted papers, but of course in such a competitive job market, every bit
helps.)

Look at it another way: the author sure was lucky they found out about the
problem after they were securely in their tenure-track position, and not just
before.

~~~
Angostura
> Look at it another way: the author sure was lucky they found out about the
> problem after they were securely in their tenure-track position, and not
> just before.

Look at it another way: The author was sure unlucky to have based their
research on shoddy data from a trusted colleague. And it took guts and
integrity to react in the way they did.

~~~
roca
That is also true.

