Hacker News new | past | comments | ask | show | jobs | submit login

> This mild-to-moderate subgroup wasn’t one the researchers said they would analyze when they set up the study. Subdividing patients after the fact and looking for statistically significant results is a controversial practice. In its most extreme form, it’s scorned as “data dredging.” The term suggests that if you drag a net through a bunch of numbers enough times, you’ll come up with something significant sooner or later.

He could have just kept that data secret, and ran another trial but specifically targeted at people with mild to moderate illness. That would have protected him legally, and made the numbers look even better.

That's the kind of thing that many people are campaigning against. Companies should release all the research they do rather than cherry picking the useful (to them) results.




Actually, that would have been way better. If they had done the study you suggest, and the result had still been significant, then he would have been entirely justified in reporting what he did.

The issue is that dividing the participants after the fact and then looking for correlation in the existing data reduces the significance of the statistic considerably (we have other statistics for that). The p-value is not representative when used that way.

But if you do another study focused on that group in particular and still get a significant result, you're fine! The problem isn't that they located a group on which the drug worked in a dishonest way, or some such - the problem is that they were dishonest to claim they had significant evidence that the drug worked on that group. If they'd done an additional study on that group in particular, they would have their evidence (or, of course, a null result).


i think the person you were replying to was implying that the same data be used, while what you are arguing is that there should be new observations made (and i agree with you, if the new work is independent; i just wanted to explain why i think the original comment was arguing for greater transparency).


I believe he's suggesting that the doctor could have legally covered up the results of the first trial (by simply not releasing them,) then run a second trial on only the most beneficial population, releasing those results without mentioning the first trial.

At that point, his product would look great, hiding its failures.

This way, while he misinterpreted the P value in an illegal and fraudulent way, he did release all relevant information - ironically, better for the informed reader than if he had rerun the trial legally.


Malician, I'm not sure you and DanBC understand this fully?

It would be absolutely fine to run a new trial on the supposedly most beneficial population (those with mild/moderate lung damage; lets call them 'the subpopulation').

If that second trial succeeded, then it would be strong evidence that the drug was beneficial for the subpopulation.

There would be no need to hide the results of the first trial, as the first trial did not provide evidence that the drug didn't work on the subpopulation.

If you read the article to the end, they did in fact do such a trial on the subpopulation. And they got evidence it wasn't working on the subpopulation - which is how science goes.

The problem was that the first trial wasn't set up to examine the subpopulation, but they reported results as if it was. You can't do that with standard NHST, as it invalidates the assumptions of the statistical framework being used.

But you can absolutely decide to run a whole new test on a new sub population, based on hints you get from the first results.

And, while it'd in general be better if all test results (positive AND negative) were published, that is not relevant to this situation - the first trial said nothing bad about the effects on the subpopulation, so there'd be nothing to gain from hiding it, if you just wanted to claim it worked on the subpopulation.

Its not like a situation where they got evidence that the subpopulation would not benefit in the drug in the first test, and then decided to do another test, planning to only report the second.


Yes, I understand this. This is correct if the result of the test on that said subpopulation is only interpreted by the public and/or scientific community as applying to the subpopulation.

However, if the results of the original test are hidden, the results of the second test could well be taken as evidence for a wider or stronger effect, yes? If this isn't the case, then I wouldn't see a problem with that behavior - but from the reading I've done, I suspect it is in fact the case and is common practice.

edit: I may be completely wrong on this - if, indeed, that's not a significant problem.


ah, ok. so, you're right, but not as right as th eoriginal issue being discussed :o) i can explain if you're interested...

what i think you're saying is that they would hide the original negative study and publish a subsequent (new, separate, on different people) positive study.

[aside - that's not a perfect description because for one particular group the first study was positive; it's just that the group in question wasn't explicitly targetted].

and, in general, that's considered a bad thing. because (1) you can keep repeating studies until you get a positive, and then publish and (2) because the negatives aren't published, people have incomplete information.

but it's not a terribly bad thing, because if something isn't true then, if you repeat a study, it's likely going to show it isn't true. the standards are set high enough that you'd need to do hundreds of studies before you showed something to be true (when it really isn't).

and because hundreds of studies are expensive, it's unlikely to happen (but then you think of the industry as a whole, and it is doing hundreds, and so some of those are likely wrong...).

in contrast, what this guy was prosecuted for was hunting in the data. you can think of that like doing a new study, but without the cost. it's pretty easy to dream up hundreds of different questions you can ask existing data. and just by luck you're going to find the occasional surprising answer.

so hunting through data is like doing hundreds of studies until you find something, but it's cheap! and that's why it's "worse" than simply hiding negative results and repeating studies. because it's much more likely to happen in practice.


I think you're getting hung up on their use of the word "hide." What they're saying is that the first study could have been disregarded except as a good reason to run the second study. Of course, that later happened, and the effect disappeared - but maybe it wouldn't have. That's how science works.

I don't thing that you're disagreeing with them, just reiterating.


It would actually be ok to do what you say - the problem is that there WASN'T a 'most beneficial population' for which his treament works; he made up that 'population' from data which, at best, show that this population could be more beneficial if it's verified.


He wouldn't even need to keep it secret. It's completely legitimate to say "We were trying to prove XYZ and didn't, but the data does hint that ABC might be true. Let's do another study looking specifically at ABC."

This is how a lot of Science gets done.


I'll suggest reading to the end of the article. They did do another trial targeted at people with mild to moderate illness. It failed: "A little more than a year into the study, more people on the drug had died (15 percent) than people on placebo (13 percent)."


They don't report whether that difference was significant though, do they...


(fpgeek is right, I should have read further)

It was significant enough to stop the trial.

(http://www.ncbi.nlm.nih.gov/pubmed/19570573?dopt=Abstract)

> FINDINGS: At the second interim analysis, the hazard ratio for mortality in patients on interferon gamma-1b showed absence of minimum benefit compared with placebo (1.15, 95% CI 0.77-1.71, p=0.497), and indicated that the study should be stopped. After a median duration of 64 weeks (IQR 41-84) on treatment, 80 (15%) patients on interferon gamma-1b and 35 (13%) on placebo had died. Almost all patients reported at least one adverse event, and more patients on interferon gamma-1b group had constitutional signs and symptoms (influenza-like illness, fatigue, fever, and chills) than did those on placebo. Occurrence of serious adverse events (eg, pneumonia, respiratory failure) was similar for both treatment groups. Treatment adherence was good and few patients discontinued treatment prematurely in either group.


My point is that by saying that "more patients on the drug died", they're implying that the drug was itself killing people. With 35 patients, the 2% difference is about one person. They stopped the trial because the drug wasn't doing anything, but it's misleading to suggest that it was contributing further to mortality.


I don't think they are. I think I may have cut my quote in a bad place from that perspective. They continue:

"That was the death knell for the drug. Most insurers stopped paying for it."

I don't think they're implying that the drug killed people. I think they're saying that the study made it obvious it wasn't helping, so insurers stopped covering it and other consequences followed.


I agree that they needed the additional study (which is underway, I believe). However, I think it is important to realize the cost of another study of that size is very expensive and often small vaccine companies can't afford unplanned major studies without going back to the financial drawing board.


After the company research there should be another independent research (by WHO maybe).




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: