Hacker News new | past | comments | ask | show | jobs | submit login
Randomization in such studies is arguably a negative in practice (columbia.edu)
11 points by luu 58 days ago | hide | past | favorite | 8 comments



The money quote at the end: Randomization doesn’t make a study worse. What it can do is give researchers and consumers of researchers an inappropriately warm and cozy feeling, leading them to not look at serious problems of interpretation of the results of the study, for example, extracting large and unreproducible results from small noisy samples and then using inappropriately applied statistical models to label such findings as “statistically significant.”

Spot on, unfortunately.


I'll note that one could say the same about any specific feature that improves a study. There's a list of ways that a study could go wrong, and if nine aspects are good, that doesn't rule out the tenth being awful enough to torpedo the conclusion.


Unfortunately nowadays people seem to have the same understanding of the terms "statistically significant" and "proven causality". There's no way a single study can prove causality and the closest we can get to reaching for the truth is meta-analysis.

Not saying studies results are invaluable. They can certainly point to the right direction if done right. But just a single RCT won't be enough to prove causality.


> There's no way a single study can prove causality

Why? If you're talking about epidemiological studies looking for associations between variables in big datasets, okay. But that's just one type of study, and it doesn't get better by throwing many of them together in a meta study.

You absolutely can show causality in a single study by doing experiments.

I'm feeling rather the opposite really, "correlation is not causation" is a beaten horse and will come up in every discussion where it may or may not fit on this website.

In many cases, correlation is a hint at causality, and if there's a strong model explaining it, should be considered evidence for causality.


> if there's a strong model explaining it

This is the key that everybody ignores.

If you simply throw variables together, you wind up with too many confounders. You can also get the direction wrong.

"Growing tall causes you to be male" is my go to example. The correlation is super powerful--because it's completely backwards. It is also competely valid unless you have a model to differentiate--at which point it becomes obviously non-sensical.


If you ran an RCT where you could directly make people to be male and then see what happened to them, you wouldn't need any further modelling to distinguish between which is causing which. The reason you need the model is if you can't randomize.


Kinda related, but the easiest way to limit bad randomizations is to stratified by pre-experiment data. But then that would make it harder to p-hack so its understandable why more people dont do that ;)


He's complaining about the statistics version of a con artist putting on a nice suit, or making a glossy website for your crap company. Or putting out a "white paper".

Adding the easy trappings or dishonest indicators of "professionalism" to your junk product is bad for society.

It's not a novel insight, and the discussion is rather cluttered with irrelevant detail, so I not sure it's worth sharing. (Of course it's fine for Gelman to write his meandering thoughts on his blog.)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: