
P-hacked hypotheses are deceivingly robust (2016) - soundsop
http://datacolada.org/48
======
eganist
The words "deceivingly" and "deceptively" have the same problem: there's a
roughly 50/50 split in polar-opposite interpretations.
[https://grammarist.com/usage/deceptively/](https://grammarist.com/usage/deceptively/)

In this case, does "deceivingly robust" mean they look robust but are fragile?
or does it instead mean they look fragile but are robust?

This isn't a criticism of you, soundsop. Rather, it's intended to keep
pointing at how difficult it can be to concisely deliver a message.

\---

edit: sounds like the correct interpretation of the title is _" P-hacked
hypotheses appear more robust than they are."_

~~~
comex
Huh. I'm skeptical about that article's dichotomy of "deceptively" meaning
either "in appearance but not in reality" or "in reality but not in
appearance". I think the most common usage of "deceptively X" is, more
broadly, "X in a way that deceives you". That includes "X in reality but not
in appearance", but it also includes "X in reality _and_ in appearance, but
deceiving you about something else".

For example, they used this quote as an example of "in appearance but not in
reality":

> It’s no mystery why images of shocking, unremitting violence spring to mind
> when one hears the deceptively simple term, “D-Day.” [Life]

But the term "D-Day" _is_ simple. It's deceptive because it might wrongly lead
you to think the event it refers to is also simple.

Similarly, if something is "deceptively simple-looking", it really is simple-
looking; it's just not _simple_.

~~~
eganist
Mate, I don't know; I'm just going with all the research and linguistic
warnings I've read on the word.

[https://languagelog.ldc.upenn.edu/nll/?p=3500](https://languagelog.ldc.upenn.edu/nll/?p=3500)

[https://brians.wsu.edu/2016/05/25/deceptively/](https://brians.wsu.edu/2016/05/25/deceptively/)

[https://www.academia.edu/37488247/The_Deceptively_Simple_Pro...](https://www.academia.edu/37488247/The_Deceptively_Simple_Problem_of_Contronymy)

Shoot, even oxford gives exactly opposite definitions of the word.

[https://www.oxfordlearnersdictionaries.com/us/definition/eng...](https://www.oxfordlearnersdictionaries.com/us/definition/english/deceptively)

\---

I mean, even when I saw the title of the thread, I had an _obligation_ to
click (clickbait I guess?) because I could have interpreted the title as
either a warning about p-hacking _or_ an attestation in favor of the practice.
In fact, on first glance, I read the title as "P-hacked hypotheses appear less
robust than they actually are."

------
bsder
Basically, if you take a p-hacked hypothesis and attempt to use it
_predictively_ , it falls apart.

That's kinda ... useful, actually.

It feels like this is sort of the same issue with overfitting in ML. Attempts
to use ML results predictively often fail in hilarious ways.

~~~
peignoir
Yep, that s also how science works: it predicts the future based on a model.
Quick trick to know if it s a science or not: if it has the word science in it
it’s not. (Eg social science)

------
ncmncm
P-hacking is a fine way to winnow through ideas to see what might be
interesting to follow up on. There will certainly be false positives, but the
real positives will usually be in there, too, if there are any. Determining
which is which takes more work, but you need guidance on where to apply that
work.

To insist that p-hacking, by itself, implies pseudo-science is fetishism.
There is no substitute for understanding what you are doing and why.

------
bjterry
> Direct replications, testing the same prediction in new studies, are often
> not feasible with observational data. In experimental psychology it is
> common to instead run conceptual replications, examining new hypotheses
> based on the same underlying theory. We should do more of this in non-
> experimental work. One big advantage is that with rich data sets we can
> often run conceptual replications on the same data.

I think actually relying on "conceptual replications" in practice is
impossible. If the theory is only coincidentally supported by the data, that
makes the replication more likely to exceed p < .05 coincidentally in a very
difficult to analyze way.

The author mentions that problem, but doesn't mention a bigger issue: If you
think people are unlikely to publish replications using novel data sets, just
imagine how impossibly unlikely it is for people to publish failed
replications with the original data set! If you read a "replicated" finding of
the same theory using the same data set, you can safely ignore it, because 19
other people probably tried other related "replications" and didn't get them
to work.

------
lisper
This problem is going to get more severe as available datasets get bigger and
bigger. The more data you have to mine, the more likely you are to find
something that looks like a signal but isn't.

~~~
knzhou
At some point, social science must switch to tighter p-value cutoffs and
corrections for multiple comparisons. These are the norm in particle physics,
which dealt with and resolved exactly this problem back in the 1970s.

