Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is not a sound approach. You're declaring what humans think random is first, and then throwing out any data that doesn't match your declaration. There is no way to learn anything from this.

I also think 'HHHHHHHHHH' is unlikely to be a good faith response, but if the goal is to actually learn anything instead of merely reinforcing my prior beliefs, it doesn't matter.

You need to find a way to design the experiment that discourages bad faith answers or let's you judge them objectively. Alternatively if you have some outside knowledge about the 'shape' of bad faith answers for your kind of experiment, you may be able to use that to properly adjust your data.

But 'nah I don't think so' isn't an acceptable reason to throw out data. It's especially egregious to do so when the data is answers that are, at a bare minimum, technically correct.



It seems to me that if we reject a subset of experimental samples because they look like bad data (e.g. extreme outlier caused by sensor malfunction) we are still keeping all the bad data we are unable to recognize as such (e.g. sensor malfunctions producing less extreme data), which introduces a bias.


You don't have to view it as "throwing out the data". You can just think of it as an alternative explanation for the data.

Original hypothesis: Old people are worse at giving random responses.

Alternative hypothesis: Old people are more likely give bad faith responses.

This review is suggesting the AH is equally good at explaining the data as the OH.


I probably should have clarified that I was responding to the content of the parent comment rather than the submission itself.

I think this is just the slop of language, but in this case it's obscuring all the important details so excuse me for being a bit pedantic.

Forming and accepting a hypothesis are very different things. You can't just come up with a new hypothesis after looking at some data and then immediately accept it because the data supports it.

It would absolutely be incorrect to look at the original data, form an alternate hypothesis, and then immediately go on to suggest it is an equally good explanation as the original hypothesis.

You don't have to accept the original hypothesis if you think the experiment is flawed, and you're free to propose any hypothesis you want, but that's the limit without new data.


Since two people have come to the same misunderstanding, I must have worded my argument inadequately.

Of course review is not the time to accept or form new hypotheses. Neither I nor the author of this article is suggesting that we should accept this new hypothesis "old people are more likely to give bad faith responses" from the data collected for this study.

But review is the perfect time to look for interesting features in the data that challenge the original hypothesis. In this case, it is very difficult for the original hypothesis to explain why older people are only worse at giving random responses in a very specific way: giving answers that are all 0s or 1s.


Although technically, this would be P-hacking. You aren't meant to change your hypothesis post-facto to fit the data. You'd have to conclude no effect, and then design a separate study to determine if age differences correlate with bad faith answers.


It would be p-hacking if we just took the same data to conclude that old people are more likely to give bad faith responses. That is just a possible explanation for the data being offered to reject the original hypothesis.

At the very least, it is an interesting observation that the entire trend line disappears on removing data points where people guess all same coin toss results.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: