Hacker News new | past | comments | ask | show | jobs | submit login

Does this actually pose an issue for most studies?

This seems like it would be an issue for any studies relying on absolute food consumption being accurate. Most studies I come across frame their findings in relative terms (likely for this very reason): Individuals who engage in more of X compared to their peers show a correlation with outcome Y.

For example, if you’re trying to determine whether morning coffee consumption correlates with longevity it doesn’t seem particularly relevant if you believe everyone is underreporting their food intake, as the article implies; it's a relative comparison.

Sure, those findings often get twisted into clickbait headlines like “X is the secret to a longer life!” but that’s more a popular science problem than an issue with dietary research itself.




You are assuming that the underreporting will be uniform. In reality people may be underrporting things they are embarrassed about and maybe even overreporting the opposite.

This is a flaw in the data that is much harder to account for.


Why would that be a problem for reporting relative results if everyone is under-reporting things they're embarrassed about and over-reporting the opposite?


Different people are embarrassed by different things. A frat student's probably going to overstate their alcohol consumption, a Morman understate.

People with bigger appetites underestimate their food consumption, people with smaller appetites overstate.

Not to mention the degree of over/under statement will vary wildly. "A big meal" might be 300 calories for somebody with an eating disorder, or 3000+ for somebody on the opposite end of the spectrum.


> "A big meal" might be 300 calories for somebody with an eating disorder

I knew a guy that complained that he "ate like a lion" and yet couldn't gain weight.

Turns out, his breakfast was typically a single egg and a slice of toast. Lunch would be half a sandwich and a bag of chips that he wouldn't finish. Dinner of course varied, but basically was like 4-6 oz of meat of some sort and a small side of veggies.

Overall, his daily calorie intake was probably only around 1,000 calories.

I don't know if this qualified as an eating disorder, or what, considering when we hear about someone undereating, it's because they're trying to lose weight. He was trying to GAIN weight and yet was still horrendously undereating.


Sure, but in a representative sample size this is largely irrelevant. The fraternity brothers and the Mormons cancel each other out, and regardless both are dwarfed by the large middle of the population that likely systematically and reliably under-reports their drinking by a few units.

The idea of outliers and systematic biases isn’t new to statistics, relative comparisons are still useful.


>Sure, but in a representative sample size this is largely irrelevant.

There is no way to know whether your sample size is representative. What amount of fraternity brothers and Mormons cancel each other out?

>and regardless both are dwarfed by the large middle of the population that likely systematically and reliably under-reports their drinking by a few units.

And? That does not prevent spurious correlations.


All of those headlines are based on meta-studies putting together 100 junk studies, based on bad data, which then informs actual medicine and health trends and American X Association and...

For your specific example - "morning coffee" could be anything from plain espresso shot to full 600+ calorie starbucks "coffee" but the meta-study-machine will lump them together.

It's kind of like feeding all of reddit's comments into chatgpt, asking it about stuff, and trusting its answers at a society-level with your health on the line.


> "morning coffee" could be anything from plain espresso shot to full 600+ calorie starbucks "coffee" but the meta-study-machine will lump them together.

You're inadvertently proving my point, though.

If morning caffeine is correlated with longevity, regardless of the vehicle/extra sugar/etc and controlling for the easy usual circumstances like income, that's pretty useful information!


But if sugar is worse by more than caffeine is good your study is in trouble. Or maybe it works but it is harmful because people who don't like coffee are going to buy the bad sugar drinks trying to get the good coffee down.


It might be useful information for other researchers to try to figure it what is actually going on, but probably not. And it is not at all useful for you and I trying to make sense of what we should eat.


> This seems like it would be an issue for any studies relying on absolute food consumption being accurate.

Exactly. Those studies either don't get done, or when they're done, they produce garbage results that get ignored or get interpreted as diminishing the importance of absolute food consumption.

> it doesn’t seem particularly relevant if you believe everyone is underreporting their food intake

It says that virtually everyone underreports. It doesn't say that everyone underreports equally, and there are good reasons to expect this not to be the case. If embarrassment is a contributing factor, for example, you would expect people who are more embarrassed about how they eat to underreport more. If people remember meals better than they remember snacks, people who snack more will underreport more than people who snack less. If additional helpings are easier to forget than initial helpings, people will underreport moreish foods more than they underreport foods that are harder to binge on. With so many likely systematic distortions, it would be surprising if everyone underreported equally.


But finding correlations is only the first and easiest step in determining causation. And almost nobody continues with the hard work that follows. So we have tons of studies showing correlations one way or the other, and tons of conflicting studies. And we are apparently satisfied with this. The state of nutrition research is abysmal.


most people are embarrassed about the truth. So they will over report vegetables while not mentioning how much alcohol or tobacco they had (or illegal drugs which the study probably legally must report to the police). Or a self proclaimed vegetarian will not report meat they ate despite their claim. fat people will report they skipped desert.


Why would that be a problem for reporting relative results if the entire population is doing that?

If everyone is under-reporting their alcohol consumption, that seems fine. The absolute numbers will be way off, the relative numbers to their peers won't.


Statistics can do a lot to find data from noise like this, but it is still noise. The biggest issue is nobody knows what variables are important, which are correlated, and so on.

Edit: there is another issue I forget until now: time. Statistically I have several more decades of life left. So even if you get accurate results of my meals yesterday, you need to report when I died, and you probably won't have the meals for the rest of my life. Did some meal I at when I was 10 have a big effect on my life? For that matter if I know you are tracking just one day's meals I will probably eat what I think is better and that doesn't tell you anything about what I eat the rest of the time.

It is easy to track people who have had a heart attack - they are likely to die of another heart attack in a few years so the study times are short. However does having had a heart attack mean either genetic difference such that your results only apply to a subset of the population, or perhaps some other factor of having had a heart attack.


I came across a comment as a humorous rule of thumb for this.

1. If you ask someone who much the drink double the answer 2. If you ask them how much the smoke, multiply the answer by five 3. If you ask them how often they have sex, divide the answer by 10.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: