Hacker News new | past | comments | ask | show | jobs | submit login

Emily Oster has a useful post about why, despite the effort, its is difficult to control for everything that matters in observational studies: https://emilyoster.substack.com/p/coffee-definitely-either-k...

Specifically:

> As result, in the data, sugar starts to look…a lot worse for you over time. In the earliest period of the data there is no observed correlation between the sugar measure and BMI (see below) but by the latest period, it’s hugely correlated. The “sugar-makes-you-fat” story only shows up in the last few years, coincidentally during the period when sugar is less consumed by individuals who exercise, do not smoke and are better educated.




It's also important that people understand what "control for" means in an observational study.

Nearly all observational studies that you read draw their conclusions from regression analysis (or something, such as a ANOVA, which is a can be trivially implemented as regression analysis).

This just means that a linear model such as this is used:

outcome ~ b_coffee * drinks_coffee + b_income_level * income_level ....

Then, unlike in machine learning, researchers look at the value of the coefficients and most importantly the confidence in the coefficient value. In a simplified example:

years_of_life ~ b_coffee * drinks_coffee + b_intercept

Where b_intercept will represent the life expectancy in general. Then b_coffee (since drinks coffee is binary here) will represent whether or not coffee adds to your years of life. If it's negative it means drinking coffee reduces your life expectancy and if it's positive it means it increases it. In statistics we also look at how certain we are in b_coffee in terms of standard error and p-values. For example if b_coffee is say 5, meaning it adds 5 years to your life, but our standard error in this estimate is 4 (ie 95% chance that the real impact is between -3 and 13 years). In this case the p-value for this coefficient will be higher than necessary to conclude "statistical significance".

But suppose the standard error is very small, like 1 year, where we are virtually certain that coffee does improve life expectancy. To control for say, college education, we just add that coefficient to our model.

years_of_life ~ b_coffee * drinks_coffee + b_college + has_degree + b_intercept

The magic of regression analysis is that if, in fact, people that go to college life longer and people that go to college also drink more coffee, the our coefficient for b_coffee will change in this new model to reflect this. If for example it were to become negative now (with a low p-value) what we would conclude from this model is that coffee is in fact bad for you, and college is good for you and it just happens that a lot of people who go to college also drink coffee.

Just as even the most advanced AI is often just a lot of matrix multiplication with non-linear transforms, the vast majority of observational studies are drawing conclusion using linear models. In practice, as you add more variables to your model, you tend to get a lot of tricky to interpret results. Regression analysis is a remarkably powerful tool, but it is important to remember when you see publications this is what is really happening and there is a lot of room for subtlety when interpreting these models.


Yet all you have to do is miss a factor and you've ended up with correlation.

As a simple question, what about people who have weak stomachs or hearts? My mother doesn't drink coffee because it "makes her heart beat too hard". How, with no actual medical data for "coffee makes my heart beat hard", do you control for that? Is that something to control for?

This is the "caffeine and healthy pregnancy" problem. We know women who consume less than ~200mg of caffeine tend to have healthier pregnancies.. but if you can drink 5+ cups of coffee while pregnant and not get overtaken with nausea, that might indicate something is already wrong.


An important limitation which is often overlooked is that when you "control for" something by entering it into a regression, as you describe, you are only controlling for the linear effect of that thing.

It seems to me that this problem is totally fatal for large-scale epidemiological studies with many factors, of which many are sure to have nonlinear effects.


Well put. I ultimately view observational studies as hypothesis generators that can spur research into more targeted questions all the way down to the biochemical level.


Yes, science is really hard, and it’s a good point by Oster, but you’ve got to think about the idea that the folks who conducted this study know as well how hard science is, and for them to publish this anyway means they (and their reviewers) did a pretty good job dealing with this issue (and many others).

None of this is conclusive to the point where no further study is necessary, but I’m guessing this study will make some institutions think/rethink their guidance on coffee consumption.


> for them to publish this anyway means they did a pretty good job

The standard shouldn’t be “we did the best we can” for the sake of publishing. It should be “is what we’re doing useful”.

> I’m guessing this study will make some institutions think/rethink their guidance on coffee consumption.

This is the major issue with all these kinds of studies. The absolute last thing they should be doing is influencing public policy. Data that’s more or less garbage but you treat it like it’s solid because it’s “science” is harmful.


I'm sorry but I completely disagree; the standard absolutely should be "we're doing the best that is currently possible to determine how things work". Asking for more is entitled and completely misunderstands the nature of science.

We need to have a way to figure out what to tell people to do, and we're doing that, whether you like it or not, with imperfect information.

Also not for nothing, this data is not "more or less garbage". This is about as good as we can do in science right now, and honestly your tone is pretty terrible. It sounds like you want religious levels of certainty and you feel you deserve that.

Science doesn't get much more certain than this (and this is far from very certain in any sense), there is no "better" way to do things that the researchers just lazily didn't follow here, and your suggesting that such is the case is ignorant and selfish. You should feel bad for what you've written here.


> Science doesn't get much more certain than this

A well designed RCT isn’t much more certain than this?

Observational studies are hardly the peak for scientific knowledge.


A decades long rigorously controlled RCT in human nutrition science is practically impossible. Even if it were possible the scientist would still have to choose variables to control for in their population selection.


You don't get points for trying. If observational 'scientists' can't discern something about the world which stands up to scrutiny then they are wasting everyone's time, including theirs.


>but you’ve got to think about the idea that the folks who conducted this study know as well how hard science is, and for them to publish this anyway means they (and their reviewers) did a pretty good job dealing with this issue (and many others).

What makes you say that about this paper specifically? Certainly in general there are career incentives for authors to publish papers which are not very conclusive, and we know that relatively useless observational studies do get published.


I'm kinda "done" with observational studies. It seems like for the last 10 years that's all we get. And every time the studies come out they contradict that last one. When are we going to see the actual coffee atoms attack bad viruses and shit (kinda a joke but seriously, when are we going to have that visibility into exact pathologies).


"People who drink coffee live longer."

Unhealthy people start drinking coffee in the hopes it's some kind of magic elixir that can erase their unhealthy choices.

Check again: "People who drink coffee don't live longer."

Unhealthy people try to find some other magical cure from a new observational study, maybe of eggs, this time.


My family has a history of all kinds of diet/exercise related issues. Diabetes, kidney issues, heart disease. It's sort of funny if you search up diets useful to avoiding and or living with these problems you'll generally find it's the diet we all know. Eat lots of vegetables, eat whole grains(Wilfred Brimley told me to eat my oatmeal, that SOB was right), if you eat meat do so in portioned moderation, eat fruits but not to excess.

I don't really know anyone personally who doesn't know what they're doing bad to themselves. When I order a salad and my buddy orders a butter seared steak and fully loaded baked potato (after he ate a whole pack of sour patch kids at the movie), and I don't say anything, but he says "You've got to enjoy your life!" I know he knows. Because I've done that, and I knew. I'm no Sporty Spice but if more salad than steak, and more water/tea than soda/beer, means I don't have to go to a regular dialysis appointment I'd consider that a tick under "enjoying life," personally. But if I could eat the steak and potato and candy and just offset it with more coffee - by god, that's the ticket.


> means I don't have to go to a regular dialysis appointment

I'm going to copy a comment I made regarding Type 1 Diabetes on another post about a year ago. It's relevant because what you're saying, though very true, comes down to immediate effort versus delayed impact. I think people who fail to control their diabetes largely do so for the same reason people fail to control their diets...

> I am going to make a weird, but IMHO apt, comparison to Type 1 Diabetes. T1D is an attritional disease caused by the pancreas's inability to produce enough/any insulin and the list of complications is long and deadly. As someone who has lived with the disease for decades what makes the disease particularly insidious is immediate effort versus delayed impact. The disease affects everything...every single part of every single day. Eating, sleeping, exercising, traveling, finances...everything. And to be on top of everything is extremely effortful. However the impact of not taking enough insulin, of not checking blood sugars frequently enough, or living with high blood sugars, is delayed. Today's transgressions may not be punished for decades. It is no shock to read about poor therapy adherence when the effort is immediate (and constant) and the effect is delayed (and therefore hypothetical). I see the same issue - immediate and constant effort coupled with long term hypothetical effect - with protecting one's privacy. Obviously one should do the right thing. But many won't.


Salad's funny because it can mean anything from "lean chicken breast on lettuce and tomato with a tiny bit of balsamic and oil, 200 cal at 60% protein/20% carbs/20% fat" to "mayonnaise potato and bacon, 1000 cal at 10% protein/20%carbs/70% fat," and pretty much everywhere in between.

Not to nitpick what you're saying, anyone on HN who orders a salad is probably thinking about what's in it, but it's worth pointing out that the category is so broad that it can be deceptive for anyone who doesn't read the fine print. I read an article the other day complaining that a 324 cal Subway 6-inch didn't taste as good and wasn't as filling as a 685 cal Sweetgreen salad bowl.


> to "mayonnaise potato and bacon, 1000 cal at 10% protein/20%carbs/70% fat,"

I never thought about something like that being considered salad... but maybe I'm the weird one? Also, whoever orders that knows that this is not healthy right?


As a Californian, I always have to remember that in other parts of the country salad dressing is applied much more liberally. Asking for light dressing in Chicago yields a similar result to asking for heavy dressing in LA.


I agree with this sentiment. Health studies are basically an oscillation between different diets these days. I never know when a new study will come and debunk yesterday's fad diet.

Just avoid processed foods and exercise more.


And eat more vegetables.


And call your mother once per week /s


No, no, you're right. It's in my calendar, but I still fail to do it.


Some of the contradictory ones are the most interesting. Such as the contradictory studies with coffee consumption, which eventually pointed to filtered vs. unfiltered coffee giving different results (so studies done on populations drinking mainly filtered coffee differed to populations drinking mainly unfiltered). I also find it is worth looking closer. The headlines in the papers might be contradictory, but a closer look indicates the studies are actually measuring different things and they are not contradictory at all, perhaps even reinforcing.

Coffee is a strange one though. I think I've only seen 'coffee is good' in recent years, but I'm wary because I don't expect 'coffee is bad' studies to get the same sort of publicity.


Unfiltered coffee contains cafestrol, which can be devastating to cholesterol levels.


Also contains Kahweol, which is supposedly anti-inflammatory, anti-angiogenic, and may decrease risk of cancer.


That kind of work is hard, expensive, and likely to be fruitless. And in the case of many softer sciences it would more or less remove the need for the field (in their current forms). Sociology is a band aid until we understand biology better.

That being said I agree with you entirely and we shouldn’t shy away from hard problems because of this.


>That kind of work is hard, expensive, and likely to be fruitless.

Don't forget that the "fruit" for scientists is fame, not the truth. Anything that can get headlines and conference talks is a good thing, regardless of whether or not the methodology even makes sense.


>Don't forget that the "fruit" for scientists is fame, not the truth.

This is a pretty shocking thing to say. It's not true whatsoever for the field of physics. Is there a particular field or a particular experience you're reacting to? I think I'm overreacting to how general your statement is.


Feynmann's classic "Cargo Cult Science" for anyone who hasn't seen it yet. And for those of us who think it should be read and re-read at least annually and the time has come:

https://sites.cs.ucsb.edu/~ravenben/cargocult.html

It's a 5 minute read. Arguably the 5 most intellectually productive minutes we can spend. (Well at least for me, anyway).

Now that ESP, paranormal, telekinesis research is thoroughly dead, (exposed by magicians like James Randi?), psychology still remains mired in it all. To the credit of a minority in that field they're having a go at getting it back to science and I wish them all the luck in the world. https://en.wikipedia.org/wiki/Replication_crisis

My understanding is that for vast amounts of "What food is healthy, what kills you." Research, epidemiological studies is all we have. They're obviously limited, easy to get wrong, easy to fool yourself (and you're the easiest person to fool!) The supply of identical twins, who are willing to commit to life long diet differences with rigor and make all the same choices outside of the study is, well, kinda low.

The statistics being used for these things is still under active development and being improved. Can it ever be done properly? Well I guess so. I think we're pretty clear on the smoking, cancer, heart disease, stroke link nowadays, right? And those studies have to have been similar.

It's interesting that scientists have to raise funding, publish or perish and so on to even have a career at all making them part P.T. Barnum. Fenymann, for all he didn't need to do that because of the different era, reflected "glory" from Los Alamos etc. really was capable of putting P.T. Barnum to shame while at the same time being the most devout adherent to and proselytiser of scientific principle & purity. So good and so lucky he could keep his hands clean?

Is there no academic misconduct in physics nowadays? None? I'd believe you if you told me so & why.


> It's a 5 minute read

Perhaps I'm slow, but that was every bit of a 15 minute read for me.


Sorry for the bum steer. Maybe I read it pretty fast because I've read it a few times since the 80s.


its not just fame, University 'Research' Centers are completely overrun with politics. I noticed after mapping diabetes, obesity, and excessive alcohol consumption that high levels of obesity almost perfectly overlapped with diabetes, except in areas with excessive alcohol consumption. I brought it up during an ideation session, but it was immediately dismissed as 'SES (socioeconomic status) must account for it, don't look into it further. when I lied and said these are higher levels obesity which is correlating with soda consumption, they wanted to write a paper / grant.

its leading to junk science, and large swaths of people are losing faith in what we are labeling as science.


Observational studies should only be used to conclude "maybe we should investigate this" - as in what you're asking for. Maybe let's try to elucidate some biological/chemical pathway that may be responsible for such a significant observed result. Or maybe let's find some confounding factors and learn why!

But pop science news interprets it as a causation.

...helpful...


Would we be ever able to conduct such a study? We'd need to have clones bred for that purpose since birth in exactly the same conditions - and even if we had that, you'd never be sure it would transfer well onto people living free lives. So, these weak observational studies are all we have.


Note that observational data can come in strong flavor as well, as in natural experiments, but I’m not holding my breath on a quasi-random decades-long coffee shortage striking half of any population soon. There’s always a chance a situation of this kind is ongoing somewhere though I guess?


This got long, so, here's a TL;DR: Observational studies on humans are basically the best you're gonna get for most purposes. Other animal models (primates, dogs, rodents) have problems, but rodents are the best compromise. By using rodents to screen new interventions, we actually end up benefitting humans sooner than if we'd just started with people in the first place.

---

Right. IMO anyone criticizing observational studies on humans needs to overcome the idea that it's hard to do controlled studies on humans, and when it is possible (e.g. vaccine trials), the studies don't last very long, because you can't really control peoples lives that long and there tends to be significant attrition over time.

Theoretically, chimps would be the next best thing, but experimenting on chimps is ethically questionable as well, and they also live a long time. That's great for chimps, but not so great for biological studies where you want to observe the effects of an intervention over an entire lifecycle. Monkeys would be the next best thing after chimps, but experimenting on monkeys has the same issues as with chimps.

Among common experimental animals, that basically leaves dogs and rodents. Dogs are relatively large and hard to deal with compared to rodents, and their relatively long lives (compared to rodents) makes them difficult to selectively breed. Even if we modified the embryos using CRISPR or something to include genes that were of interest, a dog's gestation period is about 3 months whereas for rats and mice it's about 3 weeks. Likewise, Beagles (one of the more common experimentally used dogs due in part to their relatively small size and friendly disposition) live for about 12-15 years, whereas wild-derived mice and rats both live around 2-4 years in captivity. The general public is also much less concerned when rodents get euthanized as part of an experiment than when dogs do. Whether that should be the case or not is somewhat debatable, but, nonetheless it is. It's all a big trade off between fidelity and logistics, but shouldn't an audience like HN's consisting of a disproportionate number of software people compared to the general public actually appreciate that? ;)

So, rats and mice a really the only reasonable choice if you need to experiment on a live mammal in less time than it takes to actually get a PhD. But, then that triggers the chorus of "in mice" replies we see to pretty much every biomedical article on preclinical research here. And, there are undoubtedly interventions that would work on humans that don't work on mice, just as there are many interventions that work on mice but don't transfer to humans. We miss out on those, but the ones that do make it through rodent trials and into humans got there by a process that weeds out a lot of things that couldn't possibly work, which ultimately means humans benefit sooner by starting with mice than if we just used people in the first place.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: