Hacker News new | past | comments | ask | show | jobs | submit login

"When all 14 studies were pooled, no statistically significant results emerged. However, when the studies were divided according to whether vitamin D3 was taken daily in a low dose* or in higher doses administered at longer intervals*, a large difference was seen."

This sounds fishy. Sounds like their primary result was nothing, and then they looked for something else. Would require some checking if they had a preregistration, if this was registered as a secondary resuld and if they had done a proper statistica analysis of multiple outocmes.




Yes, exactly. There's a real-life example of an author group who got sufficiently annoyed at a reviewer requesting an inappropriate subgroup analysis: https://www.thelancet.com/journals/lancet/article/PIIS0140-6... Reviewers asked for the subgroup analysis; authors said "no, this is statistically nonsense"; reviewers said "yes, but we'll reject the paper otherwise"; reviewers said "ok, but only if you let us also split on astrological sign".

Result: paper reports that aspirin has an effect, but only if you're not a Gemini or Libra. Too good.


The two signs bracketing the summer months (I presume this was in the northern hemisphere?). That's a potentially interesting finding.

> inappropriate subgroup analysis

No subgroup is inappropriate unless you know all of the values of all of the parameters. But sure, the intent they tried to demonstrate (take all sub-group analyses with a grain of salt) is a good thing to remember.

Edit: To whoever downvoted me, time of year of birth does have real-world effects. https://www.livescience.com/13958-birth-month-health-effects...

> Previous studies have found similar links between spring births and various disorders, including schizophrenia, multiple sclerosis and even Type 1 diabetes. It's possible these diseases are linked to some environmental influence during gestation or the first few months of life, though researchers aren't sure what that could be.

> The leading candidates including vitamin D levels, infections that come and go seasonally, changes in nutrition, and even possibly weather fluctuations, Handunnetthi told LiveScience.

Now perhaps all of this is just bad science and these correlations are just statistical anomalies. But perhaps they aren't.


I think you got downvotes because you misunderstood what people were taking umbrage with in the first example.

It’s not that birth time can never have real effects. It’s that if you keep rolling a die long enough, eventually you’ll hit a “statically unlikely” event like rolling 4 fives in a row or hitting 1 2 3 4 in order.

Extraneous sub group analysis are like rolling the die again. Say you’re searching for a p-value of .05 with a confidence interval of 95%. That means 19 out of 20 times it’s indicative of a real relationship and 1 out of 20 times it was due to random chance.

If you do a bunch of extraneous sub group analyses like the reviewer wanted, you’re banking on the statistical likelihood that eventually you’ll get the result you want even if it’s not a real relationship.


This is what follow up studies are for. To separate the wheat from the chaff. Don't separate before. See my rant below as to why I think pre-hoc decisions on analysis is a bad idea: https://news.ycombinator.com/item?id=35883276

At the very least, I'd like to see people say in advance which parameters they are interested in. That sort of thing is fine and important to avoid un-backed p-hacking. But for researchers who come after, for their sake, if you are not publishing the entire data so that they can reanalyze it de novo, please do as much analysis as possible (and record as many parameters as possible), if only in the supplementary data.

Science can only build on previous science if the authors of that previous science allow it to happen.


> No subgroup is inappropriate unless you know all of the values of all of the parameters

I don't know how to put it less bluntly: you're incorrect.

> But sure, the intent they tried to demonstrate (take all sub-group analyses with a grain of salt)

That's not what they intended to demonstrate. They intended to demonstrate that you need a _reason_ to want to split, and that reason needs to be given _ahead_ of the analysis. If you see the results _and then choose_ a new data analysis (that includes new subgroup analyses), your procedure is no longer statistically sound.

This is a specifically bad statistical practice called HARKing https://en.wikipedia.org/wiki/HARKing


> They intended to demonstrate that you need a _reason_ to want to split

No, no, no. FFS no! Sure this is a good thing to do if you are trying to prove a hypothesis. But you you are trying to explore for truly novel and unexpected linkages p-hack to your heart's content, form hypothesis, and then do follow up studies to see whether there's really something there!

Ignoring possibilities is bad exploratory science. And, as a reader, it's quite annoying to read old studies only to find that the author's didn't bother splitting on the parameter you are currently interested in.

Split them all, if only in the supplementary data, and let future studies sort them out.


Your understanding of significance is flawed, and there is an XKCD for that: https://xkcd.com/882/


No, read my responses to the other commenters above you, who at least made arguments and didn't post a strawman, if you want to know why I believe you are wrong.


I’m not a stats expert, but it seems blatantly obvious that subgroup selection could be inherently problematic because how do you choose the groups in an effective and responsible way?

I mean, just the statement

> no subgroup is inappropriate unless you know all the values of all the parameters

seems implausible and unlikely.

In the one stats class I took, we talked a lot about how selection bias was a huge concern. Why wouldn’t Subgroup selection bias also be a concern?

I dunno, maybe I’m wrong, but I’m dubious.


I just ranted to the top two commenters so I'll keep this response brief.

Say in advance which parameters and subgroups you, as a researcher, care about in terms of significance. And keep your conclusions and discussions focused on the results for those groups. Report all of the other stuff, whether p-significant, or not, as supplementary data.

Hopefully, yours is not the only study that will use your data. Don't limit future researchers to your hypotheses.


That's great for Geminis and Libras, but what if you're a Sagittarius?


> This systematic review was registered in PROSPERO before data collection to preclude data-driven analyses and selective reporting (CRD42020185566). In addition, the methods, including the selection criteria, the statistical analysis, outcomes, and subgroup and sensitivity analyses, were published in advance in a study protocol (Schöttker et al., 2021).

EDIT: And in the protocol, the first subgroup analysis they list is relevant:

> Daily dose versus weekly/monthly bolus dose versus bolus dose at the beginning of the trial followed by a daily dose https://bmjopen.bmj.com/content/11/1/e041607


D3 (animal-based) > D2 (plant-based) > D1 (synthetic), at least in humans if you read the literature, that is.

The fishy thing is how actual scientists do whole studies using only D1 and then draw conclusions.


Plant-based D3 exists, extracted from seaweed.


Vegan D3 can also be derived from lichen; here's one such product:

https://www.futurekind.com/products/vegan-vitamin-d3


I was unaware, thanks.


My takeaway is to keep eating fish.


Thanks to our polluted waters, there are pretty harsh limits to fish consumption [1] these days.

[1] https://www.hsph.harvard.edu/nutritionsource/fish/#:~:text=E....


No, you'd basically have to eat salmon all day long every day to have any meaningful effect.

Eating fish is not a viable or realistic way of maintaining healthy vitamin D levels. You have to either get sunlight or take a supplement.

(And I love fish, I eat salmon daily sometimes, but I still supplement with D3.)


For D3, yes. But for Omega-3's, a little bit of salmon goes a looooong way.


Be careful with that (if you're treating cancer): https://www.cancernetwork.com/view/fish-oil-consumption-link...

Im not sure where this study landed in cancer communities or what subsequent studies concluded. But it's worth noting.


Overfishing is serious enough at this point that I'd rather take a small health risk than contribute to it.


A lot of fish you would buy at the supermarket is farm-raised. Especially catfish, salmon, and tilapia.


Unfortunately, a lot of farm-raised fish are fed smaller fish from natural ecosystems, effectively shifting but maintaining the overfishing problem.

If you care about omega-3 (you probably should) without contributing to overfishing (again, imho you probably should), get an algal-oil based supplement. Prices are pretty competitive nowadays.


Good thing about farm-raised fish is that it is less contaminated, because it is usually mostly plant-fed. But it also makes it less nutritious.


Depends on what materials one classifies as a “contaminant”.


Also, they might be "mostly" plant fed but a big part of their diet is made up of wild fish!


Not sure why you are being downvoted. Salmon (mentioned in other siblings) is a perfect example of over fishing, and the farmed stuff tends to be of terrible quality and fed absolute garbage/treated terribly.


Salmon is found in high-latitude systems around the globe in both hemispheres. Overfishing is something that occurs to populations, not species. Most of the salmon populations that I am familiar with are not over-fished, but have suffered from spawning habitat loss due to dams and logging. Those are major issues and they continue to negatively affect those populations; you are doing the loggers and dam builders a favor by blaming fishing, which is far easier (and more politically palatable) to regulate.


I hope I've not misunderstood but if you are saying vitamin D deficiency is a 'small health risk' then that needs correction. Never mind the validity or not of the paper in question. Anyone reviewing the peer-reviewed literature on vitamin D for the last decade or so would conclude it's a very bad idea indeed to be deficient. For those who don't get much sun exposure, a blood test is recommended and will put people on the right track.


Does aquaponic tilapia not have D3?


Fish seem to be a very important component of the most healthful human diets, AFAIK, likely/mostly grounded in various aspects of evolution - some more on the side of actual 'selection pressure', some more on the side of chance and 'doesn't break things in a way that really matters' (i.e., successful enough reproduction / survival until ages required for reproduction).

That said, and in light of the comment from https://news.ycombinator.com/user?id=AlecSchueler, in particular - many of the most essential nutrients / 'micronutrients' that are obtained from eating fish are actually not made by fish themselves. Rather, fish 'concentrate' these substances as they go about their own business of survival. For example, vitamin D3, DHA, EPA, etc. Consequently, there are much more readily available 'vegan' sources of these substances, derived directly from the fundamental source(s) - microalgae and the like.

FYI (to all).

Overfishing IS a serious problem. Our activities, in general, are at a scale, and grounded in processes, these days, that produce significant impacts on the environment. Frankly, in the 'great scheme' of things, it doesn't matter a whit. Humans, and even this planet, are not even a droplet in the ocean of the universe, as far as we / I can even get any sort of a solid handle on that concept, now. But, that doesn't absolve us of any responsibility for trying not to absolutely annihilate OUR home.

It's disgusting to be GIVEN so much (none of us had much hand in almost anything that exists now, even what we've 'built' - we can't create atoms, we don't choose when, where, or to whom we are born, many of the opportunities we are afforded in a 'given life', etc.), and treat it as casually as so many do - to be so entitled as many seem to be.

But then, the universe (/ God / gods / whatever concept you prefer) will always have the final say. It'd just be nice to not F things up for everyone else, IMO.

EDIT: I hope the latter bits, above, don't come off as too moralizing - not my intention ... it's difficult to avoid some frustration with some of what the news inundates us with every day, I find.

More importantly - vitamin D3 is also readily produced in our own bodies with enough of the right kind of sunlight (dependent also on skin tone, age, kidney & liver function [etc.], and, ultimately, 'height of the sun in the sky' - i.e., enough ~290-300nm UV rays penetrating the atmosphere at the angle of inclination / solar zenith angle / whichever concept/quantification you prefer). And, this is actually not much at all. While skin cancer is, itself, a risk - this should, of course, be weighed against the importance of vitamin D3 itself. This comment is already REALLY long, but basically, for latitudes close enough to the 'Tropics', typically only 10 - 20 minutes of sun around noon in summer would be necessary. Winter is trickier. Here are a few links that may be useful for more info (in general, Pubmed - searching for review articles, etc. - is usually a good place to start, IMO - depending on how comfortable you are with reading these types of articles, otherwise, backtracing to those that cite them, especially, the efforts at more 'popular press' descriptions of research now produced by journals like Science etc.):

https://pubmed.ncbi.nlm.nih.gov/32918212/

https://pubmed.ncbi.nlm.nih.gov/28516265/

https://academic.oup.com/ajcn/article/110/1/150/5487983

... The 'Linus Pauling Institute' also seems to have, in my past experience, quite good information on 'micronutrients', in particular (with good citations, etc.), for all of Pauling's actual more tenuous beliefs (himself, in later life) about vitamin C:

https://lpi.oregonstate.edu/mic/vitamins/vitamin-D


That's what's confusing me: 105,000 participants in a study should have resulted in thousands of cancer deaths, a 12% drop should be pretty noticeable even with 10% of the data.


How exactly are you getting "should have resulted in thousands of cancer deaths"?

A population of 100,000 should expect to see 144.1 deaths[1].

If you evenly divide the study group into two 52,500-sized groups(treatment + control), each group should expect to see 72 deaths. A 12% drop in one of the groups is 8.6 deaths. As a colleague would say, that's very few clams from which to make a chowder.

Perhaps the researchers are using more advanced tools than I am, but even assuming a 12% difference between treatment and control does not result in a chi-squared test with any reasonable significance. ie: the null-hypothesis(there is no difference between groups) cannot be rejected.

1. https://www.cdc.gov/cancer/dcpc/research/update-on-cancer-de...


144.1 deaths per year. Presumably these studies lasted multiple years. Though I'll admit this was less than I expected, which would explain the differing results.


Thank you! Yes, the time frame is important, d'uh lol.


>> This sounds fishy. Sounds like their primary result was nothing

If you combine results from a bunch of studies that really didn't do anything, it's going to mask the results from the ones that did.


Sure... but conversely, if you keep dividing up and analyzing in different ways, eventually you'll get a p<0.05 result.

This is why you "call your shot" before beginning the analysis. Otherwise, people will be suspicious you just jiggered things around until you found something.


Maybe, but with 14 studies you have ~2^14 ways to split the studies into two groups.

If you test multiple hypotheses, you have too adjust the p-value accordingly.

Have the authors disclosed how many alternative hypotheses were tested until the result in the article was found to be significant?


P-values don't prove anything about whether an effect is real or not. They, at most, provide some degree of confidence with the presumption that the independent and dependent variables are indeed independent and dependent.

P-values should, at most, be used to direct further mechanistic studies (if possible). Only use them on their lonesome if this hasn't yet been done, or isn't possible. And if that's the case, reverify them independently (such as by doing another meta-analysis using different data).


Yeah, in addition to the XKCD comic it also reminds me of "Spurious Correlations" lol https://tylervigen.com/spurious-correlations


15 minutes of full-body UV exposure from the sun is estimated to provide the equivalent of 20-30,000 IU Vitamin D3.

There is a definite need for studies that better determine the effective dose for D3.

My research led me to the conclusion that studies should be done based not on standardized supplementation, but instead supplementation to a standardized blood serum level.


>Would require some checking if they had a preregistration, if this was registered as a secondary resuld and if they had done a proper statistica analysis of multiple outocmes.

You should do that. Meanwhile, I'm going to (continue to) take a daily vitamin D, because I recognize not everything can be confirmed by a double-blind experiment, and the cost is low.


> "I'm going to (continue to) take a daily vitamin D, because I recognize not everything can be confirmed by a double-blind experiment, and the cost is low."

Or it may be possible to prove the opposite, as with the parachute.

bmj: "Parachute use to prevent death and major trauma when jumping from aircraft: randomized controlled trial"

https://www.bmj.com/content/363/bmj.k5094

Results: Parachute use did not significantly reduce death or major injury (0% for parachute v 0% for control; P>0.9). This finding was consistent across multiple subgroups. Compared with individuals screened but not enrolled, participants included in the study were on aircraft at significantly lower altitude (mean of 0.6 m for participants v mean of 9146 m for non-participants; P<0.001) and lower velocity (mean of 0 km/h v mean of 800 km/h; P<0.001).

"Conclusions: Parachute use did not reduce death or major traumatic injury when jumping from aircraft in the first randomized evaluation of this intervention. However, the trial was only able to enroll participants on small stationary aircraft on the ground, suggesting cautious extrapolation to high altitude jumps. When beliefs regarding the effectiveness of an intervention exist in the community, randomized trials might selectively enroll individuals with a lower perceived likelihood of benefit, thus diminishing the applicability of the results to clinical practice."

disclaimer:

I am pro-Parachute! I believe 100% in the effectiveness of the parachute!



Simpson's paradox is well known - aggregate often has different statistical properties than individual parts.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: