Hacker News new | past | comments | ask | show | jobs | submit login
How Correlated Are You? (allendowney.com)
108 points by alexmolas on Aug 30, 2023 | hide | past | favorite | 55 comments



Strange that an article by someone interested in Bayesian statistics (according to the website banner) would not consider measurement error here. It seems likely to me that ear measurements are harder to make accurately than measurements of larger body parts, which would drive down the correlations. Furthermore, I suspect that measurement error is inversely proportional to the size of the part being measured. If so, there should be a correlation between the mean value and the mean correlation with all other parts.


How would you consider that? The article states that all the calculations were based on the ANSUR-II dataset. If the values in that dataset were wrong, of course the results of the calculations may not be correct.


That's not really how measurements work though. All real-world measurements have precision associated with them. It's not a matter of measurements being correct or being wrong.

As GP hints at, since you don't measure an arm the same way you measure an ear, it's reasonable to expect the errors to have different characteristics.


Of course we can expect errors, I don't doubt that. That's what I mean with "wrong". But still, how should the author consider that if he doesn't know about the precision nor quantity of errors? I just don't think that's the topic of the article. The article is about correlations in the dataset, no matter, if the dataset may contain errors.


I think the previous poster claims that different measurement process yields different measurement errors (spread). Since correlation coefficient is a function of spread, if the measurement errors are random, even if the underlying relation is the same, it suffices to increase the spread a little bit and get a subsequently smaller correlation coefficient.

Confidence bounds for every correlation coefficient would add value and _might_ change some of the interpretations.

E.g.: "its average correlation with the other measurements is only 0.03, which is not just small, it is substantially smaller than the next smallest, which is ear breadth, with an average correlation of 0.13."

If the former is 0.03 +- 0.02 and the latter is 0.13 +- 0.07, we could claim that both are equal to 0 (or just equal).


Intuition tells me it would be harder to achieve a precise measurement of ear protrusion than of a larger body part. This could be wrong of course, but it is generally good scientific practice to think about measurement errors when comparing sets of correlations. This may be especially true for a Bayesian, since Bayesian statistics (when placed in opposition to frequentist) strongly emphasises a deep consideration of all sources of uncertainty.


https://en.wikipedia.org/wiki/Significance_arithmetic

> Significance arithmetic is a set of rules (sometimes called significant figure rules) for approximating the propagation of uncertainty in scientific or statistical calculations. These rules can be used to find the appropriate number of significant figures to use to represent the result of a calculation. If a calculation is done without analysis of the uncertainty involved, a result that is written with too many significant figures can be taken to imply a higher precision than is known, and a result that is written with too few significant figures results in an avoidable loss of precision. Understanding these rules requires a good understanding of the concept of significant and insignificant figures.


Yes exactly this. And with noisier measures come lower correlations.


Not having reviewed the full list of measurements, everything other than ear protrusion mentioned seems to involve bone structure.

The ears, especially their protrusion as measured, have little or no relationship to bone structure. The other measurements with low correlation to bone structure also have low correlation across the collection of measurements.

Ear protrusion probably has some relationship to other developmental signatures, just not limb length etc.


There is another measurement for men that involves a body part without bone structure.

Perhaps this report settles once and for all, what proportion of normally more visible measurements are true correlates with that measurement.


Interesting, it seems like then that the old wives tale—that height, foot size, or hand size—should not correlate with that measurement in men.

According to the literature (RIP my search history), height is weakly correlated with it [1], while the ratio between the second digit and fourth digit is moderately correlated [2]. Weirdly enough, the 2D:4D ratio has to do with bone lengths, but is still correlated with that measurement.

--- [1]: https://www.nature.com/articles/ijir201153 [2]: https://en.wikipedia.org/wiki/Digit_ratio


"6,000 adult US military personnel (4,082 men and 1,986 women)." .... "Despite the presence of reservists in the sample, it is still not an approximation of the US Civilian population"

This dataset is from the US military so please remember that when you think about generalising these results to the rest of the world.

Since everyone is talking about it Ear Protrusion 'might' be strongly correlated to height IF you don't exclude short people from the sample set.


In particular, the statement that weight is the most correlated of any measurement with all the others seems like it would almost certainly not be true of a general population with a broader spectrum of physical fitness


I wonder how many significant dimensions a principal components analysis would identify for that data set, and if one of them would correspond to male/female.


Great question -- it is on my list of ideas for a future article!


An interesting exercise would have been to run a Gaussian Graphical Model (like the “graphical lasso”) to sparsify that correlation matrix.

In the Gaussian case, those methods are able to identify nonzero partial correlations and hence a causal graph. Since there’s multimodality here trust in that interpretation should be lower, but it’s still worthwhile looking at a pseudo/candidate causal structure as part of EDA.


Maybe this is obvious but in the original data, did they exclude rugby players, wrestlers, people with other kind of ear injuries?

Cauliflower ear can totally change the shape and size of your ear.


A colleague once pronounced correlation "co-relation" and it made more sense so now I say it that way sometimes. Anyone else do that?


Do people not try to work out the ethymology of words? Everything makes more sense when you know where words came from and how they came to be. :)


I do somehow because of my inherent interest in understanding the world.

However, this got me intro communication issues with my former boss. Apparently, this person was not interested in what words (or combinations of words) actually mean but rather simply just how they're used frequently. Every new combination of words was utterly confusing to this person.

As a conclusion, I try to figure out how people process language and adapt accordingly. In my experience, most people don't see language as a coherent system of words with meaning.


If you're interested in that sort of thing, on mental model that makes sense is people keep a little blob of context in their head, and use the words coming in to update the blob. Then at the end of the sentence they guess what you meant and respond to that, blind to what you was actually said as the tape recorder hears it.

Which would be a very sensible way of working for people who don't have a lot of short term memory to play with, since remembering a complex sentence to parse is impossible. But it makes it challenging to process words as though they have inherent meaning, because they are really acting as deltas to some state. The precise meaning of the word cannot be preserved because it is conflated with the sentence as a whole.

You see that abused a lot in politics, where people say one thing in a way that means what they say and the impression they give are different. So if trends go one way they supported the trend, and if the trends go the other way then they never actually said anything in support.

I'll probably get away with pointing at Trump as an interesting example of that in reverse where his sentences usually don't make sense if you try to interpret them as sentences, but are crystal clear as a series of updates to some blob of state.


Kind of like garden path sentences. Famous garden path sentences like "the old man the boat" are crafted to parse to obvious nonsense on the first iteration, forcing the reader to do backtracking. But when proofreading texts it's not uncommon to find unintentional garden path sentences that are easy to misunderstand, especially in English. Some politicians are great at doing that on purpose. But crafting sentences that only make sense on first parsing but are actually nonsense on second glance is kind of next level, some kind of inverse garden path sentence.


> coherent

That's another one that's useful to think about: it's the same "-herent" as "adherent" and "inherent". One means "stick together", one means "stick to" and the other means "sticks into" (I.e. is part of).

"Hesitate" comes from the same root and means "stick in place".


In linguistics, there is a saying coined by Firth, “You will know a word by the company it keeps.” The idea is that a words meaning is embedded in all the contexts in which it occurs. This is how virtually everyone processes the semantics of words. Meaning and contexts vary with time and person and can lead to many misunderstandings.


Most people are still on step 1 (learning what the individual words mean). They don't get to the stage of learning how the different words fit together.


Y'all are insufferable good lord. People know how the words fit together. Everyone speaks fine.

Not everyone needs to be a big gas etymology nerd about it and in fact nerdly insistence on what a word "actually" means contra its everyday use is an extremely common error. One that wouldn't be made by someone with even the slightest interest in the domain of linguistics, rather than just feeling smarter than people.


I put quotes around words to differentiate when I’m using them incorrectly but popularly. By definition the goalpost has also been shifted.


One of my favourite language-related sites is the online etymology dictionary, which has the following entry for 'correlation' for anyone interested: https://www.etymonline.com/word/correlation#etymonline_v_191...


It's interesting speaking to German's about their language sometimes, and the way that as a language leaner I perceive words differently to how they do as a native speaker.

For instance consider the word for a wedding 'Hochzeit' - in order to remember the word I broke it down into 'hoch' (high) and Zeit (time) - so a wedding is a 'high time' of your life.

Some Germans seem to have never viewed the word in terms of it's components, and this is a revelation to them!


You could almost say they're co-related.


I genuinely have no idea how else you'd pronounce it. "Core-elation"?


I pronounce it a bit like "cor-relation", in IPA like /kor.re'lei.ʃən/ but also perhaps like /ko:.re'lei.ʃən/ ("coh-relation"), and almost never like /ko.re'lei.ʃən/ (~="co-relation"), or never like /ko:r.re'lei.ʃən/ (~"core-relation").


Your IPA transcriptions are a bit off (from what the usual convention for English is anyway). I think you mean /kɑ.ɹəˈleɪ.ʃən/ and /kɔɹ.əˈleɪ.ʃən/ for the first and last, but I'm not sure what you mean by "coh-relation" vs. "co-relation". These seem the same to me, /koʊ.ɹɪˈleɪʃən/.


No.


Surprising article. The conclusion that ear protrusion (yes, that's a thing) has no genetic basis is of course not impossible, but seems unlikely. It would require another data set to establish it.


Couldn't it be possible for ear protrusion to be uncorrelated with the other variables but still have a genetic basis - presumably by depending on a set of genes with little overlap over the ones that govern size?


I have a lot of ear protrusion, which is even more notable due to my large ears. Having seen old pictures of my great grandfather last weekend, it seems clear whose genes for large protruded ears I inherited.


Ear protrusion is heritable: https://www.nature.com/articles/ncomms8500


Worth noting that not all heritable traits are reflected in DNA, thanks to epigenetics.


They estimated SNP heritability via GCTA.


Guess now there should be an app for this. Or is there? A phone app, to scan you, tell you how correlated you are.

LOL Could call the app "Phrenology" Or maybe a dating app "PhrenoloLove"


I wonder if they were able to settle how well foot length correlates with other anatomy.


No mention of the one size every man is interested in most.


It would be interesting if there's a correlation with other things made of cartilage like the nose and ears.

It's interesting to note that we humans evolved vastly bigger penises than the other monkeys (silverback gorillas are much larger than men but sport 1.25-inchers) due to women having much more choice / social capital than female gorillas.


I'm not sure if female social capital in gorillas is the reason for the shorter penis size. Because in other primates, like bonobos, for example, where females predominate and have lots of choice, penis size is still small.

I think the main reason why is because sex is used in humans for pair-bonding, and many human cultures have much more monogamous lifestyles—raising children together for example. Sex in humans is less for procreation than in monkeys, so larger penis sizes became more adaptive.


I thought it was because apes fight by grabbing and tearing, so with that evolutionary pressure applied you basically end up with the minimum effective protrusion.

Also, this is why I plan to avoid fighting apes or at least try to wear some good pants if I do.


You may be right. If there was a selective pressure against large penises due to intra-male fighting, and the lack of a strong enough pressure for larger penises (due to the relative absence of pair-bonding), that would support the observation that gorilla penises are small. For humans, where that sort of fighting doesn't happen, and where pair-bonding is important, penis size would naturally drift upwards until they become too big, or they take too much energy to grow/utilize.

One way to test this would be analyze the fighting techniques of various primate species, and then bin them based on their penis size and relative monogamy. If all primates that grab and tear have small penis, irrespective of their pair-bonding, then perhaps the former is more important.


That measurement does not appear to be in the dataset. I guess the military doesn't measure it because it doesn't matter to them.


My first thought was, "Yep, the impetus for this was 100% about someone trying to see something about penis sizes."


I've read height is weakly correlated (which intuitively makes sense, as greater height is more or less synonymous with larger body size).

Also the second-to-fourth digit ratio seems correlated. Longer ring finger seems to indicate more testosterone in utero.

The (many) other claims appear to be bunk.


Giggity


I am Causation.


Probably pretty correlated


What’s your baseline?


This really sounds like a resurgence of AI based phrenology. I wish that our profession took ethics more seriously.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: