Hacker News new | past | comments | ask | show | jobs | submit login
Berkson's Paradox (wikipedia.org)
39 points by xtacy on Sept 3, 2014 | hide | past | favorite | 20 comments

Implicit population selection biases can lead to all sorts of fun.

One that I saw a paper get wrong a while ago was one which claimed to find an inverse correlation between two different traits that individually improved intelligence. The catch? The sample population was chosen from university students at no name school X. But people who did well on both traits would have done well enough to go to a better school and so were underrepresented in the population sample!

(I forget the paper, but pay attention and you'll find lots of other examples...)

Speaking of intelligence and Berkson's paradox, it looks like Berkson's paradox may be responsible in a very similar way for why the personality trait Conscientiousness seems to correlate negatively in some studies with intelligence - the studies were using selected samples while more representative population samples show the expected independence: "How are conscientiousness and cognitive ability related to one another? A re-examination of the intelligence compensation hypothesis", Murray et al 2014 (https://pdf.yt/d/Dfl1N6pbR-4vYaKk ; excerpts: https://plus.google.com/103530621949492999968/posts/aQ51UnLC... )

This reminds me of a thought experiment I read a while ago, I believe on the Atlantic. I'll paraphrase with a bit more math:

Suppose that acting ability and attractiveness are independently normally distributed with mean 0 and standard deviation 1. Further suppose that to be a successful movie star, an individual must have an acting ability plus attractiveness of 6.

Then among people with the necessary attributes to be movie stars, attractiveness and acting ability will be negatively correlated. In this example, we might expect to see movie stars with (intelligence, attractiveness) around (3,3) or (2,4) or (4,2), but it's much more unlikely that we see many people around (4,4).

For a more thought-provoking variation, suppose that intelligence and test taking ability are independent, and IQ is the sum of the two. What, then, does your IQ score say about your intelligence?

See http://bentilly.blogspot.com/2010/02/what-is-intelligence.ht... for some of my thoughts on that from a few years back.

That's an unlikely assumption.

A better case might be for students at big name university's where GPA (proxy work etc) and intelligence (proxy SAT score etc) need to be over some threshold for admittance.

If that's the case bankers and others that want both a big name school and high GPA are actually negatively selecting for intellect. Which may account for a lot of fairly dumb behavior at banking institutions etc. As they might have a lot of people that can talk about statistics without actually understanding it.

That's an unlikely assumption.

On what basis do you believe it is unlikely?

As an example I submit that test preparation courses like Kaplan do nothing but improve how well you'll do on a certain type of test without improving your general intelligence.

As another example I am quite aware of how much of an advantage I gain on tests from my ability to relax in a situation where other people tense up. I've described this advantage before with, "Comparing me to a normal person based on the resulting test score is like starting with two runners, taking one out back and beating on him for a while, then expecting them to run a fair race."

Independence is to strict. Take someone with an IQ of say 50 and there not going to be good at taking tests.

Now within a given range IQ rang of say 95 to 105 the correlation between IQ and test taking ability might be tiny. However, that's unlikely to hold up as you keep stretching the IQ range from say 50 to 150.

PS: IQ tests where initially more about testing the low end of the scale than the high end and in that context there not that bad. The early assumptions where also looking at the correlation with IQ and things like reflexes when that failed people started looking into mental retardation etc.

In other words the toy model that I am suggesting is unlikely to be perfectly true. Granted.

But I think the point remains that a certain amount of what goes into the IQ score is something we don't think of as intelligence. And this means that IQ doesn't measure intelligence nearly as directly as most of us would naively think.

IQ does a reasonable job of categorizing people. It does a crap job of ranking people.

If you compare say: 0-65, 66-85, 86-115, 116-135, 135+ you find plenty of significant differences. Generally 85 vs 86 is meaningless, but 85 vs 116 is not. Which means any hard cutoff is going to exclude people close to the cutoff on a fairly arbitrary basis.

I think we're in agreement here. I don't have any better way to categorize lots of people than an IQ test or equivalent.

But people selected by success in life tend to on average have decent but not outstanding IQ scores. And people selected for their outstanding IQ tend to have decent but not outstanding success in life.

'yummifajitas is on break today, but he'll tell you that test prep courses DO NOT improve test scores.

This is exactly the phenomenon described in the OP.

In this notation: P(A|B,C) what does the comma mean?

I think it's the intersection, so, "and".

P(A|B,C) is "Probability of A, given both B and C".

That's true, may be a slightly confusing way to interpret it, given that C = A∪B. In context I think it's better to read it as P(A|B), in the case that C occurs.

wouldn't that be P((A|B)|C) ?

or even more precedence dubious: P(A|B|C) ?

The "given" symbol should only appear once in any probability statement.

That's true from an objective, mathematical standpoint. But the paradox is really saying something about how humans perceive the statement, so different (mathematically equivalent) ways of framing the same thing can make a difference. It's a minor point.

Yeah it's tricky because in most programming languages, '|' has precedence over ',', but in math sometimes it's the opposite. That tripped me up too.

In this context, the comma is as good as &.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact