
Berkson's Paradox - dedalus
https://en.wikipedia.org/wiki/Berkson%27s_paradox
======
pitt1980
Example that I remember getting pointed out on HN a while back

[https://www.google.com/amp/s/amp.reddit.com/r/programming/co...](https://www.google.com/amp/s/amp.reddit.com/r/programming/comments/44kpzk/peter_norvig_being_good_at_programming/#ampf=undefined)

~~~
cperciva
A couple more examples from higher education: Within the pool of enrolled
students, language skills and quantitative skills are negatively correlated;
and within the pool of students accepted by any law school, LSAT and GPA are
negatively correlated.

~~~
zimablue
Implication is that anything within a company that gets you hired that isn't A
is negatively correlated to A I guess, which is pretty interesting.

Looks, competence, workexp, interview skills, height, rich background, any
positive discrimination, nepotism etc etc. All should be negatively correlated
with eachother this. Although there are limits to the strength of the effect
right, it depends on how much of a slice is being excluded and if the two
variables have an enormous correlation that might outweigh it?

You'd also expect the effect to be stronger the more selective the environment
(for each pair of variables). So if your place hires very strongly on looks
and competence they'll be very negatively correlated.

Also it depends on the hiring policy, we're assuming some sort of (A+B) > C
evaluation over things that they care about, but if it's (A>A0, B>B0, C>C0),
pass all of those and you're in then this effect should be totally absent in
those variables.

~~~
bkrn
Re: your final paragraph it seems like it could be a0 > A & b0 > B & ... if
you're also trying to (or forced by the market to) minimize the sum of a0 + b0
...

------
belljustin95
My first exposure to this idea was in Jordan Ellenberg's fantastic book "How
Not to Be Wrong: The Power of Mathematical Thinking". Here's a post from him
that goes into the same example he uses in the book:

[https://slate.com/human-interest/2014/06/berksons-fallacy-
wh...](https://slate.com/human-interest/2014/06/berksons-fallacy-why-are-
handsome-men-such-jerks.html)

------
aidenn0
An example of this I heard once is:

For a given car, there is no correlation between whether or not a car battery
is dead or a fuel pump is broken.

However if you have a car that does not start, if you test the battery and it
is working, you can now consider it more likely that the fuel pump is broken
(because you have ruled out one cause of the car not starting, all other
causes are now more likely).

This means that if you were to gather statistics about batteries and fuel
pumps of all cars taken into the auto shop, you would find that there is a
negative correlation between broken batteries and broken fuel pumps, despite
this being clearly nonsensical for the general population.

------
OskarS
There's a fabulous numberphile video exploring this paradox:
[https://www.youtube.com/watch?v=FUD8h9JpEVQ](https://www.youtube.com/watch?v=FUD8h9JpEVQ)

------
bicubic
Related: Simpson's Paradox

[https://en.wikipedia.org/wiki/Simpson%27s_paradox](https://en.wikipedia.org/wiki/Simpson%27s_paradox)

------
fragebogen
TL;DR Berkson's paradox is a false observation of a negative correlation
between two positive traits

~~~
zimablue
Which occurs when/because for whatever reason, samples with two negative
traits are excluded, the example demonstrates it well: if P(a, b), P(~a, b),
P(a, ~b), P(~b, ~a) all equal 0.25:

If you exclude the (~a, ~b) sample then within the population that remains it
looks like a and b are negatively correlated. (p(a|~b) = 1, p(a|b)=0.5)

It's interesting because this is something that happens a lot => when dating,
looks negatively correlate with niceness within people you date because you
don't date people who are neither. Diseases in hospital populations are
negatively correlated because if you don't have anything you're not in
hospital.

~~~
pezo1919
Great examples, thanks!

Can you suggest any good material you know on stats? (I'd prefer videos, but
everything works) It seems to me you have a very solid background.

~~~
anchpop
look up 3blue1brown for some instructional videos on math, including some on
stats

------
personjerry
Is this just a term for the outcome of a sampling bias?

~~~
smu3l
This is a special case of sampling bias.

------
pure-awesome
Took me until reading the example involving stamps (three or four times) to
finally grok it, but I finally understand now.

:)

------
sys_64738
This is the inverse of two wrongs don't make a right.

