
General factor of intelligence g, a Statistical Myth - eru
http://cscs.umich.edu/~crshalizi/weblog/523.html
======
yummyfajitas
It's a great article. But it's a bit more philosophical than practical, and I
particularly take issue with the author's use of the term "statistical myth".

Near as I can tell, Thompson's model is the following: there are lots of
abilities, and any test uses many (232+) of them at random. Individuals have a
random selection of abilities.The author then declares that any individual
ability is not g, since no specific ability describes performance very well.
All very true.

But there is a single, explanatory variable in this model: g = # of abilities
an individual has [1]! Moreover, Thompson's model makes a specific prediction
about g - it should be normally distributed. So overall, I agree with the
author's scientific argument: g is very likely to be decomposable into
subfactors.

But I don't agree with his claim that g is a "statistical myth". Let me give
an argument illustrating the fallacy he is making. Suppose I want to explain
the thermodynamic law PV=nT. I can build a moderately more complicated
statistical model [2] involving only 10^23 newtonian particles, with normally
distributed velocities, and completely reproduce all the predictions of the
thermodynamics. But not a single one of those particle positions explains
pressure or temperature! Thus, thermodynamics is just a statistical myth.

Thermodynamics and g are simplified models of the world, based on the fact
that the macroscale is dependent primarily on the sum of a large number of
microscale variables [1]. They both have decent, though imperfect, predictive
power. There is almost certainly a more complicated underlying theory, which
will reproduce thermo/g as theorems about statistical aggregates. (For
example, g may eventually be explained as the interaction of neurons.) Does
this make them "statistical myths"? Of course not. Just macroscale models
which have an underlying microscale explanation.

[1] Or perhaps a weighted average based on how frequently abilities are used
in tests.

[2] <http://en.wikipedia.org/wiki/Statistical_mechanics>

[3] For example, pressure is the the average force imparted by particles
colliding with the side of a vessel divided by the area on which the
collisions occur.

~~~
bambax
Reading the original article further I understand more of what the author
meant -- and I agree with him more.

What he means by "statistical myth", I think, is that you can throw any number
of measurements into a big bag, even weakly correlated, and you'll always be
able to find a general macro-factor that is correlated with all of them and
therefore seems to "describe the bag" pretty well... even if the bag itself is
absolutely meaningless.

For example, one could throw into the bag height, sexiness, feet length, and
BMI, which are all somewhat correlated with one another and with "IQ".

There is a macro factor describing such a bag; but would you call it "general
intelligence"?

 _The fact that we talk about g is because we consider a priori that there
actually is such a thing as "general intelligence"_ and that we are measuring
elements of it. Maybe we are, maybe we're not; correlations alone tell us
nothing about it.

The analogy with physics, be it temperature or pressure or what have you, is
in my opinion quite flawed, because in the case of those entities we know that
the elementary events / forces are linked by way of causality with the macro
factor, and we know that _for reasons other than mere correlation_.

In the case of g, on the contrary, _the only link between the different
elements is the correlation itself_. And it gets worse: to be included as a
relevant IQ test, a new test has to be correlated with g. To quote from the
article:

" _By this point, I'd guess it's impossible for something to become accepted
as an "intelligence test" if it doesn't correlate well with the Weschler and
its kin, no matter how much intelligence, in the ordinary sense, it requires.
(...) This is circular and self-confirming, and the real surprise is that it
doesn't work better._ "

So g is at the same time what we're looking for and what we're building upon,
having decided it's there already.

This looks much more like religion than science.

~~~
yummyfajitas
_For example, one could throw into the bag height, sexiness, feet length, and
BMI, which are all somewhat correlated with one another and with "IQ"._

The point is that if those phenomena are all correlated with one another,
there is likely to be an underlying reason for that. Maybe it's not general
intelligence, but there is something.

Suppose you threw a different set of things into the bag: phase of the moon at
birth, whether you were bitten by a wolf, and how good you are with computers.
These things almost certainly will not be correlated, because there is indeed
no relationship between them. The only reason the author discovered a large
principal component in all his models is because he explicitly built one in!

Now, my analogy with physics is not to the physics of today. Rather, it's to
physics pre-Boltzman. Before that, we didn't really know how the atomic theory
of physics was related to thermodynamics. All we really had were correlations
- boiling is correlated with burning, with faster chemical reactions,
expansion of solids, with the human perception of warmth, etc. We observed
they all occurred together, and postulated an underlying variable T which
correlated well with all of them. We then exploited the correlation between
this hidden variable T and thermal expansion to design a specific test to
measure it (thermometers).

Pre-Boltzman, all we had were correlations between fundamentally different
physical phenomena. We had various incorrect theories about why some of them
occurred together, but that's about it.

Now suppose Boltzman comes along. He says "what you call a single variable T,
I can explain with the sum of many variables." Does that mean it's reasonable
to conclude T is a "statistical myth"?

~~~
Dn_Ab
See, the article is not arguing that. The article has nothing against latent
variable(s) to account for what we observe as intelligence. What it is against
is, misapplication of methodology. The myth is not in a latent variable
summarizing intelligence, that may very well exists as admitted by the author.
The myth is that g is derived in a meaningful way and explains the
correlations in data instead of being a by product of the fact that the data
is made to correlate (these days) and is just a measure of the correlation of
the tests (now made to correlate). Again, I remain sceptical you read the
article in full.

g is not as useful as T, since its ability as an explanatory variable and
verification in experimental settings are sorely lacking.

Also, even if a valid concept of a single explanatory variable for
intelligence were created, I personally, remain sceptical of the scope of its
usefulness considering the space of complexity at hand (humans, genes,
environment,...) and likely a lot less profound and far reaching as the
insights of Boltzmann. And on the political side, the capacity for damage it
would entail could be large - many people's lives could be impacted
negatively. So it would have to be wielded carefully, one eugenics movement is
enough.

~~~
yummyfajitas
_What it is against is, misapplication of methodology._

Yes, and the author rightly points out that latent variables do not exclude
the possibility of microstructure. That doesn't mean the use of the latent
variables is a "statistical myth", unless you define the term "statistical
myth" so broadly as to include temperature and pressure.

 _g is not as useful as T, since its ability as an explanatory variable and
verification in experimental settings are sorely lacking._

This is both undisputed, and unrelated to the author's argument. The
difference between our pre-Boltzman understanding of T and our contemporary
understanding of g is one of precision. The author's argument was independent
of precision, so invoking precision to protect his argument is disingenuous.

You might want to criticize the confidence levels of g. That's a perfectly
legitimate thing to do. But that's not what I'm responding to.

 _So it would have to be wielded carefully, one eugenics movement is enough._

Not sure about that. The Dor Yesherim organization does such a great job of
eugenics, I'd love to see further eugenics movements in other genetically
isolated groups.

------
MichaelSalib
I wish there were more Cosma Shalizis in the world. Somehow, statistics has
become the ultimate in cargo cult mathematics. We need more statisticians who
can write clearly to set fire to our thatched airplanes.

~~~
brent
Pardon my naivety, but what do you mean by "thatched airplanes"?

~~~
Nogwater
See: <http://en.wikipedia.org/wiki/Cargo_cult>

"Thatched airplanes" being an imitation of the real thing.

------
honm
Even if we can't measure it, what makes it so hard to believe that
intelligence is inheritable? Is height inheritable? Yes. Is eye color
inheritable? Yes. Is skin color inheritable? Yes. Is physical strength....?
Yes. Why wouldn't a human characteristic that has provided one of the biggest
advantages through evolution not be inheritable too?

~~~
tokenadult
The example of the German monozygotic twins Otto and Ewald, both well
nourished but sportsmen who pursued different sports, show that physique is
exquisitely sensitive to environmental influences even between two individuals
who share a genome and a prenatal environment in the same mother's womb. Take
a look at the photos.

<http://www.marksdailyapple.com/control-gene-expression/>

[http://www.joebower.org/2010/05/we-inherit-and-we-also-
becom...](http://www.joebower.org/2010/05/we-inherit-and-we-also-become.html)

AFTER EDIT: I'm asked in a reply below what my point was, and it's partly to
point out that the term "heritable" means something far, far different from
"determined by genes." There are whole books

[http://www.amazon.com/Nature-Nurture-Environmental-
Influence...](http://www.amazon.com/Nature-Nurture-Environmental-Influences-
Development/dp/0805843876)

[http://www.amazon.com/Genes-Behavior-Nature-Nurture-
Interpla...](http://www.amazon.com/Genes-Behavior-Nature-Nurture-Interplay-
Explained/dp/1405110619/)

[http://www.amazon.com/Dependent-Gene-Fallacy-Nature-
Nurture/...](http://www.amazon.com/Dependent-Gene-Fallacy-Nature-
Nurture/dp/0805072802/)

by professional geneticists, medical doctors, and psychologists patiently
refuting the confusion in most popular literature about what "heritability"
means, but the main point in this thread is that Shalizi is correct, and many
psychologists are wrong, about what heritability figures mean in relation to
IQ.

~~~
xiaoma
Not really sure what your point is there. It's natural two people with
virtually identical genes subjected to vastly different training would have
different body-types. It's also natural that two people with vastly different
genes would respond differently to virtually identical training.

Consider someone like this boy vs his classmates:
[http://www.sciencentral.com/articles/view.php3?type=article&...](http://www.sciencentral.com/articles/view.php3?type=article&article_id=218392292)

------
EugeneG
Cosma was one of my very favorite professors at CMU

------
NY_USA_Hacker
There's a point he omitted: In curve fitting, exploratory data analysis, data
mining, etc., we are looking for X. We don't know if X exists. But if X does
exist, then our methods have a shot at finding X. So, we look. Maybe we find
something. We test what we find, and it appears to work as we believe X would.
So, we start to believe that X exists and we have found it.

Do like following this 'paradigm'? No! But, when people do follow this
paradigm and claim to have found X, then I can't be sure they are wrong!

E.g., maybe there is a statistical model that predicts the stock market. So,
do a lot of curve fitting. Find something that appears to predict. Then if
there is a predictive model, maybe have found it. Test the model on old data
not used in constructing the model and see if it works. If it does, then we
start to believe that there is a predictive model and that we have found it or
something close enough.

When people do such things, I can't say that they are wrong.

------
NY_USA_Hacker
Largely I agree with him, and at times I suspected some such. E.g., just
testing some software, I generated some 'random' symmetric, positive definite
matrices, found the eigenvalues, and noticed that there was a big one and the
sizes went down quickly. So, in linear equations, a few variables constructed
from the eigenvectors of the largest few eigenvalues, can make a good
approximation to all of many variables. So, factor analysis makes a good data
compression technique. Can't find fault with that.

That just a few of the largest eigenvalues/vectors can explain all the data
well is curious. So really he might have just used R and some Monte Carlo to
show us how variance explained increases with number of factors used. I'm
surprised he didn't do this.

Much or all of this has long been clear.

But what I didn't like was his drifting off into old goals of the
psychologists. I couldn't figure out if he was a psychologist with an ax to
grind or what. Instead, he's a statistical physicist. Curious.

The psychologists looking for 'causality' have a goal that is from tough down
to impossible. We should have been concluding that. That he got off into
arguing about 'causality' seemed a bit silly. I don't know all what silly
stuff the psychos are trying to believe, but arguing with silly psychos is a
bit silly.

But it remains: Give a test with some mental puzzle problems, and in just a
linear way can explain a lot of the data with just one factor. Curious. Maybe
somewhat useful. 'Causality'? Likely not if only because we know that there is
a biological and neurological basis and have made no connection with that.

Then I didn't like his use of 'factors': He has some factors correlated. No:
The usual approach is that, like all the eigenvectors are orthogonal, the
factors are all uncorrelated. Maybe the psychos look for some uncorrelated
'factors' trying to get at some of their guesses about causality, but in this
case he should have been more clear.

Finally, he wants Gaussian to justify being interested in means, variances,
and covariances. Well, in the Gaussian case, sample mean and variance are
'sufficient' statistics. But even without Gaussian, means and covariances
remain important, e.g., for the inner products in the Hilbert space of L^2
real random variables.

