A flaw with your argument is that biology is harder than physics. That is, the i...

Dn_Ab · on Feb 12, 2011

I do not care about the downvoting. But I am guessing it is due to my saying that biology is harder than physics [1]. Perhaps I should clarify to say that a broad mathematical understanding of biology is a harder problem than doing so for your typical system studied in physics. In biology, systems can only be modeled and not understood or derived from fundemental principles, as the interactions of the basic entities are too complex. Or consider: Quantum mechanics, it is a straight forward linear theory (just conceptually difficult) while much of biology is analytically and computationally hard, often involving things like non-linear dynamical systems. A further difficulty here is that the behaviour of the whole of the things studied in biology cannot be inferred by studying its constituents - due to feedback-dissipation processes leading to self organization.

While I am clarifying, I might as well point out that while I personally believe that the usefulness for looking for broad general factors is meaningless considering the dimensions in consideration, there might be some use for them in aggregrate. However, these should not be taken for more than they are. Statistics. Furthermore, You, yummyfajitas, argue agianst a strawman. Did you read the article? Because if you did you would have seen:

I don't want to be mis-understood as being on some positivist-behaviorist crusade against inferences to latent mental variables or structures. As I said, my deepest research interest is, exactly, how to reconstruct hidden causal structures from data.

...

Similarly, pointing out that factor analysis and related techniques are unreliable guides to causal structure does not establish the non-existence of a one-dimensional latent variable driving the success of almost all human mental performance. It's possible that there is such a thing. But the major supposed evidence for it is irrelevant, and it accords very badly with what we actually know about the functioning of the brain and the mind.

Which leads me to the fact that Shalizi has no qualm against dimensionality reduction. Instead he is against the methodologies used and how the conclusions are drawn:

1) g is almost a tautology. It is due to the correlations in the tests which are made to correlate. When performing such factor analysis on these variables a dominating factor which most explains their variance must appear due to algebraic reasons.

2) No one has tried to explained g directly, experimentally or otherwise. Instead they still argue in terms of correlations.

3) They still use simple correlation matrices - linear models from yestercentury. Other more appropriate or robust methods such as non-parametric statistics have since been refined or developed.

[1] http://metabolism.math.duke.edu/docs/04whyams.pdf

yummyfajitas · on Feb 12, 2011

The downvoting (not by me, BTW) is probably because you didn't understand the math and simply appealed to complexity. Your appeal to complexity was both incorrect (pressure = sum of hidden variables, g = sum of hidden variables, complexity is equal) and also irrelevant (Shalizi's argument is against statistical reductions and is independent of complexity).

My objection is not to the fact that Shalizi is hypothesizing a microstructure to g. Here is what I had to say about that part of the article: "It's a great article... Thompson's model makes a specific prediction about g - it should be normally distributed. So overall, I agree with the author's scientific argument: g is very likely to be decomposable into subfactors."

My objection is simply to the term "statistical myth". By Shalizi's argument, macroscale variables which abstract away an ensemble of microscale variables are a "statistical myth". Depending on your philosophical axioms, that's a fine thing to believe - but we should acknowledge that if you believe this, then you also believe pressure is also a statistical myth.

As for the new points you raise:

1) G is a tautology if you deliberately posit a family of normally distributed variables with positive correlations. In fact, it's a tautology if you posit a family of strongly correlated normally distributed variables of any sort (the factors might just have negative components).

This does not, however, explain why performance on various tests is correlated. G and multifactor models are an attempt at explaining this. Thompson's model (which has g built in) is another.

2) Indeed, we don't really understand it. For a long time we didn't understand pressure or temperature either - all we understood was the macroscopic effects. So what?

3) You are clearly speaking about something you don't know much about. If you want to argue that linear models don't work because the data is nonlinear, do it. You haven't. Neither did the author.

Dn_Ab · on Feb 12, 2011

Complexity is not equal because the hidden variables of g (I will use it as you do for now) is not equal to the hidden variables of pressure.

The term statistical myth is appropriate because g is a myth. It is not backed by any direct experimental evidence but arises due to manipulations of statistical techniques. The Gas Law you cite was derived from experiments not from muddled manufacturings of statistics. And the tests correleate because they are made to correlate - it is all quite circular. Your post misrepresents Shalizi. Shalizi's argument is not "macroscale variables which abstract away an ensemble of microscale variables are a 'statistical myth'". Rather, it is that the way in which the latent variable g is first arrived at and then subsequently used to drive conclusions is invalid and meaningless:

But now new tests are validated by showing that they are highly correlated with the common factor, and the validity of g is confirmed by pointing to how well intelligence tests correlate with one another and how much of the inter-test correlations g accounts for. (That is, to the extent construct validity is worried about at all, which, as Borsboom explains, is not as much as it should be. There are better ideas about validity, but they drive us back to problems of causal inference.) By this point, I'd guess it's impossible for something to become accepted as an "intelligence test" if it doesn't correlate well with the Weschler and its kin, no matter how much intelligence, in the ordinary sense, it requires, but, as we saw with the first simulated factor analysis example, that makes it inevitable that the leading factor fits well. [13] This is circular and self-confirming, and the real surprise is that it doesn't work better.

As I quoted Shalizi prior: 'I don't want to be mis-understood as being on some positivist-behaviorist crusade against inferences to latent mental variables or structures. As I said, my deepest research interest is, exactly, how to reconstruct hidden causal structures from data.'

I also do not understand how you use g, as explained it is not composed of factors. It is the dominating factor. While I do not know about Intelligence tests I did follow the mathematics he gives and have applied similar in machine learning contexts. And Considering the wide variety of data linear models fail to properly capture my intuition is that yes, linear methods and assumptions of a Gaussian is overly simplistic without a solid backing argument, which has been unable to be given for near on a century now.

p.s. you are right, my original post was orthogonal to what Shalizi had to say. This was just my argument agains't yours in that I don't feel generalizing to one factor for a system as complex as human intelligence will produce as meaningful results as generalizing to a simple law for a collection of atoms/ a gas did.