Some scoring systems use an initial standardization where the standard deviation is 15 points, others use 24 points, so the same test performance can get you IQ 115 or IQ 124 depending on whose test you take.
Unless, of course, different groups of testers are using different assumptions, but without being driven by the analysis of the largest possible collection of standardized test scores. If so, it casts IQ testing into doubt as a reliable tool.
Evidence for this being a settled issue is the fact that workers in this field report a gradual increase in IQ over the decades:
If mean IQ really was adjusted to agree with current test scores, the mean would always be 100, regardless of test score changes over time.
> Some scoring systems use an initial standardization where the standard deviation is 15 points, others use 24 points, so the same test performance can get you IQ 115 or IQ 124 depending on whose test you take. [ephasis added]
The conclusion is still false -- the tests itself doesn't change, only the scoring assumptions. Those who assume σ = 15 could acquire the tests from those who assume σ = 24 and add them to their own dataset, and vice versa. Also, I have to say, either the standard deviation can't change the test scores, or the test scores have no meaning.
One more thing -- the standard deviation shouldn't be an assumption, with one group arbitrary choosing 15 and another choosing 24. The value should be derived from a large set of test scores, not a committee casting a vote.
Your argument seems to be that one's IQ score depends on the population result, along with some arbitrary assumptions like σ = 15 or σ = 24. But that's the reverse of normal statistical practice, in which the mean and standard deviation derive from test scores, not the other way around.
Obviously I'm not doubting that what you say may be so, only that it shouldn't be so -- the standard deviation shouldn't be based on anything but the analysis of a large set of standardized test scores.
I'm not making an argument about problems with the actual process of measurement, I'm making an argument that the confusion between two reported values sounds quite a lot like a confusion between two reported lengths would sound if they were of the same object, but one had been made with a centimeter ruler and another with an inch ruler, but both had been labeled just "length" in the report.
You can obviously trivially convert between the scales and convert things to the modern scale, once you know that the value you got uses the different SD value. But when the values get just thrown around as "x IQ", you don't know if they are on the old scale.
I'm not entirely sure what you think I'm arguing, but so far you've been talking about something quite different the entire time.
(*Wikipedia says there are actually three common IQ scale conventions, two psychologists had some sort of feud and one of them picked SD=16 to piss the SD=15 guy off.)
Yes, I'm not doubting that this is so, only that it shouldn't be so in a scientific endeavor. If IQ testing were purely scientific (as opposed to being partly political), all those involved in IQ testing would allow a large set of test scores in a standardized test to produce the mean and sigma values on which everyone would need to agree. In other words, an empirical outcome.
> I'm not entirely sure what you think I'm arguing, but so far you've been talking about something quite different the entire time.
Apparently so. My point is that IQ test scores must be collected on an absolute scale based on testing results, before any of the adjustments you're describing. If this weren't the case, if test outcomes depended on something other than the direct performance of the subjects measured in a uniform, reliable way, the testing procedure would be fatally undermined.
Bottom line: I doubt that changes in mean and sigma can produce two different IQ scores in a standardized test as you're claiming. For this to be true, the relationship between the population statistics and the analysis result (mean, sigma) would have to be reversed -- it would put the cart before the horse.
Imagine this conversation:
Q. How do the statistical results derive from the test scores?
A. By a straightforward procedure -- the test scores are subjected to a classical statistical analysis, resulting in a mean and standard deviation.
Q. How are the original test scores arrived at?
A. They're derived from (a) the test results, but (b) adjusted by the the mean and standard deviation values of the population created above.
Q. (after a long pause) But ... but ... doesn't that create an example of circular reasoning, in which the scores rely on the stats and the stats rely on the scores?
A. What? I'm not following you. Can you draw a picture?
A. Okay, I get it. So the statistic analysis depends on the test scores and the test scores depend on the statistical analysis. I don't see a problem with that.
Q. Have a nice day, doctor.