I'm not making an argument about problems with the actual process of measurement, I'm making an argument that the confusion between two reported values sounds quite a lot like a confusion between two reported lengths would sound if they were of the same object, but one had been made with a centimeter ruler and another with an inch ruler, but both had been labeled just "length" in the report.
You can obviously trivially convert between the scales and convert things to the modern scale, once you know that the value you got uses the different SD value. But when the values get just thrown around as "x IQ", you don't know if they are on the old scale.
I'm not entirely sure what you think I'm arguing, but so far you've been talking about something quite different the entire time.
(*Wikipedia says there are actually three common IQ scale conventions, two psychologists had some sort of feud and one of them picked SD=16 to piss the SD=15 guy off.)
Yes, I'm not doubting that this is so, only that it shouldn't be so in a scientific endeavor. If IQ testing were purely scientific (as opposed to being partly political), all those involved in IQ testing would allow a large set of test scores in a standardized test to produce the mean and sigma values on which everyone would need to agree. In other words, an empirical outcome.
> I'm not entirely sure what you think I'm arguing, but so far you've been talking about something quite different the entire time.
Apparently so. My point is that IQ test scores must be collected on an absolute scale based on testing results, before any of the adjustments you're describing. If this weren't the case, if test outcomes depended on something other than the direct performance of the subjects measured in a uniform, reliable way, the testing procedure would be fatally undermined.
Bottom line: I doubt that changes in mean and sigma can produce two different IQ scores in a standardized test as you're claiming. For this to be true, the relationship between the population statistics and the analysis result (mean, sigma) would have to be reversed -- it would put the cart before the horse.
Imagine this conversation:
Q. How do the statistical results derive from the test scores?
A. By a straightforward procedure -- the test scores are subjected to a classical statistical analysis, resulting in a mean and standard deviation.
Q. How are the original test scores arrived at?
A. They're derived from (a) the test results, but (b) adjusted by the the mean and standard deviation values of the population created above.
Q. (after a long pause) But ... but ... doesn't that create an example of circular reasoning, in which the scores rely on the stats and the stats rely on the scores?
A. What? I'm not following you. Can you draw a picture?
A. Okay, I get it. So the statistic analysis depends on the test scores and the test scores depend on the statistical analysis. I don't see a problem with that.
Q. Have a nice day, doctor.