While I agree that this is a problem (i.e., one of suspect measurement), I don't think that it's the major problem. Tests that measure stuff that a not insignificant portion of the test taking population consider either irrelevant or (worse) antithetical to their lives are going to yield wonky results. The administrators claim to be measuring school or teacher effectiveness, while I would suggest that they are basically measuring SES.
Said another way, if you give tests that ask people about stuff that neither they nor most of the people they know care about, you shouldn't be surprised by bad and/or inconsistent results.
Said another way, if you give tests that ask people about stuff that neither they nor most of the people they know care about, you shouldn't be surprised by bad and/or inconsistent results.