
Cheating our children: Suspicious school test scores across the nation - tokenadult
http://www.ajc.com/news/cheating-our-children-suspicious-1397022.html
======
nullflux
The problem with education is that these tests treat students as a commodity.
"Standardized" tests tend not to be even remotely useful in diagnosing the
positive social capability of a human being.

It is a travesty that American teachers and administrators are put in
situations where they feel that they have to falsify results because their
students are not reaching them. The learning capabilities and specialties of
students are as diverse as the rest of human nature, and the best way to
create proper incentives for students and teachers is to build a pervasive
culture that rewards learning and meaningful contribution to society. Perhaps
American education needs to edit its message: aligning extracurricular
American cultural values to highlight the positive effects of learning on
success will likely increase test scores more than the next set of
standardized whatevers that get thrown at them by an overwhelmed, out-of-touch
committee.

------
saulrh
Ensuring that scores are reported correctly won't even start to fix our
educational system. We don't need to cut down on cheating; rather, we need to
realize that more data is not better data, that teachers know a lot more about
teaching than we do, that the teachers were doing a hell of a lot better
before we started measuring them, and that we should just let them do their
jobs.

~~~
yummyfajitas
_...teachers know a lot more about teaching than we do, that the teachers were
doing a hell of a lot better before we started measuring them..._

How do you know this? Even in principle, how _could_ you know this?

To me, it sounds like "I was such a good government contractor before they
started auditing me and making sure I actually delivered what was promised."

~~~
saulrh
There are ways to evaluate teachers other than running tests. In particular, I
was in about ninth grade when NCLB hit, and its effects started rolling in in
11th grade. I was sufficiently aware of the situation to observe. It was a lot
more like "The government contractors building my bridge were meeting all
their deadlines until their managers started demanding documentation in
triplicate for every nail they hammered in."

~~~
yummyfajitas
Without measurement, how can you know objectives were being met?

~~~
saulrh
Again, more measurement is not better measurement. I spent fifty hours a week
in a classroom for nearly twelve years. Quite a bit of that was watching my
fellow classmates struggle to understand things that I'd learned from the
textbook at the beginning of the year. You get pretty good at it.

Also, around 10th grade, every single teacher in the entire school rewrote
their syllabus. From there on, the ones with appropriate degrees in math and
science - you know, the ones that actually knew what they were doing -
complained bitterly about how hard it was to teach both the stuff that
mattered and the stuff that was on the test.

~~~
yummyfajitas
_Again, more measurement is not better measurement._

Again, how do you know objectives were being met? Please clearly state your
evaluation procedure.

 _...complained bitterly about how hard it was to teach both the stuff that
mattered and the stuff that was on the test._

I've made similar complaints - Calc syllabi suck. Too many stupid algebra
tricks, too little conceptual reasoning. This is a problem with the syllabus,
not the test.

All the test does is reveals when you stop doing your job and teach whatever
you feel like, rather than what you are being paid to teach.

~~~
aestetix
Actually, I'd suggest the tests only reveal when the teacher stops teaching to
the test. Also, it's really not fair to hold the teacher's job ransom and say
that if they don't teach the way the test demands, they should lose their job.
How about giving them the freedom to teach based on what they feel is good for
their students?

~~~
DanBC
> _How about giving them the freedom to teach based on what they feel is good
> for their students?_

The controversy about teaching children to read using either 'whole words' or
'phonics' (and sub-forms of phonics) is a nice example of some teachers not
working to an evidence base and causing harm to their students.

~~~
saulrh
Even if they aren't teaching from an evidence base and from proper research,
they're better at defining requirements than the legislators simply because
they're the ones in the classroom and observing their students' progress. I
assume you're a programmer, so I'll reframe the question: would you rather
design your APIs yourself or have the CEO of your large corporation do it for
you?

~~~
DanBC
You know that doctors say that this treatment works, because they see it
works.

And then someone does a double-blind controlled trial and it turns out that
treatment is useless.

(<http://www.nejm.org/doi/full/10.1056/NEJMoa013259>)

------
droithomme
OK, so they analyzed 69,000 schools by comparing test scores to a model they
created predicting what the scores should be, and 200 out of the 69,000, or
0.3%, didn't match the model. (Note: main article says 200 districts, but the
associated article showing the data says 200 schools
<http://www.ajc.com/news/cheating-our-children-1393866.html>)

It's amazing that 99.7% of schools matched their model. Must be a very good
model.

It's ridiculous that they are using this to say there is evidence of cheating
though since, even though their model is 99.7% accurate, it is not 100%
accurate, and that's to be expected.

The methodololgy also relies on questionable assumptions. They look at 3 year
slices to find cases where test scores jumped up one year and then back down
the next. They consider these evidence of systematic, district wide cheating.
That is unproven. If it was cheating, why would the scores go back down. Is
there a correlation then between years of upswing and schools being on the
list of underperforming schools about to be forcibly shut down by the feds?
That might be evidence of cheating if so. But if that was so it would surely
be claimed in the study. Also, district wide cheating means a mass conspiracy
in which hundreds of teachers managed to, under direction of a coordinated
central authority, successfully cheat for one year but not the next, and to
keep the secret, and did so at hundreds of schools. That's quite a conspiracy.

I am sure cheating takes place from time to time, but it is far more likely to
happen at the individual teacher level than as something happening district
wide and thus coming from a central source because the larger a conspiracy
gets, the less likely it is that everyone can keep the secret. Checking for
schools that have a single year with higher scores and finding a small number
that do is poor evidence of a cheating conspiracy.

~~~
tokenadult
I appreciate your thoughtful comments, droithomme, on many of the HN threads
on education reform. You are correct that the statistical model used by the
newspaper report needs more validation. On the other hand, as the story
relates, the model is a generalization of a model developed during reporting
by the same newspaper about known cheating incidents that could be
independently verified by other means, including witness reports of overt
conduct that breached test security or altered student-submitted answers after
the test booklets were submitted to teachers or school principals. The earlier
reporting by the same newspaper revealed that sometimes agreements
("conspiracies") to cheat on tests were rather widespread in entire schools or
even entire school districts, and stayed secret for a time (at least) because
all the school officials involved were united in having an interest in
presenting fake test scores.

I think that the scores bouncing up but then back down is, again, an empirical
observation from the known cases of cheating in the area the newspaper
reported on most closely a few years ago. As the submitted article here
suggests, it is certainly notionally possible that a group of students might
spurt forward suddenly in academic performance if they enjoy the instruction
of a particularly good teacher. But, if genuine learning had taken place, it
is notionally much harder to find the same group of students fall back the
next year to a level lower than the level they reached the previous school
year. It is much more likely that students might progress in grades from a
teacher who is eager to look good, even if it means cheating, to a teacher who
is not in on the cheating plot, or who at least is less effective at cheating
in a manner that massively boosts scores.

I think some of the other claims that you suggest would be made by the
newspaper, if the statistical model were definitely backed up by other
evidence of cheating, could be made if journalists in the various localities
involved look for other sources of information. Not in all cases, as you
correctly point out, but in many cases, additional reporting leg work could
probably find direct evidence of violation of test security before the test
resulting in exact "teaching to the test" item by item, or alteration of
student item responses after students submit their test booklets to school
officials for scoring. That is currently an open question, but based on known
previous examples in various parts of the United States, there is enough smoke
here to prompt journalists to look for sources of fire.

~~~
droithomme
Interesting, I didn't catch from the article that this three-year bump model
had been tested on multiple incidents and found to correlate. It mentions in
the third paragraph their methodology used "a pattern that, in Atlanta,
indicated cheating in multiple schools". It seems Atlanta was the only place
they are claiming there was a correlation, but even of this single data point
they offer no specific evidence beyond the red dot in their graph. I did find
it interesting that the biggest recent case where there was systematic
corruption within a system was in Atlanta, and the newspaper here trying to
show that cheating is common throughout the rest of the US and not just
something special in Atlanta is Atlanta's leading newspaper.

Recent analyses of data on 18,000 teachers in New Yock City has shown that
year to year testing results are not as consistent as one would assume.

[http://news.firedoglake.com/2012/03/07/data-analysis-of-
valu...](http://news.firedoglake.com/2012/03/07/data-analysis-of-value-added-
teacher-model-shows-no-correlation/)

> Gary Rubinstein has been analyzing the data at his site, and he’s found some
> incredible results. First, he found that there’s almost no correlation among
> teachers year-over-year in the data. A teacher is likely to be judged as
> effective one year and ineffective the next. The average change in
> performance was a relatively large 25 points, and it did not fit with
> commonly accepted beliefs that teachers improve performance over time,
> particularly between the first and second year.

[http://garyrubinstein.teachforus.org/2012/02/26/analyzing-
re...](http://garyrubinstein.teachforus.org/2012/02/26/analyzing-released-nyc-
value-added-data-part-1/)

> Looking over the data, I found that 50% of the teachers had a 21 point
> ‘swing’ one way or the other. There were even teachers who had gone up or
> down as much as 80 points. The average change was 25 points. I also noticed
> that 49% of the teachers got lower value-added in 2010 than they did in
> 2009, contrary to my experience that most teachers improve from year to
> year.

> With a correlation coefficient of .35, the scatter plot shows that teachers
> are not consistent from year to year ... nor do a good number of them go up

These sorts of results where the data is available to the public for analysis
makes me skeptical that testing results are actually consistent from year to
year and that inconsistencies are evidence of cheating.

I contrast these findings to the AJC article's claim that "Experts say student
learning doesn’t typically jump backwards." The article does not state which
experts they refer to, what their qualifications are, or what exactly they
have said. That is suspicious to me, especially since the article has many
quotes where they are no so shy about identifying their sources. Shouldn't it
be even more critical to identify specific sources when using an appeal to
authority argument citing "experts"? Especially when in cases where we have
actual data, such as the study I cite above, scores did in fact jump backwards
49% of the time. This difference between empirical data and vague assertions
by unnamed experts is suspicious. I must wonder who are these experts.

Mind you, I think there is cheating. I am highly skeptical that the AJC's
method has been shown to be a valid way to identify where it is happening.

------
ilaksh
Its not as simple as that. From everything I hear, the teachers and school
administrators are being squeezed between a rock and a hard place.

I would guess that there is a subculture of school administrators who feel
that No Child Left Behind etc. are hurting curriculum and unfair to teachers
and schools.

They have to decide between being honest about their scores or firing a bunch
of teachers and/or possibly having their schools shut down, and/or having
class sizes they can't cope with due to the need to hold back huge numbers of
students.

I'm not a teacher but I have heard a few teachers talk about this. I think its
sort like what happens quite often between a manager and his employees: the
manager makes some proclamations that are impractical and often unrealistic,
and some significant portion of the employees decide they just have to smile
and nod, but basically they are ignoring it as invalid. So in this case they
feel they are forced to falsify the results.

I think that probably there are some schools in well-off neighborhoods where
kids are advantaged and the administrators don't run into these issues that
they consider moral dilemmas. The administrators and teachers in those schools
may say that the other teachers deserved to be fired based on the poor
performance of their students, but those privileged teachers aren't aware of
the disadvantages that the other students are working against and therefore
don't realize that students' failure to meet objectives often isn't their
teacher's fault.

Personally I think that national measurements and guidelines are very
important, but fine-grained command centralized control over an entire system
has been proven time and again to be repressive and result in failure.

This goes for all sorts of systems, not just education. It is good to try to
come up with standards, do holistic analysis, have common data formats and an
egalitarian shared perspective and goals. But a system needs to be highly
localized and non-homogeneous to be efficient, robust, and have the freedom to
evolve and adapt to local conditions.

~~~
yummyfajitas
From everything I hear, the traders and managers at failing hedge funds are
being squeezed between a rock and a hard place.

They have to decide between being lying about their losses or fire a bunch of
people and possibly have their hedge funds shut down.

I'm not a trader, but I have heard a few traders talk about this. I think it's
sort of like what happens quite often between an investor and his fund: he
makes some proclamations that are impractical and the market doesn't
cooperate, and some significant portion of the employees just smile and nod
but ignore it and keep collecting their base/bonus. So in this case, they feel
they are forced to commit fraud and hide their losses.

I think that probably there are some desks in well-run firms where profits are
high and traders don't run into these issues that they consider moral
dilemmas. The managers and traders in those firms may say the other traders
deserved to be fired based on their losses, but those privileged traders
aren't aware of the disadvantages that the other traders are working against
and don't realize the market simply moved against them.

Just thought I'd apply the same logic to another profession, albeit a
profession for which most people hold few positive feelings.

~~~
mokus
You left out the part parallel to "I would guess that there is a subculture of
school administrators who feel that No Child Left Behind etc. are hurting
curriculum and unfair to teachers and schools."

The standards schools are being measured against (and rewarded or punished for
their performance relative to) are far less objective than "made or lost
money". You have not applied the same logic to a different profession, because
it misses out on the fundamental point of the comment you are parodying - that
the standard is, in many cases, judging the schools wrongly.

The dilemma is not whether or not to lie about failing. The dilemma is whether
or not to lie about a shitty metric that _says_ you are failing when you can
see clearly that you are not.

I'm not going to make a claim one way or another about what I think of that
argument. I'm just pointing out that the argument you are parodying is not the
one you appear to think it is.

~~~
yummyfajitas
The standard trading desks are held to is also often less objective than "made
or lost money". It's "made or lost money relative to some arbitrary risk-
adjusted benchmark". If the risk guys disagrees with your model about how
dangerous your strategy is, they could easily mark down your +400bps alpha to
+100bps.

And yes, there is a "subculture of traders who feel that the policies enforced
by Risk are hurting profits and unfair to traders." Actually it's not so much
a subculture as the dominant culture.

~~~
mokus
Fair enough. That wasn't at all clear from your earlier comment, though. In
light of this information I'm inclined to agree that there is some similarity.
I think there's still a pretty colossal difference, too, though - the risk
guys generally work for the same firm, don't they?

To reflect back across the metaphor, in the education world the "risk guys"
are a single group of people at a national entity most people already don't
trust to manage money, and they are unilaterally declaring the standards for
_everyone's_ risk.

------
rsanchez1
When you tie incentives and benefits to test score performance, something's
gonna give. You can either institute auditing of schools to guard against
cheating, or have less incentives riding on school test scores.

If you think about it, the schools with poor test scores are the schools that
need more help, not less. The schools performing well are doing great with
what they have, they don't need more. Only if a school consistently performs
poorly with more help should some punishment take place.

