
Teacher Effects on Student Achievement and Height: A Cautionary Tale - blfr
https://www.nber.org/papers/w26480
======
jefftk
Summary: schools are rewarding teachers based on "value-added measures" that
try to measure the effect of the teacher on the student's educational
progress. The paper shows that these models are mostly picking up unrelated
differences between student populations, by showing that the same approach
would reward teachers for having taller students.

~~~
wtvanhest
Could be a landmark paper. At a minimum, it shows that all evaluation methods
should completely negate teacher's impact on height. It seriously calls in to
question a ton of research that I would have assumed to have been good.

Excellent work by those that wrote the paper.

~~~
datashow
It appears to be an important research study, but in the sense of implication
to educational policy, not so much.

The reason is, for one, the use of value-added measurement does not have a
significant role as many would think. There is not a signle state in the U.S.
use VAM as a single measure of teacher performance, nor a determinant factor
in teacher firing or promotion. It is always mixed with other quantitative and
/ or qualitative measures (which usually are no better than VAM).

The second reason is, the research community has not find a better replacement
yet. We always know we cannot directly use student achievement, now we know we
cannot trust a more complicated measure such as VAM which takes into account
of student history and other factors.

~~~
wtvanhest
The world is moving toward multifactor models like VAM. This paper introduces
a crucial test to determine whether a VAM is applicable or not.

------
MarcScott
I worked as a teacher in, state funded, education for 15 years.

Value added measurements of student achievement will always fail.

In affluent areas, where student achievement is already high, schools can
afford to hire the best teachers, but they will realistically only be able to
boost achievement by a small fraction, given the child's prior attainment.

In deprived areas, where student achievement is lower, schools can only afford
to hire less qualified teachers, and the boost to achievement will be a small
fraction of previous attainment.

Either way, the measured effect that the teacher has will be marginal, while
common standards and common testing are the only measured outcomes. This does
not equate to the teachers not being invested in student outcomes, or trying
their best for their students.

Value added measures were only introduced as a form of target setting, for
teacher performance. I'd encourage everyone to watch Adam Curtis' The Trap[1]
to see the effects of target setting.

Teachers in early years education set a target, then primary school teachers
need to show value added so game the system to show improvement. This trickles
up the system, and the net result is unrealistic targets for almost every
child as they exit formal education.

[1][https://en.wikipedia.org/wiki/The_Trap_(TV_series)](https://en.wikipedia.org/wiki/The_Trap_\(TV_series\))

~~~
angry_octet
The problems of students and schools in low SES areas go way beyond teacher
quality and student educational achievement.

More explicitly, there are many excellent teachers in poor areas. Teaching
privileged children in a well funded school is much easier.

~~~
pkaye
In my city there a 5 high schools which perform remarkably differently despite
having the same source of funding and teachers. The high performing school is
probably in the 99 percentile in the state. If you look at the schools, that
high performing one looks the most beaten down. I think it all comes down to
the affluence level of the parents, their education levels and what they
impart on their kids.

------
jedberg
The problem with using height is that there is already evidence that taller
people are more successful due to hidden biases.

So perhaps the teacher measures were still valid because all the tall people
were more successful because of other biases in the system?

I'm mostly just being cheeky here. To be clear, I still think it's nearly
impossible to measure how good a teacher is based on student performance,
since the teacher is only one part of the equation, the parents (and the
student's home life) being a much bigger part of the equation. The teacher can
only do so much if the child comes home to parents that don't feed them and
tell them they are worthless.

~~~
beagle3
> The problem with using height is that there is already evidence that taller
> people are more successful due to hidden biases.

IIRC, while it has been shown that taller people are more successful, and
everyone says it is due to "hidden biases", those hidden biases were not in
fact shown.

A competing theory that I am familiar with, which provides just as good (if
not better) explanation is this:

Height is a (bad) proxy for healthy upbringing. Your maximum height is
essentially genetically predetermined. But to actually attain that height, you
have to have good everything growing up -- good food, good health. If you do
not have access to nutritious food, or you are plagued with diseases that
impact your growth (.. or treatment for them [0]), you will be shorter. So:

Tall = Tall Genes + Good Environment

Short = Short Genes + Good Environment ; or Tall Genes + Bad Environment

Thus, tall people are correlated with good environment growing up, and shorter
people less so -- and the theory is that it is the environment you grow up in
that explains your success, rather than some hidden biases.

~~~
bsanr2
IANAS, but from what I understand, it's more complicated than that. A parent's
or grandparent's childhood health has been shown to affect the growth and
health of their descendents. These effects only go away after several
generations of adequate nutrition and health. Tall women tend to give birth to
larger babies not (directly) because of a "tall gene", but because of the
physiological realities of their bodies (i.e., more space). We've recently
identified several traits that are hereditary and that can stunt growth, but
that had previously not been linked directly to a given family's shortness
(e.g., sleep apnea and gluten sensitivity), and which have interventions much
less radical than genetic modification.

It _seems_ clear-cut, and there are studies that link certain genes to height
on a statistical level. But I suspect we're going to find that height has
little to do with genetics in the traditional sense, and outside a few notable
outliers, that there are no "short genes", just environments that fit specific
genetic profiles that are or are not available to the people with those genes.
Europeans are tallest in a generalized global monoculture where Western diets
and lifestyles are the norm, and where European-descended populations have
enjoyed a generally higher quality of life for several generations. Color me
surprised.

~~~
iguy
Often what looks like an effect of grandparents on kids (controlling for
parents) is just genetics -- what you observe in the parents is a noisy
measure of some hidden variable, and the grandparents give you additional
(noisy) data.

> outside a few notable outliers, that there are no "short genes", just
> environments

Predictions based on about 20000 different genes gave correlation 0.65, in the
best paper I can find in 5 minutes. (Which was from 2017, a long time in this
game.) This is for differences within a European-descended population.

~~~
NickNameNick
There was a study a while back, based on family records out of scandinavian
churches, where they were able to identify patterns in children based on when
(or if) their paternal grandfathers had experienced mild famine in a narrow
age range.

They were further working to tie these effects to a group of epigenetic
markers. (possibly successfully, I'm unclear.)

It may have been this study:
[https://en.wikipedia.org/wiki/%C3%96verkalix_study](https://en.wikipedia.org/wiki/%C3%96verkalix_study)

~~~
iguy
I do not know this study, but sentences like "paternal grandfather's food
supply was only linked to the mortality RR of grandsons and not
granddaughters" in a study with N=303 make me pretty skeptical about
p-hacking.

There's a lot of academic interest in such things, precisely because they go
against the grain of everything that's well understood. Finding one, and
proving that it's really certainly not zero, could make your name.

------
joshvm
This would have been a perfect opportunity to cite the paper about the
efficacy of parachutes:
[https://www.bmj.com/content/363/bmj.k5094](https://www.bmj.com/content/363/bmj.k5094)

~~~
jotakami
Too slow to make the connection here... can you explain your reasoning?

~~~
goto11
It seems to be a contrived joke: They tested people jumping with and without
parachutes from a small airplane situated on the ground.

~~~
pmyteh
It's a follow-up to a previous article bemoaning the lack of a randomised
controlled trial of parachutes and arguing that they should not therefore be
accepted as a valid 'treatment' for falling out of an aeroplane
([https://www.ncbi.nlm.nih.gov/pmc/articles/PMC300808/](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC300808/)).

Both this article and its predecessor were published in the Christmas issue of
the BMJ, where funny papers with a semi-serious point (in these cases, the
ongoing debates in medicine about the superiority of RCTs over other forms of
clinical evidence) are common.

My favourite of this genre is "Does Peppa Pig encourage inappropriate use of
primary care resources?"
([https://doi.org/10.1136/bmj.j5397](https://doi.org/10.1136/bmj.j5397))
critiquing the clinical practice of Dr Brown Bear, a fictional cartoon GP.

~~~
CrazyStat
My favorite BMJ Christmas article is a study of self-reported virgin mothers
[1], but I might be biased because I know one of the authors.

[1]
[https://www.bmj.com/content/347/bmj.f7102](https://www.bmj.com/content/347/bmj.f7102)

------
pjc50
It's not totally beyond the realm of plausibility for teachers to have long-
term effects on height, given the effect of childhood nutrition on height.
Properly feeding children can have such a positive effect on their academic
achievement that some teachers end up doing it out of their own pocket.
Certainly the best way to raise the worst-performing children is "fixing"
whatever disaster family situation they are dealing with at home.

~~~
lonelappde
Teachers are not feeding kids in statistically significant volumes. Schools
are, and there might be correlations.

~~~
matheweis
Is it possible that good teachers hang out in the lunchroom and encourage
their students to eat? I don’t think it’s fair to so readily dismiss the
hypothesis without a good look at the relevant data.

~~~
pjc50
Good teachers manage to prevent their school enacting bad policy like this:
[https://edition.cnn.com/2019/11/19/us/school-lunch-debt-
dona...](https://edition.cnn.com/2019/11/19/us/school-lunch-debt-donations-
trnd/index.html)

------
neallester
For children who are still growing physically and mentally, both height and
intellectual development are correlated with age, aren't they?. In an analysis
where you are treating children who actually have a range of ages as a single
age (i.e. "5th graders") finding some correlation between height and
intellectual development isn't that surprising. It is just an expression of
the underlying correlation between their age and intellectual development.

~~~
xyzzyz
Height is generally correlated with IQ, though not very strongly (r = 0.2),
and IQ is very strongly correlated with education success. It is also known
that this correlation is not spurious: half of it is pleiotropy, which is,
some of the genes that cause taller height also cause higher intelligence, and
half of it is due to assortative mating, meaning taller people tend to have
children with other taller people, so if pleiotropy causes taller people to be
more intelligent, the tendency to marry taller people additionally amplifies
the correlation.

Keep in mind though that the correlation, while very highly statistically
significant and replicated over many data sets, is not very strong: at r =
0.2, the r^2 is 0.04, so only 4% of variance in intelligence is explained by
height. Thus, it is not clear whether the height-IQ correlation plays
significant role as a confounder in the study above.

~~~
iguy
But the numbers you describe are for adult height, right? Wouldn't you expect
a much stronger effect if your sample was a mix of (say) 4-6th-grade kids. I
think that's what GP is saying.

~~~
xyzzyz
No, this is also found in children. See e.g. this[1] study, which also by
design (it's a twin study) rules out the hypothesis that the effect is mostly
environmental.

[1] -
[https://www.ncbi.nlm.nih.gov/pubmed/17081263](https://www.ncbi.nlm.nih.gov/pubmed/17081263)

~~~
iguy
Thanks. That there is signal in which twin is taller at precisely age 12 is
interesting, and seems like the same phenomenon as the adult effects.

But I'm saying something much simpler. In a school class, there's another
effect, that the oldest may be a year or more older than the youngest. The mix
of ages in a classroom will produce a correlation all by itself. It seems to
me this would be a much bigger effect: 7th-graders are quite a bit taller, and
better on tests, than 6th-graders.

~~~
xyzzyz
Ah, that's a good point. It would probably require looking deeper into the
studies to see how it is controlled for, if at all.

------
jeffdavis
If somebody pays for something, they want to see the results. If the effects
are local, you see with your eyes and you hear things second hand; a kind of
soft feedback. If the effects are far away, you probably want to see a hard
number, because you don't get the same kind of soft feedback.

Is the number (teacher score) useful? Maybe not. But you can't simply take
away a number, resulting in no feedback. You have to either come up with a new
number that is useful, or localize it enough that people can see the results.
Otherwise people won't want to pay.

That's why these weird bureaucratic numbers usually come from state or federal
requirements. People see the big pile of money going somewhere and they want
"accountability" because they can't see the results for themselves.

Let's stop shuffling around so much money, laundering it through state and
federal agencies. The money, if it comes back, always comes with strings
attached. This goes for schools, highway funds, and many other things.

------
jotakami
My son started kindergarten this year, and unfortunately we are zoned into one
of the worst schools in the district. He’s a bright kid, already knows
everything they would teach him, so we weren’t concerned about academics, but
let me tell you—his behavior at home, and especially his language use, has
taken a really troubling turn. He is obviously picking up the behavior of some
of the children he goes to school with.

I read a book not too long ago entitled “The Nurture Assumption,” by Judith
Harris. It influenced me more than anything I’ve read as a parent, because it
is so obviously true. Her thesis is that everyone assumes that adults have a
significant impact on a child’s future, yet there is basically zero evidence
to support such an assumption.

Peer groups are the dominant influence on a child’s personality and
development, full stop. This is probably hard for educators to accept because
there’s really nothing that can be done about it from their perspective.

~~~
Ididntdothis
I don’t understand why people often want to reduce something to only one
factor that explains everything . When I look back at my childhood I was
influenced positively and negatively by my parents, siblings, teachers, my
peers, neighbors and a lot of other people. They all had an influence at times
that deeply impacted my life.

~~~
jotakami
There’s a difference between “one factor that explains everything” and a
dominant factor that explains most of the variation.

~~~
Ididntdothis
I just doubt that in many cases there is a dominant factor that explains most
but I think that instead there are often several factors of more or less equal
importance. We made that mistake already in nutrition ("fat is the main
problem") and it caused decades of problems. The proportion of these factors
is also highly individual.

~~~
baddox
Why do you believe that it’s more likely for several factors to have roughly
equal importance than for one factor to be the most important?

As an aside, the mistake in nutrition was simply that it wasn’t true. And it
also wasn’t a “mistake,” it was a lie that the sugar industry paid to spread.

~~~
Yessing
>Why do you believe that it’s more likely for several factors to have roughly
equal importance than for one factor to be the most important?

if you pick n numbers at random, it's not likely that one of them is bigger
than the half of their sum.

~~~
klipt
That _really_ depends on the distribution of said numbers. You're probably
assuming uniform distribution, but there are power law distributions like
wealth where the 26 richest people own as much as the bottom 3.8 billion
combined.

~~~
Yessing
to be honest, I don't know much about probability.

but my intuition says that if you pick all variables from the same
distribution, this should still hold.

At first, we don't know anything about the factors, so they are assigned a
random importance. or am I missing something?

------
mcguire
Hypothesis: Teachers, _individually,_ don't have statistically measurable
effects. Students in a given cohort are mostly identical, and are mostly
exposed to identical environments, such that influences on them average to
some mean. As a result, minor influences, as long as they act reasonably
consistently, are elevated to statistical significance.

~~~
datashow
On average, it would be true, meaning most teachers do not have significantly
different effect on most students. However, there are always particularly good
and bad teachers.

------
nicoburns
I'd strongly question our ability to measure educational outcomes. If they're
basing that on standardised testing, then I'd argue that pretty much
invalidates any conclusions of the research.

~~~
dissidents
Do you think it's inherently impossible to measure educational outcomes or
just difficult with the tools that we today?

~~~
nicoburns
I think it's very difficult to come up with accurate objective measurement
methods (nobody's done it well yet). The best assessments tend to be
subjective teacher-based assessments (for similar reasons as to why hiring
tends to be done by subjective personal assessment - it's hard if not
impossible to define the criteria for "good" upfront. Someone may approach
things in a completely different way that nobody had thought to measure). And
this is before you get into the biases of the people drawing up the assessment
criteria etc.

------
sadness2
It seems to be universal that basing people's career progression on metrics is
harmful. People try to do this with engineers as a substitute for
understanding what's happening on the ground and having an actual
relationship. At best, metrics can raise a flag for you to ask empowering
questions.

------
bhouston
This is also the tale relevant to most A/B testing.

------
drited
Ruling out teacher impact on height without testing is perhaps not strictly
scientific?

We really don't know what is true unless we test it. I know it sounds
ridiculous but it's not beyond the realms of possibility that teacher opinion
of student capabilities influences social status which in turn influences
release of growth hormones which influence height. Aren't there other species
which grow when they rise to the top of the pack as growth hormones are
released to assist in maintaining that status? Not saying it's likely but if
left untested we cannot rule that out. There have been plenty of surprises in
science when we went and actually tested what we previously presumed to be
implausible.

~~~
youzicha
Or perhaps good teachers are also better at getting social services to poor
students, so they can get food stamps, eat better, and grow talller?

[https://xhxhxhx.tumblr.com/post/189336550187/teacher-
effects...](https://xhxhxhx.tumblr.com/post/189336550187/teacher-effects-on-
student-achievement-and-height)

------
snarf21
"When a measure becomes a target, it ceases to be a good measure." \-
Goodhart's Law

------
meekstro
I know a brilliant junior school teacher teaching years 1 to 3.

He constantly configures his classroom to maximise learning continuity for the
entire class. It is a massive challenge balancing out the problems students
face physically or outside the classroom. Teaching has become the easy part of
teaching.

The problem with student learning is that is affected by three things.

Emotional Health Physical Health Teaching Quality

to build an educated adult you need to optimise all three variables.

The problem with rewarding teachers is that their students aren't objectively
assessed on all three dimensions.

In every district in the world there should be three person educational
assessment teams that consist of an emotional health assessor, a physical
health assessor and an academic progress assessor that work as a team to
objectively assess the students and work with the teacher and support agencies
to eliminate learning barriers.

Any students that need help should get referred to the correct service or
program as quickly as possible. The teacher should focus purely on teaching
and integrating support agencies.

Teachers deserve to be rewarded for their teaching ability in an objective
way. As it happens there is a lot of unnecessary subjectivity and people blame
teachers for not being social workers, counsellors and doctors. The education
of every child in a class deteriorates as these three dimensions become less
homogenous because the maximised learning continuity of the group depends on a
limited resource. It is teachers who can creatively balance them while
engagingly delivering the curriculum that are creating unrewarded alpha in the
educational space.

Paying $250,000 a year in salary to a group of 3 specialist objective
assessors with the data linked to the health and social services would get a
better return on investment than any other expenditure a government could make
in lifting the health, education and future earning potential of children.

If a child isn't learning there is always a reason. The faster you address
that reason the more value you create. Sometimes it's the teacher, sometimes
it isn't but for the child's sake we should definitely diagnose the reason.

------
droithomme
There's a significant positive correlation between height and IQ:
[https://en.wikipedia.org/wiki/Height_and_intelligence](https://en.wikipedia.org/wiki/Height_and_intelligence)

There's also a correlation between IQ and academic achievement.

So, if we did decide for some (undoubtedly bad) reason to reward teachers for
happening to have more than average tall students it would be comparable to
rewarding them for having more than average high achieving students.

------
paggle
Anyone who has done A/B testing knows that you need tens of thousands of data
points to pick up any causal relationship unless it’s a huge effect like the
retention impact of telling your customers to lick a donkey’s balls. No way
you could pick it up for a teacher with 30 kids a year except over that
teachers entire career.

------
petrogradphilos
> 1 Introduction

> The increased availability of data linking students to teachers has made it
> possible to estimate the contribution teachers make to student achievement.

There was some data available before (or the sentence would not have used the
word "increased"). Why wasn't it possible to estimate with that?

> By nearly all accounts, this contribution is large.

It goes on to talk about what "large" means:

> Estimates of the impact of a one standard deviation (σ) increase in teacher
> “value-added” on math and reading achievement typically range from 0.10 to
> 0.30σ, which suggest that a student assigned to a more effective teacher
> will experience nearly a year's more learning than a student assigned to an
> less effective teacher (Hanushek & Rivkin 2010;...).

(Typo: "an less effective" should be "a less effective".)

A "range from 0.10 to 0.30σ" doesn't make sense. A Greek lowercase sigma (σ)
is used to represent one standard deviation, but the sigma is used only on the
upper end of the range. Should it have been from 0.10σ to 0.30σ?

And how are they measuring the impact on achievement of an increase in teacher
"value-added", anyway? It says that estimates of the impact "typically range
from 0.10 to 0.30σ", but it doesn't say what units those figures are in.

The sentence goes on to say that those unit-less estimates "suggest" that "a
student assigned to a more effective teacher will experience nearly a year's
more learning than a student assigned to an less effective teacher". Over what
time period? That is, how long does a student have to study under a "more
effective teacher" to get "a year's more learning"? 1 week? 12 years? It
doesn't say.

And finally, how do those unit-less estimates "suggest" an impact measured in
learning time? It doesn't say.

~~~
6gvONxR4sf7o
You are being absurdly pedantic. For example

>A "range from 0.10 to 0.30σ" doesn't make sense.

Out loud, you might read it as "oh point one to oh point three sigma." And
it's fine. Ever written a phrase like "it's very expensive, costing somewhere
around $3-4M" with one "$" and one "M" but two numbers?

------
brohee
Can't econonomists learn LaTeX? The typesetting is so horrible its putting you
off from reading it...

------
omarhaneef
I think the message to parents is loud and clear: send your kids to the
teacher that will make them tallest.

~~~
_benj
You have made me laugh dear sir or ma'am! :P

------
2stop
Research done more than 20 years ago (and repeated and confirmed) already
found that teachers have very little effect on educational outcomes. The
biggest correlative factor is socioeconomic status and education level of the
parents. Everything else is hit or miss

But hey, let’s publish a pointless paper... because academia.

~~~
sethammons
I think this is an important paper. Even or especially if it confirms other
studies. When I was a teacher in inner city San Bernardino, we were held
accountable for student achievement ignoring student ability, socioeconomic
situation, home life, or their capacity to disrupt the class. In large part,
this is because of studies that say a teacher has outsized effect on student
test scores. Now, I think a special kind of person can connect better with
some kids and help them achieve. I think that is not something you can mass
produce. With more studies showing teachers are not as critical as schools
want to believe, maybe we can start focusing on other metrics to gauge
suçcess.

~~~
ckcheng
> With more studies showing teachers are not as critical as schools want to
> believe, maybe we can start focusing on other metrics to gauge suçcess.

Like cost of hiring? Seriously, if teachers are not as critical, why not hire
less or pay less salary... any half decent person in the classroom will do.

I think the study is about right that teachers are not as critical as they'd
like to believe. I just don't know what to think of the implications...

~~~
sethammons
I still think a bad teacher is absolutely terrible and that teaching is not
easy. I don't look at teacher salaries and think they are too high at all. I
think we are bad at measuring what makes a good teacher. I just know it is not
based on standardized test scores.

When I was a teacher, I had zero power to remove a disruptive or inattentive
student from my class. I had 9th graders who could not deal with negative
numbers being expected to do well on state Algebra tests that are of arguable
quality. Even if they made substantial progress, they would not be to grade
level.

I can still remember one student who after nearly a year of not being bothered
to pay any shred of attention in class, during the final review, paid enough
attention during the last steps of solving an equation: Me: Alright, and
combining like terms, what is 10x - 18x? Student, suddenly paying attention:
But wait, you can't subtract a bigger number from a smaller number!

I can't stop the class on nearly the last week of 9th grade algebra to help
this student understand negative numbers. And they are not interested in
coming in to get help outside of normal class. Parents were not interested in
making sure this kid got through school.

Not so fun fact: at this school, less than 4% would go onto any post-secondary
education. Of them, about 2% would go on to finish a degree. Something like a
90% transitory rate (meaning that most students who started 9th grade at this
school would transfer to a different school before graduation, if they made it
that far). I could go on, but there was an underlying cultural issue that did
not value education. That trend is hard to buck: I had third generation gang
members, kids raising their siblings because parents (often a single parent)
were working multiple jobs, kids dealing with daily violence, one kid was
stabbed to death right off of campus. The majority of these kids and their
families have no experience seeing what an education can do for their
prospects. Add onto that a tough employment market (inner city San
Bernardio!), and those who did have siblings who did get a degree, they often
couldn't find a job.

All this to say, I don't think lowering salary of teachers would help. Maybe
lowering the cost of administrators would help. At this same time, there were
more administrative personnel in the district than teachers, something I could
never understand.

------
thepete2
What if bad students get hit on the head thus decreasing their height?

~~~
dang
Please don't do this here.

