
Mean Squared Terror – GridSearch Is Not Enough - amrrs
https://koaning.io/posts/mean-squared-terror/
======
gotoeleven
This article is very dumb. Models are descriptions of data--they describe how
things are to some level of accuracy, not how things ought to be according to
some or other concept of cosmic justice.

From the article: "Notice how the grid search would tell us that there should
be a big payment gap between the two groups even if they display high levels
of skill"

Only an idiot would say that the output of a model has anything to do with the
moral question of how things should be. Using "biased" inputs to your model
may or may not help you describe your data and answer questions about what it
is, but the inputs have nothing to do with what ought to be.

~~~
sbpayne
I find it interesting that you said: “Only an idiot would say the output of a
model has anything to do with the moral question that of how things should
be.”

My understanding of the article is that people often share this sentiment, but
they often believe this property implies that an optimization procedure will
only reflect bias rather than amplify it. Then the author goes on to show that
this perceived implication is incorrect.

I.e. your algorithm not understanding an existing bias does not mean it won’t
amplify bias — it means it won’t care, which is precisely what allows it to
amplify the bias. Because the bias exists in the underlying data, your
algorithm can discover and overfit to that bias. Without a “regularizer” to
control for this, it’s probably a bad idea to think the algorithm does not
amplify bias.

So if my understanding of your comment and the author’s post are right, I
think you would both agree that at the end of the day: something needs to
explicitly control for the moral question of how things should be, because the
general optimization procedures we use do/will not. Is that a fair statement?

~~~
gotoeleven
How is the existence of a model amplifying any bias? The example the author
gives is that conditioning a linear model on some "biased" variable gives a
more accurate model that predicts large differences between these groups. But
the large difference in the groups is right there in the data. It's not
amplified in any way. And then somehow this is the modeller's fault for
choosing a naughty variable, as if the output of a model has anything to do
with the world the modeller wishes existed.

The author's beef should be with the idiots who would take this model and then
say "yep looks like since these two groups have different regression lines
it's totally great that we see large differences between these groups"

~~~
sbpayne
I was mostly referring to this bit from the article: “Notice how the model
that has the lowest mean squared error is also the model that causes the most
bias between groups for higher skill levels. This increase in bias cannot be
blamed merely on the data. It’s the choice of the model that increases this
bias which is the responsibility of the algorithm designer.”

My reading of the plots is that the difference between the groups seems to
grow beyond what is present in the data. Do you not think this is amplified?

As for conditioning on a biased variable; this post uses a contrived example
for sure. But the same thing happens with variables that correlate with the
“naughty” variable (perhaps author should have explicitly showed this to drive
the point?)

Removing all variables that correlate with “naughty variables” is a pretty
difficult task itself without having something to detect the correlation and
tell you. At least that’s been my experience.

~~~
mr_toad
> This increase in bias cannot be blamed merely on the data.

This is debatable in this case. The grouped model might be a better fit to the
data: it indicates a higher bias in salaries at higher skill levels.

It sounds completely plausible that the gender bias for unskilled workers is
less than for high skilled workers.

Models can be under-fit as well as over fit. Just using a dummy variable for
gender might not properly capture interactions like this.

But of course it’s always possible that this is just an artefact of the
training data.

------
AbrahamParangi
There is no escaping this tyranny: your model will fit your data. There are no
"fair" ways to change this. The only way to avoid fitting your model to the
data is to avoid training it.

~~~
hansvm
There are plenty of options if you know the kinds of bias you want to protect
against.

I'm having trouble finding the project to reference, but one solution I saw
recently used a data preprocessing (reweighting) step that would remove bias
in any of a broad class of models built on the processed data.

The crux of the idea was that you can pick any properties against which your
model should be biased (e.g. gender), additionally _allow_ such a bias if it
came through other specified properties (e.g. gender bias in college
admissions is arguably fine if it exists solely because one group only applies
to competitive majors), and find the closest dataset to yours such that a
model trained on it would protect against the desired biases.

The key advantage over more restrictive methods like ignoring any combination
of attributes which leak information about the protected categories is that
you can throw away less information and thus get better models (subject to the
error introduced by finding the "closest" dataset) with the same bias
reduction.

One of the bigger weaknesses is that it can only protect you against biases
that can be explicitly encoded into the model. When we later discover the next
<obviously biased and wrong> thing that our models are doing we won't have
protected against it.

I think they used the terminology "database repair problem" a lot if anyone
wants to go digging for the source article.

Edit: I was thinking of this preprint:
[https://arxiv.org/abs/1902.08283](https://arxiv.org/abs/1902.08283).

