
Ask HN: Should tech ban the use of gender/race features in ML? - daenz
As machine learning becomes more commonplace, I think it&#x27;s important to consider the repercussions of including gender, sex, race, ethnicity, etc features in the training data.  Many ML models have &quot;greedy&quot; optimization algorithms, resulting in models that may be eager to prioritize some features, like gender or race, over more nuanced combinations of features, like poverty, environment, etc.  The resulting models could then be used as evidence to reinforce stereotypes about groups of people.<p>What can we do about this?  The only solution I see is to explicitly ban the use of such features in models, to force ML researches to dig deeper into the datasets, and to prevent the resulting models from containing incendiary biases.
======
eesmith
An outright ban is not reasonable.

Medically speaking, gender or race can important. In the US the rate of
sickle-cell disease is highest among those with sub-Saharan African descent,
likely related to malaria resistance.

Or, suppose you want to analyze police stop-and-frisk records to see if there
is evidence of racial bias. How do you do so without using race as a feature?

This sort of topic can be covered in an ethics course. ACM members are
expected to follow the ACM Code of Ethics and Professional Conduct, at
[https://www.acm.org/about-acm/acm-code-of-ethics-and-
profess...](https://www.acm.org/about-acm/acm-code-of-ethics-and-professional-
conduct) . See also [https://ethics.acm.org/code-of-ethics/using-the-
code/](https://ethics.acm.org/code-of-ethics/using-the-code/) .

That said, it's a long way from having such a code, and understanding how to
follow it. One was is through ethics courses. ACM accredited schools are, I
believe, required to include an ethics course. You example is only one of many
possible topics.

However, I don't think that those ethics guidelines are that influential.

Even less so if people think the goal of education is to train people for the
needs of commerce and industry. Ethics guidelines exist to help identify which
of those needs are immoral - not something most companies want!

------
tree_of_item
How in the world would you ban this? Perhaps you could convince governments
not to use such features themselves, but there's no way you can ban this in
general.

------
smt88
This isn't a new problem, is it? It's just that the models are more complex.

Science shouldn't ban any type of input data, as long as it can be considered
scientific. There's no scientific standard for determining race, so maybe the
data should be excluded on that basis instead.

I'll point out that self-reported demographic data can be useful in health and
social science, even if it's not scientifically tied to actual genes.

