
Neural networks reveal gender bias in language - yunque
https://www.technologyreview.com/s/602025/how-vector-space-mathematics-reveals-the-hidden-sexism-in-language/
======
Jarwain
I'm not a big fan of the title, it implies the bias is inherent in the
language. The article, however, attributes the bias to the input data; the
biases that the professional journalists write into their news articles.

Although that makes me wonder whether the bias is due to the journalists
themselves, or whether it's reflecting inequalities in real life.

~~~
Scaevolus
Many biases are real and self-reinforcing. In the US, doctors are ~70% male
and nurses are ~90% female.

~~~
GFK_of_xmaspast
"Real" in the sense that any social construct is real.

------
bbctol
Was this surprising? What exactly did it find?

I haven't been able to track down the original source of the "Machine learning
is like money laundering for bias" quote, but I worry about it in situations
like this. To be clear, I think the phenomenon the article is discussing is
real and valid: the implicit linking of genders to occupations is pretty well-
documented, and I don't doubt they've found _something_. But it's a little
difficult to say what's actually been contributed to the literature here.

That said, their work on manually "de-biasing" the language by applying
mathematical transformations to the word space is definitely interesting...
I'm just not sure what it is yet.

~~~
skybrian
The contribution to the literature is showing how to measure and remove one
source of bias from word2vec. Were you expecting something else?

------
grownseed
This is really interesting. Looking at the references in the paper, the first
one is about racial bias. A question I've often asked myself is (in various
other forms): rather than having the bias be put forward by the researchers in
the first place, would it be possible to let the biases emerge from the model?
Put another way, can a system be designed such that it automatically corrects
(or at least identifies) biases inherent to its training data? The
implications of being able to expose biases we're oblivious to would be
interesting in and of themselves.

