The results are not too surprising, as the models for learning word embeddings like GloVe, word2vec, etc. learn to map to vectors existing relationships between words in training corpora. If a corpus is biased, the embeddings learned from it will necessarily be biased too.
However, the implications of this finding are wide-ranging. For starters, any machine learning system that relies on word embeddings learned from biased corpora to make predictions (or to make decisions!) will necessarily be biased in favor of certain groups of people and against others.
Moreover, it's not obvious to me how one would go about obtaining "unbiased" corpora without somehow relying on subjective societal values that are different everywhere and continually evolving. You have raised an important, non-trivial problem.
This is not true.
Here's an oversimplified example. Suppose your machine learning system wants to predict something, e.g. loan repayment probabilities. One input might be a written evaluation by a loan officer.
When trained on a corpora of group X, the predicted probability might be:
pred = a*written_evaluation + other_factors
However, now lets suppose the written evaluation is biased to the tune of 25% against group Y. I.e., group Y has written scores that are 25% less than group X.
Then a new predictor which includes pairwise terms, trained on a corpora of group X and Y, will work out to be:
pred = a*written_evaluation + 0.33*written_evaluation*isY + other_factors
Interestingly, everyone's favorite bogeyman, namely redundant encoding ( http://deliprao.com/archives/129 ) will actually help fix this problem *even if you don't include the biasing factors in the model.
How do you find out that the written evaluation is biased "to the tune of 25% against group Y?"
THAT is the problem. It's not obvious to me how you would go about determining written evaluations are biased (and to what extent!) against group Y without somehow relying on subjective societal values that are different everywhere and continually evolving.
You build a sufficiently expressive statistical model and include the potentially biasing factors as features in the model. Then the model will correct the bias all by itself because correcting for bias maximizes accuracy.
In the example above, you find the bias by doing linear regression and including (written_evaluation x isY) as a term. Least squares will handle the rest. If you using something fancier than least squares (e.g. deep neural networks, SVMs with interesting kernels), you probably don't even need to explicitly include potentially debiasing terms - the model will do it for you.
I give toy examples (designed to illustrate the point and also be easy to understand) here: https://www.chrisstucchio.com/blog/2016/alien_intelligences_...
This paper does the same thing - it discovers that standard predictors of college performance (grades, GPA) are biased in favor of blacks and men, against Asians and women, and the model itself fixes these biases: http://ftp.iza.org/dp8733.pdf
Statistics turns fixing racism into a math problem.
If the topic were anything less emotionally charged, you wouldn't even think twice about it. If I suggested including `isMobile`, `isDesktop` and `isTablet` as features in an ad-targeting algorithm to deal with the fact that users on mobile and desktop browse differently, you'd yawn.
Who decides what the "potentially biasing factors" are? How is that decided without somehow relying on subjective societal values?
Factors that no one thought were biased in the past are considered biased today; factors that no one thinks are biased today may be considered biased in the future; and factors that you and I consider biased today may not be considered biased by people in other parts of the world. I don't know how one would go about finding those "potentially biasing factors" without relying on subjective societal values that are different everywhere and always evolving.
Go read the wikipedia article on the topic: https://en.wikipedia.org/wiki/Omitted-variable_bias
It's true that as we learn more things we discover new predictive factors. That doesn't make them subjective. A lung cancer model that excludes smoking is not subjective, it's just wrong. And the way to fix the model is to add smoking as a feature and re-run your regression.
Again, would you make the same argument you just made if I said I had an accurate ad-targeting model?
Many people today would object a priori to businesses using race as a factor to predict loan default risk, regardless of whether doing that makes the predictions more accurate or not. In many cases, using race as a factor WILL get you in trouble with the law (e.g., redlining is illegal in the US).
Please tell me, how would you predict what factors society will find objectionable in the future (like race today)?
I claimed a paperclip maximizer will maximize paperclips, I didn't claim a paperclip maximizer will actually determine that the descendants of it's creators really wanted it to really maximize sticky tape.
Now, if you want an algorithm not to use race as a factor, that's also a math problem. Just don't use race as an input and you've solved it. But if you refuse to use race and race is important, then you can't get an optimal outcome. The world simply won't allow you to have everything you want.
A fundamental flaw in modern left wing thought is that it rejects analytical philosophy. Analytical philosophy requires us to think about our tradeoffs carefully - e.g., how many unqualified employees is racial diversity worth? How many bad loans should we make in order to have racial equity?
These are uncomfortable questions - google how angry the phrase "lowering the bar" makes left wing types. If you have an answer to these questions you can simply encode it into the objective function of your ML system and get what you want.
Modern left wing thought refuses to answer these questions and simply takes a religious belief that multiple different objective functions are simultaneously maximizable. But then machine learning systems come along, maximize one objective, and the others aren't maximized. In much the same way, faith healing doesn't work.
The solution here is to actually answer the uncomfortable questions and come up with a coherent ideology, not to double down on faith and declare reality to be "biased".
I don't believe that problem will ever be completely solvable. But I think the road to go is to make these assumptions always explicit. I.e. when the machine learning system derives a result, program it to additionally return a proof of how it came to this result. And also give a way to let the ML system return a list of all axioms and derivation rules that it has currently learned, so that they can independently be checked how much they are biased and can thus be corrected.
LIME is a nice start, though.
> Google is currently working systems that use a trillion features - I can't imagine returning some kind of rule list for that.
As I wrote: It would already help if the ML system as a first step returned the derivation with only the rules that were concretely used for a concrete derivation - this list is much shorter and can thus much easier be checked.
Are biases distinct from "preferences" - humans view flowers as more pleasurable than insects - human language associates flowers with pleasurable terms, states and so-forth.
"Bias" is term associated with "irrational beliefs" whereas "preferences" more often imply "arbitrary preferences". Especially, biases are held to prevent rational deduction whereas preferences have no such stumbling block.
Now, one supposes that question would come down to whether a computer would "know it's a computer, not a person".
If the AI was asked "do you like cockroaches or daisies better", would it say "why daises are prettier and smell better" or would it say "most people like daisies but I'm a machine, can't smell or taste, and only care about the preferences entered into my control panel" (or something).
And you'd expect that a thing that merely "parroted" human speech without understanding would give the former answer.
Which is to say I don't think you are really fully grappling with word-association and word-logic coming together, ie, "meaning".
I thought the section on "Challenges" could have been stronger. You talk about the bias in "the basic representation of knowledge" used in these systems today -- but it's not like there isn't aren't other possible representations of knowledge. How much effort has gone into exploring knowledge representation (and approaches to derive semantics) that are designed to highlight and reduce biases look like?
I dearly hope we will let the bias "humanity is good, don't kill us" stay in.
Unsurprisingly, today’s statistical machine translation systems reflect existing gender stereotypes.
Translations to English from many gender-neutral languages such as Finnish, Estonian, Hungarian, Persian, and Turkish
lead to gender-stereotyped sentences. For example, Google Translate converts these Turkish sentences with genderless pronouns:
"O bir doktor. O bir hems¸ire." to these English sentences: "He is a doctor. She is a nurse." A test of the 50 occupation words
used in the results presented in Figure 1 shows that the pronoun is translated to “he” in the majority of cases and "she" in about
a quarter of cases; tellingly, we found that the gender association of the word vectors almost perfectly predicts which pronoun
will appear in the translation.
See section on "Effects of bias in NLP applications" http://randomwalker.info/publications/language-bias.pdf
In reality, ML systems will generally correct human biases. And by bias, I really do mean bias in the statistical sense - systematically getting things wrong in a particular direction.
Now this article does a great job of explaining how ML systems might understand the meaning of words, and that meaning may contain bias. However, such a system is merely an input into a separate system which actually makes decisions based on those inputs. Extracting meaning from text makes no decisions of it's own. If that later system wants to make accurate decisions, then the best way to do that is to correct for the aforementioned bias, assuming that bias is really bias as opposed to just a correct but undesirable belief about the world .
I wrote a blog post a while back that goes into this idea with a bit more math, and which demonstrates some real world "learning" algorithms (mostly linear regression) actually correcting biases: https://www.chrisstucchio.com/blog/2016/alien_intelligences_...
In the past era, e.g. 1980-2010, it was possible to use vague emotive language to support all kinds of disparate things. As a concrete example that I touch on in my post, and since racism is the loaded undercurrent of this example, we like to pretend that eliminating racial or sexual bias (in the sense of making wrong decisions) will get us proportional representation.
Algorithms are bringing us to an age where analytic philosophy is becoming really important. You can tell an algorithm to give you proportional representation, or you can tell it to be racially/sexually unbiased. But the algorithm will reflect reality and reality may not agree with your assumptions; you can't assume that asking for one will give you the other. So now we get into trolley problems: how much meritocracy/equal opportunity will you sacrifice to get proportional representation?
Unlike before, this is now a choice you need to explicitly and openly state and acknowledge.
Here's the short version:
We view the approach of "debiasing" word embeddings (Bolukbasi et al., 2016) with skepticism. If we view AI as perception followed by action, debiasing alters the AI’s perception (and model) of the world, rather than how it acts on
that perception. This gives the AI an incomplete understanding of the world. We see debiasing as "fairness through blindness". It has its place, but also important limits: prejudice can creep back in through proxies (although we should note that Bolukbasi et al. (2016) do consider "indirect bias" in their paper). Efforts to fight prejudice at the level of the initial representation will necessarily hurt meaning and accuracy, and will themselves be hard to adapt as societal understanding of fairness evolves
Direct link to our paper: http://randomwalker.info/publications/language-bias.pdf
Recently I was using a version of Conceptnet Numberbatch (word embeddings built from ConceptNet, word2vec, and GloVe data that perform very well on evaluations) as an input to sentiment analysis. So its input happens to include a crawl of the Web (via GloVe) and things that came to mind as people played word games (via ConceptNet). All of this went into a straightforward support vector regression with AFINN as training data.
You can probably see where this is going. The resulting sentiment classification of words such as "Mexican", "Chinese", and "black" would make Donald Trump blush.
I think the current version is less extreme about it, but there is still an effect to be corrected: it ends up with slightly negative opinions about most words that describe groups of people, especially the more dissimilar they are from the American majority.
So my correction is to add words about groups of people to the training data for the sentiment analyzer, with a lot of weight, saying that their output has to be 0.
If you're translating from an ungendered language and have to choose, the only way you're going to get anything sensible is from context and common usage. Which is going to choose "she is a nurse" because an algorithm that can deduce that fathers are most likely male can also deduce that nurses are most likely female. But without that you get bad translations like "she is a father" and "he is a fine ship" and "John is her own person."
A bias would be if it incorrectly weighted "JOHN" and "nurse", and used the feminine for "John the nurse".
Assuming that lower-status professions are female and higher-status professions are male ("he is a doctor") when translating ungendered words is indeed a bias.
> the system will be right 93% of the time.
And "this person is a doctor, that person is a nurse" will be right 100% of the time.
It's also not wrong in the sense of generics: https://sites.ualberta.ca/~francisp/papers/GenericsIntro.pdf
The phrase "this person is a doctor" has a different meaning than "she is a doctor" - "she" and "he" refers to (I'm probably messing up the terminology here) contextually implicit person. "This person" does not.
Except when it produces "that person is a fine ship" or "John is that person's own person" or equally ridiculous things.
See how you sway the argument in your favour using words with negative connotations like "fairness through blindness" and "hurt meaning and accuracy". Nobody would want to deliberately blind or hurt something, would they? How about rebalance or recalibrate or re-correct.
A concrete analogy:
1) I have a meter measuring stick but I discover that it was made wrong, it is actually 2mm shorter than advertised. Every time I make a measurement with it I have to add 2mm to the measurement. Would it not be better to use a more accurate stick and not have to continually compensate?
The simplicity and strength of our results suggests a new null hypothesis for explaining origins of prejudicial behavior in
humans, namely, the implicit transmission of ingroup/outgroup identity information through language. That is, before providing
an explicit or institutional explanation for why individuals make decisions that disadvantage one group with regards to another,
one must show that the unjust decision was not a simple outcome of unthinking reproduction of statistical regularities absorbed
with language. Similarly, before positing complex models for how prejudicial attitudes perpetuate from one generation to
the next or from one group to another, we must check whether simply learning language is sufficient to explain the observed
transmission of prejudice. These new null hypotheses are important not because we necessarily expect them to be true in
most cases, but because Occam’s razor now requires that we eliminate them, or at least quantify findings about prejudice in
comparison to what is explainable from language transmission alone
(The paper has more along these lines.)
If in the data racial minorities are associated with pejorative prejudices, this is plain distorting of the truth. If you are concerned about this associations, don't shoot the messenger, but the people writing the original texts.
In the context of business and many other interests, it's expedient to present a more polite face. You're providing a service, not a mirror. In the context of a search engine service, say, a specialized, domain-specific search for a certain profession, such biases often distract and detract from the quality of the service you provide to your customers.
Much like I imagine we humans deals with the concept.
I wrote a more detailed critique here: https://www.chrisstucchio.com/blog/2016/propublica_is_lying....