
Making a racist AI without really trying (2017) - spatten
http://blog.conceptnet.io/posts/2017/how-to-make-a-racist-ai-without-really-trying/
======
asploder
I'm glad to have kept reading to the author's conclusion:

> As a hybrid approach, you could produce a large number of inferred
> sentiments for words, and have a human annotator patiently look through
> them, making a list of exceptions whose sentiment should be set to 0. The
> downside of this is that it’s extra work; the upside is that you take the
> time to actually see what your data is doing. And that’s something that I
> think should happen more often in machine learning anyway.

Couldn't agree more. Annotating ML data for quality control seems essential
both for making it work, and building human trust.

~~~
ma2rten
This approach only works if you use OP's assumption that a text's sentiment is
the average of it's word's sentiment. That assumption is obviously flawed
(e.g. "The movie was not boring at all" would have negative sentiment).

Making this assumption is fine in some cases (for example if you don't have
training data for your domain), but if you build a classifier based on this
assumption why don't you just use an off-the-shelf sentiment lexicon? Do you
really need to assign a sentiment to every noun known to mankind? I doubt that
this improves the classification results regardless of the bias problem.

~~~
jakelazaroff
Sure, it's flawed, but that's the point of the post: that assumptions about
your dataset can lead to unexpected forms of bias.

 _> Do you really need to assign a sentiment to very noun known to mankind?_

No, but it seems like a simple (and seemingly innocuous) mistake that many
programmers can and will make.

~~~
ma2rten
I was just trying to explain in this comment why I think the human moderation
solution is solving the wrong problem.

------
gwern
> There is no trade-off. Note that the accuracy of sentiment prediction went
> up when we switched to ConceptNet Numberbatch. Some people expect that
> fighting algorithmic racism is going to come with some sort of trade-off.
> There’s no trade-off here. You can have data that’s better and less racist.
> You can have data that’s better because it’s less racist. There was never
> anything “accurate” about the overt racism that word2vec and GloVe learned.

The big conclusion here after all that code buildup does not logically follow.
All it shows is that one new word embedding, trained by completely different
people for different purposes with different methods on different data using
much fancier semantic structures, outperforms (by a small and likely non-
statistically-significant degree) an older word embedding (which is not even
the best such word embedding from its batch, apparently, given the choice to
not use 840B). It is entirely possible that the new word embedding, trained
the same minus the anti-bias tweaks, would have had still superior results.

~~~
skybrian
I think you're reading this statement as more general than it's meant to be? I
interpret it as meaning that there is not necessarily any tradeoff, as there
wasn't in this case. "You can have data" -> there exists.

~~~
guywhocodes
Is there anyone who thinks that the current level of racism is required for
the current accuracy? I can't imagine people that racist to be common in the
data community

~~~
AnthonyMouse
> Is there anyone who thinks that the current level of racism is required for
> the current accuracy? I can't imagine people that racist to be common in the
> data community

It depends on two things. The first is how you're defining racism. If the
algorithm is predicting that 10% of white people and 30% of black people will
do X, because that is what actually happens, some people will still call that
racism but there is no possible way to change it without reducing accuracy.

If the algorithm is predicting that 8% of white people and 35% of black people
will do X even though the actual numbers are 10% and 30%, then the algorithm
has a racial bias and it is _possible_ to both reduce racism and increase
accuracy. But it's also still possible to do the opposite.

One way to get the algorithm to predict closer to 10% and 30% is to get better
data, e.g. take into account more factors that represent the actual cause of
the disparity and just happen to correlate with race, so factoring them out
reduces the bias _and_ improves accuracy in general.

The other way is to anchor a pivot on race and push on it until you get the
results you want, which will significantly harm accuracy in various subtle and
not so subtle ways all over the spectrum because what you're really doing is
fudging the numbers.

~~~
nnnnnande
"If the algorithm is predicting that 10% of white people and 30% of black
people will do X, because that is what actually happens, some people will
still call that racism but there is no possible way to change it without
reducing accuracy."

What is actually happening? Does it tell you if they are they doing X
precisely because they are black or white? The racist part might not be the
numbers per se, but in the conclusion that the color of their skin has
anything to do with their respective choices.

edit: spelling

~~~
TeMPOraL
ML is spitting out correlations, not an explicit causal model. If, in reality,
X is only indirectly and accidentally correlated with race, but I look at the
ML result and conclude the skin color has something to do with X, then the
only racist element in the whole system is me.

~~~
nnnnnande
Agreed. That was the point I was trying to get at, albeit I might not have
phrased it as clearly.

------
lalaland1125
> Some people expect that fighting algorithmic racism is going to come with
> some sort of trade-off.

Um, that's because we know it comes with trade-offs once you have the most
optimal algorithm. See for instance
[https://arxiv.org/pdf/1610.02413.pdf](https://arxiv.org/pdf/1610.02413.pdf).
If your best performing algorithm is "racist" (for some definition of racist")
you are mathematically forced to make tradeoffs if you want to eliminate that
"racism".

Of course, defining "racism" itself gets extremely tricky because many
definitions of racism are mutually contradictory
([https://arxiv.org/pdf/1609.05807.pdf](https://arxiv.org/pdf/1609.05807.pdf)).

~~~
ma2rten
Not necessarily. In the case of word vectors we are using unsupervised
learning to identify patterns in a large corpus of data to improve the
learning. This is a completely different issue than your credit score example,
which is supervised learning.

Not all patterns are equally useful. By removing those unuseful patterns we
might make less mistakes (for example giving negative sentiment to a Mexican
restaurant review) and free up capacity in the word vectors to store more
useful patterns. I would expect baking other real-world assumptions into your
word vectors unrelated to bias could also be helpful.

------
paradite
To oversimplify, I think the training set is something like:

Italian restaurant is good.

Chinese restaurant is good.

Chinese government is bad.

Mexican restaurant is good.

Mexican drug dealers are bad.

Mexican illegal immigrants are bad.

And hence the word vector works as expected and the sentiment result follows.

Update:

To confirm my suspicion, I tried out an online demo to check distance between
words in a trained word embedding model using word2vec:

[http://bionlp-www.utu.fi/wv_demo/](http://bionlp-www.utu.fi/wv_demo/)

Here is an example output I got with Finnish 4B model (probably a bad choice
since it is not English):

italian, bad: 0.18492977

chinese, bad: 0.5144626

mexican, bad: 0.3288326

Same pairs with Google News model:

italian, bad: 0.09307841

chinese, bad: 0.19638279

mexican, bad: 0.16298543

------
EB66
Just thinking out loud here...

It seems to me that if you wanted to root out sentiment bias in this type of
algorithm, then you would need to adjust your baseline word embeddings dataset
until you have sentiment scores for the words "Italian", "British", "Chinese",
"Mexican", "African", etc that are roughly equal, without changing the
sentiment scores for all other words. That being said, I have no idea how
you'd approach such a task...

I don't think you could ever get equal sentiment scores for "black" and
"white" without biasing the dataset in such a manner that it would be rendered
invalid for other scenarios (e.g., giving a "dark black alley" a higher
sentiment than it would otherwise have). "Black" and "white" is a more
difficult situation because the words have different meanings outside of
race/ethnicity.

~~~
rossdavidh
I think I would agree. You otherwise run the risk of having fixed the metric
("Italian" vs. "Mexican", "Chad" vs. "Shaniqua", etc.) without actually fixing
the underlying issue.

Also, regarding black/white etc., there might legitimately be words which have
so many different meanings (whether race-related or not) that you should just
exclude them from sentiment analysis. "Right" can mean like "human rights",
"right thing to do", or "not left". Probably plenty of other words like that.
You might do better to have a list of 100-200 words that are just excluded
because of issues like that.

~~~
acpetrov
Would it be worth trying to think of words with different meanings as entirely
new words? So, "white" in one sentence may be a different word than "white" in
another?

~~~
visarga
There's a long list of papers on that - 'multi-sense word embeddings'. But
more recently we have found that passing the raw character embeddings through
a two layer BiLSTM will resolve the ambiguity of meaning from context -
'ElMO'.

[https://arxiv.org/abs/1802.05365](https://arxiv.org/abs/1802.05365) (state of
the art)

------
k__
Does this mean the text examples the AI learns from are biased and as such it
learns to be biased too?

So it's not giving us objetive decisions, but a mirror. Not so bad either.

~~~
kibwen
Yes, and it's pretty scary how many technologists seem to be surprised by
this. If we train bots using data derived from humans, the expectation is that
they will inherit biases from humans. There's nothing about a silicon brain
that automatically bestows perfect objectivity, only perfect obedience.

~~~
sidr
I struggle to think of a single person with the faintest understanding of what
machine learning algorithms are being surprised by this. Who are these
"technologists" you're speaking of?

~~~
anonthrowaway2
Almost everyone I know with at least a faint understanding of ML is surprised
by models picking up racism etc when there was zero intent to do so, because
of systemic racism etc in available data. Or at least surprised by how much
can be picked up. You're bubbled if no one you know is surprised.

~~~
rossdavidh
Hmmm...I'm no expert, but my master's thesis topic in the 90's was on neural
networks that use R-squared (a measure of correlation), and when I saw the
news about Microsoft's chatbot going Nazi, I was not at all surprised. Not
saying no one you knew was surprised, but I had "at least a faint
understanding of ML", and the primary thing I learned about it was that it
learns what's in the data, whether that's the part of the data that you
intended it to learn or not.

~~~
artursapek
Tay was trolled hard by 4chan, that's why she went hardcore Nazi almost
immediately. It was amusing, but not a fair & controlled experiment by any
means.

~~~
pixl97
The real world is neither fair or a controlled experiment.

~~~
TeMPOraL
Which is why I'm surprised about all this "AI is biased" outrage. A decent
algorithm will learn what's in the data. Cast on a wide enough scale, the data
is roughly what the world is. If your bot learns from newspaper corpus, then
it learns how the world looks through the lens of news publishing. If news
publishing is somewhat racist, and your algorithm does _not_ pick on that,
then your algorithm has a bug in it.

It seems to me like the people writing about how AI is bad because it picks up
biases from data are wishing the ML would learn the world as it _ought to_ be.
But that's wrong, and that would make such algorithms not useful. ML is meant
to learn the world as it _is_. Which is, as you wrote, neither fair nor a
controlled experiment.

~~~
artursapek
Well put. The people complaining about how AI is bad are the same people who
push "diversity hires" to try to pretend that the population of software
developers is equal parts male/female, and white/black.

------
ma2rten
I think that the bias problem they are highlighting is very important. That
said, I'm wondering if they really didn't try (like the title suggests) or if
they choose this approach on purpose because it highlights the problem.

To explain what happened here: They trained a classifier to predict word
sentiment based on a sentiment lexicon. The lexicon would mostly contain words
such as adjectives (like awesome, great, ...). They use this to generalize to
all words using word vectors.

The way word vectors work is that words that frequently occur together are
going to be closer in vector space. So what they have essentially shown is
that in common crawl and google news names of people with certain ethnicities
are more likely to occur near words with negative sentiment.

However, the sentiment analysis approach they are using amplifies the problem
in the worst possible way. They are asking their machine learning model to
generalize from training data with emotional words to people's names.

~~~
int_19h
I think the point is that they did what's commonly done in real world machine
learning. It's no surprise that it's flawed - but that flawed stuff is
_actually being used_ all over the place.

------
User23
It would be interesting to use the Uber/Lyft dataset of driver and passenger
ratings to do an analysis like this.

For any such analysis there are a great many confounds, both blatant and
subtle. Finding racism everywhere could be because overt racism is everywhere,
or it could be confirmation bias. It could even be both! That's the tricky
thing about confirmation bias—one never knows when one is experiencing it, at
least not at the time.

------
travisoneill1
I've heard a lot about racism in AI, but looking at the distributions of
sentiment score by name, a member of any race would rationally be more worried
about simply having the wrong name. Has there been any work done on that?

~~~
joatmon-snoo
This is a pretty well known study:
[http://www.nber.org/digest/sep03/w9873.html](http://www.nber.org/digest/sep03/w9873.html)

~~~
travisoneill1
I mean name within the same race. On the range of racial averages was 3, but
the range of names within a race was around 10. I don't know how significant
the results for individual names are, but I was very surprised by that result.

------
practice9
> fighting algorithmic racism

Reminds me of how Google Photos couldn't differentiate between a black person
& a monkey, so they've excluded that term from search altogether.

While the endeavour itself is good, fixes are sometimes hilariously bad or
biased (untrue)

~~~
ggreer
> Reminds me of how Google Photos couldn't differentiate between a black
> person & a monkey, so they've excluded that term from search altogether.

Technically that is what happened, but it paints an incorrect picture in
people's minds. Out of the billions of images that Google Photos had auto-
tagged, it tagged one picture of two black people as "gorillas".[1] This was
probably the first time this had ever happened. (If it had happened before, it
surely would have been spread far and wide by social media & the press.)

So Google's classifier was inaccurate 0.0000001% of the time, but the PR was
so bad that Google "fixed" the issue by blacklisting certain tags (monkey,
gorilla, etc). If you take photos of monkeys, you'll have to tag them
yourself.

I'm sure Google could do better, but the standard required to avoid a PR
disaster is impossible to meet. If the classifier isn't perfect forever,
they're guaranteed to draw outrage.

1\.
[https://twitter.com/jackyalcine/status/615329515909156865](https://twitter.com/jackyalcine/status/615329515909156865)

~~~
mediumdeviation
Our expectation of our algorithms are based on human performance. A human
would never tag a black person as a gorilla, or vice versa, and if someone did
it even once we could pretty safely conclude they're either extraordinarily
incompetent, or racist, and in either case we wouldn't trust any tagging done
by such a human.

------
js8
Maybe, you know, humans are simply not Chinese rooms.

Recently there was an article about recognition of bullshit:
[https://news.ycombinator.com/item?id=17764348](https://news.ycombinator.com/item?id=17764348)

To me the article brought great insight - I realized that humans do not just
pattern match. They also seek understanding, which I would define as an
ability to give a representative example.

It is possible to give somebody a set described by arbitrarily complex
conditions while the set itself is empty. Take any satisfiability problem
(SAT) with no solution - this is a set of conditions on variables, yet there
is no global solution to these.

So if you were a Chinese room and I would train you on SAT problems, by pure
pattern matching, you would be willing to give solutions to unsolvable
instances. It is only when you actually understand the meaning behind
conditions you can recognize that these arbitrary complex inputs are in fact
just empty sets.

So perhaps that's the flaw with our algorithms. There is no notion of I
understand the input. Perhaps it is understandable, because understanding (per
above) might as well be NP-hard.

~~~
adrianN
There is no indication that brains are better at solving NP hard problems than
computers.

~~~
js8
That is not my argument at all. What I argue is that brains attempt to resolve
the problem, while computers (when they pattern match in typical ML algorithm)
do not.

It is possible that brain has specialized circuits to solve small instances of
SAT, and it just gives up on large enough instance. I am sure you know the
feeling that you get when you understand something - it's very much like the
pieces of the puzzle that suddenly perfectly fit to each other.

------
elihu
This is an interesting result:

> Note that the accuracy of sentiment prediction went up when we switched to
> ConceptNet Numberbatch.

> Some people expect that fighting algorithmic racism is going to come with
> some sort of trade-off. There’s no trade-off here. You can have data that’s
> better and less racist. You can have data that’s better because it’s less
> racist. There was never anything “accurate” about the overt racism that
> word2vec and GloVe learned.

I wonder if this could be extended to individual names that have strong
connotations with people because of the fame of some particular person, like
"Barack", "Hillary", "Donald", "Vladimir", or "Adolf", or if removing that
sort of bias is just too much to expect from a sentiment analysis algorithm.

------
abenedic
Where I grew up, there is a majority group with fair skin, later(possibly
incorrectly) attributed to the fact that they worked in the fields less. The
minority group is darker skinned. If you train any reasonable machine learning
model on any financial data, it will pick up on the discrepancy. If it did not
I would say it is a flawed model. But that is more a sign that people should
avoid such models.

------
gumby
Please add 2017 to title

------
b6
How to make a program that does what you asked it to do, and then add
arbitrary fudge factors as the notion strikes you to "correct" for the
bogeyman of bias.

Suppose sentiment for the name Tyrel was better than for Adolf. Would that
indicate anti-white bias? Suppose the name Osama has really poor sentiment.
What fudge factor do you add there to correct for possible anti-Muslim bias?
Suppose Little Richard and Elton John don't have equal sentiment. Is the lower
one because Little Richard is black, or because Elton John is gay?

What we have been seeing lately is an effort to replace unmeasurable bias that
is simply assumed to exist and to be unjust and replace it with real bias,
encoded in our laws and practices, or in this case, in actual code.

------
swingline-747
Setting aside blatant shock behaviors... If the other side, the audience, were
less sensitive and not looking for the next micro-outrage, wouldn't ML
chatbots evolve more pro-social values by positive reinforcement?

 _It takes two to Tango_ .. the average audience behavior isn't blameless for
the impact of its response. Also, how an AI decides interprets an ambiguous
response as being desirable or not is really interesting.

~~~
s73v3r_
Blaming the victim for being discriminated against isn't going to help
anything.

