
When Data Science Destabilizes Democracy and Facilitates Genocide - thuuuomas
http://www.fast.ai/2017/11/02/ethics/
======
gt_
Most it this made sense to me, but the section on ‘bias’ seemed to leave the
ground a little more than I think is healthy. The problem is correctly
identified. The solutions offered are OK, but the examples and concluded
causes appear unreasoned.

Probably the clearest example is in the section on stereotypes. Again, the
problems are solidly identified but the understandings of their causes are
lacking. The implication of surprise that an algorithm with real-world data
would associate being a doctor with being a male and being a nurse with being
a female is highly suspicious. There’s plenty of reason to assume a social
bias feedback loop as discussed in the beginning of the article, but the
argument that this is a result of bias rsearch is unsound. This correlation is
consistent in the statistical data, in all societies and cultures on Earth.
Again, there’s plenty of reason to argue the results correlate with _societal_
gender bias, but not _researcher_ gender bias. Big difference. The race/skin
color correlations argues in the article _do_ have sound arguments for
_researcher_ bias.

~~~
nl
The whole point of this article is that good data science involve detecting
and avoiding these biased feedback loops, because software can amplify them.

It literally says: _“It’s not that data can be biased. Data is biased.” Know
how your data was generated and what biases it may contain. We are encoding
and even amplifying societal biases in the algorithms we create._

~~~
gt_
??? I just read the article... not sure what you're point is. My point is that
the author "encoded and even amplified" bias that the data _did not_ have.
Most of the article concerns being aware of inherent bias _in_ data. My point
concerns bias that the author has ascribed _to_ the data.

~~~
nl
Gender bias in Word2Vec data is a well known problem. The article provides
references. It's an unsupervised algorithm, and the Google pretrained vectors
are trained from news coverage, so not really a researcher issue (insofar as
they select an unusual datasets or something).

Edit: to clarify, your claim is that Word2Vec data isn't biased even though
there is a link right there showing how it is? Why do you think that?

If you use that data in a system then you reinforce that bias.

------
nl
Oh please.

This article is completely correct of course. But there is zero hope that
it'll be taken seriously at all.

Just the other day, here on HN someone seriously proposed the idea that hate
speech doesn't lead to violence against minorities[1]. I pointed at the Rwanda
genocide, and got voted down (a lot), because the OP claimed that 10% of
deaths didn't prove my point. Then I pointed out that was at least 60,000
deaths, and that was downvoted too.

Enough people clearly want to ignore things which disagree with their world
view that there is zero chance that this destabilising behaviour will stop.

[https://news.ycombinator.com/item?id=15936785](https://news.ycombinator.com/item?id=15936785)

~~~
jlg23
You told people to look up a topic but did not provide any link as a
recommended starting point nor any justification why Rwanda's history is
relevant in the context of hate speech; now you just refer to yourself.

~~~
anigbrowl
You shouldn't need to provide links for every topic of general knowledge, like
the factual existence of wars that have taken place within living memory or
are part of an educational curriculum. Wikipedia is an adequate place to begin
researching a topic like that. If you can't see why Rwanda's example is
relevant in this context then I urge you become more informed about the topic.

~~~
jlg23
I am actually very familiar with the topic and I am thus aware that one can
find many, seemingly contradictory, sources. But thanks anyway.

I was simply justifying my downvote.

