If a black man and a white man rob a bank together, is it fair that they receive the same sentence? Or is it fair that a sentencing algorithm give the black man probation and send the white man to jail to correct for disparate outcomes based on race?
(As an aside: our current eagerness to imprison people for relatively minor crimes strikes me as fundamentally unfair, especially in light of our inability to provide corrective justice. As such, while it would be equal to send both to prison for robbery, I don't think it would be correct to call it fair.)
One of the ways that we can do right by individuals, especially individuals who have been and are currently disenfranchised, is to take affirmative steps towards improving their lives. One of those affirmative steps (and a relatively easy one at that!) is simply not punishing them more harshly than we would the privileged. Nothing about this entails treating white people (or any particular privileged group) worse.
Obviously that is absurd, but where does it go wrong?
The solution is pretty much that simple. I would argue that all correlative models dealing with human beings will always be discriminatory.
Will loans become more expensive for many people? Sure. But that is the true solution. Sometimes the right decision is simple but unconfortable. In my opinion, this is one of those times.
Your financial history includes your rent or mortgage payments, which are for living at your zip code. Restriction to financial history is not all that much of a restriction.
It might well be the case that a totally 'colour-blind' scoring algorithm still ends up sorting people by one of those mentioned, unrelated data points. This is not racism, it is just the consequence of the mentioned relationships. Calling it racism is a sign of ignorance and does a disservice to the effort to get rid of true racism.
Each has a plausible justification for inclusion. But the more data points you have, the better you're able to predict race from them (or anything else you're not supposed to).
So how do you decide if an algorithm is discriminatory? Maybe a whitelist or blacklist of what data it can use?
If you use time in bar to predict cirrhosis it's ok.
We may say that now, but what if later some algorithm finds there's a hidden correlation between driving performance and race/gender? will it still be an ok metric then? where do we draw the line?
The line is between inferences of bad behavior -> bad result such as "you drive risky so you must be at risk of an accident" and pure correlation "you drive risky so you must be a man so you must be at increased risk of prostate cancer".
And conclusions should be assumed to be of the latter kind unless shown otherwise.
Defense means 'what are you going to say when the feds come knocking and want proof you are not discriminating against a protected class?'
As a practical matter it isn't too hard to come up with a legal defense for that, of course. Lots of plausible deniability here. IMO most of this effort is by people trying to help prevent computer nerds from becoming (or staying, as we may already be there) complicit in discrimination by race while hiding behind algorithms as if they are somehow infallible.
want proof you are not discriminating against a protected class
If black people are poorer on average, then even if the system takes only takes personal history in account, black people will inevitably get worse credit rating on average. It just follows from the premise, unless banks participate in some form of race-based redistribution which I find distasteful. How's it supposed to work in your opinion?
Of course, things are not so clear cut. Your financial situation can also be a consequence of discrimination, e.g., if you were denied a job or unfairly arrested. At the end of the day, some human has to sit down and decide what is OK and what is not. Personally, I believe the goal of research on algorithmic fairness should be to give the people who will ultimately make these decisions (e.g., judges, politicians, etc.) the tools to understand both broadly, the kinds of things that can go wrong when using algorithms, and also to understand what might have gone wrong in a specific situation.
But zip code? I understand there is empirical data showing how race is correlated with zip code, but you still choose your zipcode at the end of the day. If you have data saying that applicants from "Zipcode X" are less likely to repay loans, why is one's choice of zipcode out-of-bounds but their choice to have a hard credit check several months ago not? What if a poor credit history is correlated with race (fwiw I don't know if it is or not)? Is it then discriminatory to use credit history as a covariate?
The truth is that machine learning to predict people's behavior is inherently discriminatory and involves generalizations by definition. Once you start disallowing training on covariates that exist due to voluntary behavior, it's not really clear whether you should be allowed to use machine learning to predict individual behavior at all (I'd probably listen to that argument actually).
There's clearly a tradeoff to be made in terms of how much discrimination is allowable in exchange for a given unit of predictive performance, but that sounds incredibly difficult to regulate for every business's ML problem. I don't know what the answer is, but I think it's hasty to argue that anything that's correlated with race is out-of-bounds.
Unless you live in a city that was redlined, which is to say, unless you live in virtually any city. In which case, you really did not and in many cases still do not actually have much of a choice where you live. And that's without even looking at the fact that the racial wealth gap means that even if there weren't still strong barriers to integrated neighborhoods, lots of members of groups that were and are discriminated against would be unable to move into the neighborhoods that would theoretically let them escape that zip code bias.
As an example, some 'professional' sports are dominated by people of certain skin colours. Basketball is predominantly 'black', ice hockey predominantly 'white' while soccer is more representative of the population at large. If you're looking for people who have the potential to become good basketball players you might look at data on their athletic achievements, on their length and other such things. I don't think there is a need to have a dark skin to be able to play basketball at a high level even though the majority of players seem to have such. I assume scouts for the NBA, NHL and MLS don't profile people based on their skin colour but on the aforementioned things and others related to a person's ability to play the sport.
There was a recent talk from a Meetup engineer about this but I can't find the link right now.
A question. How do we know the algorithm is biased if it give a racial disparate result? Don't tell me the evidence is that the result is racial disparate.
There are statistical analysis that can answer your second question, and it's not only based on disparate results (though that's probably part of it). And the solution is not "changing the signal" of the bias
Shouldn't all the arguments be over the data and any biases present (and how to obtain unbiased data) and not the result?
If you start giving some people special treatment just because of their ethnicity or gender, you're going to be opening up more cracks in the system for other people to fall through - and it will hurt those people even more because they will rightly feel that the whole system is working against them personally; down to the core specifics of who they are (unfortunate individuals who don't fit under any special label).
Modern anti-discrimination approaches are often racist or sexist. They should not try to classify individuals into superficial groups, instead, they should be decided on an individual case-by-case basis based on the individual's history.
The root of injustice is simply bad luck. Ethnicity and gender are only loosely correlated with bad luck but there is no causal relationship between them (not so much anymore at least).
Anti-discrimination efforts should be aimed at averaging-out the effects of bad luck in people's lives; so that means we need a way to quantify luck on an individual basis. It's a difficult problem to solve, but it can't be solved by making gross generalizations.
This is wildly ignorant and ahistorical. There are literally thousands of books, articles, studies, movies, etc. disproving this. Literally just this week a study was released showing that predominantly white school districts collectively receive $23 billion more in funding than predominantly black districts, and that: "For every student enrolled, the average nonwhite school district receives $2,226 less than a white school district,". That isn't bad luck; that is a system designed to reinforce a system of racial segregation, and it's just one example of thousands, all of which are a moment of research away if you actually cared about this.
Based on that, any causal relationship between race and school funding is far from simple.
> That isn't bad luck; that is a system designed to reinforce a system of racial segregation
Who you believe is designing the "system" for the purposes of segregation, and what evidence do you have to support it? Asking because everybody I've met who works in education seems to genuinely want all students to thrive.
There are valid historical reasons to explain why there is a correlation between wealth, ethnicity and gender but the causality is not there anymore. My point is that a poor white kid will be just as disadvantaged today as an equally poor kid of any other race.
Why not? What allows you to make the claim that the time of extreme disenfranchisement (one generation ago) or ever slavery (two to three generations ago) bear no effect in 2019? Because they are not formally recognised in law?
>My point is that a poor white kid will be just as disadvantaged today as an equally poor kid of any other race.
Some would say that the white kid still has access to structural privileges afforded by society through extra-legal means. That's not to say his situation isn't bad, but it's probably less bad than the poor black kid's fate. This is reflected in biases about intelligence (for instance when an employer is deciding whether or not to hire), for instance.
Same goes for "there are multiple books" thing. That statement by itself doesn't reinforce your point, both because you aren't specific enough with that generic "books", and because there is a multitude of books written about vaccines causing autism. Book authors can be wrong.
I'd also like to point out that the very article you've posted contradicts your point.
As Rebecca Sibilia, founder and CEO of EdBuild, explains, a school district's resources often come down to how wealthy an area is and how much residents pay in taxes.
"We have built a school funding system that is reliant on geography, and therefore the school funding system has inherited all of the historical ills of where we have forced and incentivized people to live," she says.
So basically school funding partially depends on how affluent the area is. Due to various, mostly historical, reasons poor people are disproportionately black and so schools that have lower funding tend to have more black pupils. There is nothing requiring a grand racist conspiracy in this explanation, limited social mobility and income disparity are sufficient to explain this phenomenon. And you can never, ever be able to understand root causes just from that number you've cited. You'd need those causes to fix stuff, unless the only "solution" that you're trying to justify is "white people bad and rich, black people poor and good, take money from white people, give it to black people". That one isn't really helpful.
It's a little more technical than this paper, and presents very well why removing discrimination in algorithmic decision making is a complex task.
If I'm understanding the authors correctly, they're saying that blind algorithms don't have equal outcomes by race -- so they're reintroducing race to the algorithms so they can adjust for disparate outcomes.
Basically they're arguing for affirmative action algorithms.
the race-blind models aren't necessarily really race-neutral; they could implicitly have something like "affirmative action" in either direction, and reintroducing an explicit race variable could be the only way to get rid of it. or the opposite might be true! it takes real care to get this stuff right whether notion of equality you aim for.
But in practice, I don't think it can, without having total surveillance capabilities - not to mention essentially sci-fi quantities of AI advancement.