Also, people should have the right to know when machine-learned models are used to make decisions about their lives. They should be able to ask why a particular decisions was made and get that information.
This is real AI ethics.
What's the current standard for anything "important"? Most things important decisions are biased and not explained. Judges have biases and are allowed some discretion when sentencing. I'm sure police officers have biases as well. Same is true with just about any person making a judgement. There are laws (for good reason) that prevent certain types of biases, but it's naive to believe that the status quo on important decisions is great.
A model is better in a number of ways. First it is based on actual evidence. Although that can be manipulated, it's a lot easier to observe control than the life experiences of an individual. And you can guarantee that certain factors won't play a direct role at least by excluding them as inputs. It's much harder to tell a person to ignore some factors.
I think algorithmic, objective decision making on important decisions is very much preferable to what we have now.
* having judges that live within a changing society is what allows laws to be changed in accordance to what the state of public debate is at a given time. For some it might be too slow or too fast, but it kind of works, if you think over the centuries.
* it is harder to manipulate all judges in the land than changing one algorithm or its goals. The distribution and fragmentation of power is a feature and not a bug. It makes some things weird and others slow, but ultimately it helps making a dictator take over hard.
Any new proposed system must be able to proof it has similar resistance against fundamental change from democracy into dictatorship, while still beeing able to slowly shift where it counts.
As a system the things we have are not bad and in western societies they managed quite well. I think any modern day proponent of alternative systems of governance should realize their hubris in the light of the things we already have working. The stuff we got is not just some service that could be run better, it is a way of preventing us from tearing ourselves and each others appart.
As desireable as algorithmic decision making sounds, it needs to be objective and deterministic. And it needs to be able to factor in circumstance in order to stay humane. But if it does so all the time it will be abused. In the end it would have to produce decisions that are accepted by humans after all.
I think for topics of high importance it would be best to have informed decisions by humans who educate themselves about the matter and bring forth arguments. For smaller decisions algorithms might work, as long as they are transparent
I agree that the system needs to have a feedback loop and the law should evolve over time. But IMO there are better ways to achieve that than introduce a biased decision maker. You can rely more greatly on legislation or bring in some evolving value system into your model.
> * it is harder to manipulate all judges in the land than changing one algorithm or its goals. The distribution and fragmentation of power is a feature and not a bug. It makes some things weird and others slow, but ultimately it helps making a dictator take over hard.
Fragmentation of power is generally a good thing. But the idea of having many judges as fragmentation of power is not actually fragmented. It would be fragmented if they all heard your case and sentenced you accordingly with some aggregation method. Or if they have a small jurisdiction. But in many cases, one judge (e.g. your sentencing judge), has all the power (ignoring appeals).
Let's assume I wanted to influence a sentencing judge. I could learn his biases. I could hire a lawyer with a good relationship with the judge. I could give him some financial incentive (e.g. kick backs from a private prison or campaign donations). I could manipulate jury selection using racial biases. I could look at the statistical properties of sentencing during different parts of the day and try to manipulate that. Less common in the US, but I could intimidate or threaten the judge.
How would that play out if it was an algorithm? I could hack into the administrating body and somehow retrain the algorithm? I could try to subvert some data scientist working on the algorithms and have them introduce slight biases into the algorithm that would eventually favor me?
I'd take my chances with the human judge. And that may be a good thing, but I don't think humans are harder to manipulate than algorithms.
> And it needs to be able to factor in circumstance in order to stay humane.
Again, I think it's a lot easier to tell an algorithm consider these new factors with the explicit goal of increase/decreasing X.
It's a much simpler problem than the general image detection. Imagine how many different types of dogs you can encounter at various angles, sizes, colors, perspectives. For that it's much more of a black box as your model would have to support detecting a dog from behind, side, front, above, sitting down, walking, jumping, standing, dogs with long tails, short tails, long hair, short hair, etc. Now think about a face. Most faces are very much alike.
So the algorithmic objective decisions of current systems are actually partial and selective. We must be very careful not to attribute powers to them that they do not have. They can provide useful tools, but they are not the locus of decision making; that rests in the place that they were created, and may be distorted by either accident or design.
Expert systems were the big success of GOFAI, but they fell out of favour in the last AI winter, at the end of the '90s or so, clearing the way for probabilistic inference and statistical machine learning.
Since then it seems, we took one step forward (with accuracy in classification) and one step back (with the loss of the ability to explain decisions).
Who knows, maybe a new AI winter will wipe out the statistical machine learning dinosaurs of today and leave a clear field of play for the AI Mammals of tomorrow.
It's a survivor of the AI winter, and what we have is basically a generalized expert system. Every conclusion incorporates cross-domain knowledge and can fully explain itself. It's been a long, slow trudge of research and development over the past couple decades, but we're starting to poke our heads out and ride the current AI hype wave. We like to say that Watson is our marketing department.
Did you forget a /s there? For me the marketing around Watson, with the unrealistic and not scientifically backed assumptions about the state of AI they put out to the general public, epitomize what's wrong with how the world sees our field.
More on topic: the underlying problem with the original post here seems to be one of selection bias in the training data. Somewhat of a remnant of human decision making that, likely unintentionally, ended up in creating biased decisions in the end. While the system you linked seems to potentially create nicely decomposable decisions, is there anything that inherently prevents "bias" in what it learns? The approach seems to face many of the same problems modern ML systems in the Linked Data / Semantic Web space would given unrestricted learning from the web.
IBM's marketing Watson as a medical application is one thing. The system itself is quite another. And the system itself remains the most advanced NLP system created so far.
Note that I say "system". Most NLP research consists of testing the performance of various algorithms against very specific benchmarks, but very little work goes towards creating a unified system that can integrate multiple NLP abilities. Watson on the other hand is exactly such a unified system. It integrates statistical machine learning, symbolic reasoning (frames, fer chrissake, frames! In this day and age!), pattern-matching (with Prolog) and so on and so forth. Far as I can tell there isn't anything like it anywhere - but of course, I don't know much about what Google, Facebook et al are doing internally.
In terms of systems I'm not sure why IBM would stand out, Microsoft, Amazon and Google all offer relatively coherent pipelines for the engineering side of NLP and all of those are backed up by accomplished research teams. I'm sure you can find everything from total horror stories to beautiful testimonials for all of these platforms.
I think these two companies in particular would be very unwilling to design and build a system like Watson, integrating symbolic techniques alongside statistical ones. They probably recruit so much for statistical machine learning skills that they don't have the know-how to do it anyway.
The funny thing is that, like many large corporations, they probably have ad-hoc expert systems except they don't call them that. Pretty much any sufficiently large and complex system that encodes "business rules" is essentially an expert system. But, because "expert systems failed" companies and engineers will not use the knowledge that came out of expert system research to make their systems better.
That AI winder really got us good.
The main way our system would combat bias is by shedding light on it. No human-built system could be completely impartial, but ours will say "I decided X because of A, B, C, and D", and if people decide that C is biased then that piece of knowledge can be adjusted accordingly.
Because it seems like currently, this process is an entirely human driven manually done task wherein you have people making the decisions on how to connect and add new relations/knowledge to each other.
Initially the BK comes from some existing source- it can be a hand-crafted database of a few predicates deemed relevant to the learning task or a large, automatically-acquired database mined from some text source, data from the CYC Project of course, etc. In any case, because of the unified representation, learned hypotheses (the "models") can be used immediately as background knowledge to learn new concepts.
Edit: I don't know if the Cyc project uses ILP. But what the comment above says is doable.
1) We're able to map outside data from a DB into Cyc's knowledge format, rather than hand-encoding it. This knowledge is inherently not as rich as the rest, but it can obviously be useful anyway.
2) At some point we hope to reach a critical mass of knowledge that will allow Cyc to simply "import" a Wikipedia page by parsing and understanding it. It will interpret a given sentence into its own understanding, then assert it as true and do reasoning based on it down the line.
So you’re basically saying that it is possible to enumerate and solve for everything possible by manually iterating through every single possible edge case that exists. There’s a reason why Cyc has spent over 30 years and only has been able to get this far. You’re fundamentally limited by human constraints. The only realistic way of achieving a general purpose learning system is by teaching a system how to learn and then letting it figure things out on its own. Patently some method involving reinforcement learning.
By the way, it’s not like RL is some newfangled thing. Much of it started during the 80’s, as it concurrently developed with the other purported method of developing intelligence, which was through expert systems.
If you're interested, I highly recommend checking out a lecture that Demis Hassabis gives talking exactly about this issue: https://www.youtube.com/watch?v=3N9phq_yZP0
(Not the Cyc person).
Your comment is arguing for an end-to-end machine learning (specifically, reinforcement learning) approach. However, modern statistical machine learning systems have demonstrated very clearly that, while they are very good at learning specific and narrow tasks, they are pretty rubbish at multi-task learning and, of course, at reasoning. For breadth of capabilities they are no match to expert systems that can generally both tie their shoelaces and chew gum at the same time. Couple this with the practical limitations of learning all of intelligence end-to-end from examples and it's obvious that statistical machine learning on its own is not going to get any much farther than rule-based systems on their own.
Btw, reinforcement learning is rather older than the '80s. Donald Michie (my thesis advisor's thesis advisor) created MENACE, a reinforcement learning algorithm to play noughts-and-crosses in 1961 . Machine learning in general is older still- Arthur Samuel baptised the field in the 1959  but neural networks were first described in 1938 by Pitts and McCulloch . Like Goeff Hinton has said, the current explosion of machine learning applications is due to large datasets and excesses of computing power- not because they're a new idea that people suddendly realised has potential.
Rule based systems fail catastrophically the moment they encounter something that has not been codified into their rule set. I point to Chess as a prototypical example with two AI engines, StockFish and AlphaZero. StockFish, a manually created meticulously designed over the course of decades expert system, is handily defeated by AlphaZero, a reinforcement learning based system that trains purely through self-play.
If you look at any of the sample games between the two AIs, you can see a distinct difference in style between the two. In colloquial terms, StockFish acts far more “machine-like” whereas AlphaZero plays with a “human grace and beauty” according to many of the grandmasters that commented on its play. These of course are purely due to the fact that StockFish has certain inherent biases caused by brittleness from its codified rule set, which causes it to make sub-optimal moves in the long run, whereas AlphaZero is free from the constraints of any erroneously defined rules allowing it to do things like sacrifice it’s pieces as a strategy. Meanwhile, because StockFish codes in the value of losing a piece as giving negative points, it inherently has to overcome this bias every time it might choose to make a move in this manner, pushing its search space to find moves where it doesn’t have to be sacrificing pieces which is more optimal under its rule set.
>> The future of generalized intelligence will not be based in brittle datasets, but purely based off repeated self-play, allowing for the bootstrapping of an infinite amount of possible data.
How will general intelligence arise through self-play, a technique used to train game-playing agents? There's never been a system that jumped from the game board to the real world.
I've never used Cyc but I have used OpenCyc and I'm familiar with some of the applications of Cyc. It's interesting when it works.
Not sure I'm sold on LIME and other similar approaches, though. Seems like a lot of deep learning people are all too happy to substitute "intepretability" for actual explanations.
Decision Trees are in fact an example of the early years of machine learning where the trend was towards algorithms and techniques that learned symbolic theories. I believe the effort was driven by the realisation that expert systems had a certain problem with knowlege acquisition  which drove people to try and learn production rules from data.
I digress- I mean to say that decision trees are explainable because their models are not statistical.
To be honest, I don't know much about additive models.
I tried some quick DuckDuckGo-ing to see if anything new turned up. Drowned out by unrelated stuff. I did find what looks like a great, quick overview of expert systems for folks unfamiliar with them. Might be new, default link I share about how they historically were perceived. What you think of this one?
I think that advances in probablistic reasoning & modelling, such as practical Bayes networks should be included, and the mechanics of resolution have improved massively with the introduction of answerset systems - this gets over the problem of commitment that kiboshed gen 5.
A documentary called "Murder on a Sunday Morning": https://www.youtube.com/watch?v=LFLbptkb1eM
A woman was murdered in front of her husband. The man saw the attacker up close and was the only eyewitness.
He accused a completely innocent man, and it took half a year of jail and court-related turmoils for this to get cleared up.
We explain how to understand handwriting to every single child that goes to school. It's not the answer you want, but it's the answer that actually matters here.
Trying to equivocate AI and human cognition in this way is completely disingenuous.
Human reasoning is not 100% reliable, but we know very well in which ways it's unreliable and how to deal with it. We have shared biology and millennia of experience trying to empathize and communicate with others.
And your last point is wrong.
ML models are studied and understood much better than human reasoning.
Teaching children to read is an interactive process that has pretty much nothing in common with data steamrolling in modern machine learning.
>And your last point is wrong. ML models are studied and understood much better than human reasoning.
Is that why new ANN architectures are almost universally constructed by trial and error?
Algorithms can be explainable, and I agree that anything that affects your standing in the eyes of government should have that as a minimum requirement.
There's a recent review that they published.
I do not want to speak for the parent but I think you might be on a different level of abstraction when talking about "explaining" things. Consider this for example: someone writes a letter to your boss that you should be fired. That someone does not need to explain the process of writing but probably should be required to explain why you should be fired.
On the other hand, systems such as laws and lending, which are inherently about social interaction, do tend to have some unspecified human element. Judges interpret the law and apply it to specific cases, underwriters have latitude to make exceptions under certain vague circumstances, teachers may regrade a paper upon realizing they misspoke in a lecture. This is a feature, not a bug--if our social systems have no room for empathy then there is a big problem.
So now that AI is "unexplainable", how is this worse than the unexplainability of human systems? You can ask a human why they took some action, but their explanation tends to be incomplete, wrong, a lie, or maybe they don't recall. I once was given the opportunity to lease a top-floor apartment in my building because I had a daily chat with the receptionist. All fair housing laws were followed, I just happened to be the first to know because I had such frequent interaction. If you were to ask the receptionist why I got the apartment she probably wouldn't say "zwkrt tells me bad jokes and we share pictures of our pets", but that probably is part of the reason.
I think fundamentally we understand that humans share a way of life and a set of ineffable values, and we believe that computer-generated results do not share these values. Unfortunately I do not have a conclusion, just a formulation of a problem.
I could get behind a simple and transparent tax system, where you just see in realtime what money you give while beeing sure that big company has to do the same, without gaming the system.
But I am not sure the system that decides on these rules should be another system.
Also, to the benefit of my perception works the fact that human beings have shared brain architecture, and the way I see stuff is the same as everybody else, so whatever half-assed explanation I could give, it's intimately understood by other people. In contrast, ML models are completely alien to us.
You can understand how the model works and the math behind it but you will be hard pressed to understand the exact path behind a particular choice.
Decision trees are one exception to this. Most humans can understand those.
Is the decision reached by human members of a jury, for example, explainable in the way you mean?
The gentleman is saying that the system selects a small subset of the 350,000 mugshots, but because a human selects one mugshot from this small subset, there is no bias.
That just makes no sense.
[edited to remove stronger language]
Of course, with the error rates reported on certain groups a human making the final decision is not enough.
Let’s assume the photo was not a match for the database. Now a human doing the final step is essentially pick a random face from a biased sample.
Worse, people are really bad about assuming whatever option they are considering to be far more likely than the base rate. Read up on an random obscure disease and suddenly you start thinking it’s a real risk. Sadly, police do the same thing with criminal suspects resulting in innocent people in prison.
Besides the article's obvious anti-man, anti-white bias, I'm not surprised that facial recognition software has a hard time analyzing darker skinned people.
With photography, I've always had a hard time photographing someone who had dark skin. Lots of light needs to be used, and even then it needs to be filtered correctly, etc. This goes for any subject that is dark.
Unfortunately, this is going to be a tough problem. Cameras used by law enforcement and government agencies, (which the article seems to focus on) are normally pretty shitty; software can only do what it does with whatever input it gets. So if the lighting and image quality is shitty, then your results will be as equally shitty.
The article doesnt go into what kind of equipment the MIT researcher was using, but I will assume that it is a high quality camera. If so, and if the software is still failing as the article alludes to, then yes, these companies need to make their software better.
Even so, it's a crapshoot from the get-go, due to the hardware being used.
The issue does come down to a mix of (likely) inadvertent racism and physics.
It is very difficult to photograph people with darker skin. This is intentional, per Mother Nature. Very briefly, our atmosphere passes a certain bandwidth of colors to the ground. The peak frequency is in the yellow-green (this is why plants are this color usually). Unfortunately, one of those colors is in the UV frequency that just so happens to be the same frequency as the small channel of eukayotic DNA. Thus, cancer risk is raised when that UV photon energetically breaks up the DNA. So, mother nature evolved to co-opt the melanin molecule, which happens to absorb those UV photons, to help protect skin cells from this damage. Thing is, though, that melanin molecule also absorbs not just that UV color, but a lot of other ones. PhysicsGirl has a good video on how sunscreen works and what a freckle looks like under different wavelengths  that is informative here.
Now for the (inadvertent) racism. When they were first starting to make photographs, just about any photo sensitive material was 'good enough'. These, of course, were all in black and white. But, the frequencies/colors that these chemicals were photosensitive to, were much more than we can see with our eyes. It went from the low IR to the high UV, with all kinds of various notches and mixes in there. Tin-type photographs are a great example. During the US Civil war, there were a lot of pictures taken of slavery and 'colored' soldiers. It's not hard to distinguish the faces and features of these people . One thing to note about these tin-types, look at the clouds, rather, the total lack thereof. That's an easy way to notice that the frequencies you are seeing in the photograph are not the frequencies that your eyes see.
When color photography started to become more widespread, the frequencies used for each color (CYM) and the absorptive bandwidth of each part of the color films were not accidents. They were chosen such that attractive young 'white' women in very heavy make-up and under very bright 1930's studio lights would be best seen. These color choices, due to the inherent racism of the times and other factors, were perpetuated into modern CCDs and CMOS chips we use today. There are important differences in the physics of digital and film color photography, but largely those original choices have been conserved.
One important difference is the IR spectrum pick-up. In most modern phones, the front facing camera does not have an IR filter on it. Next time you are near a security camera or at a toilet with those IR sensors, take a look at your phone using the front-facing camera. It should be straightforward to see the little blinking IR LEDs of these devices.
So, though darker skinned people are inherently difficult to photograph due to melanin and Mother Nature, it is not difficult to find the right frequencies that will work. It's a case of the 'lock in' effect that prevents this.
The hardest thing to take a photo of is a black haired cat, and I don't think that's because Adobe is biased agains black cats.
1. There have been instances of photos of black people being postprocessed to make their skin darker. For example, OJ Simpson's mug shot on Time's cover was "artistically interpreted"  and attack ads against Obama darkened his skin . Also of Kerry Washington's skin being made lighter .
2. Some people interpreted darkening photos as a political move, embodying a belief that dark skin is scary. And lightening the photos as embodying a belief that light skin is more beautiful.
3. Photographers can't keep their hands off the postprocessing tools. And even if they could, they've still got to choose an aperture and exposure when they take the photo.
4. Some people would say, because lightening and darkening black people's skin is political, and photography can't avoid lightening and darkening, photography is inherently political. (and that saying you don't photograph black people is more political, not less)
Of course, some would see that argument as a bit of a stretch. I can understand where it's coming from I just don't agree - or at least, I'm not sure what it means in terms of my personal actions.
What I don't understand is how the article seems to imply that there's something wrong with that. Consider that
- Photography was invented and for the longest time mostly used in a majority white area.
- It makes a lot of sense that, in predominantly white cultural spaces, lighter skin is preferred, as it historically indicated that a person didn't need to work in the open, which implied some degree of wealth.
- Most importantly, it seems to not understand how light works. Dark surfaces reflect less light, so there's less contrast between strongly lit areas and shadows, meaning it's both harder to make a photo look clear and for an AI to analyze it.
The reason I commented is because that article makes a lot of complaints that drum up outrage. As far as I know are of the complaints all much better explained by either 1) less light is harder to photograph, or 2) photographing bright things and dark things at the same time is harder than photographing just bright things or just dark things.
Original study [PDF]: http://proceedings.mlr.press/v81/buolamwini18a/buolamwini18a...
New study [PDF]: https://www.thetalkingmachines.com/sites/default/files/2019-...
What about lighter-skinned women? This seems to be phrased on purpose to incite bias against white men.
"Darker-skinned women were the most misclassified group, with error rates of up to 34.7%. By contrast, the maximum error rate for lighter-skinned males was less than 1%."
The fact that the Amazon gender classifier miss-identifies 7% of white females as male, but 0% of white males as female is very odd. That seems like a JV level bias tuning mistake. That's not systemic violence so much as it is just sloppy.
I have some suspicion they haven't focused on bringing down the error rates for minority groups because they know even their best case isn't good enough.
That's exactly the kind of thing that a good AI engineer should be looking out for.
No. Racism in the legal system, and our society overall, is not "new." However it's still a real problem in our society. As such, race- and gender-bias that's "accidentally" hard-coded into software sold to police is problematic. Every new product containing this flaw is news.
> Why should it come as a shock that facial recognition product developers suffer from the same bias?
At the very least, it should come as a shock that one of the biggest software development firms in the world is attacking critics in defense of buggy behavior. Especially since, as you note, this phenomenon is common knowledge and there's a simple and well-known workaround.
Yes, if you're offering a product for sale, it's fair for people to review that product in public. And that's not "naming and shaming," it's a critical review of an irresponsibly-developed product being sold to law-enforcement agencies.
> In fact, I would expect law enforcement to test this software thoroughly themselves, because that's their job.
Yes, that is also their responsibility. But where is the recourse? Law enforcement agencies have abysmal records of self-investigation, and the judicial system is unreliable at best, in holding law enforcement agencies accountable.
The public has a right to know what technology is being used to police them. If you want to call investigative journalism "naming and shaming," then yes, absolutely, she made the ethical choice in speaking out.
"Those disparities can sometimes be a matter of life or death: One recent study of the computer vision systems that enable self-driving cars to “see” the road shows they have a harder time detecting pedestrians with darker skin tones."
Just think about the consequences of deploying such systems.
I see the point you are making, but to the extent that an AI can do better than a human due to the physical aspect, for example, of dark skin reflecting less light at night, I think we should at least try.
“Predictive Inequality in Object Detection”
I’m not denying the conclusion of the paper, but it does have a lot of limitations. Also, Tesla notwithstanding, isn’t object detection primarily done via lidar?
It's not even a bias problem. It's a lighting, contrast, and camo problem. You could as well claim a bias between asphalt and concrete pavement, based on what skin tones are more visible with the pavement as a background.
For pattern recognition, No, the computer vision system can't best a human.
I'm saying it could do so, while still having bias. The existence of bias doesn't make the computer worse than humans.
With that said, fairness in ML/AI is a real problem, and some people are doing some really good/important work in this area. I am not familiar with Buolamwini's work, but I'm much more inclined to believe disparities in facial recognition than pedestrian detection where very little skin is visible and almost everything you're seeing is clothing.
Those systems are already widely deployed - they're called 'humans'. All vision systems are going to be more likely to confuse like colours.
This problem is quite a bit different from the 'mugshot' problem, or a problem wherein there was a bias or lack of training data for certain samples.
The author I think conflated a lot of the issues and just boiled it down to 'evil AI' which I don't think is the right thing to do.
Issues of ethnic orientation are quite a bit different from systems that have trouble literally due to the colour of something.
As for Amazon's self-serving response, it should be given the same degree of respect as any other statement by an entity that is not prepared to discuss it in an interview.
There will always be some minority group that the product fits less well with, because you can divide people into an infinite number of groups based on an infinite number of traits.
This is a consequence of R&D resources being finite and the complexity of the world being unbounded. It's not sign of a moral failing or mis-placed priorities.
Here's a thought experiment to illustrate the principle I'm trying to get across: if we find out that members of a very small ethnic group suffer disproportionately from some deficiency in software that does a poor job at recognizing features common to them, and this leads to an extra 50 people dying each year, and that the resources it would take to fix this deficiency would allow the software to be improved to reduce the accident rate by 5%, leading to 5,000 fewer people dying each year, but only 5 fewer members of that ethnic group dying each year, should we target the resources at reducing the accident rate for the ethnic group, just because the group they find themselves in happens to be ethnic?
>>If it costs a bit more to do the right thing, so be it.
In this case, the cost is more lives being lost, and the right thing is only the right thing according to an arbitrary and flawed value system that you are submitting as the ideal.
I have no problem with more resources being spent to "fix a problem", if that doesn't mean it comes at the expense of resources being spent to fix more serious problems. There is nothing inherently more unfair about members of a particular race suffering from a poorer experience with a particular software program than members of a particular psychological profile or height or distance between-the-eyes group facing a poorer-than-average experience with the software.
The objective should be to reduce how many people in total die as a result of the software's flaws. Showing what particular groups are "measurably affected", meaning have traits that the software does less well with, does not provide any valuable information. It ignores the whole to focus on an arbitrarily elevated part deemed more important than other parts. Because if the software fails in 7% of cases with anti-race-disparity development priorities, and 5% of cases when development prioritizes population-wide performance, you are sacrificing more people of other groups to get better results with a favored group.
As for unfairness: we can slice and dice the statistics to show less or more disparity between the average and a disadvantaged group, by selectively manufacturing group categorizations that produce more or less disparities (a group can be anything: people with dark skin, people with wide-set eyes, people with small chins). It's impossible for the software to perform equally well with all people, unless it is perfect. Getting to perfection is more efficiently done by focusing on improving the statistics in relation to the whole of the population, rather than any subset of it.
So to summarize: choosing race or skin-color as the categorization determinant is not objectively any more moral than choosing any other trait, and trying to find groups that are exceptionally disadvantaged is an impossible feat because an infinite number of groups can be created using an infinite number of trait combinations.
Here in DC there was an example awhile back prior to legalizing marijuana: white people apparently used at a higher rate but most of the prosecutions were of black people both due to heavier police presence and because demographics meant that white users tended to have more privacy (limited visibility from the street, more distance between houses/sidewalks to make smell harder to notice, etc.) which made it harder to get evidence clearly showing that a specific person had been the one using. The process could be fair without changing the fact that the results disproportionately impacted one group.
Some races happen to correlate with some selected traits more than others, but race is not the trait selected for.
It's completely predictable that not all traits of interest for law enforcement will be distributed equally across all racial groups. To treat this fact as a sign of systemic racism is to guarantee that you will consider every society on Earth systemically rac/sex/[group] ist.
This does commonly fall upon racial lines in countries like the United States with a long history of racial discrimination but it’s not exclusive and it’s important for anyone building systems to consider pitfalls like this because we know the users are likely to assume that a computer is unbiased.
The conclusion is that the system is biased towards a particular race. The reasons WHY it was biased makes no difference to the fact that the system IS biased.
There are 5x more white people than black people in the US (roughly). So... it's kind of understandable that any training data set would contain more white people than black people if you didn't filter it beforehand?
I guess theoretically you would have to limit the size of the training data for all groups to the size of the set of training data you have on the smallest group, so that all groups could be represented equally in the training data, but then perhaps you get a far less accurate model overall and it results in greater negative consequences?
Just spit balling here. But it's a complex problem.
The whole 'racist AI' thing screams Conway's Law to me.
> Please don't use Hacker News primarily for political or ideological battle. This destroys intellectual curiosity, and we ban accounts that do it.
> Eschew flamebait. Don't introduce flamewar topics unless you have something genuinely new to say. Avoid unrelated controversies and generic tangents.
I think it's probably more likely the second just because when there is a significant problem in classification of other things for a Bayesian or other type of machine learning algorithm it often turns out that the corpus had poorer examples of the problematic classification in relation to other classifications - that is to say it is a known problem with a predictable result.
But probably should do some studies about skin reflection, that's a good idea.
Why is there necessarily no problem with a bias if it comes from a physiological difference? If a certain group of people gets falsely convicted of crimes at a disproportionate rate because some computer program misidentifies them more frequently than it does other groups, that is a huge problem that needs to be fixed, even if the misidentification is due to some inherent difference between the group.
I assume you also oppose things like wheelchair ramps and Braille signage, since any problems handicapped people have with stairs or printed signs is due to physiological differences?
What I’ve seen is differential comparisons, eg, comparing the rate of white detection to black detection or the difference in certainty scores on each — but I’d really appreciate it if people could show me the actual certainty numbers on black faces so I can see if it’s failing to recognize, misrecognizing, or just less sure then white faces.
Regardless of whether the higher error rate is a combination of race, gender, or both, it's still a huge issue. Granted, that study was from a year ago, and other companies have since improved their facial recognition systems. But an overall precision/accuracy/f1 score doesn't mean much when accuracy varies that much by group. Sure, you can market it as "accurate on white males", but you can't market it as "accurate"
If people plan on spending time on activism, there is plenty of real issues with Amazon.
We don't want autonomous killer bots to not be able to tell black people apart just because their creators can't.
Alternatively, and more likely, measures would be taken to make it not autonomous on paper, e.g. requiring a human operator to approve any action the robot is intending to take. In practice, this would likely be one of those "moral crumple zones" with little practical meaning.