I will say that 3 years ago when we tested all the major vendors' image recognition technology, they all failed spectacularly. And this wasn't some small project, we ran tens of thousands of our images through (out of over a million) and worked with several Phd students in our efforts.
A picture with a white college kid in the frame would get the output of "human". Put an African borrower in the frame and you got at best a failure to recognize a human, and at worst a reference to an animal.
I would hope the situation is much better now, but the bias (and just sheer inaccuracy) of the tools was readily apparent and we gave up on image recognition for the time being.
There are real technical challenges in producing photographs of both light and dark skin, but the focus on light skinned subjects have led to an accumulation of technical solutions.
Computerized facial recognition has been progressing for something like 50 years, with each paper building on previous work. I suspect a parallel world where the dominant social group hard dark skin would explore different technical avenues, focus on ways of recognizing faces that lean less on tone, etc. Obviously it's hard to say with what we know now.
Humans have difficulty recognizing cross-ethnic faces, however I don't think dark skinned people are worse at recognizing same-ethnic faces than light skinned people. To me that suggests our social history is more to blame than some fundimental quality of skin tone.
It goes even deeper than that. If you look at the history of photography, film, and television you'll see a bias towards white skin tones that standards have been built up around. Just looking at "leader ladies" or common reference images there isn't a bunch of diversity. In practice I've heard cinematographers and photographers bring it up around filming different ethnicities. It starts with changing the exposure, but means you have to spend more attention towards color reproduction.
This can extend beyond the film or image sensor to biasing lighting equipment, defaults, standards, and thresholds.
Looking through the literature you'll see things like, "the design decisions in the NTSC’s test images effectively biased the format toward rendering white people as more lifelike than other races"
I'm mostly familiar with the history in the US. I'm earnestly curious about how other countries accommodated these types of things as they adopted the technologies.
It's deeper than that. The chemicals used in analog color photography didn't have a dye profile that could capture people of color effectively until the 80s.
Even deeper- Kodak only started putting effort to get the process right in the 70s after customers complained about poor color reproduction[1] in wood grain and chocolate
I recall being amused maybe 3 years ago when I took a picture of two friends, one with dark skin, one with light skin, and the camera choose to expose for the darker skin. He looked fine, the lighter skin friend looked like snow lit by headlights. Completely washed out.
The camera settings and background lighting make a big difference.
FYI: Wedding photographers deal with the contrast between white dresses and black suits/tuxedos every day, for the last century.
Some notes:
- the eye can handle the dynamic range, but there's no correct exposure for that photographic combination, so they play with lighting, exposure bracketing and dodging when printing
- it's recommended that brides wear off-white (like cream) or other colors of fabric if they're concerned about the photos
A RAW photo (raw input from the camera sensor, without processing) can often be used in post production to get more range out of a photo than would be possible with a JPEG. One of my desktop backgrounds I often use was actually quite overexposed due to the sun shining through trees. I was able to correct for it in Lightroom and it's now one of my favorite shots.
That's exactly what (exposure) bracketing is for. You take three photos at different exposures and merge them.
This is also how a lot of HDR photography is done.
That's one way, but a lot of work for more than one photo. Going from setting adjustment to photoshopping is a huge step-up up in terms of labor and cost.
And that is why you pay a wedding photographer instead of just asking uncle Herbert to take some pretty pictures with his DSLR.
Mind you, the hobbyist uncle may produce great photos an may be a wizard with lightroom and photo shop. But a lot of hobbyists try to compensate with gear for skill, or more charitable may not be even aware of related skills an possibilities.
So in the end the snapshots your friend Andy took with his 2 year old iPhone might look better (have better automatic processing) than those photos Herbert took with his expensive prosumer or even pro kit.
I learned that one convention group I'm a part of converged on very visible glossy white badge cards not just because it is easy to sharpie names visibly on (in classic "Hello my name is" style), but also out of a request from the photography group as it is a very useful meter reference visible in almost every photo of any group member.
Seeing how many great photos come out of the event in sometimes very strange lighting conditions (as you might expect from an event that includes everything from brightly lit outdoors events to indoor conference spaces and theaters to rock show-lit concerts, and everything in between), I'd imagine that little metering trick is doing wonders.
On my A7S2 I simply use the exposure correction wheel in such situations - or use the exposure series function so I can mess around with HDR later in Photoshop.
Yeah I usually just use exposure comp too, it works pretty well most of the time but candids when everyone's moving and the background is changing can be pretty hard without a fill flash to help out a bit. Thank god for iso invariance though.
White people also have higher facial variance in general. I vaguely remember in university we had an assignment to generate “eigenfaces” or something and if you partitioned the faces by race the output of the SVD would be much wider for white people. This isn’t especially surprising when you consider the fact that white people have more light/dark contrast, more hair colors, more eye colors, etc. I think a lot of the “bias” complaints levied against algorithms like this are not bias at all, but just humans who are unhappy the world doesn’t quite live up to their idealized model.
When you sample from a smaller pool you will make uninformed statements like this. Black/Dark people of the world are not limited to Black people in Atlanta. Bet many people here do not know that there are Black people in this world, some who have naturally blonde hair and some who have blue eyes, google it.
Moreover capturing Black/Dark skin and features requires more accurate light metering & lighting because dark skin absorb more light. There's a lot variance in cheekbones, nose and lips.
Humans' features in general, are more complicated then you realize.
Ok but that can still be true (blonde hair, blue eyes) while there still being much more variations in the white population than in the black population.
I'm curious how many white people there are on earth vs how many black people there are, and other races. A couple google searches didn't give me any easy finds
Black people can have blond hair and blue eyes also. Common in Melanesians but not unheard of in African Americans either. I had blonde hair when I was a baby and genetically I'm 83% African (average admixture across all Black Americans). An uncle of mine had blue eyes when he was born
That’s because “white” and “black” are loose, shifting, ideological constructions with little basis in the scientific reality of human genetic variation. Many “white people” weren’t considered “white” until fairly recently and Africa actually has more human genetic variation than anywhere else.
I always cringe when reading about "race" in the US, the term really (intentionally? ) gives the impression that there is a clear genetic demarcation between people based on skin color.
The US is the only country I am aware of who still uses this term, everywhere else was using some thinking like ethnicity to indicate different culture, origin...
The UN pushed from the 1950 to replace race with culture. Many people around the world now say "But he is from a different culture." Instead of "But he is from a different race."
> little basis in the scientific reality of human genetic variation
This meme dates back to a loose claim made by R. Lewontin back in the 70s. In fact, you can very precisely and reliably recreate the "intuitive" human racial categorization using unsupervised algorithms, like doing multi-dimensional clustering over fixation indices. (It does not work using single-dimensional clustering, which is what Lewontin was talking about.)
Modern biologists usually talk in terms of clines rather than races, but this is just using the first derivative instead of the zeroth - you'll get the same result either way.
> Africa actually has more human genetic variation than anywhere else.
SNP diversity has ~nothing to do with phenotypic variance.
Of course the question here is recreate whose “intuitive racial categorization” because all of that is historically and culturally specific. Saying it’s possible for a computer to recreate these categorizations presumes that the categorization has some objective reality outside of this when they’re just a variable heuristic determined by all those inputs.
> all of that is historically and culturally specific
Not really - almost everyone can agree on "middle eastern/north african", "east asian", "south asian", "black african", "white", etc. If you force people to pick a single-digit number of major categories, they're probably going to come up with the same categories that k-means in fixation space would.
This is an evidence-free supposition, consistent with your pattern across this thread of making broad claims without anything to support them. You’ve provided no proof that k-means on a representative sample of phenotypic variation in the groups you cite would return this result.
That almost everyone can agree on these categories is also contrary to reality. For example, many of who you describe as East Asians consider themselves racially distinct both within their societies and from their nearby neighbors. Also, what major categories do mixed race people fall inside?
It's not my job to provide detailed proof on every HN post I make; I'm just pointing out something relevant, and if it interests you, you can go ahead and find where people have already done this. I think I've been specific enough that you can find this stuff on your own. This took me about 1 minute to find: https://www.discovermagazine.com/health/to-classify-humanity...
> many of who you describe as East Asians consider themselves racially distinct
That's why I specifically mentioned the number of racial categories involved. Obviously as the number increases you can have different clustering results.
> Also, what major categories do mixed race people fall inside?
Obviously not into any of them, if we're talking about a simple mechanical classifier with high separation.
The number of racial categories would itself be an arbitrary limit not corresponding to actual genetic variance, nor would classification under such limit capture said variance, and none of it would match up to the folk biology of racial categorization. This is the general problem with reasoning backwards from 19th century gobbledygook about human genetic variation instead of beginning with the genetics themselves.
It may not be “your job” to provide such evidence, but you’ve made a series of specific claims about things like the rate of phenotypic variance among different racial groups. If you don’t want to defend them, that’s your prerogative, but you also can’t expect them to be received as authoritative or remain free of challenge.
> Ok but that can still be true (blonde hair, blue eyes) while there still being much more variations in the white population than in the black population.
Even the man who coined "Caucasian" as a racial category recognized that there was more physical variance among African populations and individuals than compared with Europeans.
You've made the one good point that I've seen in these responses, which is that I think all the pictures were sourced from the US. If black Americans are descended from a relatively narrow geographic region in Africa, that could lead to me underestimating phenotypic variance for black people in general. However, the problem would still exist when the technology is deployed in America.
> some who have naturally blonde hair and some who have blue eyes
I know they exist, but we are talking about statistical properties of entire populations, and these people are very rare.
> Humans' features in general, are more complicated then you realize.
It's not about what I realize - it's about what can be mechanically detected.
What can be mechanically detected is limited to how the data is collected.
You missed this part: "Moreover capturing Black/Dark skin and features require more accurate light metering & lighting because dark skin absorbs more light."
I have to see the images used to train the ML model, to be certain, but based on my experience working in photography and programming, I believe it is more likely than not that they used essentially poor quality images for the training.
Moreover, after the model has been trained, to use the system effectively the facial recognition camera has to be set up to capture both light and dark skin, in the case of dark skin, it typically means not relying on available light alone indoors, an additional camera light must be provided.
The reality is if you want a facial recognition system that accurately detects dark skin it will cost a bit more to do it right.
White people have more facial variance than all the people categorized as not White people? That seems unlikely.
But if it were so, then it sort of invalidates your argument because the bias complaints are that the facial recognition algorithms often misidentifies other races (not generally as white but as perhaps not human)
Specifically the scandal some years ago when black people where being identified as gorillas, it seems obvious to me that if non-white people have less facial variance it should be easier to identify black people as black instead of more difficult.
I thought it was a basic understanding that unfamiliar people, events, places, etc _do_ look alike, because before you get sufficient experience and exposure, you don't have enough skill to know what features are important to focus on and which ones carry no information.
Its a well known thing that people have a harder time telling the difference but I have not seen it mentioned that software would. For example zebras are all uniquely identifiable by their stripes and other zebra and computers could identify them but I would have no hope.
It's true. But that doesn't mean we shouldn't put the effort in to learn, and especially put the effort into our software.
It's also worth considering what saying that implies to the people you're saying it to. Especially when said dismissively. I'm terrible at names, faces, voices, pretty much any way of telling people apart. But that's my problem, and I need to be careful not to push the burden of that problem onto the people around me.
If you say “i have trouble telling people of ___ race apart” that’s different from “you all look the same”. One is a statement of your own limitations, the other is saying your own limitation is actually someone else’s fault.
The first one CAN also be troubling if you have had ample exposure to become familiar, and are indirectly admitting that you did not believe it to be an important skill to learn.
Lots of white people also look the same, I grew up in a mostly Mexican environment, living in the Bay Area now, I find it difficult to recognize lots of white people apart from each other.
I'd say it's an ignorant thing to say in the way you say it. Like you said in this post, you just don't know the features to look out for when you're not familiar with them
No, software people never do anything this controversial. They were just observing that they know of a universal law of nature as described in a recent Quillete article, and/or have second-hand knowledge of a particular quirk of a specific software that is definitely not just the reflection of its creators' myopic world-view.
You might be right in terms of hair or eye color (not too many gingers outside Europe), but Africa has a vast amount of genetic diversity, so I'd naively guess that variance along other feature dimensions would be correspondingly higher too.
If there are technical challenges which would produced bias results...perhaps the answer is to not release the product. Perhaps do more R&D or more training on broader samples until one has a better product to release?
I don't think they are releasing products to make the world a better place. After all this tech is very dangerous. I don't the the ethics of racial bias are factored in to the decision making at these companies.
This is one bias I think benefits folks with dark skin. I'd love to be in the group that doesn't get recognised by this terrifying tech.
That’s not necessarily what you want. “not working” can manifest itself in false positive matches too. Cue “all X people look the same” stupidness.
And the truth is, the danger zone is when the algo is somewhat ok, but a few percents worse on the discriminated group. If it were obviously stupidly bad (like for example any two black people are a positive match according to it) then people will disregard it. But if its just slightly bad thats harder to notice and suddenly your loan/car rental is rejected because the security system says you look too much like a specific known fraudster. And of course they wont even tell you that thats the reason, because security.
I worked in imaging for medical diagnostic support and dark complexions are just plainly more difficult to analyse because of restricted illumination times and limited contrast. Sensors got much better in recent years though.
The indications in question were extremely rare for black people on the other hand. Another use case was the detection of Vitiligo, which is significantly more easy on black people. There was a great demand for therapies because patients are stigmatized for different reasons, especially in Africa because the symptoms can look similar to some deadly diseases.
There is a lot of discrimination in imaging, but it doesn't have to do with race and more with the creation of binary images.
As for facial recognition. I see some use cases for convenience, but it would be far more effective to disallow widespread application because of the numerous unsolved problems. So I see the move from IBM positively, even if they continue their work. In the end I believe face reg to be possible without bias, but I don't have these propped up security needs.
But the fact that the technology is considered viable given this failure mode is what reflects the bias. If a vendor has tech that can only distinguish white faces it will sell it as “detect your best customers and offer them special deals when they enter your store.” If the tech can only recognize black faces it will be sold as “recognize known criminals in the neighborhood and alert police when they enter your store.”
From what I understand, NIST has been running a live "competition" on faces of 9 million people for some time already [0]. They are trying to evaluate racial biases in particular [1]. Was there a reason why you decided to not rely on what NIST was putting out?
I've had this issue with AI's in the medical industry.
At best, you have AI's that can easily recognize pathologies that an average rad could recognize, but are useless when it comes to trying to recognize a pathology that 99.999% of rads would miss. Why? Probably because the system is biased towards pathologies that rads recognize. Why? Probably because that's what's in the learning set. I understand all that, but that makes the AI almost useless in a production setting simply because of the way many healthcare delivery networks are structured with respect to radiology.
At worst, you have AI's that seem to recognize pathologies that an average rad could recognize, but then inexplicably have a horrible miss on an obvious study that any first year radiologist would have caught while half asleep. It's the sort of miss that makes the astute observer wonder if the other vendor's AI, the one that didn't have any misses on your test set, was simply not fed an example study that triggered its blindspot? You start to wonder where the blindspot on that one was? You start to wonder does it even have a blindspot? How can you work around blindspots like this in general? Etc.
But here's the thing, it's never good when you're thinking about mitigations while you're still testing the AI. My first RSNA was over 20 years ago. To this day, we're still hearing the same promises, and the production testing, (it never fails), is still uncovering the same issues.
Now I would have thought that recognizing a human would be easier than trying to ascertain, say, calcification in a DX, but apparently these same issues crop up in a number of different applications of ML based technologies.
I understand that a training set, by the very definition of the word "set", does not contain everything. Obviously, there will be bias towards whatever is in the training set. Essentially, most techniques today, simply train AI's to tell us whatever we tell the AI's to tell us. But for this reason it should be unsurprising that these sorts of AI's work best in settings with well defined domain spaces and are challenged when the domain space is less well defined. (And in either case, these AI's will have an inescapable bias towards whatever was in the training set.)
What does a tumor look like? Is probably too broad a question to give these AI's. What does a stop sign look like? Is probably a question that these AI's could answer relatively reliably. (I hope?) You would have thought that "what does a human look like?" would be closer to "what does a stop sign look like?" But I guess it's not surprising to hear that it seems to trend towards "what does a tumor look like?" in practice.
I agree with you, but IMHO it's trying to solve the wrong problem. The real solution is identifying a major pain point (such as annotation, co-registration, etc) and solving a lot of the small annoying steps to make the radiologist better at his/her job.
Full disclosure: I am a cofounder at a startup automating chest X-ray reporting.
It is true that ML algorithms are almost always trained on radiologist labels on the same modality, and thus take in the reader biases. I also agree that some radiologists are better than others as you imply.
As a patient, one does not know who will read their film. IMHO we as an industry should aim not at beating 99.999% of radiologists. We should merely make products which consistently perform not worse than an average radiologist at a particular institution. It is always thrilling to outperform humans with your software, but at the end patient outcomes are what matters. Those are about consistent performance over a long period of time.
Demonstrating this consistent performance is the challenging part, but it is possible to prove it through sufficiently careful and lengthy prospective trials. That’s what we are focusing on, and I would love to see the other players in the industry do the same.
Is ML result considered as a first/second opinion, or as just 'a quick check' for reference only?
I believe that ML should not be taken in lieu of human opinion. The consensus, be it medical or legal, has to be explicitly human with all the responsibility attached.
Shifting the responsibility for the misses onto a faceless ML is only eroding trust in the professional opinions and cementing the biases.
I fully agree that treatment should be prescribed by a human doctor who can explain and answer questions. However, I would not agree that each of the data inputs to the treatment decision should be generated manually. That is already not the case.
This is the same argument that is often put forward in relation to accident rates for self driving vehicles. ML only needs to outperform humans.
The problem with this argument is it glosses over that fact that in the tail, where the ML is making a wrong decision, sometimes a catastrophic one, the behaviour of the ML algorithm is not well understood. How can we deploy something in such safety critical applications that we do not fully understand?
Excellent point. That is what the lengthy prospective study phase as well as periodic auditing after deployment are for.
Please also note that there are several important differences as compared to the automotive industry. First, one could argue that the task at hand is trivial as compared to the self driving car. We are operating in a heavily constrained setting with much better understood data inputs and a hundred-year history of medical professionals trying to classify and systematize them. Moreover, our task is not time-critical. It sometimes takes more than a week for such an image to be reported on.
Improving the training set is the way to go. Like others have said, maybe we should attempt to improve contrast of darker faces before training (which won't be easy as well).
That said, a non-diverse facial dataset in a diverse society like the US is simply useless. It doesn't help saying the AI is suffering from a human bias, and dropping these projects entirely, unless they are being used for a malign purpose like what China does.
I would hope that on HN one would take the time to explore the technical issues before even suggesting bias (which implies human racial discrimination of some sort).
I am going to venture a guess that there's a large audience in the image processing/AI/ML world that lacks a fundamental understanding of image sensor and lens technology. I have never seen mention of concepts such as well capacity, quantum efficiency, noise floor, dynamic range, thermal noise, gamma encoding, compression induced errors, etc. in most work I have reviewed.
Sensors used in the general class of imagers found in these experiments are nowhere near adequate to capture the full dynamics of a lot of real life images. The lowlights (referring to the lower portion of the dynamic range of a camera, encoding, compression and image processing system) can be some of the most challenging portions of the dynamic range to get quality data.
The old idea applies: Garbage-in, garbage-out.
It should come as no surprise that algorithms trained on (likely) bad images with bad lowlight detail will fail to deal with people of darker skin. It's almost a given. One can't assume cheap cameras and the data sets produced with these cameras will see the world the way our eyes are able to. Not to mention the fact that we have something called "understanding" while classifier systems have no clue whatsoever what they are looking at, all they can do is put things in buckets and that's that. In other words, there is no inherent comprehension of what a human being might be versus a bear or a teapot. That's a major problem.
The answer isn't to give up. The answer is to understand and then go back and do it right. This isn't going to be cheap and it will likely require rethinking how we build and train these systems.
As a tangentially related data point, I have three German Shepherd dogs. Two are the traditional black and brown coloring. The third is 100% black. It is virtually impossible to take a good picture of him. In anything but the right lighting he shows up as a dark amorphous blob. For all the prowess of the mighty camera in an iPhone 10, you'd be hard pressed to use those images to recognize him as anything other than a blob on a dark couch.
I do have access to high performance images with far greater well capacity as well as 100% uncompressed data output. In that case there's usable data in the lowlights that, through gamma and LUT manipulation can be extracted. When you do that he quickly goes from looking like a blob to looking like a happy dog.
Anyone interested in learning more, I would highly recommend looking up Jim Janesick:
The popular phrase "he wrote the book" applies here. Jim's books on the subject of image sensor technology (science and design of sensors) are the reference work anyone in imaging studies. He designed so many sensors for space applications I am not sure he even remembers how many. I was fortunate enough to study CCD and CMOS sensor design under him a couple of decades ago.
ML has to start with good data. Inadequate sensors coupled with compression and other processing artifacts leads to bad data, a formula for failure.
You wrote a lot of text but missed the central point. Cameras that can capture dark skin exist. This is a problem that human researchers just shrugged off or ignored. You might say well, can't do anything about it, our training corpus is full of these bad images. Maybe we blame the camera manufacturers?
But that's still a cop out. You can't use models with these problems. It gets you disasters like face unlock that doesn't work for black people. This happens even if people aren't white. Samsung famously released blink detection that didn't work on a lot of East Asian faces. The products are broken because the white bias is the default and pervades everyone's thinking.
What I am saying --which is absolutely true-- is that there's a lot more data in the mid-range of a gamma-encoded image (which is 100% of the images produced by everyone doing this work) than the low-lights. This means that the local dynamic range in those regions is different. Which means that operations such as edge detection will be more accurate and could make the difference between something working and not.
Vision researchers should take a class or two in cinematography and photography, it would serve them well. Even the quality of the lens makes a difference. Most work I've seen out there uses cameras that barely pass as security cameras or webcams.
That said, yes, ML needs to work with crappy images and every single camera out there. My argument is that you are not going to be able to train using crap data. And the images in a data set would be crap if the data --the images-- were not acquired using cameras and techniques that provide enough data across various segments of the dynamic range.
Again, I gave the example of my black GSD for a reason. You are not going to be able to recognize him as anything other than a blob on a couch without a camera that can capture enough data at the low end of the dynamic range and a system trained with that data.
The fact that Samsung (or anyone else) failed means nothing. In order for that data point to be meaningful you'd have to have intimate knowledge of what they were doing and what capabilities they had, both in terms of science and engineering as well as the consumer hardware they developed.
I have competed against multi-billion dollar multinational corporations who, despite their financial prowess and scale, could not design their way out of a paper bag. They don't understand the problem, lack creativity and, most importantly, absolutely lack the passion necessary to solve it. Ten 9-to-5 engineers can't compete with a single engineer passionate enough to devote every waking hour to solving difficult problems. It doesn't matter how much money you throw at them, they just can't perform.
This doesn’t help when the variance in feature space of white people’s faces is objectively a lot wider than the variance of black people’s faces. You can have the best camera in the world, but it’s not going to change the fact that someone from Ireland looks way more different versus someone from Italy (in any mechanical basis) than someone from Cameroon looks versus someone from Malawi.
Objectively? You have to show some evidence, otherwise this just sounds just like "yall look alike to me".
To add to this considering that genetic variations between peoples on the African continent are larger than the variations to anywhere else, this is also very unlikely. I think it's a reasonable assumption that genetic variation and variation in physical features are somewhat correlated.
Yes, I wonder if this is because in the political climate in the US some people feel more empowered for statements like this or if the number of people holding these believes is actually increasing (at HN)
I think GP may be reacting based on common techniques of obscuring faces - i.e. ninjas wearing black masks, special ops putting dark paint on their face to obscure details. I think the effectiveness of those techniques leads people to think that darker colors are harder to see or make out. But that analogy breaks down because here we're talking about fully lit situations, right?
> genetic variations between peoples on the African continent are larger than the variations to anywhere else
You are misinterpreting this, as does almost everyone else. Phenotypic variance is almost completely orthogonal to number of unique SNPs (which is what this generally refers to).
> genetic variation and variation in physical features are somewhat correlated.
This is not the case. You can have narrow population bottlenecks (reducing SNP diversity) followed by high variance in selective pressure. This is exactly what happened to early Europeans. You had a small initial population spread out to inhabit a large variety of ecosystems.
This is a farcical argument. Facial recognition apps aren’t looking to distinguish gross features like Irish vs Italian features (whatever those are). They can tell two brothers from Ireland apart from one another, they should be able to tell two brothers from Malawi apart too. Their mother probably can, so why the hell shouldn’t their phone?
Is this really the case though? Feature wise it's not just the color of the skin that matters, and populations across Africa can look very different from each other's even to the human eye.
The first mistake is to believe the definition of "bias" requires intent to harm. It doesn't. Any systemic error is "bias", even if it's just your thermometer always reporting 2 deg more than the actual temperature.
The second issue is believing some technical arcana are a valid excuse for selling products with such errors. It isn't! If your product reinforces entrenched discrimination, you either fix it or stop selling it.
Think of this from the perspective of the end consumer of the model: am I willing to buy new high-quality capture hardware for the target production environment, potentially discarding an existing investment?
In my experience, they rarely will even consider it.
Therefore, the training data should have the same technical shortcomings as what will be used in production.
> Therefore, the training data should have the same technical shortcomings as what will be used in production.
I think training and the application of the trained system are separable. Nothing is accomplished by training with data sets that lack data or detail. Inference or classification is impossible or deeply impaired by the lack of data.
As a hypothesis, the solution is likely to involve training one network with good data and a separate network to be the interface between the first network and the imperfect perceptual data in real-world applications.
At the end of the day AI/ML need to leave the world of classification behind and move on to the concept of understanding. This is not an easy task, yet it is necessary in order for these amazing technologies to truly become generally useful.
We don't show a human child ten thousand images of a thousand different cups in ten different orientations in order for that child to recognize and understand cups. The reason for this is that our brains evolved to give us the ability to understand, not simply classify. This means we need far fewer samples in order to have effective interactions with our physical world.
The focus on using massive data sets to train classification engines is a neat parlor trick, yet it will never result in understanding and is unlikely to develop into useful general artificial intelligence. The problem quickly becomes exponential and low quality data becomes noise. We need to develop paradigms for encoding understanding rather than massive classification networks that can't even match the performance of a dog in some applications. As I said before, this is a very difficult problem. I don't think we know how to do this yet. Not even sure we have any idea how to do it. I certainly don't.
A picture with a white college kid in the frame would get the output of "human". Put an African borrower in the frame and you got at best a failure to recognize a human, and at worst a reference to an animal.
I would hope the situation is much better now, but the bias (and just sheer inaccuracy) of the tools was readily apparent and we gave up on image recognition for the time being.