Hacker News new | past | comments | ask | show | jobs | submit login
AI Recognises Race in Medical Images (explainthispaper.com)
128 points by stuartbman 9 days ago | hide | past | favorite | 334 comments





Please forgive me for asking a controversial question (particularly so early in the morning), but if there are all of these biological correlations with race, what does it mean that “race is a social construct”? Is the idea that black people have greater bone mineral density (per TFA) due to social or environmental causes (e.g., diet)? For what it’s worth, I’m a staunch egalitarian and I don’t see that changing either way.

EDIT: Really pleased with the largely constructive conversation in this thread. Was worried that this was going to be coopted as an ideological flame thread. Thanks for the insightful answers and good faith engagement. Keep up the good work!


From what I understand, the definition of a social construct is a bit nuanced, and it's possible to both believe race is a social construct and that it correlates with things.

As a metaphor, let's say we create two categories in the world for all people - tallers (people above six feet in height) and shorters (people below six feet in height). Human height objectively exists, but these categories are social constructs. Likewise, human variations in genes based on ancestry clearly exists, but the discrete racial categories we define (black, white, asian, etc.) are social constructs since we could create other discrete categories (Irish, Slavic, etc.).

So saying race is a social construct does not mean your genetic make up does not matter or correlate with anything, but that grouping people into the set of commonly agreed upon races is not the inherent way it has to be. At the same time, these groupings do represent distinct genetic make up and so correlate with physical attributes. It's just that different groupings with different correlations are also possible.

This video explains it pretty well IMO: https://youtu.be/koud7hgGyQ8


So if you were going to create a test for "Is this a social construct?" What would it be?

It seems like the definition from your comment would be "Could we take this labeling system, and define different labels or the labels differently?"

I watched some of the youtube, and in the thought experiment she proposes she take a continuous trait(height) and arbitrarily splits it into two buckets. And talks about how this is kinda silly. But this is something we do all the time, with hypertension, diabetes, disabled, Alzheimer's, the 1%, capitalism, Canadian, etc...

And many of these categories are far more continuous than something like sex and gender which as far as categories go are pretty discrete.

Maybe these things are social constructs, but if they are then we surely must come to the conclusion that almost everything we care about in the world is a social construct.


> Maybe these things are social constructs, but if they are then we surely must come to the conclusion that almost everything we care about in the world is a social construct.

That's the point really. And it's not inherently a value judgement to say something is a social construct. Using it as a value judgment usually is meant to convey that some secondary attributes are not inherent, or even more so, just historical by-products and might need rethinking or recontextualisation. Being a woman and being of the female sex might seem to be identical, but womanhood is not the same as having a certain genomic makeup, and therefore womanhood is obviously a social construct. The video above even outlines how female-Ness might be considered a social construct but I'm not smart enough to explain that to you where abigail will do it much better.


I guess another way to put this when someone says "race is a social construct". What are they trying to communicate about race?

Well, when you say something is a social construct it opens the door to find explanations for why that social construct has arisen. You open the door to questioning why those specific categories have been chosen. Why is it that in the early 20th century Italian was a race and now those people would be considered white (in America). It turns the conversation away from what is distinctive about a group and towards why we regard those distinctions as important.

> What are they trying to communicate about race?

That the hard racial boundaries are unscientific (despite attempts to make it so), and that it's artificial and depends on consensus, and therefore is subject to change, and is typically subjective to the involved social group: the "one drop rule" is an example of how arbitrary this can get.

As sibling pointed out, in the United States, Italians and Jewish folk used to be not white, but are now widely considered to be white, despite not having any genetic or cultural changes in the interval between the shift in categorization.


That racial categories and the criteria for deciding which category an individual belongs to depends on history, traditions and politics. There is no objective criteria to decide what constitutes a "race".

In my experience, the implication is that things which are social constructs have no basis in reality and are thus not useful (and their use is harmful). I'm often inclined to respond, "money is also a social construct", but I'm afraid they would agree and proceed into some anti-capitalist, utopian diatribe.

> I watched some of the youtube, and in the thought experiment she proposes she take a continuous trait(height) and arbitrarily splits it into two buckets. And talks about how this is kinda silly.

Something can be continuous but still very clustered, and the extent to which it's "silly" depends on the uniformity of the distribution. It can still be useful to label the clusters in the distribution.

Another reason to subdivide a continuous dimension is that there could be a threshold after which some other dependent variable begins its inflection point (think of a hockeystick distribution). For example, there's probably some blood pressure value beyond which we begin to rapidly see serious adverse health effects--it's useful to call this "hypertension" or something. For another example, there's a threshold for the average temperature of the planet beyond which global warming "runs away". These are useful thresholds even though the dimensions are continuous.


A lot of the angst around "social construct" is because people think "X is is a social construct" means "X is imaginary" or "X can be whatever you want it to be". Just as an example, legislation and the judicial system is clearly a social construct, but that does not mean it is not real!

This reminds me of Astral Codex Ten's (fka SSC) series on the ontology of psychiatric conditions. Basically, you're often faced with the question of whether something is a binary condition you have or don't have which then translates into a more continuous expression of symptoms, or whether it's more like a spectrum where the extreme outliers are deemed "ill" and the rest is just character differences.

It's interesting to think about the various ways to model this, and in many of your cases, the most reasonable model probably depends a lot on context or what it's used for/what statement you're trying to make.

https://astralcodexten.substack.com/?sort=search&search=Onto...


Thanks for the great explanation. The height analogy made it click for me.

Also, passing would work for height. Suppose you're 5'11 but your friends are around 6'1 and you were close friends in kindergarten (when you were all shorter of course). You begin wearing platforms that add about an inch to your height. You hang out with other tallers and so people just assume that's what you are, they likely don't insist on measuring you in bare feet at every opportunity. Laws affecting this social construction of "tallers" have much less impact on you than might seem to be the case looking at a medical chart. You would be said to "pass" for a taller.

Passing was (and to a lesser extent remains) crucial in the US and many places where variations in appearance were crucial to racial assumptions. If you look white and you act white then, most of the time to a first approximation you are white. But of course this opportunity is much more open to a relatively pale-skinned person than to a dark-skinned poor person which is a further problem on top of the problem that now people are lying about who they are.


One of the other things to point out is that race doesn't correlate all that well with genetics in humans. People who would be described as black have so much genetic diversity that there can be no reasonable argument that there's a genetic basis for their racial label. Certain health conditions or biological differences often have to make far more specific categories than race.

Not sure why downvoted, this is correct. The rest of the world was populated by small (i.e. less genetically diverse) groups who came "out of Africa". So "Africans are more diverse genetically than the inhabitants of the rest of the world combined"

https://www.geneticsandsociety.org/article/study-africans-mo...

https://blogs.bcm.edu/2018/07/19/genetic-diversity-in-africa...

there are "Larger Genetic Differences Within Africans Than Between Africans and Eurasians" "and the genetic diversity in Eurasians is largely a subset of that in Africans, supporting the out of Africa model of human evolution"

https://www.genetics.org/content/161/1/269

The highest genetic diversity is actually among people who are small subgroups of "Africans" - Khoisan not Bantu.

https://www.nature.com/articles/ncomms6692

https://www.newscientist.com/article/dn24988-humanitys-forgo...

https://pubmed.ncbi.nlm.nih.gov/25471224/

So, genetically, just as you can't draw a line around "bony fishes" without also including land animals who are also descendants of bony fishes.

https://www.sciencealert.com/actually-there-is-no-such-thing...

Also, you can't genetically call "Africans" a well-defined sub-group of humans, excluding others.


But with the taller/shorter, you would commonly have two tallers producing a shorter, or two shorters producing a taller. If we were talking about husbandry, we would say that tallers and shorters don't "breed true".

But with race, you never have two white people produce a black person, or vice versa (and no, a black person with albinism is not a white person).


"never" is too strong here. Two black people occasionally produce a white baby without albinism. There's a lot of genes that code skin color and you can have a bunch of recessive "white skin" genes and still be very black-skinned yourself. Then sometimes your child gets all the recessive white-skin genes from both parents.

I had a personal friend who was white with two black parents, not albino, and genetic tests proved paternity.

However, can I ask what the point of your hypothetical is? I'm not sure what message or conclusion I'm supposed to get from it, in context of the discussion.


Skin color isn't race. I would expect this AI to detect your white-skinned friend as black because all his other traits would be black-like.

Well, you would be wrong. Because this AI predicts self-reported race, not ancestry.

Think about what this means People who have, say, three black great-grandparents and five white ones, will vary widely in how black they look, and this likely affects how they identify. But there are also invisible markers of ancestry - of the sort you would expect maybe were visible only on an X-ray if you looked carefully. Those could easily break the other way. You could easily get someone who looked white (and thus, probably identified as white), but were "black on the inside", in terms of their less visible characteristics.

Yet the algorithm sees through all that, and manages to see what you feel like? Correctly classifying Shaun King as black and Tom Jones as white? From a 8x8 X-ray picture?

The people who insist that "race is real" should be the most confused by these results, since we know how fuzzily identity is coupled with ancestry, especially in the main groups studied here, black and white Americans.

I'm much more prepared to believe, for instance, that there is something catastrophically bad going on in medical image dataset collection, than I am that self-ID race is nebulously predictable from almost nothing.


You're really missing something obvious here. I don't know if it's the statistical nature of ML or the statistical nature of correlation or what. But the fact that everyone else can see how it probably works while you can't should be a red flag that there's a hole in your understanding somewhere.

Or maybe it's you who should try training a couple of image classification models. I'm sure I've trained far more than most people who are blazé about this.

Why don't you answer the argument instead of trying to convince me it's risky to disagree with the crowd?


The obvious answer is that self-reported race is correlated with many biological traits throughout the body. This is already known. It's no surprise that ML can detect those differences. If somebody's self-identification is abnormal (doesn't know their ancestry, say), then sure, the AI might see through it. Conventional measurements could see through it too.

Well, if they are black people with white ancestors(or white people with black ancestors), then they aren't the people I was thinking about. Almost certainly, the only reason your black friends produced a white baby, is because they had white ancestors at some point.

If two doberman pinschers had a puppy that looked like an English bulldog, it would be strange and newsworthy. But, if the two doberman's actually had grandparents that were English bulldogs, the mystery would be fairly easy to solve.

At least part of this confusion is associated with the culture of race(at least in the US) as opposed to the genetics of race. For example, we consider Barack Obama black, but he is equally as white as he is black. There's no genetic basis for making that kind of determination.


> Well, if they are black people with white ancestors(or white people with black ancestors), then they aren't the people I was thinking about.

Africa is very, very, very genetically diverse compared to the rest of the world. I don't think there exists a population which doesn't contain genes for lighter skin.

I think race labels make some sense in a social/cultural context: In America we can call someone "black" when they are 75+% white, because for their entire childhood/life, socially, society at large treated them similarly as they treat people who are "100% black".

But race doesn't make sense in a genetic context. It's probably far more absurd than defining the difference between an accent vs dialect vs language. Even though there are clear differences between individuals/families, the boundaries are absolutely arbitrary.


The heritibility of human height is 79% (https://www.nature.com/articles/d41586-019-01157-y). The heritibility of human skin color is 82% (https://www.nature.com/articles/emm201228).

That's relative lightness/darkness of skin pigmentation of the butt of Mongol people. Of course there is variation even among races, even among family members of the same race in skin tone.

Which do you think is more common, based on your quoted stats?

a) To see white parents who have a genetic child who has black skin, or black parents who have a genetic child who has white skin (disregarding albinism)

or

b) two tall parents have a genetic child who is short as an adult, or two short parents to have a genetic child who is tall as an adult


Whats the point?

The point is showing that the heritability of skin tone and height are about the same is meaningless when talking about skin tone differences among races.

Literally, news articles are written and go viral about "black couple has white baby", whereas there has never been an article "Two 6'3" people have an adult child who is only 5'5""

If you had a series of 100,000 parents+genetic children paraded in front of you, do you think it would be more common to see two parents who are both rather taller or both rather shorter than their child, or would it be more common to see two parents who are appear to be of one race, but their child appears to be of a different race?

If your answer isn't "I'd expect those things to happen at about the same rate", then you should question what exactly the sources are saying for the person who posted about the heritability of height vs skin tone.

The feeling I get is that many people really, really don't want race to be real. Because it if it isn't real, then you can't say there are differences among the races. So, they will argue against common sense and try to say things like height is just as heritable as skin tone as a rebuttal to the fact that pretty much always, two white parents don't give birth to black babies, and two black parents don't give birth to white babies.

I see the same kind of mental logic at play with LGBT-supporters, where there is a strong insistence that being gay is genetic, and not a choice. That way, you can't chalk it up to lifestyle choices that you can just change. Personally, I don't really see why it matters whether being gay is genetic or a choice, because there is literally nothing wrong with being gay, whether you are born that way or whether you "just" choose to be that way.


I admit that my previous comment was tongue-in-cheek; I found it interesting that the two numbers were so close. To make up for it, I'll try a more serious response.

I suspect you are making a category error comparing "two parents who are both rather taller or both rather shorter than their child" and "two parents who are appear to be of one race, but their child appears to be of a different race".

To put the two into the same category, you could compare "two parents...taller or shorter than their child" and "two parents...lighter skinned or darker skinned than their child"---two aspects that are single traits. And no, I wouldn't be crazy surprised in either case, unless you were specifically thinking about Robert Wadlow and Zeng Jinlian giving birth to Chandra Bahadur Dangi or two parents of Danish descent giving birth to a child with a Bantu skin color. (And no, I don't know the relative frequencies of such events.)

The other way to resolve the category error, if you prefer to compare bundles of traits, would be two people of Italian extraction giving birth to a stereotypical Irish child. That sounds pretty unlikely, especially prior to modern travel and migration patterns.

But the real, underlying question that is the base for "many people really, really don't want race to be real" is, "So what?"

Yes, human beings tend to share traits with other closely related people. So what? Individual variation is pretty large, too.

Historically, Italians and Irish were considered to be different races. Not so much today, because "race" is a social construct and the difference between the Irish, southern Europeans, and the canonical northern European isn't a big deal today.

Races are defined to be a way of applying a group of conclusions, which may be difficult to perceive directly, to an individual who has an easily perceived marker for the race. That can be more-or-less neutral to somewhat pernicious. ("You are Asian, therefore you must be lactose intolerant!" Well, maybe.) Or it can be straight up evil, especially if the conclusions you are making are simply made up to enforce your superiority to the individual.

As a result, race is either not real, or real but completely uninteresting. Any other option is intellectual laziness at best and at worst....well, it leads to poor outcomes.

Now, neither you nor I particularly care whether homosexuality is genetic or a choice, but I hope you can see how someone who has to respond to "So just don't be gay!" might prefer one over the other.


> And no, I wouldn't be crazy surprised in either case,

With all due respect, I don't believe you're being honest here. And that goes to my point about people wanting race to not be a social construct. I believe you're lying to yourself(or just lying to me) that you wouldn't be crazy surprised in either case.

> But the real, underlying question that is the base for "many people really, really don't want race to be real" is, "So what?"

Exactly. You don't want race to be real, you think it might be misused if it was real, and so you argue that it isn't real. That is not a compelling basis for an argument. A good argument for the earth being round is not that you're scared of people punching you in the nose if you say it is flat.

> Now, neither you nor I particularly care whether homosexuality is genetic or a choice, but I hope you can see how someone who has to respond to "So just don't be gay!" might prefer one over the other.

Yes, I can see how they would prefer that. But it has no bearing on the reality of whether being gay is genetic, or a choice, or possibly both, and really has no place in scientific analysis of sexuality.


Ok, so race is real.

What races are there? Is Irish a race? How, in fact, do you meaningfully define a race such that there aren't 20 races in sub-Saharan Africa for every one everywhere else?

And then, what do you do with that information? I'm Irish, she is Southeast Asian, Ted over there is Mayan-German. Does it improve life in any significant way?

If you wish to examine a granfalloon, just remove the skin of a toy balloon. — Bokonon


You're right. Thinking is too hard and scary, let's go shopping!

Maybe if diseases are distributed differently across different races, it can make testing and treating them more cost effective, leading to an overall improvement in health outcomes. I'm not going to waste my sickle cell anemia test kits on testing Icelandic people.

Do you apply this same logic to people who study esoteric branches of mathematics with no real possibility of improving life in any significant way?

This all goes to my point that you don't want race to be real, so you argue that race is not real. Maybe it is convincing for you, but it is absolutely meaningless to me. It would be great if the universe worked that way though, all we would need to do is close our eyes and pretend cancer doesn't exist, and it would just go away!


OK, I'll speak in terms you might get.

Humans have a myriad of visible characteristics: height, weight, skin color, eye color and shape, hair color and type, shapes of facial features, and so on.

All these characteristics vary continuously.

If you see that data as points filling a high-dimensional cube, there's not going to be an empty space there.

Some areas are going to be denser than the others, but there are no gaps there.

What you try to do with "race" is you're trying to cluster this data.

But there really is only one cluster. Might as well call rand() a million times to get a bunch of points in 0..1, and cluster that.

Oh, but you've see black people! And white people!

Well yeah, but it's all those people filling the spaces in between any two points that make it impossible to draw the line.

The only way to draw the line is to make a call on where to draw it — that is, to make an arbitrary choice. Without it, your clustering algorithms would fail.

Yes, there are high-density peaks on this data, especially if you look at any single characteristic.

Yes, you can separate the peaks. But deciding on where to put the the threshold is choice — a social construct — that can leaves a lot of points without a "race" label (which race is Irish - Mexican?) and/or change which peaks make the cutoff (are Armenians a race, or noise in the dataset?).

>Maybe if diseases are distributed differently across different races, it can make testing and treating them more cost effective

The scientists have two options:

A) Look at the original data which you used to assign the race label (skin color, hair type, etc), and see if there's any correlation of that data with diseases

B)Look at the data, cluster it using an arbitrary choice to be able to get more than one cluster, ignore a lot of people below the threshold, assign the labels, IGNORE THE RAW DATA, and then look for correlations between labels and diseases.

Which approach do you think is more scientific?


Do you believe that life begins at conception? Or at birth?

Or is it somewhere in between? Do you feel comfortable marking the line where life possibly begins? If you mark it before birth, aren't you just giving ammunition to anti-abortion advocates to take away freedom from women? Should we just say that life begins at birth, and shut down anyone who asks otherwise, because anything else is dangerous and possibly arbitrary and not 100% accurate in rare cases?


You seem to be confused.

What's the continuous variable here that you are measuring to assign the discrete label (alive/not alive) to?

If you want to do that based solely on one variable, time from conception, then you get bad science. People can become dead at any point in time.

Yes, it's not possible to assign the "alive" label based on time from conception alone.

We do have plenty of discrete characteristics based on which to assign the alive/not alive label.

The Bible, for example, assigns the discrete label "alive" based on discrete label "breathing" in at least one case[1].

The definition I use is "does not need to be within a human body to sustain activity in brain cells".

Hope that makes things clear.

[1] https://www.biblegateway.com/passage/?search=Genesis%202%3A7...


Right. A bunch of variables all exist whose values let us ultimately apply the label of alive or not alive. So how can one pick the right set of variables and weight them appropriately to determine if a fetus is alive? Isn't it kind of arbitrary at a certain point? As in, we know when something is definitely not alive (when it is gametes in separate humans), and we know when something is definitely alive (when a baby is born and it is crying), but anywhere between those two is just an arbitrary choice.

You on your own definition of life would apparently want to restrict the rights of women relative to what they have currently. This is a dangerous idea and should not be allowed.


Yes, it is kind of arbitrary at certain point, that's why we have the debate about it in this country. I am not sure where you are going with this given the subject of the comment I was responding to.

I chose my definition from the rights-of-women perspective. My understanding is that a fetus is generally non-viable outside of a host body, and thus not "alive" by my definition.

On the other hand, we havd C-sections and incubators. If a baby can be safely extracted with a C-section, placed in an incubator, and survive, my understanding is that the choice to abort is no longer available.

Of course, I can be wrong here.

My point was that at least I can make a definition here that does not depend on a choice of an arbitrary value of a continuous variable. My definition of "alive" depends on the choice of which variables to look at, not on arbitrary thresholds. And I don't insist on it being The Truth.

Another example:

"Heart rate" is a continuous vafiable, but the distribution of its values has a large gap between 0 (no heart rate) and nonzero values (the lowest observed was 27bpm). So you don't need to make a choice for a clustering algorithm to work. You can run K-means on "heart rate" and get these 2 clusters: 0 and everything else.

This allows one to make a definition of "alive" based on heart rate. That's not my definition, but it's a usable one.

This is not feasible with race if we use the variables commonly understood to be associated with race: skin color, height, nose shape, etc. There are no gaps in those variables.

Even eye color varies continuously [2], and it's not clear how to assign labels [1].

Color in general is a good analogy for race. You see the colors vary in a rainbow. You can tell the difference between red, green, and blue.

But you can't run K-means on a rainbow to get 7 colors out. Or any number but 1, for that matter.

You need to make a call yourself where to draw those lines.

This reflects in languages. Russian has distinct words for what we'd call "blue" in English; But also English has words like cyan, turquoise, navy, etc., which other languages may not have.

Color, in end, is a social construct.

[1] https://www.edow.com/general-eye-care/eyecolor/

[2]https://udel.edu/~mcdonald/mytheyecolor.html


Esoteric branches of mathematics do not have a long and ongoing history of being used to marginalize and take advantage of groups of people.

And I notice you cannot answer any of my questions as to how you would define races.


Ok, so again, that is my point: Race is too dangerous to study, the bad people might make the wrong conclusions with it.

I have no idea how you would define race. I don't know why you would expect me to. I also have no idea how you specifically define certain other aspects of biology that you probably accept as real and accurate, because I'm not a biologist or a geneticist.


That's actually an interesting observation, though perhaps a bit besides the point for an analogy that wasn't meant to be perfect. Regression to the mean does not seem to apply the same way to skin color, even though presumably it is also a polygenetic trait to quite some extent.

Even disregarding albinism, it's possible for people of northern european descent with fair skin to have darker skinned children. For example, in a case where the parents have germ line mutations that they pass on to their child, there could be (improbably) a collection of mutations that greatly increased the quantity of melanin. However, skin color is a complex trait with many small differences coming from a large number of different genes, so large changes in skin color are very unlikely. Nothing of what I am describing is albinism, and would have been part of the process of natural selection for fair-skinned people in northern climates with less solar exposure.

IMO it's because it's very dubious the AI is doing what the paper is maintaining it's doing.

People are not 'race x' or 'race y' biologically, as if race is some discrete set of features common to a whole population. Every individual has a set of biological features inherited from their ancestors which, theoretically, could include any or all so-called 'races'. Human beings have a continuum of features that is heavily interlaced amongst all the 'races'.

Putting it another way, if we were alien visitors, and had in front of us a representative sample of dead bodies from the entire world, we would be hard pressed to sort those bodies into 'races' based on biological features.

For example, we currently use melanin levels as a key indicator of 'race' today, but an alien, lacking the social context of the significance of say, high levels of melanin, may well consider it a secondary feature since its shared with otherwise unrelated people


>> People are not 'race x' or 'race y' biologically, as if race is some discrete set of features common to a whole population.

Clearly people are biologically different based on race and the AI here is picking up on that. My kids orthodontist even told me they align teeth in part based on race. The Asian arch is flatter across the front for example. I asked about this because an engineer I worked with had a father in dentistry and told me my kid had "German teeth in an Irish mouth" which matched her ancestry, which he didn't know - just said that in response to my description of the crowding.

So YES, races have biological differences. If not, we wouldn't be able to tell where people are from. I get that it's not cool to discriminate based on race, but it's not OK or even practical to deny that it exists (see dentistry example above).


I think you're missing the point. "Race" is an attempt to put people with different biological traits into 'buckets' but in reality there's variation, overlap, and blurry lines that make any sort of classification a social approximation at best.

No one is going to argue that you get your traits from your ancestors and that regional groups have similar traits due to shared ancestry, it's the classification itself that doesn't match up well with reality.


I agree, but is it accurate to say that our concept of race amounts to labeling different clusters in an N-dimensional space (where the dimensions are largely biological traits)? I.e., while labeling clusters always entails some amount of approximation (clusters are inherently nebulous), we still label clusters because it's useful. For example, the boundaries between languages are imprecise, but we still bin/label them because it's useful to do so--is it roughly the same with race? Or is race categorically different for some reason that I'm not understanding?

I recently read a (non-fiction) book where exactly that kind of clustering was talked about. It's correct that you'll find clusters in each major continent, but with just 6 clusters one of them will be entirely devoted to a small, insular community in the north of Iran. To your eyes and our social consciousness this community would be considered "middle-eastern" when in reality they're about as genetically as different as "Black people" and "Asian people" are.

Since they're a small population, they don't matter for making generalizations about middle-easterners or whatever other larger classification you use.

I didn't make any comments on the utility of racial classification, just that the classification isn't perfect, leaves groups out, and where you decide to draw the lines between them is often arbitrary. Race is an oversimplification of a complex system, but even an oversimplification has value.

Seems like you have a pretty good grasp on my assertion here.


Racial categories are based on a very small number of superficial traits (such as skin tone and eye shape) and a host of cultural considerations (such as region of birth, language spoken, etc. etc.) You can group people this way, but it's not really an approximation to anything. Nor is it useful in a non-circular way. The only real reason to keep track of people's race is to ensure that...they're not being discriminated against on the basis of their race.

> Racial categories are based on a very small number of superficial traits

Right, but genetic correlates are real. These superficial characteristics evolved along with a thousand other non-superficial characteristics, in mostly (with the exceptions of conquest, trade, and border settlements) isolated regions.

Your skin tone and eye shape are indeed cosmetic trivialities, but they correlate strongly with muscle fiber density, susceptibility to certain diseases, endocrine profiles, and a host of other things that very much do matter.


Yes, no-one is denying that racial categories correlate to various degrees with genetic characteristics. This is blindingly obvious – unless there are some people who think that ethnically Chinese people learn to have epithantic folds from their environment?

Where people err is in assuming that these sorts of trivial observations are sufficient to show that race has a biological basis. Racial groupings are not biologically natural. They are completely arbitrary from a genetic point of view. That is, there are no sensible biological criteria according to which humans can be grouped neatly into a handful of 'races'.

See for example the following comment regarding sickle cell anemia: https://news.ycombinator.com/item?id=28525697#:~:text=azalem... Yes, 'black' people are statistically more susceptible, but that's not because of any property of 'black' people as a group. It's just because regions with Malaria happen to have dark skinned populations.


Our concept of race is way too course-grained to make it useful in this sense. And those clusters of features you mention would not correlate to what we consider a race.

Indeed. I heard one of the PIs on the 100,000 genomes project [1] giving a talk in which they flat-out said that anyone whose four grandparents were white and irish is basically a clone compared to anyone whose four grandparents were from sub-saharan africa. The whole point is that there's so much variability within each societal clustering that it tends to make not that much sense to talk about it -- and the degree of homogeneity is different, too, mostly depending on ancient geography (hello Iceland, as an example).

There are lots of "exceptions" to this, like sickle cell anaemia, for example [2], which is used as a teaching example of an Mendelian autosomal recessive disease. But note that it goes hand-in-hand with a historical pattern of malaria, covering a fairly large and inhomogeneous blob of africa, the middle east, italy/turkey/greece and india. Our social construct of race varies quite substantially over those places.

[1] https://www.genomicsengland.co.uk/about-genomics-england/the...

[2] https://en.wikipedia.org/wiki/Sickle_cell_disease


So which "race" is the child mentioned above, German or Irish? Sure, they have aspects of both, but they can be in only one cluster.

The utility of clusters is nebulous, too.


> but they can be in only one cluster

I don't think this is true, logically or socially. We have dense clusters that are slowly merging as the world enjoys this extremely new concept of travel and intermingling world populations. I've seen many people describe themselves as "mixed race", directly or indirectly by describing their ancestry. Of course, the number of basis vectors required to accurately describe a person is increasing with time, but it seems that medical science has chosen to ignore this "easy" way to describe and treat people for some reason. But, is it all that surprising, considering not even women were represented fairly in medical trials, even 15 years ago?


> "Race" is an attempt to put people with different biological traits into 'buckets' but in reality there's variation, overlap, and blurry lines that make any sort of classification a social approximation at best.

The same is true of "species". Last I checked, there were at least 24 different definitions of "species", all of which have some overlap and none of which are perfectly precise. You don't see people going around saying "species is a social construct". Then again, maybe they will soon, I suppose anything is possible these days.

That said, you are correct that "race" is merely a rough statistical correlation for some cohort, not some precise measure. If we can categorize more precisely, then we should, and we only fall back to "race" as a last resort (if it's applicable).


The second sentence of the Wikipedia article for “species” reads:

> A species is often defined as the largest group of organisms in which any two individuals of the appropriate sexes or mating types can produce fertile offspring, typically by sexual reproduction.

This is the definition I’ve always heard, and it’s certainly more rigorous than any definition of “race” that I’ve encountered, and makes no reference to any arbitrary social constructs.

Conversely, the definition for “race” is explicitly arbitrary and social:

> A race is a grouping of humans based on shared physical or social qualities into categories generally viewed as distinct by society.


> This is the definition I’ve always heard

It's not as precise as you think, because fertility isn't transitive. Consider members of a species, M1, M2, F1, F2. M1 might be fertile with F1, M2 with F2, but M1 may not be fertile with F2. Are they all really members of the same species?

Read up on the species problem for more information (there are now 26, not 24):

https://en.wikipedia.org/wiki/Species_concept


I can’t admit I have any qualifications to speak about biology, and I know HN is known for proposing reductive “solutions” based on oversimplified models of reality.

That being said, it seems to me that it makes sense that a species would include the entire connected graph whose nodes are members and whose edges represent the ability to have fertile offspring.

I found [1] which is an interesting example of a very large graph, but nevertheless, all the examples seem to be within altogether very similar groups.

I think a reasonable definition for casual use doesn’t need to require the graph of a species to be fully connected, only fully reachable.

There are certainly leaks to any abstraction of species, but they are empirical cases of exceptions. They don’t inject arbitrary social categories into the definition. Definitions of “race” have no empirical basis on which to be proved or disproved in the first place.

[1] https://en.m.wikipedia.org/wiki/Ring_species


> They don’t inject arbitrary social categories into the definition. Definitions of “race” have no empirical basis on which to be proved or disproved in the first place.

That's simply not correct. The literature abounds with all sorts of correlates with race, like propensity for sickle cell anemia, vitamin D deficiencies, susceptibility to alcohol, and morphological differences. These are just as empirically justified as any classification of species, and just as with species, not all of those properties need apply to every single member in that category.

So to the extent that we find "species" a meaningful category when applied correctly, then we should also find "race" a meaningful category when applied correctly. The key in both scenarios is to apply them correctly, and we should abandon them when we find more precise metrics.

That said, you are correct that there are also numerous cultural and social properties that are sometimes lumped in with race in a manner that we don't see with species, mainly because "species" hasn't been politicized. That doesn't imply that there's nothing "there" once you tune out that baggage.


> That's simply not correct. The literature abounds with all sorts of correlates with race, like propensity for sickle cell anemia, vitamin D deficiencies, susceptibility to alcohol, and morphological differences. These are just as empirically justified as any classification of species, and just as with species, not all of those properties need apply to every single member in that category

I think this is the thing people misunderstand about race being a social construct, is that race is a bucket, by nature there need to be correlations between people in that bucket in order to actually place people into that bucket in the first place.

There will likely be other correlations between people in said bucket, who are in this cause usually more closely related and share more recent common ancestors.

The size of that bucket and how we put people in it is arbitrary, but the fact that correlations exist when you put people in buckets isn't.


That doesn't work because the graph for humans would go all the way back to those ancient apes by many connected edges and back down to modern apes, making us all the same species.

This is another problem with species - where is the transition where one species evolves into another?


IIRC, there are two species of birds, A and B, in the North Atlantic that cannot interbreed. But A can breed with C which has a range to the east, and C can breed with D again further east. And so on, around the arctic circle, until you get to Q which can breed with B.

A modern definition of "species" generally gives a set of objective criteria that you can use to determine if two organisms are the same species (such as "can they produce fertile offspring"). Is there any definition of race like this? Genuine question, I don't think I've ever seen a definition of race more specific than "a collection of people with similar physical characteristics" or something like that, and I can't find one now. There's lots of definitions of what all the different races are, but none of what a "race" actually is.

Is there any definition of race where, given two people, you can apply some criteria to determine whether or not they are the same race? The criteria can't be "look at the definitions of each race and categorize the two people" because that's circular, how were those particular categories chosen? With species, you can determine that a bird and a fish are different species without knowing what birds and fish are.


The imprecise definition of "species" is exactly what I was describing. See: https://en.wikipedia.org/wiki/Species_concept

I agree we don't have a perfect definition of species, but I'm wondering if we have literally any definition of race. Has anybody even tried? Is there any criteria more specific than "if two people look very different then they are different races"?

Can't say to be honest, I don't have that much interest in the topic of race beyond its informal use to define cohorts in studies. More than likely someone could start from ethnicity and expand it from there, but the whole endeavour is likely doomed from the start given the sensitivity of the subject.

You're so, so close to getting it. And then you decide to just hop over the point :(


I admire the confidence of someone who can say in the same paragraph that there are 24 different definitions (constructions) of "species," then deny that they are a social construct at all.

I fundamentally disagree with a definition of "social construct" that would have to exclude species as a meaningful category.

Edit: but even if I did agree with the definition, if "race" is as useful a category as "species" despite being socially constructed, all of the people calling race a social construct have utterly failed to make the case that we should stop using it.


I strongly dislike (almost despise) this argument. Race is a surrogate variable for genetic history. The classification itself indeed matches with reality, as it is generally made on ancestry.

Approximation it may be, imperfect it may be, construct it may be, but its utility is very much real. Attempts to handwave it away do not change the fact that race and genetics are very much linked as it is used today.


> The classification itself indeed matches with reality, as it is generally made on ancestry.

I would dispute that claim.

Due to population bottlenecks among the first 'out_of_africa' groups, a black passing south indian is genetically a lot closer to a white as milk finnish person than various african subpopulations are to each other. (Africans are orders of magntitude more diverse than the rest of the world, in a genetically quantifiable way)

Race markers like latin american and hispanic betray the fact that some countries (argentina, chile) are almost entirely white, others are have denisovan dna (natives) or are racial frankenstien's monsters due to slave trade (Brazil). It makes no sense to use these umbrella race denominations.

Race as an overloaded term for sociological, antropological, genetic and medical use is stupid. It just becomes a terrible tool for each. Genetics has smartly stopped using race much, but the others still continue to do so, despite the inconveniences it brings.

There is utility to race , only because we refuse to cut the middle man and identify clusters directly from genetic data. No one needs a cockerel to wake you up, when alarm clocks have been invented. Honestly, typing this comment has just made me want to invest in these 23nme-like companies.


>There is utility to race , only because we refuse to cut the middle man and identify clusters directly from genetic data. No one needs a cockerel to wake you up, when alarm clocks have been invented. Honestly, typing this comment has just made me want to invest in these 23nme-like companies.

There's a reason for this, and that reason is cost. Even the relatively cheap microarray based tests that 23 & me uses are expensive at scale. Race, imperfect as it is, serves as a low-cost, reasonably effective proxy in many (but not all) cases.

Remember, for medical logistics, it's not about getting perfect care (are you sequenced yet?). It's about getting cost effective care, lest you bloat costs to high heaven.


Does anyone really believe that medical logistics is the most common or most impactful use of "race"?

With regard to this topic, where we're literally discussing lk race in an xray?

"Latin American" and "Hispanic" aren't race markers. They're a regional identity and an ethnicity, respectively, which exist alongside race.

What's the "genetic history" behind being black? If it means basically anything other than that your ancestors came from Africa in the past 100k years, you're missing many groups commonly considered "black" like Australian aborigines and Negritos. If that is your definition, then basically everyone is black.

Where's the utility in that?


> its utility is very much real

What's the utility of the concept of race?


Low cost variable for modulating medical treatment? It's often used in such scenarios.

> is an attempt to put people with different biological traits into 'buckets' but in reality there's variation

But this is true for everything. For example "night and day" - these are just buckets, but nobody would argue that there are no differences between night and day because of that.


random side note: glasses for example, bought RayBans my nose bridge isn't the right shape for it sucks

could be genetic I suppose but I wasn't aware of this as a thing eg. Asian fit sunglasses


Blue is a color. Yellow is a color. The fact that green exists doesn’t mean those colors aren't real.

Is magenta a color? A real one? We perceive it as such, but there is no place on the electromagnetic spectrum corresponding to magenta. If you went hunting for magenta photons, you'd be hunting for quite a long while. What that says to me is that our intuitive mental construct of color isn't very reliable, even though it feels so real. Likewise, maybe we should be less trusting of our intuitive labels for other phenomena.

Tigers are real. Lions are real. The fact that Ligers exist doesn’t mean they’re not two different populations.

The point is that just because two categories can mix doesn’t automatically mean that those categories don’t meaningfully exist.


The differences between blue, yellow, and green are pretty minor compared to radio or gamma radiation.

No, I don't know what the point of this comment is. But I'm not sure I understand the point of the parent comment either.


So you're saying there is an Asian race spanning 4.5 billion people from China to India to Iran and there are also German and Irish races. I guess the question is how can Germans and Irish be different races, but Chinese, Indians and Iranians are the same race?

Africa is going to be similarly diverse as Asia.


I would say this is an observation that the labeling isn't nearly granular enough in those very large regions. From my experience, people from those regions will use much finer labels, to describe people, than the "this side of the planet" labels of "Asia" and "Africa".

More diverse, there is more genetic diversity within Africa than outside of it.

> Clearly people are biologically different based on race and the AI here is picking up on that

I think the parent is saying that it's possible (likely even) that the AI isn't picking up on biological features, but some other artifact. For example, perhaps the quality of x-ray machines or technicians correlate with race (race and "access to higher quality radiology" both correlate with wealth) and the AI is really picking up on the quality of the imaging. The fact that the AI still worked when the imaging quality was reduced across the board (pixelated into 8x8 squares) suggests that this particular hypothesis is unlikely, but this is the kind of error we're discussing.


>> an engineer I worked with had a father in dentistry and told me my kid had "German teeth in an Irish mouth"

Off topic, but this sounds very engineery, indeed. Was the conversation polite?


Don't forget Sinodonty.

Aliens may as well sort us by the colour of our pants as soon as the colour of our skin, but let's not pretend that such an obvious visual variation as skin tone would be overlooked.

White cat, tabby cat, grey cat, etc? We don't try to say one sort of cat is better than other, but we can tell them apart very well.

Maybe what you're saying is that the aliens would not have the same prejudices associated with that marker as we have.


No, I am saying that skin tone would not be as important a feature.

For example, lining up those dead bodies by skin tone alone would have Central Africans mixed in with South Americans, Austronesians, and South Asians, Northern Europeans mixed in with East Asians and Inuit, Southern Europeans mixed in with Indians,Central Asians, Native Americans, and Arabs, etc etc.


But it's not limited to skin tone. Have you never been able to identify an albino African as African?

But you're moving away from the premise of the GP.

I don't see how.

That depends on the Alien's primary mode of sense. Which may not be sight. Sight is particularly strong for humans. But what if the Aliens primarily distinguish based on smell like many animal species with poor colour vision?

> Aliens may as well sort us by the colour of our pants as soon as the colour of our skin

This made me chuckle. But it is a good example. I bet there is a lot we can infer from people based on their common go-to colors for clothing.


I never realized how odd the concept of "blue jeans" were until I travelled outside the US.

Supposing there are Aliens FWIW I don’t know that they would perceive the same visible spectrum that we do, which might effect their ability to differentiate by color. And they might not even perceive colors at all perhaps focusing on patterns or perhaps use some sort acoustical sonar system of high pitched shrieks or an electrocapacitive perception captured by elongated dactyltennae hands with which they can sense the world around them.

The alien is hypothetical, a proxy point of view that's theoretically unbiased and has no awareness of human social context, history, etc.

Sure, but aren’t they just extending that to remove the “bias” of human-sensory-organs ? :P

"But I take higher ground. I hold that, in the present state of civilization, where two races of different origin, and distinguished by colour, and other physical differences, as well as intellectual, are brought together, the relation now existing in the slaveholding states between the two is, instead of an evil, a good — a positive good," John c. Calhoun said.

"Nonsense," replied the thistleglorb. "You both have exactly the same capacitance."


> People are not 'race x' or 'race y' biologically, as if race is some discrete set of features common to a whole population. Every individual has a set of biological features inherited from their ancestors which, theoretically, could include any or all so-called 'races'. Human beings have a continuum of features that is heavily interlaced amongst all the 'races'.

Exactly. One of the people I'll always point to as evidence that race is a social construct is Barack Obama. He was the "first black president". In reality he is, genetically, as "white" as he is "black". We still call him black because of the color of his skin.

If people insist for long enough that racial categories are inherently biological you'll eventually end up in one-drop judgement territory. Not a great place to have a discussion.


One does wonder how the AI would do by the 1/16th legal definition.

That race is mostly a social construct, does not abnegate the fact that it can also exist physiologically.

'Aliens' visiting Earth would immediately categorize us into groups very crudely resembling the groupings we use today, because our visible characteristics are the most immediately obvious artifacts of our existence.

(Edit: when I say 'immediately' I'm indicating this would be an obvious, first order thing to do from the first pictures they have of us, not an 'Enlightened Alien' scientific form of categorization).

If all they had were 'pictures' of us, the race categories we use would be the obvious grouping, or something resembling that.

They would see that most of the people in Sub Saharan Africa looked quite different from those in East Asia. (And difference between Sub-Saharan Africans and East Asians is more than 'melanin').

There's a 'continuum' between every biological grouping, that doesn't mean those categories don't exist. It just means we're going to argue a lot about where and how to draw the lines.

Race as a 'Social Construct' relates to all of the other attributes that we associate with race, and individual lived experiences due to how they are perceived etc..

To your point, Aliens wouldn't immediately pick up on the 'Social Construct' bit, at least not right away and so they wouldn't have the prejudices that we do, but if they could only observe from afar, they would see exactly what we see, and visual distinctions would be the 'first order of separation' even if it was, after further understanding (i.e. genetics) a less important distinction as you hint.

Edit: someone provided this like I'd like to also include it [1] which illustrates some of the current debate over the notion of race, and that it's clearly politicized.

[1] https://en.wikipedia.org/wiki/Race_(human_categorization)


It’s a social construct because the lines aren’t drawn along genetic lines. For example Americans call someone that’s 3/4th white and 1/4th black a black person.

Further features that seem very important to us like the shape of our eyes or skin tone may seem irrelevant to a creature which doesn’t have a face and sees the world in a different color spectrum than we do. They might group based on smell or habitat marking groups of urban, suburban, and country dwellers.


Part of it is social and part of it is genetic.

From a social PoV, in Cuba, someone as you describe 3/4 one and 1/4 the other would be likely be put in the category of the 3/4. In some places an albino is just someone with different genetic characteristic, in other places they are a creature that brings bad luck and suffer violence.

Dog and cat breeds are distinguished throughout the world. It's genetic and socially constructed as well.


"the lines aren’t drawn along genetic lines. "

Race is definitely drawn along genetics lines - you've just demonstrated that in your example.

The 1/4 black - 3/4 white person was not identified as 'Asian' in your example, but rather, in the mixed physical scenario was crudely categorized as one or the other, in the example you gave, Black.

(FYI I tend to disagree a bit about the category though: I believe people will be categorized mostly for how they actually look, not so much the 'ratio' of anything. There's a lot of 1/2 Filipino 1/2 White people on TikTok who 'look' 100% White and make funny videos about the fact nobody believes them about their heritage).

Your example shows that race is definitely a social construct, but that it also has underlying genetic realities.


> 'Aliens' visiting Earth would immediately categorize us into groups very crudely resembling the grouping we use today, because our visible characteristics are the most immediately obvious artifacts of our existence.

An alien taxonomist, perhaps.

We humans go "bug! whale! snake! cow!" most of the time even for species found here on our own planet.


Yes, and if we were to most crudely distinguish between sub-groups of 'bugs whales and cows' we would do it first and foremost based on immediate appearance.

We would even name the sub-groups firstly the artifacts that differentiate them physically.

Blue Whale, Grey Whale

Black Bear, Brown Bear.

As we develop a better understanding, we'd also probably later determine that the genetic differences may not map very well to the physical attributes ... but due to historical groupings based on visual cues, we'd continue to overstate/understate the differences in the textbooks and in pop culture.


I'd tend to hope a species capable of interstellar travel would have learned the same lessons already about not classifying based on mere outward appearance.

They'll presumably have had things like https://en.wikipedia.org/wiki/Carcinisation happen on their own world.


> 'Aliens' visiting Earth would immediately categorize us into groups very crudely resembling the groupings we use today

I wonder about that... I can't tell the difference between, say, a male octopus and a female octopus, but octopi can. Maybe the differences that seem so obvious to us are actually almost imperceptible outside of our species.


> 'Aliens' visiting Earth would immediately categorize us into groups very crudely resembling the groupings we use today, because our visible characteristics are the most immediately obvious artifacts of our existence.

Whatever categorization they came up with would most certainly not resemble what we consider racial categories today.


"I tried to classify your species and I realized that you're not actually mammals. Every mammal on this planet instinctively develops a natural equilibrium with the surrounding environment, but you humans do not. You move to an area, and you multiply, and multiply, until every natural resource is consumed. The only way you can survive is to spread to another area. There is another organism on this planet that follows the same pattern. A virus." -- Agent Smith

> Human beings have a continuum of features that is heavily interlaced amongst all the 'races'.

That doesn't mean race doesn't exist any more than the fact that height is a continuum doesn't mean that short people and tall people don't exist.


"That doesn't mean race doesn't exist"

Exactly, although not in the traditional sense. There are many many overlapping genetic aspects in humans; We subdivide for political or social reasons, not strictly biological ones. To make the height comparison more fair, it would be as if we divided people into "bigs" or "littles" arbitrarily and formed political parties around it, etc. Height is one biological aspect and even then what is "tall" is subjective.

For example, the US viewpoint of white or black(~=african) is a relatively recent way of looking at race. People don't slot neatly into X or Y buckets.

I recommend reading the wikipedia article on Race [0].

0: https://en.wikipedia.org/wiki/Race_(human_categorization)


But I've think you've gotten looped around here to a stance that fails to account for the original context. It wouldn't be that surprising if an AI could categorize people into "tall" vs "short" from pixelated medical images; certainly nobody would say, as the parent comment did, that the AI must be cheating because there's no such thing as being biologically tall.

But no one would argue that the definitions of "tall" and "short" weren't social constructs, even as the fact of human length from heel to crown clearly is not a social construct.

Funny, I read this immediately after leaving a comment (https://news.ycombinator.com/item?id=28525122) using height categories as an example of a social construct.

Race is socially constructed, just like the idea of who is short and tall is. The objective things in each case are genes and height, which are not socially constructed.

This is an error on two counts: it conflates a spectrum on a single qualiry with spectra across multiple qualities, and it presumes the existence of people who are the “most” of their race (the way that there is a tallest and a shortest person). The first isn’t what race is, and the second is absurd.

Does it? I'm 6'5" (196cm in Roman Catholic[1]); in Amsterdam I'm at the upper range of normal while in many other places I am Godzilla and have to watch my step to avoid crushing things.

[1] A humorous reference. Sorry.


Race in the context of physical differences does not map to the social concepts of race.

For example, US documents usually include Latino as a possible race, even though Latin Americans are white, black, indigenous Americans, or a mix - and Spaniards are what usually would be classified as white. If you check older forms you'd see Italians and Irish people categorized as a different (non white) category, etc.


Do US documents do that? In my recollection, you're usually asked to specify a race, and then "Latino" is included in a separate ethnicity question.

You're correct. US documents generally have race and ethnicity separate, with Hispanic/Latino origin being an ethnicity.

You are correct. Most people who identify as Latino/Hispanic in the US consider that to be their race though, not ethnicity. So Census data has a lot of people who identify as Race: Other Ethnicity: Hispanic/Latino.

Race is distinct from Nationality. "Latino" is not a synonym for citizens of Latin American countries, but the name of a racial group which originated from that region of the world.

It gets confusing with countries like China, Japan, and India, which are more racially homogeneous, and where the country name is the same as the (common) name of the predominate racial groups.


The parent is talking about ethnicity (culture), not nationality. "Latino" isn't a race, it's an ethnicity. In particular, there are many "races" of people who are Latino (white Cubans, indigenous Mexicans, black Brazilians, etc can all be "Latino" despite different races and nationalities). Indeed the "latin" in "Latin America" and "Latino" was originally a language category--these peoples all spoke romance languages (Spanish, French, Portuguese, etc).

> “Latino" is not a synonym for citizens of Latin American countries, but the name of a racial group which originated from that region of the world.

It really isn’t. Where are you getting this idea from?


You've apparently asked a very interesting question, judging by the volume of replies!

The other replies here are mostly good, but I'd also like to note that "race is a social construct" refers to how "races" aren't really objective categories (What defines if somebody is "white"?) and more of a subjective thing, particularly at the margins. We can build classifiers that can match most people's (in our current cultural context) perceptions most of the time, but that doesn't make it a rigid natural phenomenon.

For example, I could build a classifier that looks at household finances and decides if people are lower, middle, or upper-class. I'd bet that I could get it good enough that most people off the street would agree with the results most of the time. However, that doesn't make "social class" some sort of objective, unchanging, universal truth. Somebody from 100 years ago would probably find us all to be upper-class. Somebody from the far-flung future would (hopefully) find us mostly to be near-destitute.


> You've apparently asked a very interesting question, judging by the volume of replies!

Agreed, and upvotes too! It seems like I've struck upon something people have been interested in talking about.

> The other replies here are mostly good, but I'd also like to note that "race is a social construct" refers to how "races" aren't really objective categories (What defines if somebody is "white"?) and more of a subjective thing, particularly at the margins. We can build classifiers that can match most people's (in our current cultural context) perceptions most of the time, but that doesn't make it a rigid natural phenomenon.

I certainly believe this to be the case, but when I hear "race is a social construct" it's almost always in the context of denying biological differences between the races in the same way that some extreme (though mainstream and influential) people take "gender is a social construct" to mean that literally all differences between the sexes are socially constructed including height, weight, strength, etc (otherwise known as "blank slatism").

That said, unlike biological sex, there are fewer valid social implications that we can draw from race (e.g., there are a bunch of social implications which fall out from women's unique ability to bear children, but no analogues which fall out from race) and we have drawn many false implications from race which have been tremendously harmful to individuals of different races, so if we have to reduce everything to a slogan or a binary (as our simplistic society increasingly demands), then "race is a social construct" isn't a bad one.


> when I hear "race is a social construct" it's almost always in the context of denying biological differences between the races in the same way that some extreme (though mainstream and influential)

Racial segregationists in the Americas literally had to write laws delineating races based on factors external to that individual (e.g. their parentage). That is: even people who believed in a racial hierarchy also believed it was not possible to objectively identify a person's race without knowing e.g. the races of that person's parents.

Race has never been considered a generally observable fact about a person.


But is this only because they lacked a theoretical understanding of genetics (including an understanding of which combinations of genes that express the features that we attribute to race, which we still presumably lack by-and-large) and sequencing technology?

> which combinations of genes that express the features that we attribute to race

If an arbitrarily large grouping of genotype combinations is necessary to categorize people, perhaps that categorization scheme is not useful? I would imagine that the number of "races" generated via the mechanism you posit would measure in the hundreds or thousands, rendering it unrelated in practice to the word "race" as it is used.


>some extreme (though mainstream and influential) people take "gender is a social construct" to mean that literally all differences between the sexes are socially constructed including height, weight, strength, etc (otherwise known as "blank slatism").

Could you give a reference to someone who actually says this?


Perhaps those arguing against the idea that "gender is a social construct"?

>Somebody from the far-flung future would (hopefully) find us mostly to be near-destitute.

I love thinking about future historians view of today, I think it's better and more useful than futurism.


It can be both real physiological differences and a social construct, as "what level of bone density tips you from x to y" becomes a question, as does "is an x person with calcium deficiency actually a y person" sort of things that are obviously not the case.

As someone with a scientific bent who is of the left, I always find it incredibly frustrating when people say “x is a social construct”, because it’s technically true, but also utterly elides the dual nature of the category. Race is a social construct that can be used to infer true things (probabilities) about the real world! Other social constructs that have this property: sociology, economics, physics…

This isn’t to say a lot of people who are into race science don’t wildly overstate their claims, but there isn’t literally nothing to it.


Agreed on both fronts. There's a lot of attempts here to use "race is a social construct" to indirectly claim that race is meaningless, or at least imply that.

People need to realise the following: - Race is a social construct - It's also a proxy for ancestry - Ancestry is a proxy for genetic history - None of the above contradict each other.

It is possible that we could sufficiently redefine race and ethnicity such that the above isn't true, but as it is right now, race is at least moderately coupled to a biological signature.

What should also be emphasized is that race isn't an end-all. The within race variation is far greater than the between-race variation.


Are you positing a close ancestral link between Africans and Australian natives, say closer than between either and northwestern Europeans?

You can't say anything about someone's bone density by deciding that they are black. Averaging the bone density of everyone that you've defined as black might be predictive of distributions of the bone densities of large numbers of people you've defined as black.

But it's just because of the metrics you've chosen. If you started defining race by bone density, and ignored ahistorical half-biblical half-mystical 19th-century human taxonomies, you might find that their classifications aren't interesting or useful for most things. If you controlled for effects that are affected by differential social treatment (like diet and upbringing), you might find most of your metrics are ghosts.

The process of sorting people into boxes for differential treatment based on their qualities affects their qualities.


I happily grant much of what you’re saying; I think the difference here in my view is that you can define groups in one way, and find that _other things_ can be predicted. No one said, “we will define African Americans by whether or not they have problems with high sodium causing heart issues” and then determined that African Americans have this medical problem. Race is certainly a suboptimal way to predict this medical problem; much better to have statistically determined genetic markers, or better yet, a complete rendering of the mechanisms of action. But we don’t.

I’m all for “variation with groups is larger than variation between groups”, both as a literal reality and also as a practical answer to the question of how do I go through the world without being a jerk to people.

Racial categories are squishy and permeable. So are ideas of species, sub species. They’re useful tools with sharp limitations. Again, some people certainly oversell the utility of the tools, often for malign ends. But it doesn’t mean it’s an entirely bankrupt view


The following is a response to only your first paragraph:

Huh? If the distributions of bone densities among “people you would visually identify as ‘race X’ ” and “people you would visually identify as ‘race Y’ ” differ, then knowing that an individual would be someone you would visually identify as ‘race X’, gives you some amount of statistical information about their bone density.

I guess you just mean that you can’t make any high-confidence statements that are substantially different than if you didn’t have that information?

Small note on second paragraph: I don’t think the racial categories in question are even half-biblical? I mean, I know in the Old Testament there are lots of things referring to e.g. edomites, or Amalekites, etc. , but I don’t think these are really like, “races”, and I can’t think of anything that really supports the idea of a fixed set of racial categories. But perhaps I’m forgetting/missing something.


> If the distributions of bone densities among “people you would visually identify as ‘race X’ ” and “people you would visually identify as ‘race Y’ ” differ, then knowing that an individual would be someone you would visually identify as ‘race X’, gives you some amount of statistical information about their bone density.

Only if they differ substantially. If they look like https://evergreenleadership.com/wp-content/uploads/2014/01/B..., you can't conclude much from an individual's bone density measurement, even if "people of race X have 3% more bone density than people of race Y".


I took the word "anything" too literally.

I interpreted "conclude anything from" as "have a different posterior probability distribution" rather than "have a significantly different posterior probability distribution". This was an error on my part.

Thanks for the elaboration.


It's a deepity, ie. a statement with two interpretations, one true but trivial, and one enormously impactful but false.

that is a wonderful word, thanks

also, i miss reading your comments on SSC, are you on ASX?

Yeah, just not much to contribute lately.

How race is a social construct: it was imposed as an idea for reasons that were not connected to physiological fact (although often claiming the contrary at the time, the claims were the same kind of made up BS as phrenology, another idea from the same time period).

There are biological correlations with inherited genetic lineage. That only has weak correlation with assigned race. Lots of different lineages of people are "black", Africa is big and diverse. Lots of different lineages of people are "white". But for example, if your lineage is from a malarial region, your chances of being a genetic sickle cell carrier are higher. Most people from those areas are dark skinned, so it correlates with being "black".

Also there's the impact of racism, which affects everything from nutrition to poverty to pollution exposure, and does so on an individual, a regional, and a national scale. And this has biological consequences.


The problem is the question itself, both the one you asked and the one the researchers asked. Is it recognizing ethnicity or race? Clumping ethnicities as "race" is the social construct. It's not just skin that is different between people. If a certain culture prizes certain features in people, people with those traits will breed more, thus amplifying biological features within a socio-cultural group.

My question is, using medical imaging can it particularly identify say people of african origin but it cannot tell apart east vs west africans? Can it uniquely identify asians but can it not tell apart an indian person from a Korean? Or given proper training can it discern between north and south koreans or between a french person and a greek person?

Race is a social construct not because groups of people are all identical but because both science and major-religion have concluded that humans share a common human(homosapien) ancestry, therefore there is one human race and multiple ethnicities and geographical super-ethnicities (south east asian and north european for example).

Edit: This is also why race on id cards is silly. Not because you don't want to identify people based on appearance but because ethnicity is more granular and leads to less confusion. Would it be more identifying to say indian or asian? African or north-african?

I strongly believe the modern black/white/asian "race" is a darwinian invention to try and understand and classify nature better, based on intuition instead of science.


Nature does not create "bins" to sort people or things. Some traits are clustered and we latch on the most visible clusters to define "races".

In reality, there is broad overlap and if you look up close, the whole concept becomes hairy. Someone with a father of Scandinavian descent and a mother with African lineage, what is that person, black, white, 50/50?

It's the same with gender/sex. While the biological substrate is clearly variable, the categories are social.


Not quite: Sex is very clearly split into two distinct mechanisms. There's definitely sex-linked distributions of different traits, like male and female typical ranges of height, or agreeable personality, risk-seeking/risk aversion and so on, where both sexes operate the same mechanism.

Races are mostly clusters in variation within one mechanism - eg. skin color is largely a gradient of more or less melanin, and what gets selected for depends on the environment. It's not intrinsically linked to the whole cluster of race-typical traits, those just are in the same genetic bundle that gets inherited from generation to generation, and most of those traits can be mixed.

Sex itself has a sharp divide from the mechanisms themselves being completely different.


To advance your point, the ability to bear children is kind of the definitional distinction from which all of these biological distinctions derive as well as tons of social distinctions. Race has no such analogue.

What about infertile women or intersex persons?

How about cat vs dog?

I'm not following.

"the definitional distinction".. ?

I don’t know what point you’re making, but my “definitional distinction” remark meant that “sex” is more or less defined by the ability of one group to bear children (fertility issues aside, of course). Not sure if that addresses your concern.

There is no "concern". Think about the other part of your comment rather than the gender bit. Good luck.

If you insist on being unclear I'm not going to work to tease out your meaning. Have a good day.

No, between the two end of the gender spectrum, there is all kinds of shades of grey, starting with persons with intersex genitals and all the way up to women with a more male personality, body hair and so on.

Nature doesn't care about categories and will happily produce all kinds of distributions. It's us humans who try to bin them cleanly.


Intersex conditions are disorders of physical sex development, and the specific disorders themselves most of the time sex-specific. It is a physical medical condition, and they certainly do not create a spectrum of sex: There is not a third or an inbetween kind of breeding setup, just bugs in building the proper things. Bugs that are insanely rare.

Women with more male-like personality goes into that "variation in a shared system that has sex-linked distributions": There are women that have more typically male personality configurations and the reverse, just as there are tall and short people. Nothing surprising about that, and it doesn't relate to sex itself being a binary dictated by two different reproductive mechanisms.


Sorry, you are falling for a category system that is entirely human-made. Nature doesn't care about "disorders" but produces the whole spectrum. Nature doesn't even care about that the evolutionary fitness of many variations is zero.

“Race is a social construct” does not mean that inheritable biological differences between groups people does not exist. It means that the categories we use to classify people into races are cultural. For example the "black" and "white" categories in the US.

Isn't that ethnicity? Race: biological category Ethnicity: Cultural category

Both race and ethnicity are cultural categories. Example: Barack Obama is considered black in the US even though he has a white mother. This is a consequence of American culture, the historical "one drop" rule.

Came here, wanting to, but afraid to, ask just this question. Glad you asked. Glad some were willing to give thoughtful answers.

"race" as we use it colloquially is a lot less fine grained than biological differences between humans. For instance there's not a monolithic genomic signature around "black" or "white" and yet those are categories we use for "race".

As someone pointed out in another reply, "races" are buckets people are distributed into, frequently based on physical aspects. The point of saying "race is a social construct" is the nature of those buckets.

Northern Europeans, Southern Europeans, and Eastern Europeans (or rather people descending from those areas) differ in a number of physical aspects. Are they different races? For those who think race matters today, it seems not---they are all "white". Back in the heyday of scientific racism in the first half of the 20th century, they absolutely were---that is why the US had different immigration limits for different parts of Europe. Are native Australian people the same race as Africans?---there's no especially close genetic relationship as far as I know.

Physical differences exist. How you use those differences to divide people into groups and, more importantly, how you treat people of those groups is a social construct.


If the software is told of the existence of race via an attribute in its ML training, then it will see race in additional data. Otherwise, it won't, IMHO.

Also, due to omission of many other variables (such as culture), those variables are being conflated with race. I personally think a lot of what is commonly accused of being racism, sexism etc. is really just "culturalism" or "preferentialism"... let me think of an example... Given two bars, one filled with rap music and the other filled with techno music, and I pick the techno one every time... am I racist (assuming the predominant race in these 2 bars differs)? Or just preferentialist, culturalist or (frankly) "techno-ist" (if that were a thing)?

Because I think it's much harder to get angry about preferences than it is to get angry about racism, I think that given a choice, we need to consider the less-triggery inputs to a perceived problem


It’s biologically defined the same way as height is biologically defined, and socially defined the same way as ”tall” (180cm+) is defined.

  if there are all of these biological correlations with race, what does it mean that “race is a social construct”?
I could divide people into 'tall', 'medium' and 'short' buckets. Someone else could categorize them all into just two categories (eg: 'tall' and 'short') instead, and someone else might argue that I've chosen bad cutoffs, and tell me I incorrectly put a bunch of 'medium' people in the 'short' bucket. People have differences, but the criteria by which we choose to group them is subjective.

That is only true if the distribution of things within the continuum is not "lumpy". For height it isn't, but for other things it is. For example gender is highly bimodal, yet still a continuum. Would you say that "male" and "female" are subjective socially constructed categories? Of course not.

Race is probably somewhere in-between. There are people all over the spectrum but there are pretty clear large groups with sparsely populated gaps in-between them.


I would argue that "male" and "female" can be subjective socially constructed categories in certain contexts. For example, intersex people exist, people with hormone conditions exist. Whether these people are male or female is a very hairy and nuanced conversation that is viewed on a case-by-case basis, and sometimes creates controversies, such as women who have XY chromosomes and naturally elevated testosterone being disqualified from competing as women in elite sports.

See: Caster Semenya, who is disqualified from competing as a woman because she has XY chromosomes and naturally elevated testosterone. She's intersex. Her only supposed recourse is to take medication to force her testosterone levels lower to be qualified as a woman. Of course, the organization is now defining who counts as a woman to qualify for women's sports is likely making socially constructed categories.


Right, but the existence of edge cases does not mean that the "male" and "female" labels are completely arbitrary.

The words "bed" and "sofa" aren't arbitrary social constructs just because sofabeds exist.


Your comment merits a more thoughtful reply than this, but I discussed the same issue here some months ago (https://news.ycombinator.com/item?id=26876186) and it's hard to work up enthusiasm now.

> if there are all of these biological correlations with race, what does it mean that “race is a social construct”?

What race is Obama? And how is your answer to that question not a social construct?


From what I understand, race is a social construct doesn't mean that there can be no physical differences. After all, humans have been able to tell* what race someone is with their stupid human eyes. It's making a claim the important aspects of race are a social construct.

If the defining physical differences between races is both melanin and sternum width, that doesn't seem to be more relevant than just skin melanin.

* In many cases. It's more error prone than most people want to admit.


> humans have been able to tell* what race someone is with their stupid human eyes.

It's more that humans have been able to define, with significant disagreements, what race someone is with their stupid human eyes. Race is nothing but a collection of these judgments. I'm not brown because I'm black, I'm black because I'm brown.


>It's more that humans have been able to define, with significant disagreements, what race someone is with their stupid human eyes.

Sorry, that's also true.

My asterisk was talking more about the inability to visually tell the difference in many cases. The black people in America who could pass as white in the 1950's (or now). Or how Latino and Middle-Eastern people often get confused. There are numerous cases, both specific and general.

But yes, it's not an objective truth being measured, and I'm sorry if I implied otherwise.


Race, of course, is based on physical traits. But the groupings and boundaries are culturally determined and very arbitrary. Minor nose and cheek shape differences are obvious to the Hutu and Tutsi (enough they massacred each on the premise they are different racial groups) but to the average American they're both just black. Similarly, a century ago, it was entirely obvious to the British and the Italians that they were different races of people, although closely related. Today that idea is completely alien in America, where they're both white.

Perhaps the best example is the American views on what defines blackness. Because I grew up in a community where mixed race white/black was seen as a distinct race from white or black, I have a very hard time interpreting race the way Americans do sometimes -- Barrack Obama and Kamala Harris's mixed parentage is obvious to me, which makes them obviously not black to me (think like in the same way Obama is obviously not white) and I have a hard time wrapping my head around Americans seeing them as such, but apparently they do, since even a small amount of physical traits that suggest recent African descent categorizes you as black there.


Yes, some anatomical differences are certainly affected by environment, but the quoted phrase is typically used in the context of treating races differently based on behavior ("X people are immoral", "Y people are stupid"). It's meant to encourage people to treat each other with equity and consider that such observed differences may be distorted by news, your like-minded cohorts, sociological conditions, etc, which are all theoretically addressable. But maybe I'm misunderstanding your question, since clearly the idea that races have anatomical differences is not contested by anybody - after all, anatomical difference is the very thing we base the word "race" on, colloquially. You don't need non-obvious differences to see that, since we already have obvious differences.

> since clearly the idea that races have anatomical differences is not contested by anybody

This is not true. I had a college professor explain the "race is a social construct" idea, and her position was staunchly that there was _no_ biological basis for race. See also this article by the scientific american[1]:

> Today, the mainstream belief among scientists is that race is a social construct without biological meaning.

This is the idea that GP is responding to - clearly there must be some biological basis for race, if an AI can determine race from an x-ray.

[1] https://www.scientificamerican.com/article/race-is-a-social-...


> This is the idea that GP is responding to - clearly there must be some biological basis for race, if an AI can determine race from an x-ray.

Not necessarily; they could easily reflect societal differences. Bone density could be affected by diet, or medical care received, or environmental factors in poor neighborhoods, which might easily vary by race without there being a genetic cause for that variance.

If it were a simple "all black people have a bone density of 3; all white people have a bone density of 2" you could pretty solidly conclude a biological component, but we're in the realm of small variances and probabilities that confound things quite a bit.

I think the most likely explanation for these results is something like https://techcrunch.com/2018/12/31/this-clever-ai-hid-data-fr..., personally. Especially with the 8x8 pixelation example.


I'm not sure how your link applies to this situation. That refers to an interesting way an AI encoded data in an image. The issue here is that this data _exists in the first place_. Whether the AI learned to compress the racial data into an 8x8 picture is besides the point.

The point is “the AI might not be doing what you think it is doing, because it doesn’t actually understand the goal”.

You're misunderstanding the context of the thread. At this point in the thread, we're talking about whether or not there is any biological basis for race, not about the AI example in particular. The premise here is that there is a biological basis--that race is a label applied to a cluster of genetic traits (various facial shape features, skin color, etc) which are almost certainly genetically constructed.

The comment I replied to stated "clearly there must be some biological basis for race, if an AI can determine race from an x-ray".

I'm pointing out that's not necessarily true. First, we have to prove that the AI is determining race from an x-ray. It's effectiveness at doing it via 64 pixels makes me skeptical that it is.

After that, we get into sticky territory determining whether any biological differences we can identify between races are caused by genetics, or environmental/societal factors like disparities in healthcare, diet, neighborhoods built over old SuperFund sites (https://en.wikipedia.org/wiki/Love_Canal), etc.


Ah, my mistake.

There is a massive difference between there being a biological basis to some of the phenotypes associated with racial categories and there being "biological meaning" to identifying races.

"Black" represents a wide range of genotypes, many which differ more from each other than from "white" individuals and populations, even if there may be other tendencies like bone density and novel genetic features that appear more commonly or exclusively in some subsets of the "black" population. The skin colour phenotype being (usually) darker just happens to be very easy to notice and have acquired a lot of socially constructed meaning. Except in the narrow context of skin tones, it isn't biologically meaningful to consider "black" people as a particular group though, particularly not compared to more specific genetic markers that don't have the same socially constructed meaning...


The point of the article is that if you clustered people based on genetic or anatomical properties in a non-post-hoc way, you would not end up with the same system of racial categories that are commonly used in the USA (or indeed any other country). As another poster has pointed out, the most striking example of this is the category 'Latino', which while often treated as a 'racial' category obviously has no biological basis.

This does not entail that no biological traits are correlated with race – biological traits are correlated to varying degrees with all kinds of subgroups of people. It does, however, mean that racial categories have no scientific basis.

To expand on this, imagine a Martian scientist studying human biology in isolation from human culture. Such a scientist would not subdivide humans into groups that match common racial categories, as these groupings are arbitrary from a biological point of view.


The idea that races have anatomical differences is very much contested, and rightly so.

You'd better hope your anesthetist doesn't believe that next time you need surgery: https://pubs.asahq.org/anesthesiology/article/107/1/4/8323/E...

That article writes about very specific ethnicities with shared ancestry. Race is something entirely different.

That’s what the study calls into question.

More often than not, AI/ML finds some surprising shortcut. I am pretty sure this will be the case too, because "races" and their (hypothetical) biological or genetic foundations don't map at all.

I would like to agree with that, since I would like to believe race is a construct that we can dispense with at some point.

That said, assuming the paper is just wrong is simply bias.

If there is an identity associated with race, there is no reason to think it doesn’t impact biology. There is no reason to assume it’s genetic.


Well if race has biological basis and also if race is inherited, there is no way around a genetic basis. Also there is a more than hundred year history of scientific research and no genetic (or other) grounding for race has been found. So yes, that is the source of my bias.

> Well if race has biological basis and also if race is inherited, there is no way around a genetic basis.

False. Diet and behavior affect biology amongst other things.

> Also there is a more than hundred year history of scientific research and no genetic (or other) grounding for race has been found.

We didn’t have machine learning or big data 100 years ago.


> False. Diet and behavior affect biology amongst other things.

Yeah, so if a South-Asian orphan is adopted into a Swedish family, he magically ceases to be of whatever race were his parents and becomes white. That's... not how the concept of human race works.


According to many proponents of CRT or wokeness, absolutely he becomes white. People from typically non-white ethnicities are routinely accused of being white if they adopt cultural traits associated with white people.

And it’s entirely possible that if he was adopted at a young enough age, whatever this AI is detecting would read him as white too, assuming his habits and diet affected his development, as they might.

Indeed that would be consistent with your claim that there is nothing genetic about race.


My claim is that race is not a meaningful concept.

Do you think that racism is real?

Yes, racism is sadly very real, it is based on perceived ingroup/outgroup differences and as such is only one of such ideologies (I come from a territory where people used to kill each other if they belonged to different branches of christianity, or later if they belonged to different nationality). They are all based on false beliefs and propagated by people who are trying to profit from hate and conflict.

The genetic differences between different people in Africa are far greater than the genetic differences between black and white people in North America. The term "race" as used to describe belonging to a different groups which generally experience their lives vastly differently to other groups is thus inherently social. The hypothesis goes as follows - if black and white people are so similar to each other biologically, why are their life experiences so different? It would seem that there's more to it than just the biochemical composition of our bodies that dictates what kind of a life you'll have.

Differences between African groups being bigger than between blacks and whites in NA don't make the rough categorization no longer work. For example, there's rather little genetic variety within dogs, or between dogs and wolves, but that doesn't say much about how impactful the variation is: Chihuahuas and Huskies are genetically really similar to each other, but no one in their right minds would disagree that you can comfortably put them in different, very real, very genetics-based buckets.

In general, arguments of variation within a group are not arguments against considering between-group differences, or those between-group differences being real. The variation within a category cannot be dismissed, though, it's hugely important for understanding the world properly and for guiding personal conduct.


The following article has a detailed explanation of why the analogy with dog breeds is not a good one:

https://evolution-outreach.biomedcentral.com/articles/10.118...


> The genetic differences between different people in Africa are far greater than the genetic differences between black and white people in North America.

This is often repeated, but the point of the question is that the OP calls this into question.


Lots of great replies but missing the most important from a scholarly journal:

Slavery, Race and Ideology in the United States of America: https://u.pcloud.link/publink/show?code=XZ3bwqXZT2m8MI2egSRA...

Side note: If you've seen Ken Burn's docs then you've probably seen her before: https://en.wikipedia.org/wiki/Barbara_J._Fields


If you consider it from the perspective that race and culture are typically conflated into just race, then "race is a social construct" is fairly logical.

>If you consider it from the perspective that race and culture are typically conflated into just race...

I'd say that that only occurs in the odd case of 'Latino'. For some reason Spanish-speaking got amalgamated into a thing it didn't belong.


It happens for Europeans too. White has come to mean everyone of European descent, even though there's tons of ethnic, historical and regional differences across the continent. Asia even more so. Africa is also very diverse when you take into account Northern Africa. The whole Mediterranean area had lots of groups moving around for thousands of years across three continents.

>White has come to mean everyone of European descent

I'd be surprised if anyone in Europe even thought of themselves as 'white' until (perhaps) the post-WWII era. Wogs begin at Calais.

>Africa is also very diverse when you take into account Northern Africa.

I suppose that you could consider North Africa as a separate continent given the bordering desert. My understanding is that sub-Saharan Africa has more real-deal genetic diversity in human populations than the rest of the world put together (which makes sense given it's age).

Obviously race is a real thing. Either a giant tree of relatedness with clusters of appearance + small construction differences or simply a way to form self-interested groups (go to any prison for 10 minutes). The fascination with it as of late is unfortunate, but I'm not sure if it's a reflection of resource depletion/overpopulation, a wave of quasi-religion, or people simply forming up teams for a big fight.


I am thinking mostly with respect to African Americans...when people talk about "racism", rarely do I see anyone distinguish between biology and culture, on either "side" of the issue...which I suspect is a huge contributor to the lack of progress on the issue.

Everything is a social construct depending on how you look at it. Some people say that biological sex is a social construct.

The differences exist on a spectrum and there’s no hard line between one or the other. Yes, as you mention, there are biological correlations. But at the edge cases, it’s quite subjective. How many races are there? Ask a different person and they’ll have a different list. Is Asian a race or is that ridiculously general? etc etc

Race is both biological and social.

Biologically, forget about bone density etc. Skin color, facial features, face shape, etc are discernable directly.

Race as a social construct is the idea (originally rooted in imperialism and colonialism) that certain races are inferior mentally and societal development wise. A few Brits saw Africans residing in huts and living on farming using primitive tools and concluded that they could not develop any further than that and that their brain development was limited. (Of course, this was perpetrated to enable guilt free enslaving and "civilizing" them and exploiting them for labour. I am sure that if African societies were simply introduced to western civilization and allowed to trade and travel, the ideas from west would have been adopted and assimilated much quickly)

So, biologically, races are distinct, identifiable and have evolved to meet the needs of their local environment. But socially, races as inferior or superior was perpetrated with ulterior motives and have been shown as false time and again.


The idea of a social construct is just a social construct man

Or rather, a spook.

Race is a proxy for clustering groups of biological traits

> of biological traits

or genes

I saw a visualization of the clustering on twitter.

https://twitter.com/rokomijic/status/1426614501856751620


Basically groups may have physical characteristics. European nations starting from at least the 1500s began using these physical/religious groupings to mark certain groups for predatory expropriation and premature death.

To quote Ruth Wilson Gilmore: “The racial in racial capitalism isn’t secondary, nor did it originate in color or intercontinental conflict, but rather always group-differentiation to premature death. Capitalism requires inequality and racism enshrines it.”[1]

Cedric J. Robinson (among others) have discussed how capitalism and racialization are continually co-created.

1. Abolition Geography and the problem of innocence, in Futures of Black Radicalism.


Racism - judging people by the colour of their skin - has been on life support for a long time [1] in most liberal republics/democracies but nefarious actors are attempting to revive it in the name of identity politics. It is this type of race which can be detected using X-rays. The same group tends to support the doctrine of cultural relativism [2]. Claiming all cultures are equally valid they call criticism of behaviour related to specific cultural traits "racism", mangling the definition of that term in the process. Cultures, by definition, are social constructs.

Cultural relativism is not the norm and should not become such since there are differences between cultural traits where it is possible to state that some are objectively better than others. As an example, the cultural trait of genital mutilation is objectively worse than that of leaving girls' bits alone - and I'm open to stating the same about boys even though that would raise up a storm of protest. The cultural trait of parents marrying off their offspring without said offspring having a say in the matter is objectively worse than than that of having the offspring decide for themselves who they want to share their life with. The cultural trait of having people who achieved success within the bounds of the law - whether those be inventors, writers, athletes, successful farmers, builders or architects or anything else - is objectively better than that of having successful criminals and hoodlums as role models - yes, "street culture" with gang bangers as role models is objectively worse than whatever name can be given to cultures which have/had those inventors (etc) as role models.

X-rays can not be used to detect whether you might mutilate your newborn's genitals, marry off your 5yo daughter to your 20yo nephew or leave your children to be raised by the local street gang leaders since these traits do not depend on the colour of your skin even though there is often a correlation; correlation does not imply causation [3]. Take for example Michael Skråmo [4], a Swedish-Norwegian man who very much looked the part of such but ended up as a recruiter for islamic state in the Nordic countries. Contrast him to e.g. Luai Ahmed, a Yemeni refugee who lives in Sweden and is a vocal critic of everything Skråmo stood for. It was not Skråmo's white skin and blonde hair which made him ready to pick up a Kalashnikov, it is not Ahmed's brown skin and black hair which made him averse to the negative cultural traits related to islam.

MLK was right when he longed for a society where people would be judged on the content of their character and we were well on our way of achieving that goal. Unfortunately there are those who derive their identity - and income - from their purported position as fighters against racism (without scare quotes), a fight which was nearing its conclusion. While most old soldiers fade away [5] some have taken it upon themselves to revive their old enemy so as to keep their purpose - and income - alive. Their culture is not mine and I consider it to be objectively worse than, e.g. MLK's. If you then consider that MLK was a "black" man while I am of north-west European descent and as such have "white" skin the truth becomes clear, it is not the colour of our skin which makes us alike - it is the content of our character.

Race is not a social construct. Culture is. Nature is not a social constrict, Nurture is.

[1] https://www.nationalreview.com/2015/11/racism-america-histor...

[2] https://en.wikipedia.org/wiki/Cultural_relativism

[3] https://en.wikipedia.org/wiki/Correlation_does_not_imply_cau...

[4] https://en.wikipedia.org/wiki/Michael_Skr%C3%A5mo

[5] https://en.wikipedia.org/wiki/Old_soldiers_never_die


> Racism - judging people by the colour of their skin - has been on life support for a long time

You would think so, but the proliferation of affirmative action among tech companies and prestigious universities says otherwise.


Affirmative action had a place just after desegregation. Now, it is just one of the examples of identity politics, pushed in a doomed attempt to placate the deconstruction crew. Given that deconstruction of societal institutes is a stated goal for these people they do not care whether it actually achieves anything productive for either those who are given positions based on identity categories, for the affected institutes of for society as a whole - as long as it creates strife the purpose has been fulfilled.

Because race is not a social construct.

I initially had this same belief, but I realized I had such a limited understanding of race after reading the wikipedia article on it [0]. That we divide not based on actual genetics but visible markers makes it a social construct. And the way we've chosen to look at race politically has changed over time.

"Because the variation of physical traits is clinal and nonconcordant, anthropologists of the late 19th and early 20th centuries discovered that the more traits and the more human groups they measured, the fewer discrete differences they observed among races and the more categories they had to create to classify human beings. The number of races observed expanded to the 1930s and 1950s, and eventually anthropologists concluded that there were no discrete races.[93] Twentieth and 21st century biomedical researchers have discovered this same feature when evaluating human variation at the level of alleles and allele frequencies. Nature has not created four or five distinct, nonoverlapping genetic groups of people."

0: https://en.wikipedia.org/wiki/Race_(human_categorization)


A more accurate view would be that it really is both.

It's a social construct, but it's not fully arbitrary, being historically (and currently) used as a proxy for ancestry, and thus genetics.

>That we divide not based on actual genetics but visible markers makes it a social construct

This is going too far towards the other end. Physical characteristics stem from genetic variations, and consistent patterns in appearance are often linked to some shared ancestry. It's not the end-all, but it's hardly without cause either.


I think everyone would agree that there are some pretty obvious biological racial markers that differentiate racial groups. Skin color comes to mind :)

The social construct argument just says that the specific categories and lines we draw are fairly arbitrary. Why is an afghani middle eastern (white?) but a Pakistani south-Asian? Is a Russian from Vladivostok really more closely related to a brit than a Mongolian? Idk - but, to me, the social construct argument just says “who cares? The specific groupings are pretty arbitrary anyway”


I think we can agree that thanks to our minds we have leaped over our physical differences encoded in our DNA. Evolutionary differences have little influence on our survival and reproduction, socio-economic constructs and technology has pretty much taken over that role.

The funny thing is that it isn't really arbitrary at all.

It's location based. Humans only recently gained the ability to travel vast distances, and in the past lived (and bred) within a small, localized region. The "arbitrary" location based ethnicities actually do reflect that genetic ancestry

Of course, really wide ones like black/white/asian lose some of their meanings.


It's not a surprise that race effects how someone looks. It doesn't take a genius to see things like different colored skin, different nose, or height differences...

The question is what about the looks of a chest X-ray are connected to race. I agree with the research here, it's non obvious what is being extracted by the AI.

If I had to guess, maybe something about the quality of the scan itself. Perhaps one race was scanned at one particular hospital, vs a different hospital scanning a different race. Then it's just picking out the different scanner.


Or: Race is a rough proxy for breeding populations that have been separated for long time, and subject to different environments' selection pressure. Over time, selection and blind luck build up all manner of small differences, which both observably exist at the surface (so why not inside the body) and that you'd expect to exist from basic evolutionary principles. You'd expect a Norwegian forest cat, a Saharan desert cat and a Burmese to be different in innumerable small ways because they grow up in totally different environments. You'd also expect there to be a lot of overlapping, well, catness to them all. Lions purr, after all. There's nothing complicated about any of it, humans just tie themselves into knots when it comes to humans in a way they don't when it comes to cats.

With that said, the simple explanation is that the AI picks up on these small patterns in a way humans don't. The brain and neural networks are fundamentally pattern-recognition engines. The AI is just seeing something we don't either notice or can't see.


> Lions purr, after all.

They do not, actually. Incapable of it; cheetahs are the only big cats that can do it.

https://www.nwf.org/Magazines/National-Wildlife/1995/Questio...

> Purring ability, rather than size or behavior, is one of two chief distinctions between the two main genera of cat, Felis and Panthera.


Huh, TIL.

The way we interpret race is largely a social construct due to 'apparent' physical differentiation, less so underlying biological or genetic factors.

We end up segregating ourselves for a variety of reasons, in which case groups that are physically very distinguished end up forming almost an ethnic basis.

For example, two groups with varying genetic makeup and maybe a number of non-obvious biological differences, but who otherwise looked identical - would have the similar life experience in terms of their social treatment by other groups.

But irrespective of how a person is socialized - if you're Black, people are going to treat you one way, and if you're White, people are going to treat you a little different. That 'lived experience' differential is a somewhat unavoidable.

The degree of that variability is obviously debatable, but surely it exists to some degree.

I suppose you could make a parallel in ethnicity: a century ago, the difference between a Scottish-American and an English-American would have been apparent by lineage, accent, Church affiliation, and that might have affected relationships, status etc..

Whereas after a few generations of integration, there is definitely 'no' (or not much) difference between those two groups, and no vector for differentiation/discrimination. The historical ethnic situation was a 'social construct'.

That said, some of the argumentation used to promote the idea that there is no genetic basis for race is a little odd, the 'Africans have more genetic variation than other groups combined' is often used, but frankly I do not understand how that doesn't mean there are material differences between them and other groups.

And of course there is no 'hard line' between groups, but there is also no 'hard line' between the Scottish and English, there are many people who have attributes of both cultures, but that doesn't negate the existence of either group.

I think we're a bit oversensitive these days to these issues. Systematic racism exists and we should think about it, but that doesn't mean there's a boogeyman behind every door.

I think in this case it's also worth examining what exactly the AI is finding out, because it may not be just 'bone marrow'.


It might be helpful for folks to look at the blog post written by one of the authors:

https://lukeoakdenrayner.wordpress.com/2021/08/02/ai-has-the...

or the paper itself

https://arxiv.org/pdf/2107.10356.pdf

I see a lot of "oh it's probably just picking up on x y z" when x, y, and z are things they explicitly checked for:

1) "It's probably just the names or other metadata" – they only gave it pixel data to train on. To control for things like metadata overlaid on the image (e.g., a name written on the image) they divided the images into 3x3 sections and trained classifiers on each section separately.

2) "It's probably some artifact of how the hospital marked up the images" – they used something like 7 different datasets from different hospitals and different modalities (X-Ray and CT).

If it is cheating somehow, it's not doing it in an obvious way that you can think of in a minute or two. Also note that they had more than just medical folks working on the paper; the author list includes plenty of computer scientists. It's unlikely they're making an elementary ML mistake here.


One major risk source I see is that the size of the training data for the races isn't the same. For white vs. black patient data, there's between a 2:1 and 3:1 ratio bias in both the training and test data (and a much higher ratio bias for Asian... as high as 20:1 in some of these categories).

This gives the CNN more information on one race than another, which can create a classifier that performs very well on the training and test data it has access to but then flakes spectacularly on data outside the training set (because the source isn't representative of the total variance in the global population).


They tested on tons of different external datasets, and at least one of the training datasets was balanced. Same results were obtained.

I don't see why this is necessarily bad. An ML model is picking up on subtle anatomical or physiological differences between races. So what, that doesn't automatically mean the AI is racist or biased...

It's not necessarily bad, if it's actually working.

The fact that it works on an 8x8 massively pixelated version of the x-ray points to the possibility that it's not actually working, which would be bad if you based patient treatment decisions on an training set that was actually teaching the AI something else entirely.


Huh?

What do you mean, not working? That the AI was randomly choosing the correct race 82% of the time by luck?

I'm confused by what your implying because it would seem to me that the authors went through many steps to try to pinpoint how the AI was doing this identification and how baffling it was to everyone that even with a lot of x-ray information removed (8x8 pixels compared to say 4k), it somehow was still correctly picking the race.

What would this "something else entirely" that you are implying actually be?


> That the AI was randomly choosing the correct race 82% of the time by luck?

No; as with the article I linked elsewhere in the thread (https://techcrunch.com/2018/12/31/this-clever-ai-hid-data-fr...), that the AI might have found some other indicator, like filenames in the data set, or metadata in the images that included patient name, or differences in the length of patient name (often redacted by black rectangles in x-rays in training data), or any number of other factors.

This happens all the time in science. As another recent example of "whoops, turned out we were measuring the wrong thing", https://en.wikipedia.org/wiki/Faster-than-light_neutrino_ano...

Another example around AI: https://www.vox.com/recode/2019/12/12/20993665/artificial-in...

> One such résumé-screening tool identified being named Jared and having played lacrosse in high school as the best predictors of job performance, as Quartz reported.

Are lacrosse players naturally better workers? Probably not. Are they probably whiter, wealthier, better networks, etc. than the average population? Probably. These sorts of things - as with the 8x8 pixel example - start to point to confounding variables that need to be worked out and accounted for.


>the AI might have found some other indicator, like filenames in the data set

The paper quite explicitly goes into testing and disseminating what exactly the AI detects. Two observations:

- the classification clearly was primarily based on the visual content rather than spurious metadata, because various transformations of the visual content had the expected impact on classification correctness

- the classification clearly wasn't based on one specific feature of the visual content but rather on multiple factors in the visuals, because various transformations to features (including masking out specific features like bone density) produced results matching expectations (usually gradual decrease in accuracy, with some thresholds).

Conversely, if the classification was primarily based on factors other than the visual content, the visual transformations would have had negligible effect - possibly up to a threshold, and then would throw the AI completely off.


The faster-than-light neutrino experiment similarly went "we've tried to account for everything we can think of and still can't figure it out" when they published. It turned out to be a measurement error.

The same may be true here, and I think it's the most likely explanation.

I'd be interested in whether the same model can be trained to predict patient wealth, hair color, style of clothing, religion, etc. from the same x-ray data sets.


I am dismayed by your example that runs counter to the modern science.

While "faster than light neutrino" was highly unexpected and rather suspect from the start, the "bone geometry differs slightly between ethnic groups" is well established among the anthropologists of humans. There are also parallels in wider biology of animals - mentioning that to underscore it's as scientifically expected, and not merely construed for humans alone.

The question here was how exactly is AI detecting it this well from chest X-rays; the question centered around AI and possibly if it would unexpectedly influence the medical processes - rather than around the bone geometry itself.

For sake of example, a random link from google search: https://www.researchgate.net/publication/24427702_Ethnic_dif...


I agree that an AI model might be able to glean race on a probabilistic basis from x-rays.

This specific model's ability to do it from a 64 pixel version of said x-ray makes me skeptical it's doing so successfully.


What if it was an 8x8 grayscale photo of their face? We wouldn't be particularly surprised that it can guess race. The fact that we struggle to detect patterns in the data doesn't mean they don't exist.

> We wouldn't be particularly surprised that it can guess race.

That's actually a great example of this problem, though.

https://www.theverge.com/21298762/face-depixelizer-ai-machin...

> It’s a startling image that illustrates the deep-rooted biases of AI research. Input a low-resolution picture of Barack Obama, the first black president of the United States, into an algorithm designed to generate depixelated faces, and the output is a white man.

> It’s not just Obama, either. Get the same algorithm to generate high-resolution images of actress Lucy Liu or congresswoman Alexandria Ocasio-Cortez from low-resolution inputs, and the resulting faces look distinctly white. As one popular tweet quoting the Obama example put it: “This image speaks volumes about the dangers of bias in AI.”


Wanting the AI to have the same racial bias as Americans, that is that Barrack Obama who is half white half east African should be categorised the same as someone with west African heritage is just dumb. Barrack Obama has a white mother and a father from Kenya, so has little resemblance to African Americans who have mostly west African heritage, only reason people don't see a difference is because they are so used to categorise people by skin color. Of course getting data from all areas of Africa and all mixes of people would be great, but there are limits and adding in more west Africans wouldn't have helped accurately depixel Obamas face.

This is actually pretty funny. In much of the American Deep South, even Americans from other parts of the country can't identify race reliably when sitting next to a person.

The notion that a classifier can reliably identify race based on an 8x8 grayscale is risible.


Your comment is the only one in this entire thread that mentions reliability. Even if it were a meaningful metric in the context of race, an AUC of .68-.72 is a puncher's chance at best.

That's 1/4 the size of the texture of a Minecraft block. An AI could easily be trained to identify the block type of a Minecraft block based on an 8x8 subsample of one of its textures, and there's no reason something similar couldn't happen with biological race. Unless you a priori assume all racial differences are socially constructed and/or only skin deep.

Every Minecraft block of a particular type is identical.

The same is... not remotely true for humans, or even two chest x-rays of the same human.


Forgive me if I don't consider "you can tell someone's race from physical features" to be quite as extraordinary a result as "particles can travel faster than light".

The point is less "they're equally significant conclusions" and more "sometimes the thing you thought you discovered isn't a thing".

In principle yes, but did you read the paper? They do a lot of completely crazy things like blurring the image until it's just fuzzy blobs, or doing a high-pass filter on it until it just looks like noise (they comment that a human could not even guess that it's an x-ray picture), and they still get very high accuracy. Basically no matter what they try they can still get the race out, with slightly lower percentage numbers. When reading it I also thought this is too good to be true, and they may have some kind of bug in their code...

It seems clear that if you just, instead of blurring the image, set the images (or the part of the image with the x-ray scan) to the same image, then that would work to evaluate whether it is getting the information from the image or from some other source.

Seeing as this would be easy to do, I imagine that if it is at all plausible from what they know that it is getting information from anything other than the x-ray scan, that they would have already tried this?

I do wonder how good of a predictor something would be if it just went off the average brightness of the image. Probably very bad, but maybe better than chance? Well, better than chance on the training set is to be expected, the question I guess is whether it would be better than chance on the test or validation set (I’m not confident in my understanding of the distinction between testing set and validation set. Is the idea that if you are using the score on the testing set to decide when to stop training, and maybe what hyper parameters to use or something, and other things to determine which model, you only try the model on the validation set once you have decided on your final version of the model?)


>> I’m not confident in my understanding of the distinction between testing set and validation set.

It's confusing, not least because people refer to "testing" when they mean "validation".

So, suppose you have a dataset, let's call it D, and it doesn't matter what's in it other than "instances". To train a classifier you start by creating two partitions of D: a trainign partition (the "training set"), and a testing partition (the "testing set"). We'll denote them by T₁ for the training set and T₂ for the testing set.

It's typical to use most of D as a training set, for example you may choose 80% of D to be T₁ and 20% to be T₂. Obviously T₁ ∩ T₂ = ∅ and T₁ ∪ T₂ = D.

Now, because T₁ is four times the size of T₂ it's very likely that when you test your classifer on T₂, it will appear much better than it is, just because most of the instances in T₁ aren't represented (by similar instances) in T₂. This is called overfitting to the training set. One way to mitigate it is to perform cross-validation, the most common type of which is k-fold cross-validation.

In k-fold cross-validation, you further partition T₁ to k partitions, or "folds", and then hold out each i'th partition, for i ∈ [1,k], use all the rest k-1 partitions as a training set and test on the i'th held-out partition _during training_. So you train your classifier on partitions 1 ... k minus i, test it on partition i, and repeat this process for all i, recording the performance (accuracy, F1, ROC etc, whatever your metric is). Then you choose the model that performed the best on your chosen metric.

And then you test it on T₂.

To avoid confusion between the k folds of T₁ that you use for testing your training models during cross-validation, on the one hand, and T₂, that you use for testing the model that performed best on cross-validation, on the other hand, we call the testing process performed on the k folds "validation" and each i'th subset of T₁ used for validation a "validation set". And we just call T₂ the "testing set".

The confusion arises because we do actually _test_ on sub-sets of T₁. But T₂ is always the "testing set" and it's never "seen" during training.

As to hyperparameter tuning, this is done _on the testing set_, i.e. T₂. This is A Very Bad Thing™ but there you go. Once you train a classifier and find out that it sucks on T₂, what do you do? Well, you tune the classifier's hyperparameters. Or do a grid search to automate the process. So eventually you overfit your classifier to the test set, because you now essentially have no "unseen" data instances in T₂ - the classifier didn't see the instances in T₂ during training but the trainer did, or, worse, the grid search did, and the classifier's hyperparameters were tuned according to that knowledge. How to avoid that, is a big question, but anyway that's what is done in practice, and the reason for that is that when you do Big Data, you end up needing so much data that despite having terrabytes of it, you never have enough.


>crazy things like blurring the image until it's just fuzzy blobs

That... that doesn't influence one of the presumed ways the NN categorizes images: the trend in bone geometry. The "blobs", while fuzzy, still largely retain the relative proportions to each other. Or, in other words, proportions of image elements are invariant for operations of scaling and of blurring.


If that is the case you could write a normal algorithm to get the proportions and see if it separates you data set nicely. (which should be done to prove this assertion)

As an additional comment on this point:

The fact that trained neural networks cannot tell us why they give an answer and the best tool we have to explore that is to wiggle the inputs and see how the black box responds is a major concern for the whole space. Figuring out how to tag data with enough information to generate a "why" was an active area of research ten years ago and still is.


Yep. "Explainable AI" is an active area of research with huge amounts of funding and interest from institutions in the US, EU and China. For example, this is the DARPA programme:

https://www.darpa.mil/program/explainable-artificial-intelli...


Can you explain to me how you recognize your mother's voice?

I cannot, but I don't understand why the question is asked. I'm not a convolutional neural network. And recognition of my mother's voice isn't going to have impact on, say, medical treatment for people of a particular race, or who gets a loan granted to them, or whether an autonomous vehicle successfully recognizes a person crossing the street, or whether a drone's auto targeting system decides that this blob of sensory input is a civilian sedan or a tank, or any number of other situations where the consequences of not understanding how a decision was made are that people are placing their fates in the hands of unaccountable machinery.

One of the points of building these systems is to do better than human-driven.


We rely on human judgements that can't be described all the time in many of the above situations, so why would it be a "major concern" with machines? Machines can be much more thoroughly tested than humans, so they should be more statistically predictable

Three reasons: one practical, two psychological.

The practical one is that errors in a machine system scale, as do most things with machines. If I have a single bad X-ray tech who is applying the wrong medical process because I have a different race, for some reason, the damage that tech is doing is limited to whatever specific set of patients they are seeing. If a similar error occurs in a popular machine classification tool used widely by a hospital network, the damage is widespread. It is a plus that the machine can be corrected and the correction also scales, but with the (relatively speaking) stone tools we use to understand why a CNN makes its decisions these days, every fix risks breaking something else we're not testing for.

The first psychological reason is that machine learning systems break in "alien" ways. They don't make the kind of mistakes humans make... They make mistakes as a product of their machinery, which means it's much much harder to predict what those mistakes will look like for an average operator. As a frequent example, it's pretty rare for humans to misclassify human beings in photographs as apes, or to fail to recognize a face in an image because the skin is too dark. That's a failure mode that happens over and over again with image recognition systems.

And the second psychological reason is that humans don't trust machines to make human decisions yet. And that mistrust doesn't extend to other humans, even though we're incapable of cracking open another human's mind and understanding their thought process at the mechanistic level. It doesn't matter... we are the same organism and have a shared experience and empathy with them that we lack with machine recognition systems. It's semi-irrational, but it can't be wished away. A system for understanding why a machine makes decisions would be a step in the direction of addressing those concerns.


I'm just making this up, but...

Perhaps hospitals that treat a disproportionate share of poor people (which themselves are disproportionately not white), tend to use a different brand of X-ray film, and that brand has different contrast ratios than that of the brand preferred by rich hospitals. Thus, they'd be detecting the different brand of X-ray film rather than anything about the patients themselves.

Of course, at this level it's still hard to imagine generating that 82% hit rate. But maybe there are multiple factors along these lines.


> tend to use a different brand of X-ray film

Most of us radiology folk abandoned film 20 years ago and went to digital systems (CR or DR). This doesn’t negate your query though, as vendors do have different technologies and their images do not look the same.


That sounds like a great idea and they can test for it! Classify the scans on the “type of film” and then alter the scan and see if the model recognizes it

I think the idea is that it's picking up on a coincidental correlational bias in the source data.

Since it's effectively the same anatomical structure, presented in a grayscale image, I think bone density would change the average color of the image regardless of what level of pixelation was applied. The chance of detection through some other method isn't ruled out though. It could be hitting off of something mundane like the margins on the edge of an image, how centered the torso is in frame, foreign objects in the body from procedures, any random thing that could bias the data can also lead to false positives. The only way to verify would be to use a new data set, ideally from new hospitals, and see if it has similar results.

But it does mean that AI models trained on data might end up just perpetuating racial correlations even when you don't think it's aware of race. It's another example of why interpretability is important: the model might end up depending on correlations you don't mean it to.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: