Hacker News new | past | comments | ask | show | jobs | submit login

At least this is relatively innocuous. Until recently if you did a Google Image Search for "person" or "people", it only showed white men.

One can play this game a lot and most results will return expected cultural biased results. A "kind person" is apparently a white girl. A "good person", a white woman. A "bad person", white men. A "evil person", white men. A "honest person", equal mix of white women and white men. "Dishonest person", white men in suits. "Generous person", hands of white women. "Happy person", women of color. "Unhappy person", old white men. "Criminal person", Hispanic men. "Insane person", white men. "Sane person", white women.

Is it surprising that very few of the result surprises me?

Down voted because this is just a lie.

"Kind person" - pictures of men women, children, of all ages and colors.

"good person" - Mostly pictures of two hands holding. No clear bias towards women at all. If anything, more of the hands look "male".

"Bad person" - Nearly 100% cartoon characters

Absolutely ridiculous that you would take the time to write up such fake nonsense.

Google searches are not reproducible, different users can get different results on the same query.

Yes. If I had the energy and time to do a proper researched data set I would have a bot search through the top 100 common words associated with either warmth (sociability and morality) or competence, and then use a facial recognition system go through the first 100 images of each to determine the distribution of gender, age and skin color.

Following the stereotype content model theory I would likely get a pretty decent prediction of what kind of culture and group perspective produced the data. You could also rerun the experiment in different locations to see if it differ.

FWIW, this is most likely not a bias of the search engine, but just a reflection of its sources (mostly stock image platforms I suppose). So if most stock images of blue trolls would be labelled with "politician", you'd eventually find blue trolls when searching for "politician".

Did you google all of them?

Yes. I thought about words people use in priming studies, usually in order to trigger a behavior, and just typed the word with space and "person" appended.

I did use images.google.se in order to tell google which country I wanted my bias from since that is the culture and demographics I am most familiar with. I also only looked at photos of a person and ignored emojis.

I have also seen here on HN links to websites that have captured screen shots of word association from google images and published them so you could click a word see the screen shot. They tend to follow the same line as above, but with some subtle differences, and I suspect that is the country culture being just a bit different to mine.

You really should link to screenshots of your results so people can judge for themselves.

I just submitted all your searches to google.com from Australia, and the results were nothing like what you described; all the results were very diverse.

This is to be expected, as Google has been criticised for years for reinforcing stereotypes in image search results, and has gone to great effort to adjust the algorithms to reduce this effect.

I usually don't spend time producing evidence since no one else does it, nor did the parent comment, or you for that matter. It also tend to derail discussions onto details and arguments over word definitions.

But here, not that I think it will help: https://www.recompile.se/~belorn/happyvscriminal.png

First is happy person. Out of 20 we have 14 women, 4 guys, 2 children.

Second is criminal person. The contrast to the first image should be obvious enough that I don't need to type it.

If I type in "person" only I get the following persons in the first row in following order: Pierre Person (male) Greta Thunberg (female) Greta Thunberg (female) Unnamed man (male) Unnamed woman (female) Mark zuckerberg (male) Keanu Reeves (male) Greta Thunberg (female) Trump (male) Read Terry (male) Unnamed man (male) Greta Thunberg (female) Greta Thunberg (female) Unnamed woman (female) Unnamed woman (female)

Resulting in 8 pictures of females, 8 males, which I must say is very balanced (I don't care to take a screenshot, format and upload, so if you don't trust the result then don't).

Typing in doctor as someone suggested in a other thread I get in order (f=female, m=male): fffmffmmmmfmmfffmfmfmmmff

and Nurse: fffmffmfmmffmffmfffmffmffff

Interestingly the first 5 images have the same order of gender and are both primarily female, through doctor tend to equalize a bit more later while nurse tend to remain a bit more female dominated.

Thanks for the screenshot. It helps (and by the way, yes the onus is on you to provide evidence as you're the one making the original claim).

Your initial comment said "Happy person", women of color.

But your screenshot showed several white people, several men, and a diversity of ages. Yes, more women, which is probably reflective of the frequency of photos with that search term/description in stock photo libraries and articles/blog posts featuring them. No big deal.

You also said "Criminal person", Hispanic men

But the screenshot contains more photos of India's prime minister than it does of Hispanic men. In fact I can't see any obviously-Hispanic men, and the biggest category in that set seems to be white men (though some are ambiguous).

The doctor and nurse searches suggest Google is making some effort to de-bias the results against the stereotype.

To me the biggest takeaway is that image search results still aren't very good at all, for generic searches like this.

Indeed it's likely that they can't be, as it's so hard to discern the user's true intent (for something as broad as "happy person"), compared to something more specific like "roger federer" or "eiffel tower".

I couldn't quite believe your comment when I read it so I did a Google image search for "person" and the results weren't a lot better than you'd suggested. Mostly white men, a few white women, a very few black women, a handful of Asians, and multiple instances of Terry Crews.

The net result of that Google search, combined with the "Shirt Without Stripes" repo, leaves me even more unimpressed with the capabilities of our AI overlords.

I think the skewing of results lessening your impressed-ness is the wrong takeaway. If anything, the AI is a more perfect mirror of the society it learned from than you expected. Perhaps the right way to look at it is that we are capable of producing things that we don't understand, that are more sophisticated than we realize.

You may be right. It's been bugging me since I posted earlier on so I fired up a VPN with an endpoint in Japan, along with a private browsing session in Firefox, to see if I got different results. As it happens the results were interesting:

- If I entered "person" I'd see a mix of images substantially similar to what I saw using google.co.uk up to and including Terry Crews, which was frankly a little weird, and otherwise mostly white

- If I entered "人", which Google Translate reliably informs me is Japanese for "person", I'd see a few white faces, but a substantial majority of Japanese people

So it seems possible that Google's trying to be smart in showing me images that reflect the ethnic makeup I might expect based on my language and location. I mean, it's doing a pretty imperfect job of it (men are overrepresented, for one) but viewed charitably it's possible that's what's going on.

Is the case for woke outrage against Google Image Search overstated? Possibly; possibly not. After these experiments I honestly don't feel like I have enough data to come to a conclusion either way, although it does seem like they may at least be trying to do a half decent job.

This seems like you're attributing motive to google here, but I don't believe that's right. For example, Terry Crews appears in the query "person" because his "TIME Person of the Year 2017 Interview" article was very popular online. I get a lot of Greta Thunberg because she was TIME Person of the Year 2019 and received similar online attention because of Donald Trump.

The TL;DR of it is that google crawls the internet for photos, associates those photos with text content pulled from the caption or from the surrounding page, and gives them a popularity score based on the popularity of the page/image. There are some cleverer bits trying to label objects in the images, but it's primarily a reflection of how frequently that image is accessed and how well the text content on the page matches your query. There's some additional localization, anti-spam, and freshness rating that influences the results too.

The majority of pages with "人" and a photo on it that has a machine labeled person image would be a photo of a japanese/chinese person, and if you're being localized to japan with a vpn, that would be even more true.

Google doesn't "know" what you're trying to search. It's a giant pattern matching game that slices and dices and rearranges text to find the closest match.

> Google doesn't "know" what you're trying to search. It's a giant pattern matching game that slices and dices and rearranges text to find the closest match.

I'm not disputing that, and it certainly explains why it's "good enough" for somes search queries whilst being totally gimpy for others.

My understanding was that Google does prioritise what it's classified as local search results though, on the basis that they're likely to be more relevant.

This is the problem though, all those companies are advertising fantastical results. They aren't saying "Hey! We spent billions of dollars so our algorithm could be as racist as your uncle Steve!". Oh and by the way, Steve is now right - because all the crimes he ever finds out about are by black people, because that's what Google has decided he wants to see. So it's no longer him seeking out ways of justifying his latent racist tendencies, no, he's outsourced that to Google.

Bing results, "person" shows stick figure drawings, Pearson Education logos, Person of the Year, people named Person, etc.

"Person without stripes" shows several zebras, tigers, a horse painted like a zebra, and a bunch of people with stripes.

> "Person without stripes"

Interestingly, duckduckgo shows me, as second result, an albino tiger with, you guessed it, no stripes. The page title has "[...] with NO stripes [...]" in it, so I assume that helped the algo a bit.

EDIT: I also got the painted horse (it looks spray-painted, if you ask me) and I must admit it's quite funny to look at

If you really want to be disappointed, search for [doctor] and [nurse].

Unless things have really changed, [doctor] will be mostly white men and [nurse] will be mostly white and Filipino women.

But don't blame the AI. The AI has no morality. It simply reflects and amplifies the morality of the data it was given.

And in this case the data is the entirety of human knowledge that Google knows about.

So really you can't blame anyone but society for having such deeply engrained biases.

The question to ask is does the programmer of the AI have a moral obligation to change the answer, and if so, guided by whose morality?

Those look almost entirely like stock photos or part of advertisements. It's probably just reflecting the biases of what photos other businesses like, which get the label of "doctor" or "nurse".

Any sort of image search is going to tend to be biased toward stock photos, because those images are well labeled, and often created to match things people search for.

> The AI has no morality. It simply reflects and amplifies the morality of the data it was given.

Key point right there. Unless Google is deliberately injecting racial and/or gender bias into their code, which seems extremely far fetched (to put it kindly), the real fault lies with us humans and what we choose to publish on the web.

All the young doctors are women. 13 women to 12 men.

Nurses it's 34 women to 5 men. Proportions of skin tones are what I'd expect to see in a city in my country.

What does the color of people's skin in search results have to do with morality? I was raised not to see color, now we have this "progressive" movement hell bent on manipulating search results to disproportionately represent minorities. If you want to filter your search results based on the color of skin you can do that easily.

What bias? Who is biased? Quick duckduckgoing indicates there are far more male than female doctors in the US. So statistically, it would be correct to return mostly male doctors in an image search. If you want a photo of a specifically gendered doctor, it's not hard to specify. Not really seeing a problem here.

> What bias? Who is biased?

I would contend that society is biased. There is no evidence that says men are better doctors than women, and in fact what little this has been studied says that women make better doctors than men (and is reflected in the more recent med school graduation classes which are majority women).

So it's a question of what you are asking for when you search for [doctor]. Are you asking for a statistical sampling or are you asking for a set of exemplars?

> So statistically, it would be correct to return mostly male doctors in an image search.

And that's exactly it. The AI has no morality. It's doing exactly what it should, and is amplifying our existing biases.

> So really you can't blame anyone but society for having such deeply engrained biases.

You can blame statistics for that. Beyond that, you can blame genetics for slightly skewing the gender ratios of certain fields and human social behavior to amplify this gap to an extreme degree.

Honestly, I don't think morality is the issue here; it is objectively inaccurate to show only white men for the search string "doctor" when not all doctors in the U.S. are white men, and most doctors in the world are not white men. This would be like showing only canoes if someone searched "boat"--we would rightly consider that an error to be corrected.

IMO, wrapping it in a concept like "morality" because the pictures have people in them just serves to excuse the problem and obscure its (otherwise obvious) solution.

I tried this as well in an incognito window on Firefox and got the results you mentioned. I notice, however, that virtually all of the results have associated text containing the word person. It seems likely that Google image search featurizes photographs to include surrounding document context.

(That's how I would do it if I wanted more accurate rather than more general results.)

I don’t understand why AI or a search engine had to meet your or anyone’s expectations for diversity. If I searched for “shirt” and didn’t get shirt pictures in the color I wanted I would just tune my query instead.

I just did a google image search for "person". The first 5 images were of Greta Thunberg. She must be the most representative person ever.

The next few images contained Donald Trump, Terry Crews, Bill Gates and a French politician named Pierre Person.

After that it was actually quite a varied mix of men/women and color/white people.

I am still not very impressed with Google's search engine in this aspect, but it is not biased in the way you suggest.

At least it is not biased that way for me. As far as I am aware, and I might be completely wrong here, Google, in part, bases its search results on your prior search history and other stored profile information. It is entirely possible that your search results say more about your online profile than about Google engine :)

> The first 5 images were of Greta Thunberg. She must be the most representative person ever.

Well, she was the 2019 Time Person of the Year.

Likewise, Trump was the 2016 choice, and Crews and Gates have been featured as part of a group Person of the Year (“The Silence Breakers” and “The Good Samaritans” respectively).

AI can't fix society's problems. AI merely reflects them back.

4 of my top 7 images (the top line) are Greta Thunberg in a search for "person". First viewport is 11 men, 11 women, 1 stick person, of which there are 4 Thunbergs, 4 Trumps, 2 Crews. People seem to be if they got major "person" awards like "most powerful person" or "person of the year".

There's not much diversity, assuming Terry Crews is from USA, then all the first viewport full of images are Western people; except Ms Thunberg they're all from USA AFAICT [I'm in UK].

The first non-Western person would be a Polish dude called Andrzej Person (the second Person called Person in my list after a USA dancer/actress), then Xi Jinping a few lines down. The population in my UK city is such that about 5/30 of my kids primary and secondary school, respectively, classmates have recent Asian (Indian/Pakistani) heritage. So, relative to our population, there are more black people, far fewer Indian-subcontinent no obviously local people.

Interesting for me is there are no boys. I see girls, men and women of various ages but no boys. 7 viewports down there's an anonymous boy in an image for "national short person day". The only other boys in the top 10 pages are [sexual and violent] crime victims.

The adjectives with thumbnails across the top are interesting too - beautiful, fake, anime, attractive, kawaii are women; short, skinny, obese, big [a hugely obese person on a scooter], cute, business are men.

Most of the person results appear to be 'Time Person of the Year' related. Another result is a guy with the last name Person. The results don't seem to be related to the definition of the word 'person'.

For me it shows all newsworthy people and articles. It shows the titles of the pages and they are all stuff like "11 signs you are a good person" So it seems clear that there is no kind of AI bias here but simply that high ranking articles with the word person more often than not choose white men as their stock image.

Most of the very top results seem to be of trump and greta thunberg.

You've raised an entirely unrelated problem. Showing shirts with stripes when you search for "shirts without stripes" is just plain wrong. Showing only a single demographic of person when you search for "person" is correct, it just doesn't have the level of diversity you seem to want. Nothing about diversity is implied in the query, and so your observation is completely unrelated to a plainly incorrect query.

On the other hand, the bias in the results means they're somewhat incorrect: there is more than one demographic of person, showing only one in response to a query that doesn't ask for a particular one is incorrect.

If you were unfamiliar with them and searched "widgets" to find out more and got widgets of a single colour and form, it would not be an unreasonable assumption that widgets are mostly (if not entirely) that shape and colour, especially if there was nothing to indicate that this was a subset of potential widgets.

It's not so much "demand for diversity" as it is "more accurate and correct representation".

A former coworker had the last name "Person". They once received a letter (snail mail) addressed to "TheirFirstName Human".

I never figured out what kind of mistake could have led to that.

Maybe a veterinarian's customer database? They would have to distinguish pet names from humans, but keep a record of all.

Yeah, that's one plausible explanation. (I don't remember the nature of the letter.)

Relatedly, one time I picked up a prescription for a cat. The cat's name was listed as CatFirstName MyLastName. They had another (human) client with that same first and name. It turned out that on my previous visit they had "corrected" that client's record to indicate that he was a cat.

I think search algorithms still have a long way to go to really understand the intention. Try your image search results for "white person" "black person" "asian person" "white inventors" "black inventors" "asian inventors" Doesn't quite deliver what would be expected.

Huh, I tried that with 'people' and the first result that was all white was #15, first result that was 100% men was #8.

If I search for 'person' it's a mixed-race woman, then a white woman (Greta Thurnberg), then a white man.

More than racism on the part of google[1] I would attribute that to it being an hard problem with too many dimensions. About three years ago if you searched "white actors" google would give two full pages of only black people (I have no idea whether the actor part was correct).

Many interpreted this along tribal lines, but likely it is that there is constant tuning and lots of complex constraints.

[1] not to say that you implied the reason was racism, but often it is attributed to something along those lines

The inverse: a favorite trope of the American far right is that GIS for "american family" will show you photos of... mixed race families. (Something the far right has strong opinions on, and is a tiny minority of all marriages in the US)

Something of a corollary to Brooksian egg-manning: with an infinite number of possible searches, you can find at least one whose results do not exactly match the current demographics of the state from which you place the search.

Did they manually skew the results of the algorithm once this started making bad PR?

And when I search for "men without hats" I see men from Men Without Hats with hats. Language is hard.

DDG does pretty good for "person" or "people"

What is your point?

The google image search you did -- did not provide incorrect answers, unlike the OP's

Jokes on you. Not having diversity is now considered incorrect, even if it wasn't stated. AI needs to learn to keep up with the craving for relevance the rest of Silicon Valley has by ensuring all results comply with whatever equal opportunity mantra is now in vogue. The next time I search for "CSS color chart" I expect the preselected color to be black.

Wouldn't that be a reflection of the world's bias rather than Google's bias?

Google American Inventors and you'll get 95% black men.

I hate comments like this that only exist to create drama.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact