Hacker News new | comments | show | ask | jobs | submit login
Why Stanford Researchers Tried to Create a ‘Gaydar’ Machine (nytimes.com)
38 points by JumpCrisscross 11 months ago | hide | past | web | favorite | 52 comments

The "average face pictures" make it clear that the average weight of the straight people is higher than the average weight of the gay people. My guess is that they actually built a fat detector. It certainly wouldn't surprise me to find that gay people are on average much less fat than straight people. A quick search through the actual papers shows that they made no effort to control for the effects of weight.

I'm having trouble finding any information on the issue, but I did find one paper that refutes your claim[1]. That info was gathered from British people, and didn't show any difference in weight between homosexuals and heterosexuals.

Whether or not fat is what's being detected depends on the dataset I suppose, though.

[1] https://www.ncbi.nlm.nih.gov/pubmed/11911936

Especially when facial lipoatrophy is a well known side-effect of anti-HIV drugs [0].

0. https://www.poz.com/basics/hiv-basics/changes-face-body-lipo...

Edit. I never complain about downvoting (who cares it is just HN), but what exactly are people downvoting?

It is a well known side-effect of the older anti-HIV drugs to lose fat from the face (it tends to be deposited around the belly and organs). Even if you controlled for overall weight, then you would still need to control for facial fat separately.

My guess is that people are downvoting due to the assumption/stereotype that HIV is associated with homosexuality. While it's true that HIV is more prevalent among men who have sex with men in the US[0], some still think of HIV as "the gay disease", which it is definitely not, and the stereotype contributes to stigmatization of homosexuality.

That may not be what you intended, but it's one way of reading your comment that can come across negatively.

[0]: https://www.hiv.gov/hiv-basics/overview/data-and-trends/stat...

Thanks for answering.

Unfortunately in the population studied (white americans) HIV is far more prevalent in men who have sex with men (>70% of all new infections) than in heterosexual men and women. If your two populations vary significantly in any factor you need to control for it or make sure you rule out that it is significant - this is science 101.

Having read the paper and authors notes, it appears that facial fat differences are unlike to be the signal being detected.

You only need to control for it if the total number of infections is a significant percentage (probably somewhere around 1-5% depending on significance level) of each group population - otherwise you'd have to control for anything and everything and statistical field studies would be completely impossible to allow for any conclusions (while as it is it's just really hard).

This is true, but you should at least make an effort to rule this out. The really hard thing in science is not fooling yourself by overlooking something important.

Didn't vote but they looked at men and women - hence if it found lesbians with some sort of accuracy your HIV treatment hypothesis doesn't work - I'm pretty sure lesbians get HIV at a much lower rate than gay men or drug addicts. It also seems your hypothesis presumes the vast majority of gay men have HIV infections. I didn't bother looking it up, but both those assumptions seem pretty faulty from memory.

Yes lesbians are at a reduced risk to being infected with HIV than heterosexual women.

It is not my hypothesis that weight differences is the signal being detected, it is the GPs. I was just pointing out that facial fat may not correlate with weight in the different populations studied.

Haha, let me introduce you to the bear community. TBH, I can't say for sure, but at least in this area, the gay community is very diverse in weight and bears / chubs are very popular. Going on a gay dating app as a chub you get mobbed like a skinny blond woman does on a straight app.

I think you're mixing up the fact of outlier (subgroups) with the location of the mean in a normal distribution. Looking at the pictures in question it does seem to me that GP has a point, i.e. BMI could be a simple proxy for what the NN is detecting as sexual preference.

> My guess is that they actually built a fat detector.

Mmm, given that the system was correct 80% of the time and the majority of fat people are not gay, then this can't possibly be true, right?

Consider reading the paper itself and the authors' notes before throwing various objections around - they address many of them.

Paper: https://osf.io/zn79k/

Notes: https://docs.google.com/document/d/11oGZ1Ke3wK9E3BtOFfGfUQuu...

Read it and here's what I would like to know. When they used human gaydar (which they carefully said this was not...) to create the statistics to compare against the AI sexual orientation detector, what was the best human gaydar and the worst human gaydar (were there any humans actually better than the algorithm)? Why didn't they increase the number of faces available to the human gaydar or did they do that and it didn't make a difference? Also, if they are letting the machine learning algorithm learn, shouldn't they have done it in a way that let the humans learn for a comparison?

I loved the authors notes on the paper and the idiotic arguments they have to deal with. A must read.

"So to call attention to the privacy risks, he decided to show that it was possible to use facial recognition analysis to detect something intimate"

I'm curious if this is backpedaling, or if he did clearly call this out before starting. Not that backpedaling in the face of death threats is that terrible...

It's in the paper, it's literally in the abstract: "our findings expose a threat to the privacy and safety of gay men and women"

Hmm. Makes this quote kind of interesting...

Advocacy groups like Glaad and the Human Rights Campaign denounced the study as “junk science” that “threatens the safety and privacy of LGBTQ and non-LGBTQ people alike.”

It’s really only ‘interesting’ to the extent one finds its hard to believe that people would harm another on the basis of their supposed sexual orientation. Otherwise it just sounds like humans finding more pretexts to harm each other, which sounds like a pretty common occurrence.

LOL. Do these advocacy groups even both with things like logical consistency before denouncing something.

One could argue that even if the AI doesn't actually work, if people believe it works then it could bring harm (sorta like lie detector tests which are ineffective but useful for getting confessions out of people who don't know any better). Hence junk science that could harm people of any orientation.

Junk science can certainly harm everyone, but if the research is junk then it won’t be able to harm anyone.

If you wanted to railroad someone specific into a confession you wouldn’t use this research, you would just use a lie detector. Potential harm can only arise if this research is not junk.

The real risk is that we sleepwalk into letting governments and large corporations build these systems without any discussion of the consequences.

The lie detector test was an example of junk science that still can be used to produce a desired result. Not meant to imply that people will use a "gaydar" to try and get people to confess to crimes. My mental contrived example was that homophobes in a homophobic community might doctor up such a system to try and convince other people in the community that someone they don't like is gay and should thus be ostracized, but I realize it's a stretch.

Potential harm can certainly arise if the research is junk. People misuse research or believe in junk research all the time. Just look at the anti-vaccines crowd.

I agree on your third point.

Soon to come: visual paternity/maternity tests. Visual pregnancy tests. Lie detector smart phone apps. Bar conversation isolation and transcription.

Blackmailers will have a field day.

I'm more worried about health insurance companies, potential employers, loan officers, and so forth. I'm sure there's lots of ML being produced to "reduce risk".

"Insurance in the limit of perfect information" – that would be the title of my talk if I were ever invited to give one at an insurance company as part of a hiring process or something. I think it's a cute game theoretic problem.

See, if an insurer, in the extreme case, can make a perfect prediction of whether the presumptive customer will get the disease (or whatever the insurance is about), then it would offer insurance to the ones who won't need it and not offer insurance (or offer it at a price that is too high to be useful) to the ones who do need it. The customer who is offered insurance will thereby learn that he doesn't need it, and will decline the offer!

Is this the end of insurance? Insurance is of value to both sides. Is it really reasonable that we can't extract this value if one side can predict the future?

I think the rational solution might be for the insurance company to introduce some randomness when making their offers, so that the customer can't for sure know, by looking at the offer, whether he will be sick or not. I haven't calculated more precisely what the optimal strategy of the insurer would be. It would presumably depend on the utility functions of both parties.

If it's perfectly predictable, but takes effort, then insurance companies would be replaced by (or pivot to) companies that sell the service of telling you your predictions. (If the prediction takes no effort—open source software, very little computation—then this cost will approach zero, and the more tech-savvy would do it themselves. Maybe OS vendors would bundle it as a convenience.) Then each person knows in advance what medical costs they'll have to pay—or choose to not bother with—and can plan accordingly.

Realistically, I think that having perfect information about one person and about biology and stuff still doesn't give you enough information to predict whether they'll get various diseases. If there is any validity at all to "exposure to X causes cancer", then no amount of information about one person will help you predict whether they'll get exposed to X. Or whether they'll trip, fall, and break a bone. And, really, I expect that the perfect state of knowledge of human biology would merely allow you to deduce "person X has x% chance of developing disease D during the next ten years; person Y has y% chance". So there would still be a role for insurance companies for the risk-averse.

You can create pools as is the case with group health plans in the US today but then you have the issue of adverse selection if people can opt out.

I think there is already a lot of regulation in insurance and banking industries which stops them acting on statistical signals (relating to sub-populations) which would save them a vast amount of money.

Gaydar is a gaida performer, and gaida means bagpipe in Bulgarian (other languages?)

My family name on mother's side was Gaydarovi (-ovi)

Just a fun fact!

Or a prime minister of Russia https://en.wikipedia.org/wiki/Yegor_Gaidar . However, it's pronounced [GUY-DAHR] in Russian,is it different in Bulgarian? There are few people actually named Gaydar though:



And here was I hoping they'd created Gouda.

> “Deep neural networks are more accurate than humans at detecting sexual orientation from facial images”

I don't think the comparison to humans is the interesting thing here. The thing of interest is what can be said at all, by human or by computer, by looking at a picture.

This is a fascinating article. It's amazing what kind of data you can get from something as simple as a photograph.

I don't really understand the outrage, however. Some in the article claim that it's "racism" and hurts Dr. Kosinski's career, but the fact that he made a program that predicts who is gay and who is straight with odds much better than a coin flip is crazy! If they find out what the differences are between heterosexual and homosexual pictures, they may discover something that was never noticed before, whether it's a physiological trait or a difference in the photo taken due to behavioral differences.

We really are living in the future.

edit: To those downvoting me, could you please explain why? I don't think I said anything particularly controversial.

I suspect you're getting down voted because people see a reason to be outraged at this and you've said you don't understand the outrage.

I'd like to ask someone to articulate the reasons for this outrage. I don't really understand, wasn't this created to show the dangers of facial recognition systems, didn't they have to pick something edgy to get that point across?

Firstly, the researchers seem to have a cracker jack box understanding of the fields they're claiming to upend, what with claiming PHT is widely accepted. Secondly, it's not clear to me that they controlled for enough variables and only doing it on white people is strange. It could really be that gay people use better photos, who knows.

They're claiming to be able to detect gay people but it seems like the system has to be coaxed very carefully into doing that and seems very limited. They built a purposefully inflammatory tech demo that seems to do more to betray their distance from the topic they're studying than it does bolster their reputation. At least from my perspective.

> Secondly, it's not clear to me that they controlled for enough variables and only doing it on white people is strange. It could really be that gay people use better photos, who knows.

Are these two sentences meant to be related to each other, or am I misreading your comment?

No that's a bit confusing. They were meant to be totally unrelated.

So if I understand correctly the problem is not in what they attempted to do, it's that they didn't do it properly. Because they did this incorrectly the results they claim are not conclusive and given the inflammatory nature of the study they should be more careful with their claims?

Basically yes. Also if the point was to show that "AI" or whatever can be used for nefarious ends I think there are better ways of doing it. I have a feeling they were looking for better results than they got.

> ... only doing it on white people is strange.

According to the article, they found few profiles of non-white homosexuals.

I know what they said the reason was, it doesn't make it less strange.

It's not at all strange, for the researchers. If there's too little data, there's nothing to be done. But maybe it's strange that there are so few profiles by non-white homosexuals. It's a cultural thing, I suspect.

Well, if you don't have enough data for the experiment you want to make, then you don't make the experiment.

Not make the experiment and then note that "btw, don't call us out on our poor data, it's not our fault, we tried our best".

Especially so if you're perfectly aware of how controversial your study may be considered.

Racism is an odd choice of words, since ML would probably determine race at a higher level of accuracy than this experiment.

The word "racism" comes directly from the Clare Garvie quote in the article: “At its very worst, this is racism by algorithm.”

She should have used "bigotry" or "discrimination" instead.

Would it? Race is so arbitrary and nonsensical that it’s tough to rationally predict. You could train it to determine things like skin color and other gross morphological features, but only the ones the trainer chooses to represent given races.

I think most posts receive very few votes in any direction. You may be merely looking at a few downvoters, which I feel is a different population than HN at large, since many people can’t even downvote.

I haven't downvoted you, but I'm a bit troubled by your unbridled enthusiasm for how amazing previously futuristic technology is without any consideration of the ways in which it can be abused - as historically such knowledge about others has been abused. I invite you to consider the experience of people whose appearance is distinctive enough that complete strangers feel emboldened to just casually attack them in the street without any prior provocation.

Now imagine the ramifications of having an app that can reliably tag people as gay or suchlike. How do you think that's likely to get used in the real world?

That chair looks extremely uncomfortable. As a former grad student, I would hope that Stanford provides enough resources to ensure its students and staff members don't get RSI (Repetitive Strain Injury).

It also looks much better than a typical office chair.

Of course dating photos are going to give away dating data..the aggregate gay photo doesnt have much facial hair. Most likely reasoning here is that gay men shave for dating photos much more often than straight men do. This isn't groundbreaking or racist..its just a trivial finding. All they did was build a facial hair detector.

Applications are open for YC Winter 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact