Hacker Newsnew | comments | ask | jobs | submitlogin
Sex Machine: Get gender from first name in Ruby (github.com)
45 points by _pius 372 days ago | comments


te_platt 372 days ago | link

As a man named Tracy I would just throw out a caution to anyone thinking of using this. At this point in my life I don't really care about people calling Ms. Tracy Platt. But what it does do is immediately signal that I don't really need to pay attention. Fake familiarity doesn't usually work too well.

As a side note I would like to mention that having a girl's name does have some benefits. For example my wife can handle bank account issues over the phone for me. Also some near misses - I was assigned a locker in the girls locker room in junior high but they caught on before I could use it, and once a telemarketer called to offer me a spot in an all woman's resort spa. The only time I said yes to a telemarketer and I got rejected.

-----

nostromo 372 days ago | link

Me and my significant other frequently impersonate each other when dealing with banks and hospitals over the phone (we're a gay couple). It honestly never occurred to me that straight folks can't get away with this. How unfortunate; it's quite a time saver!

-----

chadillac83 372 days ago | link

We can, it's just a little more tricky... example being while looking up plane ticket information for an S.O. I'm usually her "personal assistant".

-----

binarysoul 372 days ago | link

my experience is that the person on the phone usually doesn't care if you have all the relevant info

-----

mst 372 days ago | link

My full name is Matthew Stephen Trout.

This means I can filter a substantial percentage of junk mail simply by throwing away anything addressed to 'Ms. Trout'.

... or Grout, Sprout, Trent or any of the other hilarious misspellings people also seem to manage.

Less OT: Using such a thing to pre-select the probably-right option on a gender picker might be actually useful without the false selections causing the "fake familiarty" problem.

-----

MartinCron 372 days ago | link

I just checked the data file behind this project and Tracy shows up as both a male and female name. If I were to use this, I would only ever let the analysis "leak" to the end-user if it were extremely confident that it's a strong match.

-----

rabidonrails 372 days ago | link

Reading the last two sentence made me smile. Of course, the following must be mentioned: http://www.nbc.com/saturday-night-live/video/its-pat/n10133/

-----

ig1 372 days ago | link

Without knowing what data it's been trained on it's of questionable use. What maybe a 18 year old american girls name may be a 60 year old german mans name.

The gender of a name can vary heavily by culture and time-period, it would make much more sense for the api to return the data in the form of ranked probabilities.

As an aside it's worth noting that as this library is GPL3 it means you can't use this code in any non-GPL product.

-----

mbell 372 days ago | link

> Without knowing what data it's been trained on it's of questionable use.

Literally the very first line of the README has a link to the source data. It contains name frequency data for a number of countries and the README clearly indicates you can provide a country of origin when doing lookups.

> As an aside it's worth noting that as this library is GPL3 it means you can't use this code in any non-GPL product.

The source data is GPL, there wasn't much option.

-----

ig1 372 days ago | link

Can you get probabilities out ?

-----

tingletech 372 days ago | link

The data is GPL documentation license; why/how would that affect the code?

-----

andyroid 372 days ago | link

> As an aside it's worth noting that as this library is GPL3 it means you can't use this code in any non-GPL product.

Of course you can. Usage != distribution.

-----

ig1 372 days ago | link

Agreed, but putting GPL code into your internal code is incredibly risky because of the vagaries of what counts as distribution under the GPL (for example if you distribute your code to a third-party security auditing company).

-----

jghrng 372 days ago | link

> As an aside it's worth noting that as this library is GPL3 it means you can't use this code in any non-GPL product.

Well you can always simply ask the author -- if he has got the copyright, he can grant you permission.

-----

api 372 days ago | link

Its probably useful for aggregate statistics. The Jordans and Teagans and Pats are outliers.

-----

MartinCron 372 days ago | link

As are the Leslies, Shannons, Ashleys, and Tracys.

That makes me think, if you had firstname + birthdate, you could probably be a tiny bit more accurate.

-----

nickzoic 372 days ago | link

first name, birthdate, country/region code to allow for regional variations.

Would make a nice exercise in fuzzy decision processes, but I suspect it isn't a great idea: you'd be better off leaving that field as "unknown" by default and writing "Dear Sir/Madam" if it is unknown.

-----

chriseppstein 372 days ago | link

Gender identity is actually a quite tricky topic and should be approached carefully. I would discourage anyone from trying to use this library, the real world doesn't fit neatly in your multiple-choice view of gender. For more information, please watch this great talk: http://vimeo.com/61172068.

-----

chrislloyd 372 days ago | link

I checked how Facebook handles gender the other day and it's still either "Male" or "Female". Strange, because they return gender as a string in API calls.

-----

galvanist 372 days ago | link

I can't imagine making a dating site these days: "I'm (name text field), a (sexual identity text field) interested in meeting (sexual identity text field) for the purpose of (dating/friendship/sex/swinging/whatever text field)." Good luck with that business logic. Maybe we do need intelligent agents for this stuff.

-----

pyrocat 372 days ago | link

Came here to say this, thank you.

-----

lkrubner 372 days ago | link

I know that most of us already know this, but it is worth repeating: sex is biological, gender is cultural (for instance, "la montaña" - the mountain is feminine in the Spanish language). A "sex machine" would tell you whether you were dealing with a biological male or biological female, or something else, but it would not tell you the gender.

-----

obviouslygreen 372 days ago | link

Of course you're technically right, but the pedantry is out of control. In reality, when speaking English in a professional (and pretty much any other) setting, "sex" means "gender."

I don't think it's worth repeating. I'd say it's repeating this kind of misplaced vernacular revisionism that's making us (in which I controversially refer to the disparate collection of users on HN as a single group) look even more anal and oversensitive than we actually are.

-----

drakeandrews 372 days ago | link

A non-zero quantity of people have a gender that is separate and different to their sex. By marking out differentating sex and gender as "vernacular revisionism", you are contributing to making the lives of a non-zero quantity of people worse than they need to be. Erasure sucks, please don't perpetuate it.

-----

jordibunster 372 days ago | link

When you say things like "updated his profile" in your app, you're talking about individuals, often times to them. Those people have feelings, regardless of how "professional" the context is.

So don't assume anything unless you want to exclude people who don't fit in that binary box. It's not about being pedantic, it's about being considerate.

-----

regis 372 days ago | link

Frankly, it is none of your business what the sex or gender of a user is. I understand that sometimes there is money to be made by collecting this information but it is also alienating and just plain irrelevant (and I think there is also money to be made in recognizing that people can be fluid.)

You can give your users an option to provide you with these details but guessing/requiring is not a good practice.

On a side note, it's interesting that the most common gender neutral title is Dr.

-----

icambron 372 days ago | link

I think you're looking at this through too narrow a use case. I agree you shouldn't be taking individual people and guessing what their genders are. And you should minimize the instances where gender is even relevant in your application.

But what if you have a whole bunch of data and want to do some aggregate statistics? "Do women use our product?" is a perfectly reasonable question to ask yourself. You don't need it to be exact, and it's certainly not reasonable to ask every user. So you use some heuristics and you get some useful data.

-----

regis 372 days ago | link

I agree I'm mainly concerned about cases where gender follows you around a website when you're logged in.

"Good Morning Mr. _" and stuff like that should be avoided unless supplied by the user. For your own stats using this library is probably better than asking users for that information.

-----

tttp 372 days ago | link

In quite a few languages, you need to know the gender to write personalised emails, eg. Cher Joe vs. Chère Jane.

-----

ldh 372 days ago | link

That's a pretty poor definition of "personalised". When people get canned emails which guess their gender incorrectly it just seems a bit sleazy.

"Hi, you don't know me but I want your money."

-----

heelhook 372 days ago | link

Shouldn't this be called "Gender Machine", they get it right on the project description but not on the project name, weird.

-----

chimeracoder 372 days ago | link

On the one hand, gender corresponds to identification and behavior, which this predicts (more so than biology). On the other hand, this produces binary output[0], and sex is more generally accepted to be binary than gender.

So, I think 'sex machine' is appropriate.

[0] Okay, three options, but 'andy' really corresponds to 'unknown', not 'person of androgynous gender'.

-----

jules 372 days ago | link

Sex isn't binary as binary as you'd think. There are people with ambiguous genitals, so that doesn't work. There are people with genitals that don't match the sex of the rest of their body. Genes then? There are people that have male genes and female bodies, and vice versa. There are people with both male and female genes (XXY).

-----

MartinCron 372 days ago | link

It's probably a cultural thing. "Gender Machine" doesn't invoke the great James Brown song.

-----

advisedwang 372 days ago | link

I had a friend who insisted "gender" should be used for linguistic use (for example cheese being masculine in French) and "sex" for the boy/girl-ness of a person.

Determining the boy/girl-ness of a chicken is called "sexing" the chick. I see "sex machine" as a machine that sexes by name.

-----

Evbn 372 days ago | link

This product is used for linguistic purposes, not for deciding whose head to cut off.

-----

noblethrasher 372 days ago | link

'Male', 'female', and 'androgynous' are sex designations while 'masculine', 'feminine', and 'neuter' are terms of gender.

The code is using the gender of a reference to determine the sex of the referent.

So you could say that they're both right.

-----

bcoates 372 days ago | link

In other news, searching for "Gender Machine" on Google gives me NSFW ads for sex machines.

-----

mark-r 372 days ago | link

"Sex Machine" is a whole lot sexier. It gave me a chuckle.

-----

whackedspinach 372 days ago | link

This is exactly what I am looking for! I run a tech conference and am interested in seeing what percentage of our attendees are male or female. I only have names for historical data, so this should help give a somewhat close approximation of sex!

-----

obviouslygreen 372 days ago | link

Haha, wow. It's like someone said, "Hey, what are people reacting incredibly poorly to right now?", then took the answer, and built an almost-useful library with a funny but obviously-destined-to-offend-lots-of-people name.

Really though, geek PC hilarity aside... with so many collisions and uncertainties, this just isn't a practical approach.

-----

_pius 372 days ago | link

I submitted this, but I'm not the author of the library.

I agree with many of the concerns and limitations brought up here, most notably the fluid, non-binary nature of gender. That said, thoughtful application of probabilistic guesses about gender can add value in certain situations. For instance: http://source.mozillaopennews.org/en-US/learning/freeing-plu...

-----

dclowd9901 372 days ago | link

We tried to solve a similar problem with our app, too. We were trying to generate questions based on person and occasion (think: "What should I get my boyfriend for his birthday).

It got interesting when occasions didn't warrant possessives ("What should I get my boyfriend for Christmas"), and when language factors were considered ("What should I get mi abuelo for su cumpleanos"). We decided to try and crowd source it, which worked ok: essentially we left the occasion empty, and if the person wanted to attach the gender-based possessive to it, they could. Otherwise, we would guess with what information we had. We figured, over time, we could actually create a service where we could sell that information (GaaS: Grammar as a service?).

Turns out, people just wanted to be able to write their own titles, and we quickly trashed the idea in the early phases.

-----

dgreensp 372 days ago | link

How is this useful?

The answer is convention over configuration. See, if we institute a societal convention that your gender is derived from your first name (automatically by a Ruby program), it will save a lot of time and energy and make the world more DRY.

-----

ftay 372 days ago | link

Let's institute a convention to refer to everyone by a unique identifier. Oh wait, that's prison.

-----

hugi 372 days ago | link

>> d.get_gender("Álfrún")

Out of curiosity, why did you choose that name?

-----

spitfire 372 days ago | link

Funny. I had plans to build one of these in a month or so time. Now I can crib the answers, Cool.

-----

coherentpony 372 days ago | link

Any reason you called it "Sex Machine"?

-----

adambard 372 days ago | link

https://www.youtube.com/watch?v=5qjHLsctLL8

-----

eremzeit 372 days ago | link

Ohhh the controversy that would be generated if this gem ever got big.

http://www.confreaks.com/videos/1120-gogaruco2012-schemas-fo...

-----

nsxwolf 372 days ago | link

This gem is too clever by half.

-----

mikeruby 372 days ago | link

I can this being useful if perhaps your attempting to target with maybe some email marketing..long as your content with it being 70-80% accurate.

-----

Evbn 372 days ago | link

Looking forward to the upcoming submissions

1. Rename "Sex Machine" 2. Why Women Don't Like Rubyists 3. Don't Publically Shame the Person Who Suggested the Name Change 4. Take Your 'Sex Machine' and Shove It

-----

nnnnni 372 days ago | link

Iiiiit's Pat!

-----




Lists | RSS | Bookmarklet | Guidelines | FAQ | DMCA | News News | Feature Requests | Bugs | Y Combinator | Apply | Library

Search: