As a side note I would like to mention that having a girl's name does have some benefits. For example my wife can handle bank account issues over the phone for me. Also some near misses - I was assigned a locker in the girls locker room in junior high but they caught on before I could use it, and once a telemarketer called to offer me a spot in an all woman's resort spa. The only time I said yes to a telemarketer and I got rejected.
This means I can filter a substantial percentage of junk mail simply by throwing away anything addressed to 'Ms. Trout'.
... or Grout, Sprout, Trent or any of the other hilarious misspellings people also seem to manage.
Less OT: Using such a thing to pre-select the probably-right option on a gender picker might be actually useful without the false selections causing the "fake familiarty" problem.
The gender of a name can vary heavily by culture and time-period, it would make much more sense for the api to return the data in the form of ranked probabilities.
As an aside it's worth noting that as this library is GPL3 it means you can't use this code in any non-GPL product.
Literally the very first line of the README has a link to the source data. It contains name frequency data for a number of countries and the README clearly indicates you can provide a country of origin when doing lookups.
> As an aside it's worth noting that as this library is GPL3 it means you can't use this code in any non-GPL product.
The source data is GPL, there wasn't much option.
Of course you can. Usage != distribution.
That makes me think, if you had firstname + birthdate, you could probably be a tiny bit more accurate.
Would make a nice exercise in fuzzy decision processes, but I suspect it isn't a great idea: you'd be better off leaving that field as "unknown" by default and writing "Dear Sir/Madam" if it is unknown.
Well you can always simply ask the author -- if he has got the copyright, he can grant you permission.
You can give your users an option to provide you with these details but guessing/requiring is not a good practice.
On a side note, it's interesting that the most common gender neutral title is Dr.
But what if you have a whole bunch of data and want to do some aggregate statistics? "Do women use our product?" is a perfectly reasonable question to ask yourself. You don't need it to be exact, and it's certainly not reasonable to ask every user. So you use some heuristics and you get some useful data.
"Good Morning Mr. _" and stuff like that should be avoided unless supplied by the user. For your own stats using this library is probably better than asking users for that information.
"Hi, you don't know me but I want your money."
So, I think 'sex machine' is appropriate.
 Okay, three options, but 'andy' really corresponds to 'unknown', not 'person of androgynous gender'.
Determining the boy/girl-ness of a chicken is called "sexing" the chick. I see "sex machine" as a machine that sexes by name.
The code is using the gender of a reference to determine the sex of the referent.
So you could say that they're both right.
I don't think it's worth repeating. I'd say it's repeating this kind of misplaced vernacular revisionism that's making us (in which I controversially refer to the disparate collection of users on HN as a single group) look even more anal and oversensitive than we actually are.
So don't assume anything unless you want to exclude people who don't fit in that binary box. It's not about being pedantic, it's about being considerate.
Really though, geek PC hilarity aside... with so many collisions and uncertainties, this just isn't a practical approach.
It got interesting when occasions didn't warrant possessives ("What should I get my boyfriend for Christmas"), and when language factors were considered ("What should I get mi abuelo for su cumpleanos"). We decided to try and crowd source it, which worked ok: essentially we left the occasion empty, and if the person wanted to attach the gender-based possessive to it, they could. Otherwise, we would guess with what information we had. We figured, over time, we could actually create a service where we could sell that information (GaaS: Grammar as a service?).
Turns out, people just wanted to be able to write their own titles, and we quickly trashed the idea in the early phases.
I agree with many of the concerns and limitations brought up here, most notably the fluid, non-binary nature of gender. That said, thoughtful application of probabilistic guesses about gender can add value in certain situations. For instance: http://source.mozillaopennews.org/en-US/learning/freeing-plu...
The answer is convention over configuration. See, if we institute a societal convention that your gender is derived from your first name (automatically by a Ruby program), it will save a lot of time and energy and make the world more DRY.
Out of curiosity, why did you choose that name?
1. Rename "Sex Machine"
2. Why Women Don't Like Rubyists
3. Don't Publically Shame the Person Who Suggested the Name Change
4. Take Your 'Sex Machine' and Shove It