

Ask HN: Best way to guess gender based on persons name - amarcus

Hey guys,<p>I need to write a script that will try to guess a persons gender based on given a given name.<p>What do you think the best way to do this is? I am thinking the best way is to look up the name in a large name/gender database (if there is one).<p>But it seems to be that some maths and syntax should also do the job...<p>What do you guys think?
======
Scott_MacGregor
1\. Start with a database of popular names, like might be found in a baby name
book.

2\. Ask the user to input the name to be guessed.

3\. Lookup the name in the database. If the name supplied by the user is not
in the database, add the name to the database.

4\. After the name lookup or insertion in the database perform a guess as to
the sex of the name based on the data from the baby name book and weighted by
previous user corrections for that name.

5\. Print the guess to the screen (male or female) for the user to verify.

6\. Ask the user to input if the guess is correct or incorrect.

7\. Store the users input regarding if the guess is correct or not linked to
that particular name to improve the future accuracy of the guess on that
particular name.

~~~
amarcus
That's exactly how i've got it now...it looks like it might be the best
solution...Thanks Scott :)

------
quant18
It's hard to give an answer without knowing more about your input data and
what you intend to do with the output.

On general principle, "math plus syntax" seems like a highly error-prone
approach, especially if you need to process anything that's not a standard
English name, or if you have lots of users named Pat. E.g. diminutives of
Russian _masculine_ names often end in "a". I'm sure there's names which are
masculine in some countries and feminine in others. Etc.

------
apu
<http://amp.ece.cmu.edu/people/Andy/projectpage_names.html>

"Estimating Age, Gender, and Identity using First Name Priors"

Although the paper looks at this in the context of computer vision -- matching
faces in an image with names in the caption -- it should provide some
information (and references) on your problem.

~~~
amarcus
Thanks. Will have a read.

------
whatusername
Mechanical Turk job.

Perform a Google search of the name, bring up all the pics that return and get
the Turker to identify if the photos are male or female.

~~~
Flenser
Google image search is a good start, but why pay humans, surely it wouldn't be
hard to write a pattern matcher that could differentiate between male and
female photos.

Google image search could also allow for breaking down the results by geo-
location.

