
How to Tell Someone’s Age When All You Know Is Her Name - ca98am79
http://fivethirtyeight.com/features/how-to-tell-someones-age-when-all-you-know-is-her-name/
======
te_platt
Not only can you get a good idea of age from a name you can generate names
that match age and sex. I have a niece who recently did a science fair project
where she used Markov chains seeded with U.S. census data over the last
hundred years to create new names. With about 90% accuracy people could tell
if a fake name was from 100, 50, or <10 years ago and the sex.

An interesting side note was that she put in a simple profanity filter but in
all of her trial runs it never picked up any "fuq" or variant names.

Edit: Here are sample boy names: Shill Flay Roshard Per Coll Milius Madfrego
Derry Fer Fordy Carlel Marler Rommyronance Jord Felwooke Rott Luper Bent Zekin
Othen Nolanterry Jerarton

Here are some girl names Esalessie Rine Nolenn Alynna Myrtinet Faybeciline
Aline Orassabenda Phina Dorgia Lideleaste Beara Sonilinn Judelia Monangeora
Jarnina Geleene Emozellyn Maudra Verta Lortis Fret Kathoph

~~~
andrey-p
Any chance she could open source that? My friend's writing a fantasy novel and
I think he could really use a realistic-sounding fake name or two.

~~~
tomwalker
[http://www.fakenamegenerator.com/](http://www.fakenamegenerator.com/)

~~~
andrey-p
I did specify "fantasy" for a reason. If you're writing fantasy you want your
names to sound natural yet unlike anything your reader will seen before. Hence
why I guess many names in fantasy novels (I'm thinking Song of Fire and Ice
and Wheel of Time) are pretty much a normal name with one or two letters
replaced.

I'll keep your link in mind though - for my own writing which is in a non-
fantasy setting.

~~~
cac04
I was horrified when I realised that all the names in Game of Thrones are just
normal names as pronounced by my one year old daughter.

------
taliesinb
One of the built-in models in the Wolfram Language does precisely this:

    
    
       In[1]:= Predict["NameAge", "Gertrude"]
       Out[1]= 84
    
       In[2]:= Quartiles @ Predict["NameAge", "Gertrude", "Distribution"]
       Out[2]= {62.8975, 74.7389, 84.8247}
    

More info about Predict and Classify here:

[http://reference.wolfram.com/language/ref/Predict.html](http://reference.wolfram.com/language/ref/Predict.html)

[http://reference.wolfram.com/language/ref/Classify.html](http://reference.wolfram.com/language/ref/Classify.html)

~~~
gkoberger
WolframAlpha can handle this, too:
[http://www.wolframalpha.com/input/?i=how+old+is+mable](http://www.wolframalpha.com/input/?i=how+old+is+mable)

~~~
lesterbuck
My first name is Aubrey, which completely flipped to a girl's name in the US
about ten years ago. According to this chart, the fraction of female Aubrey's
is approaching 12% at birth. When that fad wears off, it will make a nice
spike in the curve for many decades. By the way, Aubrey means "elf leader" or
"king of the elves".

www.wolframalpha.com/input/?i=how+old+is+aubrey

------
palakchokshi
This is one of the ways cold readers hone in on all kinds of things about the
person they are reading. It is a very effective way to guess someone's
mother's or grandmother's name or sister's name. if the audience is a group of
mostly 30 to 50 year old women the reader has a good starting point. It goes
something like "Is there a Laura or Lisa here?" There is a high probability
there will be one of those. Once a woman acknowledges their name is Laura the
reader can see what her approx. age is and make a guess about what their
mother or grandmother or grandfather's name is. They use other cues to figure
out which dead relative the woman is there to "hear" from and then say
something like "Someone with a M or K is coming forward" if the target reacted
to one of those letters the reader guesses "Mar.... Marg... Mary...Margeret...
Margeret... Is that your mother?"...

You get the idea.

------
jwegan
I actually used something similar to this (but not as sophisticated) at a
previous startup to generate recommendations of people to invite to the app
because the app's target demographic was women ages 20-40.

[http://jwegan.com/growth-hacking/hacking-mobile-invites-
with...](http://jwegan.com/growth-hacking/hacking-mobile-invites-with-help-us-
government/)

------
hudibras
Baby Name Wizard (linked in the article) is one of the true hidden gems on the
internet. It looks like a fluffy website for moms-to-be, but then you start
poking around at the graphs and you realize that an hour of your life has
disappeared...

[http://www.babynamewizard.com/](http://www.babynamewizard.com/)

Bonus: This blog post from Baby Name Wizard is utterly fascinating. Everybody
I've ever showed this to has been amazed.

[http://www.babynamewizard.com/archives/2012/5/the-shape-
of-b...](http://www.babynamewizard.com/archives/2012/5/the-shape-of-boys-
names-an-update-on-the-age-of-aidan)

------
btilly
I found the age range on "Jennifer" to be particularly interesting.

My sister Jennifer (see
[http://en.wikipedia.org/wiki/Jennifer_Tilly](http://en.wikipedia.org/wiki/Jennifer_Tilly)
for details) is in her mid-50s. She was in college before she met another
Jennifer her own age. People still are mislead by her name and believe that
she has to be a lot younger than she really is.

The moral is that if you have the great fortune to pick a girl's name that
_will_ be popular some day but is not now, that girl will probably be happy
about it. :-)

~~~
ryanx435
so according to your username and that wiki page, you are therefore her half-
brother ben tilly. your mother is patricia, your father is john ward, and you
are from british columbia, specifically texada island.

honestly, i'm not really sure why you would post identifying information on
the internet.

edit: just checked your profile. hey ben!

~~~
btilly
There is enough identifying information about me out there that there is no
point in denying it. That ship sailed for me many years ago. And sometimes it
is convenient to be able to say something and have people realize that I have
direct experience with it.

Of course you shouldn't believe everything you read on the Internet. Contrary
to what you surmised, I was actually born in California, did most of my
growing up in Victoria, British Columbia, and Jennifer is not part-Irish.

Now if you want to get disturbed, go read my sister's book, Singing Songs.
None of what is discussed there was stuff that I had any control over, so I
have no shame about it. And it is all stuff I've said in public before.

As I said, that ship sailed for me long ago. There is no point in hiding it.

~~~
thret
I'd no idea she'd written a book. Thanks Ben! As I said her blog was great, I
read the entire thing in a single sitting.

Edit. Other sister. Still, I'll give it a look. Talented family you have
there.

------
bane
I used to work for an NLP startup, we focused on stuff you could do with
Romanized names -- names that were original not written in the Latin alphabet
and ended up being written in the Latin alphabet using some kind of
transliteration scheme.

For example, we could take a name and generate a pretty comprehensive, and
culturally aware, list of variants.

Jennifer -> Jenifer, Jen, Jenny, Jennie, etc.

Richard -> Rich, Richie, Dick, Dickie, Ritchard, etc.

Rho -> No, Lo, Loh, Noh, Roh, Ro, Nho, etc.

The intention of course was to build up lists of name variants that could be
used during identification checks.

We also had some pretty significant statistical models that could guess Gender
and provide a descending list with confidence levels of the most likely
country of origin for a name. It was surprisingly accurate and could account
for different Romanization schemes popular in different countries. It could
even guess if a name was a surname or a given name.

What did we build the models on? Somehow, one of the founders was able to
swing access to U.S. Border Control Data. Even though it was names and country
of origin data, it's de-identified (having a list of names doesn't mean we
know who the names belong to). There was something north of a billion names in
the collection, and included place of birth, country of origin, gender, etc.
Names were mined for digraphs so we could build CFGs that could be walked to
generate variants. There was lots of manual work as well. Endless regex
writing and testing, QA, that sort of thing.

For some countries, we had pretty poor data to be honest. I think we had a
couple dozen North Koreans, but for most of the world, our coverage was
surprisingly good. It turns out all that work boiled down into a surprisingly
small library just a couple dozen megabytes in size and was pretty fast -- I
don't remember how fast, but something like a few thousand names per hour. It
was pretty niche, but eventually the company was acquired and I went on my
way.

I always assumed that technology like that would find its way into more
applications, but I'm constantly surprised it hasn't.

~~~
incision
_> 'I always assumed that technology like that would find its way into more
applications, but I'm constantly surprised it hasn't.'_

Many years ago, I was working on a large project for an organization nothing
apparently consistent between half a dozen systems with tens of thousands of
users each except names. Naturally, those names were full of exactly the kind
of variations you're describing.

When I went looking for a solution to do exactly what you're describing I ran
into solutions that were both vague about their functionality and expensive.
Like you say, pretty niche - it seemed that everyone was used to selling very
specific 'solutions' not a library/API.

I ended up hacking together a very basic script to accomplish the same. It
took days to run thanks to my non-existent coding skills, but the accuracy was
pretty good.

What it couldn't line up was solved by later decoding and discovering
correlations between the long forgotten conventions used for unique IDs in the
various systems.

------
ilamont
Two thoughts:

1\. Marketers surely have mined this data to the hilt -- cross-referencing
these trends with address lists and full-name email prefixes can make targeted
promotions a lot more effective.

2\. My own name is relatively rare in the U.S. among my age cohort
([http://www.wolframalpha.com/input/?i=ian](http://www.wolframalpha.com/input/?i=ian))
to the point where some adults had problems pronouncing it when I was in
elementary school 35 years ago ("Isn't that a girl's name?"). But I suspect,
based on anecdotal evidence and personal observation, that the name is more
common in England, Scotland, Australia and Canada. And the Wolfram data shows
that it has been growing in popularity for many years in the U.S.

~~~
meric
I live in Australia and I concur with you, I know quite a few Ian's.

------
shawkinaw
Can we agree to use the plural "their" for ambiguous sex third-person
possessive? "His" is sexist, but so is "her", which is distracting on top of
that because it isn't conventional.

~~~
Spittie
As someone that's not a native english speaker, can I ask you why "his" is
sexist?

A quick read of a dictionary
([http://dictionary.reference.com/browse/his](http://dictionary.reference.com/browse/his))
says that his is "the possessive form of he", and the second definition of
"he" is "anyone (without reference to sex)". That's also what I got taught in
middle/high school.

Sorry if I'm just missing something and this is a stupid question.

~~~
Someone
Something is sexist when language users think it is. The problem with
he/him/his is that the primary meaning refers to males, only. Because of that,
anyone reading it gets pushed towards the primary meaning.

That's why some people push towards the use of the singular they
([http://en.wikipedia.org/wiki/Singular_they](http://en.wikipedia.org/wiki/Singular_they))
That may eventually change the language for all.

------
mooism2
...and her nationality.

I'm British. I know two women called Deirdre. They're both Irish. It seems
that the name had fallen out of favour in Britain by the 70s, but was still
fashionable in Ireland until at least the 80s.

~~~
slyall
Could be related to the character on the long-running primetime soap
Coronation Street. She started in 1972.

[http://en.wikipedia.org/wiki/Deirdre_Barlow](http://en.wikipedia.org/wiki/Deirdre_Barlow)

~~~
mooism2
Corry's been shown in Ireland since 1978.

[http://en.wikipedia.org/wiki/Coronation_Street#International...](http://en.wikipedia.org/wiki/Coronation_Street#International_syndication)

Not saying it's not the reason for the divergence, but it seems less likely.

~~~
rmc
A lot of Irish households have access to British TV stations like BBC, ITV
etc. and could have watched it there.

~~~
theoh
Since Deirdre is an Irish name, it is possible that the decline in popularity
(among the general population) in the UK was at least partly due to increased
anti-Irish sentiment in the 1970s.

------
jaxytee
The Wizard of Oz was released in 1939, so it makes sense that the median age
for Dorothy's is around 75 years of age.

I wonder what other pop culture events influenced naming trends.

~~~
probably_wrong
I bet Liam Neeson is the reason for the new wave of Liams, but I can't decide
whether the Karate Kid remake is the reason for the wave of Jaydens, or if
it's just a coincidence.

~~~
personZ
[http://names.yafla.com/#n=Jayden&s=mt](http://names.yafla.com/#n=Jayden&s=mt)

Coincidence. The name started its ascent in the mid 90s.

~~~
Larx-3
Not necessarily. Liam Neeson was the star of the 1993 film "Schindler's List."
It won 7 Academy awards in 1994, including best picture.

------
was_hellbanned
I was curious about one of the deadest male names, Isadore, so I looked it up.
It's of Greek origin and it turns out the female counterpart, Isadora, is the
ninth most popular name for baby girls in Chile in 2006. The website linked
from the article indicates that it's never ranked in the top 1000 in the US.
Interesting how a shared, ancient name could be so wildly divergent in usage.

~~~
officemonkey
Kirk Douglas was known as Isadore Demsky when he grew up in Amsterdam NY in
the early 20th Century.

Apparently it was a popular male name for immigrants and first generation
children in the early 20th Century. It was often shortened to "Izzy."

There was a social trend in America during the middle of the 20th Century to
"anglicize" names. For example, I have uncles who changed their birth name in
the early 50s from "Wozniak" to "Wagner." Even Izzy Demsky became Kirk Douglas
when he grew up.

Let's take for example, the children in "the Godfather" books, The older
children have "old country" names (Santino, Fredo) and the younger children
have "new world" names (Michael, Connie.) It's almost as if the older kids
"Americanize" the family when they go to school.

Anyhow, names are funny things when taken in aggregate.

~~~
cookiecaper
My immigrant grandparents named my mother an anglicized version of their
intended name after pressure from the older children, who said, "In America
you say ____, not ____". I think there is probably something to the theory
that the family gets more Americanized as the older children are raised in
American culture and "correct" some of their parents' old world ways.

------
danso
The combination of the SSA babynames data, which is very cool and deep on its
own, with the SSA actuarial data is pretty neat, partly because I hadn't known
about the actuarial data set...but when I saw that the OP had tried to
calculate surviving persons of a given name and birth year, I assumed that
they just used the SSA's death database...from until at least 2010, the SSA
had a list of every SSA person who has died and also, when they were born, and
also, their social security numbers. Since the SSN, until relatively recently,
was indicative of what state the SSN-holder was actually born...well, that,
combined with the babynames-per-state data, could get you very granular
calculations...I'm sure the SSA's actuarial table gets it pretty much within
an acceptable margin of error, but who knows, maybe some awkwardly named
people were doomed to a shorter lifespan? (I'm only half joking, I think)

~~~
alttag
> what state the SSN-holder was actually born

No, it was the state where the SSN was issued. Not all children applied for an
SSN at birth. Centralization of SS offices also altered this practice.

See, e.g.,
[http://www.ssa.gov/history/ssn/geocard.html](http://www.ssa.gov/history/ssn/geocard.html)

------
dllthomas
The assumption that death rates have no link to names will probably break down
in some cases.

~~~
jedberg
This was your subtle way of saying that the average lifespan for a typical
Afro-American name is lower, right? :)

It's ok to differentiate things amongst races sometimes -- it isn't _always_
racist.

~~~
dllthomas
That was certainly an example, but I moved away from it for generality (and
brevity), not concerns over racial tension.

------
bazzargh
Would be interesting to apply it to a group of friends. Since they're likely
to be similar ages, you should be able to get an improved guess from combining
the distributions for all of their names.

------
Finbarr
"The peak year for boys named Joseph was 1914 — when about 39,000 of them were
born. Those 1914 Josephs would be due to celebrate their 100th birthdays at
some point this year. But only about 130 of them were still alive as of Jan.
1."

Something quite poignant in this. I'd be interested in seeing a life
expectancy chart based on name.

~~~
DougWebb
I'm pretty sure that would be "a life expectancy chart". It's pretty unlikely
that your name has any impact on your life expectancy. But, since name
popularity is influenced quite a bit by social/cultural status, and those _do_
affect life expectancy, you'd probably see some differences along those lines.

~~~
hluska
Great comment!

In case anyone is interested, there have been some studies done where
researchers send in two identical resumes. The only difference is that one has
a traditionally 'white' sounding name, whereas the other has a name more
associated with minorities. The 'white' sounding name performs better in these
types of tests.

[http://www.slate.com/articles/business/the_dismal_science/20...](http://www.slate.com/articles/business/the_dismal_science/2005/04/a_roshanda_by_any_other_name.html)

^ This article gives some more information, including an interesting story
about two brothers named Winner and Loser. The most relevant quote, however,
comes right at the end:

"The data show that, on average, a person with a distinctively black
name—whether it is a woman named Imani or a man named DeShawn—does have a
worse life outcome than a woman named Molly or a man named Jake. But it isn't
the fault of his or her name. If two black boys, Jake Williams and DeShawn
Williams, are born in the same neighborhood and into the same familial and
economic circumstances, they would likely have similar life outcomes. But the
kind of parents who name their son Jake don't tend to live in the same
neighborhoods or share economic circumstances with the kind of parents who
name their son DeShawn. And that's why, on average, a boy named Jake will tend
to earn more money and get more education than a boy named DeShawn. DeShawn's
name is an indicator—but not a cause—of his life path."

(Levitt and Dubner, "A Roshanda by any other name")

------
onewaystreet
Did FiveThirtyEight steal this idea from Business Insider?
[http://www.businessinsider.com/popular-girl-boy-
names-2014-5](http://www.businessinsider.com/popular-girl-boy-names-2014-5)
They did the same research a week ago.

~~~
jonas21
They were probably both inspired by Social Security Administration's release
of name data for 2013 a couple of weeks ago:

[http://www.socialsecurity.gov/pressoffice/pr/2014/babynames2...](http://www.socialsecurity.gov/pressoffice/pr/2014/babynames2013-pr.html)

------
drpgq
Somewhat related, here's the latest NIST results for age estimation based on
face photographs (PDF):

[http://www.nist.gov/customcf/get_pdf.cfm?pub_id=915238](http://www.nist.gov/customcf/get_pdf.cfm?pub_id=915238)

------
ElongatedTowel
Xavier? Logan? Guess someone wants to have his son grow up as the Wolverine.

------
bostonpete
It's surprising that Jacob isn't one of the top 25 most common male names
considering that it's been the most popular male baby name for 14 of the past
15 years.

~~~
dfc
The list is the 25 most popular names _since 1900._

~~~
bostonpete
I assume its the 25 most popular names of the living. But either way I would
have expected the most popular male baby name for 14 consecutive years to make
the list.

~~~
dfc
Baby Boomer generation.

------
talles
That's very interesting.

Anyone knows some sort of service or website where you input a particular name
and then gives you statistics like the average age of persons with the name
given?

------
tzury
Can someone tell what software was in used to products charts ?

~~~
skeletonjelly
Nate Silver willed them into existence.

I often wonder this about newspaper ones as well. I guess graphic designers
custom make a fairly large amount.

~~~
qwerty_asdf
How can y'all not know 'bout R?

[http://www.r-project.org](http://www.r-project.org)

...just kidding, lots of people don't know about R, but check it out, because
it's pretty badass!

I'd be curious to know if people still use Processing, professionally?

[http://processing.org](http://processing.org)

------
cafard
Amusing. My brother and I are almost smack on the median of our names. Yet I
was named after my father (and his father), my brother after our mother's
father.

------
ojbyrne
Its great to be unpredictable (Owen, slightly older than 8).

------
akilism
Baby names, is this the new wave of data journalism?

~~~
jonathanjaeger
Baby names were heavily discussed in Freakonomics as indicators of a variety
of things. An interesting read if you like this sort of data. Relevant
content: [http://freakonomics.com/tag/baby-
names/](http://freakonomics.com/tag/baby-names/)

------
MarkMc
I've just been travelling through Singapore and was astonished to come across
young women named Agnes and Gertrude.

~~~
skrause
My name is Sebastian, which was extremely popular in Germany in the early 80s
(not once was I the only Sebastian in the classroom). In the USA people would
now imagine a small child when hearing my name. It's very interesting how
different popular names are in different countries.

~~~
dalke
Was that perhaps influenced by the name "Bastian" for the main character of
Ende's 1979 book "The Never-Ending Story" (Die Unendliche Geschichte in the
original German)?

------
p1itopre
I was searching for a link to download the data (a csv maybe) for me to play
with. Did someone find a link?

~~~
keithba
[http://www.socialsecurity.gov/OACT/babynames/limits.html](http://www.socialsecurity.gov/OACT/babynames/limits.html)
has links.

~~~
p1itopre
Thanks!

------
autokad
sorry if someone already posted it, but you can also get an estimate on where
they live :)

[http://jezebel.com/map-sixty-years-of-the-most-popular-
names...](http://jezebel.com/map-sixty-years-of-the-most-popular-names-for-
girls-s-1443501909)

------
subdane
This totally nailed my Mom. That sounds worse than I meant it.

------
shmerl
That's very much culture / language / country specific. Naturally societies
tend to have certain preferences in names in different time periods. But those
only a tendencies, not a set in stone set of names.

------
jccalhoun
I wonder how names with alternate spelling fit in?

------
bmmayer1
My app, DrillbitApp.com, uses the same data to run on marketing lists. Also
does race and gender.

------
madengr
Amazing that the oldest male names do not include biblical Old Testament names
but the youngest male names do! A sign of increasing religious fundamentalism?

