
Falsehoods Programmers Believe About Names – With Examples - andrelaszlo
https://shinesolutions.com/2018/01/08/falsehoods-programmers-believe-about-names-with-examples/
======
koliber
I like reading these lists and find them very interesting.

At the same time, I would caution anyone from building software that blindly
tries to accommodate every point in these lists.

Every software has a set of requirements, and a target market. It should be
built to meet those requirements and anticipate strange situations, but within
reason. This is very context sensitive. There is no one ruleset.

At the same time, it should be possible to address edge cases in some
reasonable way. For example, if your software does not support pasting images
in as names (and therefore cannot represent the prince symbol), it should
allow transliterations into a script that it can support. We can't support all
cases perfectly, but being aware o them and finding some way to fit them in
makes for a better experience.

~~~
SilasX
I find it interesting because it's a more universal problem that predates
software. Each civilization had its own way of doing things and had to
occasionally accommodate people who didn't follow its customs, and they had to
use many of these very same workarounds.

"You don't use our script? Okay, you'll have to come up with a version of your
name that we can write in our script."

"You don't have a [second] family name? We'll have to come up with a
convention for deriving one when we log you."

"Your name has sounds we can't pronounce? We'll treat you as having a version
of your name that we can pronounce."

"You don't have a signature version of your name? Uh, write X or something."

~~~
tillinghast
Exactly this. Your name is not your own--your name is a contract between you
and the culture / civilization you're dealing with.

------
DoreenMichele
A few random thoughts:

They chickened out on giving an actual example of a "bad word" name, though
there's a not very dirty and really common example:

Dick, which can be short for Richard.

My ex is a Junior. When I get phone calls asking for "Doreen Traylor Junior" I
know it's a telemarketer.

I used to try to crack jokes, but they never got it. As far as I know, Junior
and Senior are masculine. Even if they weren't, it's my married name. Doreen
Traylor didn't exist until I married. I couldn't have inherited it from my
mother, even if culturally we did that kind of thing. (I mean even if it was
common to give daughters the same name as their mother and call them
_Junior_.)

I'm not sure where that falls on this list, but it seems like a glaringly
obvious error if you know my gender. (And, as far as I know, _Doreen_ is a
female name. Though _Michele_ can be male in some places, like Italy, they
usually don't seem to know my middle name.)

~~~
jaredwiener
I've gone through life with a surname that euphemistically refers to an
intimate part of male anatomy. While I've never had problems with it (well,
technological problems, at least) -- this writer with a slightly different
spelling, has:

[https://twitter.com/natalieweiner/status/1034533630071193602...](https://twitter.com/natalieweiner/status/1034533630071193602?lang=en)

~~~
mixmastamyk
Interesting that the one pronounced "winer" has problems, where the "weener"
doesn't. Perhaps someone not familiar with German, it is the second letter
that is pronounced.

~~~
jaredwiener
I'm not German -- this ends up being another, though unrelated, constant
difficulty with this name. (It's pronounced like "Whiner.")

Family legend holds that when my great grandfather reached Canada and told the
immigration agent his Lithuanian name, the agent replied, "I don't know how to
spell that. I'm writing down 'Wiener'."

Strangers will insist we are pronouncing the family name wrong, though I
firmly hold the position that we can pronounce it however we want, since it's
our name.

~~~
mixmastamyk
Hmm, sounds easier to change the spelling rather than spend a lifetime working
against standard expectations.

------
nradov
For those who have to design systems to properly accommodate most of these
"falsehoods", don't reinvent the wheel. Take a look at how the HL7 CDA R2
format handles names for healthcare data interchange.

[http://www.hl7.org/implement/standards/product_brief.cfm?pro...](http://www.hl7.org/implement/standards/product_brief.cfm?product_id=7)

Each person can have multiple different names tagged by use (legal, maiden,
alias, etc). Names can be composed of multiple different parts (prefix,
suffix, given, family, or just free text). All of Unicode is supported. And
names can also be _NULL_ in various different ways (which is distinct from
just blank or missing).

------
dharmab
One for "people have names"\- in the context of an emergency room or criminal
investigation, a person's name may not be initially known.

ERs work around this by assigning a code name to every trauma patient...
meaning those patients have an additional name in the context of the hospital
system.

------
cgoecknerwald
> "It seems some people believe that you get a name and it never changes. Not
> so, even in Western countries, where a person may change their name when
> they marry."

Since it is traditionally /women/ who change their names when married, this
blind spot does not surprise me.

~~~
repolfx
I don't think you need to reach for sexism as an explanation there. Name
changes are sufficiently rare that building the flows to support it will
usually only become an issue in mature production software where accurate
names are extremely important, and where preserving long term records is
equally important. That's not most systems. And that's before you get into
cases where names are primary keys like usernames.

The business case for supporting such changes can be pretty weak relative to
other features.

~~~
cgoecknerwald
Name changes are not rare.

In recent decades, about 80 percent of brides choose to change their names
after the wedding, both professionally and legally. [1] Most people marry at
least once. [2] Women are half the population.

That being said, I do agree that in many systems, supporting name changes is
not an immediate priority.

[1] [http://www.nytimes.com/2015/06/28/upshot/maiden-names-on-
the...](http://www.nytimes.com/2015/06/28/upshot/maiden-names-on-the-rise-
again.html) [2] [https://flowingdata.com/2017/11/01/who-is-married-by-
now/](https://flowingdata.com/2017/11/01/who-is-married-by-now/)

------
thomasfoster96
I know people see No. 29 on this list and decide that their product or service
will only be used by English speakers of European descent, but from experience
even with these extraordinarily low requirements for testing most services
fail. I’ve got three short (<8 letters) entirely ASCII names, and even then I
still have trouble:

* I regularly get letters addressed to my first name and my middle name, missing my surname.

* Banks and airlines always seem to try a different way of dealing my middle name, from only using an initial to abbreviating it to anywhere from three to six characters.

* A friend once got issued travel documents where all of their given names were concatenated together (no spaces) and printed after their last name.

* Many websites still require you to pick an honourific from a drop down menu, but the options vary considerably (and usage is different in every country, even with a common language). Just make it a text field and be done with it.

I really don’t think it’s too difficult nowadays, especially if you’re writing
something from scratch, to just accept really long Unicode strings and store
given names and surnames separately. And it’s not like testing this is
difficult - you can find datasets of a few hundred names in different
languages and scripts fairly easily.

~~~
stevula
To your second and third point, recently traveled internationally and the
airline concatenated my first and middle name on my ticket and boarding
passes. My passport has a space between my first name and middle _initial_ My
driver’s license has a space between my first name and middle name.

Needless to say, I was unable to check in online or with one of the machines
at the airport.

~~~
gav
For some reason Delta prints GAVINJ as my first name on all tickets even
though the J is the first letter of my middle name. When I book flights this
isn’t the case, either directly or through a 3rd-party; I assume they have
some master record in a mainframe somewhere that was created by some other
system incorrectly and I’m stuck with it forever.

However, I’ve flown hundreds of times and nobody who has checked my ID has
ever been bothered by this, which is a little worrying.

~~~
detaro
Why is it worrying? What would you prefer they do?

------
danso
This was an interesting list, but I’d be more interested in reading examples
of how these assumptions are manifested as erroneous design decisions.

For example, many of the flawed assumptions are a variation of “Names don’t
change/vary, ever”. I think most U.S. programmers are very aware of this —
especially the many who have some variation of a name like “Mike/Michael”,
“Dan/Daniel”, “Kate/Katlyn/Katherine”. But it took me a bit to figure out
where a programmer would most likely forget that fact, in a way that’s a
disservice to users — making a database system that assumes the first
name/surname fields should never be updated.

~~~
Piskvorrr
Updated? That's a design error right there.

"Nope, can't give you the records of Jane Doe from 2005, you are in our system
as Jane Smith. You had a different name then? Doesn't compute!" In other
words, the old name needs to be kept, the new one is not a _fix_ from an
erroneous state: it used to be correct then, the new one is correct now, and
you still need to match on _both_.

Why? Because changes never propagate instantly, if they propagate at all: I
still get mail addressed to my mother (who hasn't lived here for decades),
addressed to her previous name (another decade on top of that), into my
mailbox, even though no trace of any of those names remains. As far as the
post office goes, "XY Street 123, box 45" is still the residence of Mrs Foo.
(That's actually a counterexample, yeah)

~~~
0db532a0
Interested question: Why shouldn’t names be updated in-place?

~~~
Piskvorrr
If you do have a denormalized "this is a current name" column, sure, go ahead.
But the example is right above you: you might need to match not just on a
current name, but on historical names as well. People change their names _a
lot_.

~~~
0db532a0
That’s true, actually. I didn’t read the reply properly. It makes complete
sense to ban updates if you want to have access to historical names.

------
LordHeini
So what is the solution now?

Have a single full Unicode 'name' field which is over 200 characters long?

What is even the longest possible name?

The ex german minister of defence has 108 characters in his ridiculous name.

~~~
JetSetWilly
That still violates point 11:

"People’s names are all mapped in Unicode code points."

It seems like it is impossible to address all these points. Obviously there's
no way for me to allow people to enter their names which don't have any known
code point. So at some point you're always going to be saying "tough luck, you
must adapt your name - or come up with something - that confirms to these
constraints." And I am sure people are clever enough to do so.

~~~
gvx
That point and point 40 ("People have names") can't be solved by technical
means. There are two options here:

\- You can make name fields optional for people without (representable) names
\- You can loosen what the field stands for to "what you want us to call you":
for people who don't have a (representable) name, they can choose an
alternative, without implying the user's (lack of) name is Wrong.

------
macintux
It’s worth noting that mistakes here have enfranchisement implications as
well: officials with one U.S. political party have been eager to combat voter
fraud, including by refusing to allow people to vote whose names don’t
perfectly match in all relevant databases.

------
pmoriarty
Also see the curated list of falsehoods programmers believe in:

[https://github.com/kdeldycke/awesome-
falsehood](https://github.com/kdeldycke/awesome-falsehood)

------
WillKirkby
Another one for "People’s names do not change." \- transgender people will
almost always change their name during their transition.

~~~
OskarS
The other obvious example is changing names when you get married.

~~~
WillKirkby
yes, which is mentioned in the article..

------
speedplane
This article points out lots of false or at least weak assumptions that
programmers make about names. But it doesn't really address the root of the
problem: why programmers make these assumptions in the first place.

There are two main reasons: 1\. UI/vanity: it's nice to see your name in your
app/email somewhere, and to display your name, the software often must make
assumptions about your name. 2\. Disambiguation: Software often has to
determine if "Jon Livingston" is the same as "Jonathan M. Livingston". That is
by it's nature pretty error prone, but over a finite and well understood
dataset, can be made to be relatively accurate.

At it's worst, this article says that the two pieces of functionality above
are impossible. At it's best, it says that they are possible, but there are a
number of things you need to consider. I'm an optimist, but I'd like to see
more that points to the latter.

~~~
jcelerier
> Software often has to determine if "Jon Livingston" is the same as "Jonathan
> M. Livingston"

wat ? what kind of software does that ? that's dumb as hell. You have one name
and that's the one on your id card.

~~~
pjc50
a) I don't have an ID card

b) I rarely use my middle name or initial, except on "official" documents

c) People abbreviate their names a _lot_

d) Sometimes it's helpful to match people with the same name at the same
address as being the same person .. and sometimes it's adversarial.

~~~
Piskvorrr
Ad d: an acquaintance of mine named his son after himself. We don't do
suffixes here, so no "II." or "Junior" \- they're both Mr. Foo Bar, at Baz
Drive, Quuxtown. The age differs, of course, but rarely does anyone collect
it, much less use it for disambiguation. As you say, it's a mixed bag - either
you get to impersonate the other, or the police gets all confused when looking
for a rowdy adolescent and finding a middle-aged man, or you just can't get
booked ("we already have you here" "no you don't, that's my son" "that's
impossible, there can't be two of you, computer says so!").

------
glitchc
I knew names were hard but now am convinced they are diabolical. The only safe
solution seems to be to store every name as an alphanumeric blob and forget
about parsing altogether.

~~~
contravariant
If only there _was_ a unique aplhanumeric blob representing a name.

~~~
Piskvorrr
A name, or a person? The first one is a bit easier; person:name is, at the
easiest, 1:n. But yeah, a name can have multiple representations, even before
multiple languages come in (Peter/Cephas/Petros).

(Pedantry: Jean-Claude van Damme doesn't fit into the _alphanumeric_
requirement on two separate counts)

------
aquamo
One option might be to use a hash of one's DNA sequence and be done with it.
The name identifier can then just be an image with any markings made by the
person. This would then move name / id search from a character set problem to
an image recognition problem. Works for me :-)

~~~
deathanatos
Chimeras:
[https://en.wikipedia.org/wiki/Chimera_(genetics)](https://en.wikipedia.org/wiki/Chimera_\(genetics\))

Essentially, people can (sometimes) not have a single set of DNA.

And this has already caused issues:

> _In 2002, Lydia Fairchild was denied public assistance in Washington state
> when DNA evidence showed that she was not related to her children. A lawyer
> for the prosecution heard of a human chimera in New England, Karen Keegan,
> and suggested the possibility to the defense, who were able to show that
> Fairchild, too, was a chimera with two sets of DNA, and that one of those
> sets could have been the mother of the children._

------
cabalamat
> 9\. People’s names are written in ASCII.

This is apparently true in the UK, although they use the terms "special
characters" and "normal letters".
[https://passportapplication.service.gov.uk/help/html/pages/1...](https://passportapplication.service.gov.uk/help/html/pages/10.05_01_name-
to-appear_en.html) :

>We cannot show full stops, hyphens, accents or other special characters in
your passport. Full stops and hyphens in a name will be replaced with spaces.
> >If your name has a special character or accent mark please enter your name
using a normal letter eg e instead of é or a instead of ä etc.

~~~
wwosik
"normal" letter sounds rather judging, doesn't it? Altough I suspect it is
probably a product of an undereducated person.

~~~
soundwave106
Agreed, simply "please use the unaccented letter" would have been fine
instructions. Accented letters are pretty "normal" in many languages.

~~~
occamrazor
Accented letters are not the only issue. Apart from the already mentioned
apostrophes and hyphens, there are consonants with diacritics such as ç, č and
ñ, or cases where substituting an “accented” letter with the corresponding
unaccented one is not appropriate (e.g. ä in German is transcribed as ae in
ASCII).

------
caf
_An Australian businessman and politician called Benjamin Benjamin died in
1905._

There's a _current_ Australian politician called Grace Grace.

~~~
Piskvorrr
Yeah, this is actually quite common. This is the man who rediscovered how to
move the moai using simple tools:
[https://en.wikipedia.org/wiki/Pavel_Pavel](https://en.wikipedia.org/wiki/Pavel_Pavel)
In this case, neither the first name nor the surname are rare, when used
separately.

------
rfgjheljfrsd
He (Patrick) missed out "People in the same family have unique names" as a
general concept.

He pointed out "Jr." and "I/II/III", but how about Maria <middle name> <last
name>?

Both my sisters' first names are Maria, as is my daughter and my upcoming one.
My new daughter will have the same initials as one of my sisters (My mother,
her sisters and my cousin also are all Marias, but luckily many of them have
different last names).

So, if your form only has an initial for middle name, you could have a
situation where a person living in the same household (i.e. same address) is
indistinguishable for another.

So neglecting the middle name, as if often done, is terrible.

~~~
lmkg
And let's not forget George Foreman, who named all five of his sons George
Edward Foreman. So a middle name wouldn't have helped you there.

------
ThorinJacobs
I enjoyed the list and examples but I feel the hypothesis in the introduction
is flawed.

All anecdata, but I expect those of us who read these sorts of lists will
likely understand them quickly without needing examples. There's a large
number of developers that are not reading software blogs and lists and are not
applying much thought to their programs beyond "how do I meet the minimum
requirements put forth in this user story". As long as those sorts of
developer continue to be prevalent it's partly on the business to specify
these requirements explicitly.

Tragically I don't see this changing any time soon.

------
jyriand
Small example with ny own name: In my home country im called Jüri, in Finland
probably Jyri, in Russia Juri, and in the passport it’s also written as Jueri.
Some people write it mistakenly as Yuri.

------
purplethinking
Can't we just refer to people as a hash of their DNA.. And forget about
identical twins.

~~~
andrelaszlo
"Falsehoods Programmers Believe About DNA"

~~~
db48x
Yea, there's at least two in there already.

~~~
eesmith
"Lydia Fairchild is an American woman who exhibits chimerism, in having two
distinct populations of DNA among the cells of her body." \-
[https://en.wikipedia.org/wiki/Lydia_Fairchild](https://en.wikipedia.org/wiki/Lydia_Fairchild)

~~~
DoreenMichele
Microchemerism more common than previously believed:

[https://www.nytimes.com/2015/09/15/science/a-pregnancy-
souve...](https://www.nytimes.com/2015/09/15/science/a-pregnancy-souvenir-
cells-that-are-not-your-own.html)

TLDR: Having a baby tends to leave you with DNA from the baby distributed
throughout your body.

------
Upvoter33
This was a fun read. However, at least one point is wrong. "People’s names fit
within a certain defined amount of space" This is actually true.

~~~
GuB-42
And how do you define the amount of space people's names fit within? ;)
Tautologies are cheating.

"People’s names fit within a certain amount of space" is definitely true
though.

------
isotropy
Despite being a middle-class white guy from the US, I've experienced three of
these on the same day.

When I met my future wife, my last name ("Coon") happened to be a (somewhat
rare) slur on her ethnic background. At that time, a man couldn't
automatically change names on his wedding day in California, so we had to go
to court ahead of time to fill out a legal name change form.

The paper form had only one blank for the entire new name. A Hispanic family
ahead of us in line had a son whose last name was supposed to be [father's
surname] [mother's surname], but his birth certificate had accidentally
recorded his father's surname in the middle-name field instead. So the family
wanted to move his "middle" name to the front of his surname. With only one
blank for the whole name, the kid's current legal name had the same exact
sequence of characters as the new legal name, and there was no place on the
form to make a note. The judge asked if they were OK with adding a hyphen, and
they were.

Epilogue: each of my daughters has two middle names.

------
Insanity
This was an interesting read!

Being from Belgium, I wouldn't have expected the difference between how we
index "Vincent Van Gogh". We always (or I do) thought that using the same
software in Belgium (Flanders) and the Netherlands would be easy - i18n wise.

(There are issues with different laws though depending on the domain of your
software)

------
PurpleBoxDragon
>People whose names break my system are weird outliers. They should have had
solid, acceptable names, like 田中太郎. >No, your system is badly designed.

>People have names. >This one is perhaps the most difficult for which to give
solid examples. There was an isolated culture in which no one had names – they
referred to everyone in relative terms, such as “my mother’s eldest sister”.

So taking the last two statements together, any system that assumes a name
exist is thus poorly designed?

Some of these examples are things people need to take into account, but that
doesn't mean bad design. A well designed system takes into account that there
is a business need and a business budget. To ignore the constraints of real
world situation makes less sense than ignoring the one culture where people
don't have names at all.

------
salutonmundo
"e e cummings preferred his name written in all lower case." \- this is not
exactly true. He signed with capital letters, most of the time. (see
[https://en.wikipedia.org/wiki/E._E._Cummings#Name_and_capita...](https://en.wikipedia.org/wiki/E._E._Cummings#Name_and_capitalization))

A good example of such a name, though, would be eden ahbez:
[https://en.wikipedia.org/wiki/eden_ahbez](https://en.wikipedia.org/wiki/eden_ahbez)
(he's also a good example of someone with more than one name.)

------
masonic
The original McKenzie list was posted to HN many times from 2010 on. Original,
170+ points:

[https://news.ycombinator.com/item?id=1438472](https://news.ycombinator.com/item?id=1438472)

------
nindalf
41\. You share a surname with your father.

This is not true of many people from South India. My last name is my father's
first name. So my sibling, mother and I share a last name, but my father has a
different one (his father's first name).

Shoutout to the consultant who was helping me fill a visa application and
helpfully "fixed" my typo.

------
implying
I have a hyphenated last name and I'd say 1/4 of online forms do not validate
when I enter my name.

When a system had an account created for me and doesn't accept logging in with
my name, it is always a guessing game to see if they normalized it with a
space, am underscore, removing the first or second name etc.

------
JustSomeNobody
Nitpick. I don't think most developers "believe" most of these things. I think
most developers would rather wire up some simple code to handle names and move
on to (what they feel is) more interesting logic so they end up using 40 chars
of ascii for names and leave it at that.

------
fredley
Also somewhat related: Computerphile's video on dates and timezones.
Programmers might be tempted to make similar assumptions about dates, times
and timezones:

[https://www.youtube.com/watch?v=-5wpm-
gesOY](https://www.youtube.com/watch?v=-5wpm-gesOY)

------
jvanderbot
The worst offense is two-world last names. If you give me one field that says
"Full name" I may as well roll the dice to determine if the three strings I
input in there are First Middle Last, First Last1 Last2, First1 First2 Last,
etc.

------
kebman
I always wanted an exhaustive list of name falsehoods. The worst part is
probably that I read it with interest... I have no life!

~~~
Piskvorrr
Exhaustive? Ha, this doesn't even scratch the surface! How about "There is a
set of strings that never refers to a real person, so treat 'Christopher Null'
as test data."
[https://www.wired.com/2015/11/null/](https://www.wired.com/2015/11/null/)
(Also, talk about timing: "'Abcde' is not a real name, either"
[https://www.bbc.com/news/world-us-
canada-46393501](https://www.bbc.com/news/world-us-canada-46393501) )

~~~
steve_gh
Let alone "Little Bobby Tables" :-)

~~~
Piskvorrr
Some years ago, a certain Mr. O'Neil had something to do on a site of ours,
which made certain assumptions. He couldn't get through to our contact form -
in the e-mail, he was...outspoken.

