Hacker News new | past | comments | ask | show | jobs | submit login
My name causes an issue with any booking (stackexchange.com)
812 points by signa11 26 days ago | hide | past | web | favorite | 369 comments



I have a suffix (the Roman numeral Ⅳ), and it causes all sorts of problems. Some sites will have me "prove" that I'm me by asking questions about "my" credit history, and very often I'll get my father's. Half the time I've already supplied a SSN… which makes it even more appalling that they can't get this right.

I've also been issued a driver's license for a "4TH". I have no idea how TSA would ever spot a fake. (Since they don't flag me! But I'm also in a demographic that tends to get passes in the security theatre…)


My wife is from Myanmar, where most people only have one name (no first, last, etc.) and has experienced endless frustration since immigrating here to the US. It's been quite difficult for her and even limiting in some ways and she's broken down in tears more than once.

Some places just put a few letters into each field (like say for the name Jessica, first: Jes, middle: si, last: ca, or something like that). The DMV did that, and then listed her name on her license as <last>\<first> <middle initial>. Others have insisted on putting "nosurname" in the last name field. The immigration people put "FNU" as first name, and her given name in the last. At some places she's put her name twice, once in the first, and once in the last.

Not anything close to what she's had to endure, but my name doesn't fit the standard mold either. I prefer to use my middle name and really dislike using my first name (shared whith my dad, who I don't have a good relationship with). I'm endlessly having to explain every single stinking time I interact with pretty much anyone.

Anyway, the take away is, please please please (!!!) don't make assumptions about people's names! Ideally just one field labeled "name", and let the user interpret that as they see fit. If you need to collect a legal name then you need to validate it anyway. If you really must do first name / last name then at least make the last optional and also include a field for "what should we call you" or "nick name" or something.

Great ted talk about Myanmar names: https://www.ted.com/talks/cynthia_ma_shwe_sin_win_not_good_w...


> I prefer to use my middle name and really dislike using my first name (shared with my dad, who I don't have a good relationship with). I'm endlessly having to explain every single stinking time I interact with pretty much anyone.

This sounds dramatically overstated. Going by your middle name is normal and doesn't require much in the way of explanations. For example, my father goes by his middle name. This has led to "problems" all of one time -- when he worked for the military, they insisted on the first name. So, during that period, he used the first name.


It really depends on where. Some places you'll have no trouble and others you will. Whether or not you'll have problems depends on what you want to do.

I've have the opposite problem in Japan. Virtually everywhere insists that I use the name on my (Canadian) passport as my official name. It's listed as "Lastname Middlename Firstname". Some government offices can handle 3 names (Yay). Some government offices understand the order for Canadian names and since they often can only handle 2 names, will record my name as "Lastname Firstname" (in Japan, family name comes first). Other offices don't understand the order and assume that my single given name has a space in it. Since they can't handle spaces in their software they list my name as "Lastname Middlename". No amount of explanation will detract them as they have to do it the way they've been told.

So now I've got 3 official versions of my name in Japanese official databases (4 if you count the version that is truncated because my name has too many characters). Luckily none of their systems talk to each other ;-) -- though I had one heck of a time getting my "My Number" (similar to social security number) registered because of the confusion. I feel sorry for any Portuguese people (who often have a lot of names) who live in Japan ;-)

The situation in the US is not nearly so bad, but I've definitely heard of problems before.


I know it's a minor annoyance, but it's something that jades me every single day and gets really old after a while. I feel companies and websites should do a better job of being respectful. A "what should we call you" field is simple and easy, would make the service much more friendly to some people.


So do you dutifully put your real first name into all forms? I know people who always go by shortened versions of their name (e.g. Rob instead of Robert). They just put in their preferred name as their first name for everything (uni, bills, etc.)

You can't blame companies for using the name they give you. Are you worried about being accused of fraud or something?


That was very informative. Thank you for sharing this ( and useful for my day work ).

I know that western naming convention is just that, a convention, but it does help to see how other cultures handled it.

Apart from all this, it is always interesting how a given system deals with exceptions.


Is there a reason your wife didn't adopt your last name?


It's not something Burmese women are used to doing, and if you think about it it's kind of a weird holdover from a much more patriarchal time. But, if she'd known how difficult it would be I think she would have anyway.


Tons of immigrants at Ellis Island did not have last names, and were assigned them on the spot, or made them up on the spot. Sometimes with humorous results.

I imagine it would be extremely difficult to operate in the US without a last name. She doesn't have to adopt yours obviously, but she could pick one.

For example Osama Bin Laden. Bin is not his middle name, and Laden is not his last name. It means Osama son of Laden. But his relatives in the US use "BinLaden" as if it were a last name.


In other words, "I dislike that name, as it doesn't fit the One True Way Of Naming. Here's a Real(tm) name for you." How very nice of you /s

Or, as the article says, "39. People whose names break my system are weird outliers. They should have had solid, acceptable names, like 田中太郎."


That's like saying, I don't like having an @ symbol in my email address, fix it, make it work the way I like it.

Sometimes you just gotta make accommodations to where you live. Having a last name for legal purposes doesn't really change anything about you.


Quite the opposite, IMNSHO: "I don't like people using + in their emails, ban that! Everyone should have an address [a-z]{3,}@[a-z]{3,}\.[a-z]{2,4} - anything else is HERESY! You gotta make accommodations, else you'll break my overly restrictive assumptions."


You mean like when you move to a country and are oppressed into having to write addresses on envelopes in the way the local postal service expects it?

If email systems were a national thing, and I'd move somewhere where they didn't accept "+" in email addresses for, say, official government communications, then .. yeah, I'd have to get a new email address.

Personally I find it much more annoying if people mispronounce my name (which happens in a personal face-to-face setting) than if I had to write my name down in a certain way on a form, to make it work with the system. It's just an identifier for the system. I wouldn't get the same SSN or phone number either--which is just about as silly to expect.

It would be nice if they had standard ways of dealing with names that do not fit in such a system, so that at least the variant to make it work would be the same everywhere else. But to demand it to be taken verbatim and work correctly in the system, that's like moving to another country and demanding you keep the same land line number.


If there was a specific postal service for each building, each with their own set of requirements, you mean. And each, of course, insists that theirs is the One True Way. Yeah, that's a nice world to live in: change yourself ten ways from Sunday to fit into various, mutually incompatible and completely arbitrary rules; all so that the original developer making the rules up on the spot would have an easier job.


Presumably, if she wanted to change her name to avoid these problems she'd have done that independently. Most people would not want to change their name because of badly programmed systems.

Also, many people (and many cultures) consider adopting your husband's surname to be weird.


Just whatever you do don’t put ‘null’ into the fields lol


Or "DROP TABLES"!


I have the suffix III which often gets written iii, and sometimes get attached to my last name so they just say it and then and accented "ee" sound on the end. There have even been other variations, but those have all been one-off exceptions.

Even at the government level, identity isn't solved. I have two government issued IDs, both current, that contradict each other and once barred me from getting on a flight. I also have to visit the IRS in person regularly to confirm my identity. I had multiple issues with standardized tests when I was younger, including one instance where they both wanted to delete my score because they thought I couldn't fill out my name correctly and they also wanted to give me an award (which ultimately made me a National Merit Scholar) - the best part is because of the confusion on my name they had a hard time figuring out whether or not they were supposed to give it to me (as in they thought I both did and did not exist in their system due to the naming error).

The above are just a handful of examples, there are many more and those aren't even necessarily the most extreme. The only thing I know for certain is my children will not share my name, or have any weird shit attached to it.


I have a hyphen in my last name that caused the California DMV to make one of my last names a middle name, and the Social Security Administration can’t verify my name on their website also due to it.

I feel like it’s time software needs to level up, ok 30 years ago sure mistakes were made, but now if you live on planet earth you have to know how names work after how many thousands of years our current systems have been in place.


"how names work"

part of the issue is people writing software making decisions about 'how names work', and there being multiple interpretations of it. I've wanted - many times - to build in just 'name' in systems, vs "first name", "last name", "middle name", "suffix", etc. Because... inevitably, clients have to support someone that doesn't fit that mold. The end user probably has dealt with it dozens of times already, but it's still bad for them, and usually unnecessary. MOST of the time, we only ever take "first" and "last" and concat them on the screen anyway, then keep them separated for someone to sort via excel...


It's actually interesting that frameworks provided by platforms such as .NET, Java, etc., don't include an abstraction for the representation of names.

Such abstractions exist for dates, times, calendars, currencies, calculations with money, and so forth, but not names.

On the one hand I can understand it, because names are so complicated, and how would you sit down and come up with something good enough to represent all of them?

On the other hand they're prevalent in such a high percentage of line of business and consumer facing apps that it's almost ridiculous that every single developer on the face of the earth at one time or another has to come up with their own half-baked implementation.

It's especially ridiculous when you consider that so many of these home-rolled implementations, if not all of them, are rife with terrible flaws that constantly cause frustration and inconvenience to a small but significant number of users.


This is a solved problem from a modeling standpoint. The HL7 Reference Information Model allows any entity (such as a person) to have multiple names. Each name can be tagged with a type (legal, maiden, alias, etc) and validity date range. A name can contain multiple parts in any order, optionally tagged as prefix / suffix / family / given. Names can also be explicitly marked as null if unknown or not assigned. There are open source RIM implementations in several languages.


Interesting; I didn't think I'd see an HL7 reference in this thread. I work somewhat with FHIR, which also has a HumanName[1] data type, and I think it handles most of the cases in this thread.

For those not familiar: FHIR is a standard that covers health and patient data. IMO, it's a pretty good model. (HL7 is the organization, and there are few other standards under it.)

I'm less familiar with RIM; could you link to it's definition of a name? (The best I could find suggested that it was nothing more than an unconstrained piece of text.)

[1]: https://www.hl7.org/fhir/datatypes.html#humanname


Unfortunately due to the way the HL7 web site is structured there's no way to give a direct link. Go to the Normative Edition, Foundation, Data Types, Basic Types.

http://www.hl7.org/implement/standards/product_brief.cfm?pro...

The FHIR data model is a little simpler to allow for easier implementation. In the vast majority of real world healthcare use cases it works well. But from a modeling standpoint if you need to cover odd edge cases it sometimes helps to look back at the old RIM.


What's the purpose of structured modelling of names? Why does it matter which parts are family vs given vs whatever else? Lots of the W3C International Examples they give use a `<text>` field. Why not just use that?

https://www.hl7.org/fhir/datatypes-examples.html#2.24.1.13.1


It matters for collation order, if you want to show a list of patients sorted by family name.


Is this a current use case, or did that just make sense 100 years ago? (As in these assumptions: "tracking actual relationships in data is Hard, and family name correlates strongly with real-world relationships") I mean, it sort of works even today, but neither of the assumptions are as strong any more.


It's still a current use case for healthcare provider workflows. Also helps a lot when doing automated record linkage between different systems.


A dedicated, well-maintained name abstraction is certainly something that needs to happen. More than interesting, it's a bit bizarre that this hasn't been done yet (AFAIK.)

In terms of developer-facing complexity, this could be a laughably simple thing to use- just a type that supports equality, perhaps ordering, and conversion to string. Only the constructor would need to be complex. :)

I guess the reason this hasn't been done is simply that the implementation would never be "correct"- there is no formal specification of human names out there, and there would always be cases where some poor individual with an unusual name falls afoul of the system. Strictly fewer cases than we have today, where everyone rolls their own name system, but still some; it's not a solvable problem the way timezone conversion is.

But, on the other hand, that's the way the world is- messy. Developers are going to have to learn better best practices for communally doing our best in cases where there is no perfect answer, because there's only going to be more of such cases as tech continues to eat the world.


I actually attempted to do this.

I used ML techniques to help smooth over some of the difficult parts (there are many difficult parts). The hardest cases are ambiguous names, for instance delineating Hispanic vs. Puerto Rican naming conventions (they're different). The fundamental approach involved pushing all ambiguity up to the end user, so they always have the option to correct the system.

https://www.alphanym.com/demo/?jm2


I’m pretty sure it’s solvable, the main problem is that we break up names into first and last to identify the parts and we do bad data quality checks. Let’s say we did just have one field and a service that was trained on each counties’s variations that could return the parts of the name you wanted. So some database and detection system to understand the pattern. It’s definitely possible since we humans do read and understand names just fine in our own locales.


I'm not convinced it is solvable. I don't think the general case is reliably solved even by humans.

E.g. is Carlos the same person as Karl?

Well, that depends. Was one of them localized, or are these the actual given names. Just this weekend I was on an offsite and saw a Spanish book about Karl Marx, or Carlos Marx as they had written it in the title, in the library of the house we stayed in.

Clearly in this instance the names are the same, but that requires knowing that Carlos Marx maps to Karl Marx and that Karl Marx is a famous name; otherwise you can't assume the name was translated.

There were many other names on books in that library. I don't know which one of them - if any - also maps to someone known under another name, because that requires me to know which person they are about.

Is Curt, Curth, Kurt the same person? My uncle had all three on different documents, and delighted in telling people about it.

What about countries like the UK, where there is no legal requirement to notify anyone of a change of name, and where a the legal way of formally changing your name - a "deed poll" is just a document structured in a certain way where you assert that you are known under a certain name? My ex is known under at least three different name combinations, all of which are present on different sets of legal documents.

Some subsets of the issue is solvable, but for example there is no way of taking a full name and returning the "name this person prefers to be known by" because the name does not contain that information. You can make a pretty good guess.

But you'll fail dramatically for people from different countries. And don't think for a second you can guess correctly based on where a name is from - many names are used in different countries, and often as different elements (e.g. firstname one country, lastname another; feminine name one place, masculine another), and many people have names that combine different nationalities (e.g. my son has a name that combined an English firstname, a Nigerian middle name and a Norwegian last name).

The only reliable solution is to not assume any one single string can be used as a generic name - you need to ask what to use within a given context and within whatever constraints you have.


It would help if he quit using the First|Middle|Last| terminology an used Surname(if-any)|Personal-name #N|Personal-name #N+1(ifany)...


It would help if we quit using that and had one field for the full name and then another one for "how do you want to be addressed".


I've in the past written stuff that generated an index of names; this was sorted. While you can certainly sort on a free-form text field, the culture here is that name indexes are generally sorted by family name. So, to do that, you have to have some understanding of what the family name is, which a free-form text field does not give you.

But a lot of software has no need for the breakdown, and would be better served by a free-form field.


Then at least be upfront with the user about why you're asking, because they might well answer differently (e.g. include a different number of parts of their last name) if they know your purpose is to sort the name than if they think it's being used for a different purpose. They might even give totally different names.

That's the most important part of the comment above: The concept of a name is so overloaded that unless you ask about the string to use for the specific purposes you intend to use it, then there is very little you can do with it.


We already have an abstraction for names: a text field. Trying to be more clever than that will break for someones name.


Which is great until the client asks for a friendly name (given name) identifier. GitHub uses name, we use first/last name. So we just shove the GitHub name in the first name spot and ask people to organize what looks right.

Our partners suggested string.split(' '), which produced interesting results against the sample list of github users.


Use two fields: name and display name. Anything else will break for someone. In most cases where something asks for my name its really not even necessary and certainly not necessary to split it into first/last/display/friendly etc.


This is not sufficient if you’re going to localize to languages other than English. In some languages, proper names get declined like other nouns and thus change spelling in different contexts.


Are there examples where splitting it in First/Last name (or any other split) would help with that? It seems like that would always either be a problem or something to design around while localizing.


It’s fundamentally language specific, so any comprehensive solution is going to need to interface with the localization system. Really, names should be keyed by (user, language, tag) triples, where the localization defines the acceptable tags based on language requirements. For example, a single person may need their name stored as:

  en:disp     Eric
  is:disp:nf  Eiríkur
  is:disp:þf  Eirík
  is:disp:þgf Eiríki
  is:disp:ef  Eiríks
Designing a UI to collect this information is left as an exercise for the reader.


Since asking a user to enter every possible variant is unwieldy and without doing so, the problem is unsolvable, I’ll stick with my original suggestion of using a single field. Yes, some names will break, but far fewer than if first and last names are required.


A lot of software would be much better without stupid clients asking for the wrong things.


Eh, I understand it. They want to put some name in the identity control in the header. Putting the full name in is guaranteed to go wrong. We might start asking for a nickname for that purpose.


Why is the full name guaranteed to go wrong? A better assumption is that there is no reasonable nickname and you have to use their full name.


You're assuming that you're 'friendly' enough with your users that they want you to use their given name.


Oh yeah that touches a nerve. I hate it when I have filled in my name and email somewhere, maybe I forgot doing it, and I receive a newsletter, which opens with and greets me with just my first name.

No. Just because I signed up for your mailinglist doesn't mean we're on a first name basis! I hardly even remember your business exists, do you remember me? Anything else about me except my email and my full name, from which you deduced my first name? Then let's not pretend we're buddies.

This of course depends on how "friendly" I'm willing to be with said business. Which differs if it's an Etsy store, ordering food online, my bank, insurance, etc. I especially hate it when the news letter is in fact 99% ads and promo babble, but has this 1% of useful info that I want to be kept up to date on. We're not close, I'm letting you spam my inbox, call me "your grace" or something.

Can you actually go wrong with just using someone's full name, and erring on the side of being a tad too formal? Is this just a problem with marketing companies that want to "connect" and become "buddies"?


This already breaks names which inflect.


Which isn’t really a problem you can solve. If names change in the context that they’re used, it will always be broken, so why break more names by trying to be clever?


I'm not saying you can solve it. A single text field isn't a solution either. You cannot avoid breaking some names.


A single field breaks fewer names than forced first and last names, though, and is a simpler implementation too. Plus, as long as you accept any input (besides blank, I guess), then the only way it will be broken is during display and at least the user sees the exact name they typed in, exactly how they typed it in.


A single field also breaks the expectation that people do not get called by their full names in every interaction. This is a very common expectation, and violating it makes you sound subtly more like an evil robot.

Is this a lesser offense than mangling a name that doesn't cleanly split into first/last? At the individual scale, probably.

The impact, in aggregate, on UX/sales/utility? Could definitely go either way depending on your userbase.


That's not an evil robot, it's a polite robot.

Better than the slimy feel I get when a robot calls me just by my first name. You're a robot, we're not buddies.


Actually I think it should be ok, because you’d enter the name in the nomative case and then when you write it on the screen you’d declinate it based on the language you’re displaying it in, which would be the same for every name regardless of its origin.


Tried that once. Horrible, terrible, no good idea. (The only rule you can be sure of is "there are countless exceptions, and exceptions from the exceptions", everything else is a minefield in a quicksand) Asking for "how should we address you" is far easier, even if a few users fill in "Your Galactic Imperial Majesty".


What’s this? A name that changes when you are talking to someone directly? In some Slavic languages this happens.


Usually the name would change according to the full rules of noun inflection in whatever language. In Latin, a noun has 6 cases, of which vocative (indicating direct address to the noun) is one.


Irish has a vocative case that can modify names, and is an official language of a UN member state.


Because it’s extremely diverse between cultures and countries how names work. Here in Germany it’s typical to have several first names and it’s legal to use any of it, even though it might be just the name of your godfather/godmother.

FHIR has a relatively general definition for names, but multiple general and country-specific extensions exist for it: https://www.hl7.org/fhir/datatypes.html#HumanName


It's a string. Dont even try anything else.


Exactly this, and I really don't see what the fuss is all about in the above comments.


The idea that a person's name should be parsed and managed by software is amusing to me. How about just getting rid of concepts like "last name" and "first name" (which already embed a lot of cultural assumptions), and only ask for a "full name"? In some countries people don't have both first and last names. In some countries the last name customarily comes before the first name. In some countries the structure of names is more complicated and the son's name includes a copy of his father's name. I don't think software will really handle all these oddities correctly, given that just a single parent can undermine all the system's rules by choosing an unconventional name for their children.

For what it's worth, in Singapore, where there are significant Indian, Chinese, Malay ethnicities but also highly westernized, the government identity card provides just a single full name. Parents can choose their children's names in accordance with their culture—or not. You can put your first name before your last name, after it, or surrounding it. Or include your father's name if needed.



By ignoring the structure in data acquisition phase, you just postpone decisions about structure to data processing phase, now without necessary information about the structure (which could be obtained in data acquisition phase).

For example, such basic functionality like changing sort order between given name and surname would be much more complicated.


Besides, the whole story is about the problem that name including title was in one field and later processing (title removal) misbehaves due to insufficient information.


Is it important to parse names like that? You can just do a search by substring in most cases


It sounds like all you really need to handle names reliably is to ask for the entire name in one field, then have another field for their preferred name (which could be the first name, or the middle name, and a diminutive). And if you need to do something more formal with a title (like Mrs Lastname), potentially have that as a third field.

Sometimes the dumb solution is better than trying to be clever, and it saves some trouble with localisation.


W3C recommends the same pattern: [0].

[0]: https://www.w3.org/International/questions/qa-personal-names


> Sometimes the dumb solution is better than trying to be clever

It's astonishing how often this turns out to be true, which has been probably the single most important lesson of my career. I think it's that clever solutions tend to depend on more assumptions, which rarely have P(true) = 1.


"First" and "last" is a wrong representation anyway, because it assumes people always write their names in that order.

That breaks for chinese, japanese, korean, and probably multiple other types of names.

s/first name/given name

s/last name/surname

But this has it own issues, like assuming people have either a given name or a surname in the first place.


Also hard to decide which is the surname to use with some names/cultures.

My native country, Norway, went through an assimilation period of standardising surnames a few hundred years ago. Before that your name often was in 3 parts:

First names(s) - father's name - farm/manor/village.

So names were something like "Ivar Ragnarsson of Torp" or "Sverre Haraldson Bjerkeli". (With the -son bit to say whether a son or daughter).

With assimilation into standard more Continental Christian Danish society and most likely standard registration for tax - people dropped either the farm name or the father's name in their names. And froze the father's name in the surname in future generations. And changed the -son to a more Danish -sen for all genders. So, since the 1700s people have just 2 parts to their names. Unlike Iceland which has kept the naming tradition.

However,... what is common again today is to have 2 surnames. One from each parent. Unhyphenated. Similar to the Spanish convention (first-name - father's surname - mother's surname) but not as standardised, and mostly opposite order with father's surname at the end being the official family surname. And that makes internationalised computer systems so complicated.

My children have both our surnames, both by choice and necessity so either of us can get through passport control with them. (mother's surname - father's surname). But they had to have their surnames hyphenated to be able to register their births and British passports. Which still angers me today as my family convention of the latter surname being the main one is now mostly ignored.


How does that work after 2 generations? Wouldn’t you end up with names like this, and longer afterwards? Which ones are carried on?

Bob Jones Alexander Richardson Hill


People are free to choose what they want but you mainly keep only the "main" surname from each parent.

I know in England there was a tendency of people keeping both names of powerful families[1][2], then as double-barreled surnames. Which then sometimes went a bit nuts a few generations later if they married into other double-barreled families [3].

I think it was when I visited Stowe School, the seat of the Dukes of Buckingham and looked at the family tree, that I even saw some surnames repeated if they married into other families which shared one of their multi-barreled surnames...

[1] https://www.theguardian.com/lifeandstyle/2017/nov/02/keeping...

[2] https://www.telegraph.co.uk/family/parenting/are-we-heading-...

[3] https://en.wikipedia.org/wiki/Richard_Temple-Nugent-Brydges-...


I don't know about the OP, but in Quebec this is fairly common. Usually you have 2 surnames until you turn 18 and then you choose one of them. I really like the idea of this, personally.


Myanmar is another one, almost everyone has just one given name unless they are specifically following the western or some other style (and that's rare in my experience). The last name field should always be optional at a minimum.


> "First" and "last" is a wrong representation anyway, because it assumes people always write their names in that order.

I do not see this as a problem. If i know English so i can fill english-labeled form on english webpage, i would also have a bit of cultural knowledge to translate first name to given name and last name to surname.

> But this has it own issues, like assuming people have either a given name or a surname in the first place.

This is only a problem if the form validates that both fields must be non-NULL. Problem is not with the split itself but with the validation code.


I know a Romanian online and his surname comes first... It’s a historic IT clusterfuck to assume all names are firstname, lastname


Worked on some business software previously, and customers insisted on first/last or first/middle/last, despite the fairly obvious issues. They also demanded address fields in a US style despite needing to support international addresses (I still have no idea how their staff handled that).

People want to follow the conventions they know, even apparently if they're told it will cause issues.


Yet in reality, nobody actually needs a name that's sorted by surname: that's a holdover from paper phone books. We have search, and we have stable sorting algos. Every requirement "sort by surname" I've ever seen turned out to mean "sort the names in a predictable way, btw this is the way we always did that, because we always did that."

(Yes, familiarity is a part of UX; but do note that this one specifically is a historical, not intrinsic, motivation)


We used to have problems with non-ASCII chars in names, we fixed that with UTF, we had problems with currencies and numbers, we made libraries that understand locales and even directions of writing, time zones same thing. So it’s time for us to resolve names now with standard libraries that have been thought through like the above.


While I'd probably run in to edge cases, it would be nice to actually point to a standard and say "the libraries all support standard XYZ. that's built in - doing it any other way is going to mean problems ABC and cost $$".


I worked at a place that required a middle name. I don't have one, so at their instance, I picked a name, "Xavier" became my middle name.


Credit cards forms handle this perfectly. There is just a name field. It seems odd that this is perfectly acceptable for financial companies that in essence loan billions a year but its not ok for anyone else.


I have a similar situation, it's so frustrating. I couldn't buy insurance from Progressive because Lexis Nexis thinks I'm my father and it was too much of a hassle to resolve. Simply because I'm a Jr and we lived at the same address (obviously) for about 18 years.


I still have problems with this because I have the same name as my dad. The worst was when my bank account was locked because his boss at the time had been taking taxes off their cheques but not paying the government. For some reason they called my bank and locked my account thinking I was him(at the time they thought he wasn't paying taxes, it was all worked out eventually) despite our birth dates being nearly 30 years apart.

It happened without any explanation. I had to go to the bank and ask why I couldn't use my debit card. They explained.the government had a hold on my account. I ended up having to prove I was not my dad with a bunch of pieces of ID. It took like half a day to get access to my account restored. I was 20 or something at the time. There was no way I could me mistaken for someone in his 40's.


>There was no way I could me mistaken for someone in his 40's.

But when designing automated systems, you can't add in a condition of "unless person doesn't look their age". If one can't use names as a unique identifier, then that leaves a number issued at birth by a central government. But as I understand, even SSNs aren't unique and get re-used.

As an aside though, cultures that use the same name for multiple people in the same family confuse me. To me, the purpose of a name is to identify, so what is the point of naming someone the same? Perhaps, historically, it was a way to establish credibility before the time of credit reports and phones.


> “As an aside though, cultures that use the same name for multiple people in the same family confuse me.”

i’d wager their desire for affiliation is stronger than their desire for differentiation. individualism isn’t as valued in many cultures.


SSNs are not (yet) reused. There's ~900 million potential SSNs, we've run through half, and are using ~5.5 million a year, which gives us at least another half a century before we have to start reusing.


There are all sorts of reasons.

In some Irish families (mostly older people now), people had a Gaelic derived name and a legal English name, because the authorities banished Gaelic names. When you do genealogy, it’s very difficult to track people in certain circumstances as spellings and references change.


That happens in areas of Brazil that had German immigrants. They had their German-style names, but at one point, Brazil forced them all to adopt Portuguese-style names.


I'm surprised it was so quick.

Normally when I discover things like that which require some interaction with bureaucracy at some level, I find out on a Friday afternoon at 4pm, have to wait until Monday, then things are resolved for days because people are 'backed up' on Monday...


I actually wrote an API for handling personal names, because software mangling people's names irked my pedanticism. The fundamental takeaway is that names are ridiculously complex, equivalent to any other part of natural language. For every rule you could contrive there are exceptions, and many more legitimately ambiguous cases.

You shouldn't ever use first/last name fields, because they force users to adapt around your system (many names don't follow this structure). A long unstructured text fields is best, because it can accommodate nearly anyone who's name can be spelled with unicode. Finally always check your interpretation of a name with the person in question, seeing as they're the end authority.

https://www.alphanym.com/demo/?jm1


I also have a roman numeral suffix and experience the exact same issues all the time, its insane! I understand what my parents were going for with the suffix but that tradition will end with me if I have kids.

The credit agencies also started attributing debt to my name from someone with a very similar name and social. Of course they don't like paying their loans and so every once in a while my credit tanks while I go through the process to get it cleared up. As far as I've been able to find there is no long term solution to this, I just have to deal with it every couple years.


My first name differs from my sibling's by two letters (and same last name), and I often get credit history "verification" questions about him. I mean, they're two very different names, and whatever BS company it is that's behind this is complete crap.


I can’t use online question verification systems.

The people I bought my house from have the same last name as me. So every time I do one of those, it asks me questions about them. I know none of their information as I never met them.


My wife has the same maiden name as her mom, with a middle initial that is one letter apart.

This flags more corrective correlation, so despite having a different last name for 19 years, insurance companies and Bank of America get them confused. The insurance is worse because you cannot appeal insurance data.


I have a suffix as well (II), but I stopped using it years ago. I have one old credit card that has my full name including suffix on it that I have to remember to include the suffix with, but that's about it.


Maybe it would help to spell out II as "The Second" or IV as "The Fourth"? People with hyphenated or apostrophed last names could even spell out their punctuation too. But then how do you standardize on the spelling of punctuation?

FORTH solved the punctuation spelling problem by systematically documenting how every word (including punctuation) is pronounced, so you could unambiguously speak FORTH programs over the telephone.

http://forth.sourceforge.net/standard/fst83/fst83-12.htm

-Don O Apostrophe Dell Hyphen Hopkins The Fourth (Not to be confused with Don O Tick Dell Minus Hopkins The FORTH)


Never knew this about FORTH, very interesting!


I had a friend in college named jane smith, capitalized as so—that is, not capitalized. She always entered it this way and it always appeared to be correct until the next nightly database cleaning job ran and gave her initial capitals. Eventually the registrar told her that nothing could be done but that a note was made in her file to ensure her diploma was printed properly. Of course it wasn’t, but they did re-print it free of charge without complaint.

(Name changed to protect the innocent.)


I also have a numeral. Sometimes as a child, when I went to the local hospital they would initially pull up my father's information instead of mine. I'm not sure how they thought that could be correct, given I was a child in the 2000s and my father was born in the 70s.


Amr Eladawy has personally contacted GDS but been ignored. Sadly, this is one of those situations where getting a lawyer to send a letter to them probably would work. A bug that affects a group of minorities with a mention of interfering with the US Constitution right to travel with a hint of possible class action will tend to fix this type of bug. I don't like bug-report-via-lawyer but it does work.


The GDS systems have worked like this forever, and likely this won't ever change. The GDS can claim the issue is on the travel agent side, who should have written the lastname as AMRMR.

Having said that it clearly sucks and is super confusing and annoying to affected people, especially given the horrendous prices some airlines demand to change the name on the booking etc.

It seems like a small thing and I'm hypothesizing but I'm pretty sure no one sane at time will risk changing this, deploying to prod and hoping that nothing breaks. There's hundreds of airlines and thousands of travel agents involved.

The rule about having a title after the name without any separator is clearly a bad design, but now you have decades of systems (and hacks) built on top of it, you can't just change it like that overnight. In the same way as you can't change how certain broken web APIs work because there's too many websites that rely on that broken behaviors.

Yet another, more local and fixable issue is the airline app which doesn't allow 1-letter last names. This is the same as the apps which helpfully validate email address and reject many valid emails. The best regex for validating emails is /@/. Anything else is probably broken.

--

Another big issue in the industry is that PNRs are 6-digit long and sequentially generated over time, which is a security problem, as demonstrated by CCC a few years ago. If you know someone's name and that they're flying on a given date with a given airline, you can try to brute-force-guess their PNR number and get their personal data or even change their reservation. Again, this is so rooted into the systems that it can't be changed without a massive industry collaboration - probably a years-long project with $$$ cost. For now the GDS have anti-bruteforce mechanisms in place, but it's not good enough solution IMO for a determined attacker.


> given the horrendous prices some airlines demand to change the name

In practice airlines are fairly lenient concerning misspellings, out of order names, missing diacritics, etc. The name change fee applies mostly to changing the person who is flying.


I have a one letter first name and Air New Zealand refused to book me with only one letter in my name and then equally refused to change it when it caused problems later matching my frequent flier mile account on another airline without charging the ridiculous name change fee, which the airline agent at the desk said the computer system would not allow her to waive.


[flagged]


“People whose names break my system are weird outliers. They should have had solid, acceptable names, like 田中太郎.”

— #39, Falsehoods Programmers Believe About Names

https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-...


I'm not talking about computer systems. It is trivial to write code to handle an arbitrary UTF-8 string. I mean for his own sake, it's probably best to have a real name.


He has a real name.

You continuing to assert that his name is in some way not real is incredibly rude.


Do you even realize how rude and offensive this suggestion sounds? For most people their name is quite a large part of their identity, it's not on you (or developers of the protocols in question for that matter) to decide what does or does not constitute a "real name".

The mere fact that you're reading about these cases here should be good enough of an indication that they're indeed not seeing it as a waste of time.


But how much time would then be wasted writing/typing out all of those extra letters in their name? Then, dealing with updating every account they have. Your sarcastic response wsan't even well thought out. Developer's time is expensive to waste in a gov't office to address a bug left from their parents.


I was not being sarcastic whatsoever. I know two people who have changed their name because they didn't like the name they were born with. My suggestion is 100% dead serious.


Username checks out


Right. The GDS systems doing anything other than pointing the finger at a third party (like, say, fixing the bug) might be seen as admitting responsibility. In fact, this is a good way to ensure that the bug never gets fixed.

If Sabre has enough money to send Bill Clinton and Mark Ronson to Hawaii for a glorified frat party [0], it has enough money to tear through all but the best-funded legal challenges. (You'd think I'm joking. Nope.)

[0] https://hawaii.fcglobalgathering.com/a-huge-global-wrap-up/


Trying to invoke a constitutional right to fly on a plane will do nothing but make them think you might be a nutjob.


Just because nutjobs like to try to invoke constitutional rights doesn't mean those rights don't exist and can't be helpfully invoked. The stereotypical nutjob-invoking-constitutional-rights move is to respond to every police encounter by asking, "am I being detained?", but sometimes you're better off asking that question.


He kind of has a point, though -- I agree the right to travel does not specify a specific mode of travel and should cover planes, but then pretty much everything that goes on w.r.t. planes and cars is unconstitutional.

(Consider that while the government definitely has the right to tax you for road use, requiring you provide an ID to prove you paid a tax is like requiring you to carry an ID proving you paid your federal income tax, or requiring an ID to exercise free speech.)


"Am I being detained" is level 1 nutjob stuff, fairly normal. Incorrectly invoking a "right to travel" is advanced nutjob. It will immediately make a lawyer think you're a sovereign citizen since that's a cornerstone of their legal philosophy.


Its from the "Privileges and Immunities Clause" of the US Constitution and has shown up in many court cases. What may seem "nutjob" to you might have a whole different meaning to the lawyer you have hired to protect your company. Which is why you really want to make sure the letters that you get from lawyers are ran by a lawyer even if they seem a bit off.


Probably, but they would not do the same for your lawyer and a few well placed media mentions.


To be fair, we don't know if the problem is being ignored, or just isn't in the current sprint.

Think about how this would go on your team. Customer service sends in a ticket to your Product owner, complaining that a customer has a problem, and called in not only reporting it, but telling you exactly what their expected fix was. Even if the team acknowledged the problem and wanted to fix it, would it really be bumped to the #1 priority, and would the team really just take the fix from the customer instead of designing their own solution?

Or would it be triaged and prioritized, taking into account all other product and customers needs at the same time?

I'm not arguing that airline customer service is great. But a post on stackexchange venting about a customer service team's lack of communication gives us zero insights into what is really going on.


All the Agile nonsense is irrelevant if you're not communicating with the individual.


True, where I worked for, they'd scrap a sprint to fix something that went over CxO's channel.


if the bug is known for 6 years I wonder how long the sprint takes


The post said that it had been working fine for 6 years, and this is a new, recent bug. Or, more specifically, that the problem's workaround is no longer valid recently because of the new bug that requires longer names.


These PNRs are stored on mainframes running software written in COBOL which has not been touched for years. The actual PNR is a flat text document with semi-structured lines as explained in the article.

We never make changes to the mainframes, we just wrap the functionality. The structure of a line cannot change as the industry relies on all airlines using the same structure. If a new "field" needs to be added to the PNR we add it as an "RM" (remark) entry.


FYI Amadeus has migrated away from mainframes recently (some 2 or 3 years ago)


I dunno about you, but I always liked the practice of having an on-call rotation (even if you weren't actually on call) so someone would be able to pick up minor bug fixes without them getting in the forever backlog.


>Recently, another smart developer decided to prevent people with first name less than 2 characters from checking-in.

PSA, don't assume everyone has a name of at least 2 characters; it isn't remotely true. Some people don't even have more than one name.


My entire "legal" name is "Aaron." (And now you know what my HN name means.) I was born with the usual three names. When I was thirty, years before 9/11 and DHS, I got a court order and dropped the middle and last. Because, OK?

Never a problem, often a conversation. Sometimes I'd have to spend a little effort to be sure it said "Aaron" on my driver license. Other areas like credit cards and employment I've been more flexible, going by "A Aaron" or "Aaron Aaron," etc.

SS used to call me "Aaron." At some point they renamed me to "UNK Aaron."

All fine.

Coming on two years ago, as part of a "pivot," I got my CDL, Commercial Driver License.

Up to this point my DL said "Aaron." During training my learners permit said "Aaron."

When it came time to take the CDL road test, the CDL office, separate in some way from the DMV, would not schedule my test.

"Because there must be something in the first and last name fields on the driver license."

Without investigating, I'm quite sure that they populate some CDL office record from some DMV record, and their software was written assuming that there must be two names and assuming that the DMV records would all have at least two names. But not three or more, because the programmer or requirements writer had personal experience with people without a middle name.

So I had to go back to the DMV and get my driver license changed to something else.

It too a while, phone calls, "can't you just ..." etc.

The clerk finally agreed to list me as "Unknown Aaron." Which, note, is not my legal name, just what the DMV agreed to call me. So my legal name and "wallet" name are not the same.

Now the CDL office recognized me as "Unknown Aaron." Took my test, got my permanent CDL. Which says "Unknown Aaron."

Hired on with a company which knows me as "Unknown Aaron," because they have to use my CDL name because the feds know me that way now.

Which means my health and other insurance knows me that way.

Which means my doctors know me that way.

So thanks, anonymous programmer.


> I've been more flexible, going by "A Aaron"

I'm sorry, but if I found that name IRL I'd have a hard time not laughing. Have you seen the Substitute Teacher sketch by Key and Peele?

https://www.youtube.com/watch?v=Dd7FixvoKBw


Yup. Funny.


I was tempted to go with a single name when I was forced to legally change my full name to meet ID requirements for a driver's license.

My birth certificate name is patterned like Andrew Bruce-Carlos Denis-Edward Fatherson. After my parents split, I had many variations. E.g. school knowing me as Andrew Charles Motherson, my bank as Andrew Bruce Motherson, etc.

Every official document was something different, so I had an official change to get the most common variants in a close-enough form that still fits on most forms, i.e. Andrew Bruce Carlos Motherson.

It still feels like a missed opportunity to go as simple as possible, and shed some family baggage.


Should have gone for length tbh


You could go by "Aaron Nmn Nln" which would tongue-tie a lot of people.

NMN = No Middle Name

NLN = No Last Name


Assuming what is and isn't valid for user inputs is a dangerous game because there are always exceptions.

I ran into a similar issue with many online retailers when I was living in the inner city of Mannheim, Germany because a lot of online systems make assumptions on how a valid address looks.

Addresses in Mannheim's inner city follow the format "Char Number, Number". "A1,1" is a valid address if you want to send a letter to the district court. A1 being the city block the court is located at and 1 being the house number within that block.

I didn't get to do a lot of online shopping for years when I lived there.


Simple workaround would be putting in a human-readable long string. Anything like, "Daniel's residence, A1,1". Post people know how to read.

Upd: reaction to this suggestion shows that some people don't understand how post office operates. They go to great lengths to understand where to deliver the mail/parcel. In most cases, addresses like "big yellow house with a red door overlooking the cliff near the lighthouse" would work. So the only challenge here is to get past the whatever dumb rule the service developer imposed on the address format. Likely it is just filter by string length.


Post people know how to read, but I think now most nail sorting/routing is done with computers and OCR. I sometimes get mail addressed to people who used to live at my address but have long since moved. I tried writing “return to sender, addressee not at this address” or similar things but the mail kept coming back to me. I finally went into the post office and they said that the machines would just rescan the address and send it right back to my address for delivery. So I think relying on postal employees to see/interpret things on address labels is no longer a viable approach in many places.


No, not really. Sorting is always manual when automation fails.

In your case, automation actually didn't fail, it just didn't recognize your additional instruction. Probably, you could have just patched the address with an easily removable piece of tape and that would definitely trigger a human attention, and delivery would go where it should


-My parents (living in rural Norway) once had a postcard delivered where the address given was simply their first names - no last name, no street, no town, no nothing.

Having a database in which every citizen's domicile is registered does have its occasional advantages.


Similarly, there was a story going round a few years back about mail being delivered in Iceland where instead of an address there was a map to the house to be delivered: http://i.imgur.com/1GVjLKF.jpg


I live in Japan. I once had a package delivered from overseas where their printers couldn’t print CJK fonts and thus the whole address resulted in just small empty boxes. The post office inferred my address from the post code + my name and delivered it correctly. There wasn’t even a (noticeable) delay.


> Probably, you could have just patched the address with an easily removable piece of tape and that would definitely trigger a human attention, and delivery would go where it should

Fair point. And that was the advice given to me by the postal service.


Cross through the wrong address. Every year or two somebody from the management agency tidies the noticeboard for the building I live in and removes my hand written sign explaining how this works. Then, next September/October when lots of people move in (some fraction of the occupants are students) the noticeboard gets envelopes pinned to it with undeliverable mail. I write a fresh sign.

The sign is a flowchart, it says first, is this mail for a different address? If so, either redeliver it (duh) or write "Misdelivered" in bold leters and put it into any postbox.

If not, but you don't recognise the recipient, strike through the whole address in black pen and write clearly "Not at this address" then put it into the postbox.

This won't stop you getting more mail by the way, I still get letters labelled "Urgent" with the name of the previous owner years after I bought this place. But it does stop literally the same mail coming back since the OCR will reject the crossed out address -- it's just that the sender may not have any effective process for what to do when they get the mail back undeliverable.


Carmel, California used to have deliverable descriptive addresses like that. Been years, don't know if they still do.


> They go to great lengths to understand where to deliver the mail/parcel

Only for private unregistered mail. Registered mail is required to specify the address accurately.


Define 'accuracy' though. That's what this whole thread is about. When what you're comparing against is itself wrong, what does accuracy mean? Also, "big yellow house with a red door" etc. is in fact accurate for that place.


That's the whole issue: a random coder decides "and this is my idea what's acceptable: Google Maps/whatever finds it from the input string; worksforme, done!" without second thought or even authority to make such decision; this, an operational decision, gradually becomes doctrine, even dogma.


A friend of mine living there works around it by using „Quadrat A1 1“ (translated: square A1 1 - since most blocks are roughly a square in Mannheim) and it seems to work okay.

But the naming in Mannheim causes a lot of issues, I remember early navigation systems having a hard time with the format. A IIRC TomTom even crashed when trying to announce the street.


That is actually a pretty clever solution, I wish I had thought of that back then.



It's always annoyed me that the blog post above didn't contain examples, and I'm grateful that someone posted a latter post that does contain examples:

https://shinesolutions.com/2018/01/08/falsehoods-programmers...


This entire thread is a testament to Falsehoods Programmers Believe About Names. Shouldn't everyone by now have access to a list of "edge case" names to test their software against so these kinds of things don't get deployed? It's hard to believe it's 2019 and we are still struggling with things like names, addresses and dates.


The edge cases in this example are trivial to generate: length 1, or empty string. ("Ends with the letters MR" isn't on this list, and is awfully specific.) If you do nothing but have a text field that can contain anything at all the user chooses to type, that will work!

Add in "user X's name changes to Y on date Z" (hard to put in a list of edge case names) and you've covered 98% of these issues.


I must admit I have not seen one. Sounds like a perfect thing to crowdsource


I’ll add the HN thread from that here too.

That link of yours is very good.

https://news.ycombinator.com/item?id=1438472


The French Minister for Digital is called Cédric O. Everytime I see his name I wonder how much problems he's had with digital systems.


OTOH must be a good incentive to fix a system when a programmer gets told that the minister can't use it.


I have a family member with just one name, officially. We usually don't use surnames in our community.

He has had problems registering his bike in India, with his name.


This is a great tests of systems - which field does it go in? First name? Middle? Last? It would not work well on systems I use.


As a person with one name and a programming background, I've always put my actual name in the last name field, assuming that that would usually be the most significant sorting field.


What is registering a bike? Register with whom, and why?


"Bike" in India is short for "motorbike". OP's relative probably needs to register a newly bought motorcycle with the local transport authorities.


Also don't assume these requirements come from developers. I've built form fields countless times and explained these rules to countless stakeholder and they always push back with "no, street address is a number" or "no, phone numbers must always conform to this format" and "no, everybody must have two names because I need to be able to sort the list by surname".

And then the system launches and the complaints start to roll in to support.


Also don't assume people can't have the same first and last name (Norwegian Air...). When their website failed to book my ticket, a customer rep told me to put NAMEMR as my first name. This then led to my ticket showing NAME MR MR. Surprisingly nobody batted an eye on international travel day, but it annoys me that their website product team decided to take it on themselves to take what I assume were Norwegian naming norms and improve on lax GDS constraints.


Don't assume anything about names, let a user enter whatever they wish - https://shinesolutions.com/2018/01/08/falsehoods-programmers...


I have an apostrophe in my name and it causes allll sorts of issues like this. (Think Bobby Tables). To the point where I’m pretty convinced that the internet is going to wipe out apostrophes from people’s actual names. In fact I just omitted it in my most recent drivers license.


Ouch. I have a ü in my last name from family in a different country way back somewhere. It doesn't crash systems as much as an apostrophe would, but it's very good at showing encoding issues between systems..

It's not as big an issue as it used to be, at least. Before I've had online transactions failing because of a mismatch between my name (with ü), and the name on the card (with u). The systems seem more forgiving now, having handled that case or something. I also remember being a bit scared traveling to Japan many years ago, as we were told it was SOoo important that the names and everything matched to gain entry. And then the name on my ticket was completely mangled. But no one cared.

Here's a SO post about someone with the last name Null: https://stackoverflow.com/q/4456438/923847


The Japanese are quite used to mojibake [1], so they would've understood immediately that the mismatch between your ticket and passport was caused by encoding issues.

[1] https://en.wikipedia.org/wiki/Mojibake


Interestingly, I've had problems in Korea (Gimpo Airport) because my name contains an "ö", and the canonical spelling in the passport for this is "oe". This was cause for much confusion among the airport staff.

I would have thought that people from CJK-countries were more understanding of encoding-to-latin weirdness than most, but apparently not.


I think their understanding would be focused on the encoding for their language and a relatively narrow set of problems. I've encountered name issues in CJK countries that keep names in native encoding due to an assumption that full names fit within a couple of characters with no need for any spaces or punctuation. Some systems might be designed to be "accommodating" and take even up to 8 or 10 characters! There was one train system where my name had at least four different iterations through the tickets I collected, with different ordering of first and last names and truncating.


In defense of the Korean airport staff, they might have been more accommodating if the "ö" was completely and obviously broken, like "£‡�". Spelling it as "oe" makes it look like there are no encoding issues, in which case strict checking makes more sense.

It's much easier to identify mojibake (they tend to be extremely obvious in CJK encodings) than to remember canonical spellings and other variations in a whole bunch of different languages. Airport staff probably know that "oe" and "œ" are interchangeable, but that's about it.


Diacritics are usually stripped in air travel. In Hungarian we have many letters with diacritics, but it is never a problem that the passport has them and the system doesn't.


> Diacritics are usually stripped

Not in all cases. In Germany and Finland (maybe all EU passports???) ä is spelled ae, ö is spelled oe in the machine readable part (umlauts shown in the "human-readable" part). This is important to know when you need a visa.

For Germans this is not a big problem because it has been like this forever if the umlaut is not available for technical reasons. For Finns this is a problem, because this "transcription" is completely unknown in Finnish. For a couple of weeks now it has been possible to get an electronic visa for Russia on the internet. Reportedly many Finns with an ä in their name (that's not uncommon) dropped the dots when applying for their visa, because an ä is not accepted. At the border they were not allowed to enter, because the machine-readable part of the passport has ae instead.


Good point, I don't know any Hungarians with ü or ö in their name, just á and é.

I do wonder what happens to ű and ő though.


There is an ICAO recommendation. However, it is not unambiguous and of course it's not legally binding. So in the end every country decides what they do. (Possibly there are more multinational agreements e. g. inside EU, but I doubt there is anything truly worldwide.)

https://www.icao.int/publications/Documents/9303_p3_cons_en....

Ü is written as UE, UXX or U

Ű is written as U

According to https://en.wikipedia.org/wiki/Machine-readable_passport#Name... Hungary uses UE for Ü, but there is no reference given. According to the same article Russia uses even 2 different transliteration systems depending on the type of document.


For German names, this is a problem. I have an ü in my name and this is transcribed as a "ue" in my passport. Transcribing it as u would produce a different name (which AFAIK actually exists).


In Hungarian the diacritics are also important, for example Szilasi and Szilási are different and are pronounced differently. Still, it won't be an issue when flying or other stuff.

German is more complicated though with all the substitution rules.

Not to mention Germans who actually have an ue in their name, still pronounced as ü, but written as ue only, never as ü. Or someone may be called Gross, but it would be incorrect to write it as Groß, while someone else's name may be Groß with the acceptable alternative spelling Gross when ß is unavailable.


I too have an apostrophe in my name and experience the same thing. I've had people put it into their system as a comma, dash, space, and all sorts of weirdness despite my calling it out specifically.

My experience has actually improved substantially in the last 10 years or so, and most of the government systems I encounter these days actually handle it properly (as well as handling suffix properly too). That said, I've somewhat recently started having trouble checking in for flights again -- I flew last month and it took the ticketing agent >20 minutes to find my reservation on both the outbound and return flights, even despite my providing the 'confirmation code' / itinerary email (we were checking bags & flying with infants, else I'd have done online check-in).

It can be really frustrating -- though I'm hopeful it will continue improving and hopefully be a smoother experience by the time my kids are adults.


> I've had people put it into their system as a comma, dash, space, and all sorts of weirdness despite my calling it out specifically.

Ugh, yes. And it's insane how many people seem to just NOT KNOW what an apostrophe is.

> checking in for flights

Yea, airlines seem to be one of the worst offenders. I have Precheck but Spirit in particular is never able to match the name on my ticket to the name in the gov't database so I never get it. Just one more reason to avoid flying them I guess.


Out of curiosity, why not just omit the apostrophe for airline reservations then? I understand wanting your full, real name in many circumstances, but who cares about what the boarding pass says as long as you get to fly? I doubt the people checking ID would care about the missing apostrophe.


I often did do this when I used to fly more often domestically, but it tends to cause other issues -- the primary one being a "frequent-flyer/mileage account name mismatch" which means that I have to undertake some manual process to collect my miles. I've lost out on countless 'airline miles' as a result via forgetting to do the manual process within N days after the trip.

Similarly, automated check-in kiosks are then usually unable to find the reservation via credit-card or passport scan -- meaning you're back to looking up the reservation code, and even that often fails, as if the apostrophe just flat-out causes issues with the query/lookup or something.

It can be very frustrating, and I'm increasingly often impressed (and vocalize the same) when I spell my name and the agent enters it correctly AND the system flawlessly handles it, too! The DMV systems in my state are one such example where I used to have issues but, in recent years, the problem appears to have been wholly addressed/handled.


Update your name in the mileage program to take the apostrophe out. They might be able to do it


Practically speaking, that makes sense. Philosophically, it's abhorrent. Blaming the user is bad behavior in general. Expecting a person to alter their name to confirm to a poor software implementation is just wrong.


Well, people with names that are not written with Latin script are coerced into whatever Latin transliteration their government uses when issuing passports. Bonus points for altering the transliteration rules from time to time.


But ID checking is also done electronically at some checkpoints. If your ticket doesn't match your passport, your Visa, your Visa waiver, etc. you are going to be in trouble.

That being said last time I went to the US the person booking the ticket swapped my first name and last name. Only the person at the baggage dropoff noticed it, and after much deliberation they suggested to leave it that way. I went through with no issues apart from not being able to register the mileage.


He shouldn't have to. He is a human being. Computers were made to serve us, not the other way around.


I have a hyphen in my first name that also causes problems. I love it when I put my name in and the web site say "invalid first name." Thanks mom and dad...

What is worse is people who "fix" my name by moving the second half of my first name and making it part of my last name. I'm an adult. I know what my name is.


In Quebec, composite first names (prénoms composés) like Marc-Antoine are pretty common, so there was nothing weird about my parents giving me such a name. And frankly, most webforms I had to fill out while living there accepted my name just fine.

However, now that I've moved to the United States, it's been a bit of an annoyance.


I have a double first name, so I have a space in my first name. Many people / systems seem to think I accidentally put my middle name in the first name field and helpfully move or drop the second part. Putting a hyphen in (which is not really supposed to be there) typically fixes it, so I'm variously known with a hyphenated and non-hyphenated name. But it rarely causes issues.


I write very strongly-worded letters to those companies. Honestly, I wish someone publicly shamed all those stupid companies.


My name is officially spelled as Léon.

The letter e with an acute accent causes all sorts of UTF-8 encoding issues with many services, not just airliners. If you interpret the UTF-8 é (0xC3A9) as ASCII it becomes à (0xC3) + © (0xA9), so my name often comes out as 'Léon'.

Airlines make it worse, because they strip both characters during sanity checking, so my name comes out as 'Lon', which has caused me problems a couple of times as the name on my passport did not match the name on the ticket.


Reminds me of the ode to a shipping label:

http://i.imgur.com/4J7Il0m.jpg

What these things all reinforce is that a lot of programmers take text encoding as a given, and don’t realize all the potential places for errors to sneak in.


Could be a fun way to hunt for buffer overflows on internal shipping services. Just fill out the sender name field to just "óóóóóóóóóóóóóóóóóóóóóóó" and let it expand. If the parcel arrives, not vulnerable. If the packet doesn't arrive, you've found a vulnerability... somewhere...


I wouldn't say an accent "causes" UTF-8 encoding issues. If acute accents are a problem, then UTF-8 handling has completely failed.

It is amazing to me where I see failed encoding like that. For instance, many SEC filings and job ads for tech companies. I mean, I feel like I'm expected to spell things correctly on my resume and emails at work...


> If you interpret the UTF-8 é (0xC3A9) as ASCII it becomes à (0xC3) + © (0xA9)

As latin1 (ISO-8859-1) or Win-1252; ASCII doesn't have either à or ©.

latin1 is the default for text, including HTML, if you don't specify in protocols such as HTTP (modulo some stupidity from the WHATWG where it might be Win-1252 instead) and Windows-1252 is the default encoding in Windows in the USA (at least, prior to the Unicode APIs being added. The old APIs probably still exist though…). So these codecs pop up a lot in places where people who don't know what they're doing end up touching text.


The WHATWG HTML spec requires UTF-8 for conforming documents and scripts [WHATWG 4.2.5.4]. In both HTML specs, charset declarations, if provided, must be UTF-8 [4.2.5].

If the transport, content-type, lack of charset declaration, and sniffing fail to determine an encoding, both specs use defaults based on the configured locale, for English that's windows-1252 [WHATWG: 12.2.3.2 W2C: 8.2.2.2]. latin1/ISO-8859-1 is prohibited. [WHATWG: 12.2.3.3 W3C: 8.2.2.3].


I ran across some code once for descrambling data that had been incorrectly processed like that, which I found common in legal documents. It's an interesting problem, because strictly speaking, it's lossy, but you can use probabilities to figure out something plausible. You can decode/encode one thing as another, or you can decode/encode multiple times...


Any chance you have a link? I’ve had implement solutions to this myself and it’s very tedious. If someone has built a more complete solution I would love to just use that instead


This HN thread has some links and discussion: https://news.ycombinator.com/item?id=16103356


That might be what I'm remembering; then again, I don't really do Python, so maybe it was something else. I doubt it was anything better than the link above, regardless.


You could try inputting your name as [Latin Small Letter E][Combining Acute Accent]:

e◌́ => é

Which should keep the `e` intact, while the combining acute accent (0xCC 0x81) may "only" get converted to a `Ì` which may be stripped. 0x81 is undefined in Windows-1252, so I have no idea what would happen to that, but probably be stripped as well, keeping just Leon.


Unless someone decides to NFC-normalize the text along the way. And it's generally agreed that text should be normalized with NFC, although there is often a fierce debate about who should do it ("not me").


Reminds me of the times when Amazon failed to reproduce the ü in my last name on their shippig labels. They consistently printed the UTF-8 encoded character interpreted as 8 bit ASCII sequence. That bug was present for a couple of years.


"I was not trying to do SQL injection, sir! My name is John Letme'or True"


> the internet is going to wipe out apostrophes from people’s actual names

Also seeing that with accentuated uppercase letters in French, even in nouns, because it's hard to type them on Windows.

People still use accents in lowercase of course, but think that it's incorrect to use accents for uppercase letters, even when handwriting.


I only learned they do have the accents from your post. I was taught to omit them about a decade ago (as a second language).


It is obligatory in Spanish -using accents both in lower and upper cases...


I did some research a few weeks back about why I have an apostrophe in my name. When the British conquered the Irish they started keeping records of the citizenry. The Ó used in Irish names to track descent was eschewed for O' in British record keeping.

https://en.wikipedia.org/wiki/Irish_name


Holy crap- still? This was an issue on HN like 10 years ago or so and I thought the word got out to fix it.


It was also an issue 10 years before that. You'd think the word would get out to fix it...


Just because the engineers know that there's a tricky problem with input validation doesn't mean the business people want to take the time to solve it, unfortunately.


For those unfamiliar with Bobby:

https://xkcd.com/327/


D'Von?


Nah, pretty common Irish last name


As a general rule, the earlier an industry automated their systems, the worse the implementation.

In the earlier decades of computing, the software industry really just didn't know what it was doing too much. (Not that we're perfect now, but we were even worse back then.)

And replacing an implementation is a huge undertaking, and a lot of industries just don't bother.

This leads to a paradoxical situation where industries that most obviously need automation are the ones that have the worst automation. Before others, they pushed to get it done, and they got locked in to something primitive and/or outmoded.


I mean, it doesn't hurt that hardware now is so fast and scalable that performance and efficiency can take a back seat to usability and clarity. Guarantee this was done because storing the title as part of the first name field saved a few bytes over having a separate field, and that really mattered in the 70s when these systems were designed.


Yeah, it is a mixture of causes. Sometimes hardware constraints really did force software into tough choices.

The C programming language might be a good example. From a modern perspective, requiring forward declarations seems like pointless busywork. At the time it was written, decisions like that made it possible to have a one-pass compiler, which was an important efficiency gain. You could reduce I/O and maybe save RAM.


Based on the top answer, this appears to be an error in the CLI "helpfully" re-interpreting Amr as { First name: A, Title: Mr }, not an issue on the data storage side.


This interpretation imputes a principled separation of the "CLI" from the "data storage side". Not a winning bet.


Would be interesting to know whether other people whose names end in mr also have this problem


> it doesn't hurt that hardware now is so fast and scalable that performance and efficiency can take a back seat to usability and clarity.

Agree. Unfortunately , the users of even the new usable systems may be loath to take up these changes. Why? I guess inertia and priorities. My mind goes back to QWERTY v/s Dvorak and how it panned out.

The top answer on StackExchange alludes to this, too.


My mind goes back to QWERTY v/s Dvorak and how it panned out.

Sometimes the new system isn't better so there's no reason to switch... AFAIK, Dvorak is not better than QWERTY (despite claims to the contrary).

I worked in IT at a hotel company when they made the cutover from a 3270 based text system to an early 90's modern Windows GUI -- agents hated it, even with command shortcuts it was slower than the old interface. Training new agents was faster, but experienced agents were much slower.


It is nice to see it when we get first implementation where "business" invents some rules, like only 2 items for this list or something like that.

Next month when feedback from user comes, we are removing loads of rules that were supposed to be helping.

So with agile we are better now because we can remove stupid rules in next iteration and not leave it hanging there for decades.


"So with agile we are better now because we can remove stupid rules in next iteration and not leave it hanging there for decades."

Nice try. Fixing bugs and changing bad logic in timely manner were done ages ago when nobody knew that Agile term.


Yes, but in many cases the rules were not fixed also. Agile is a mindset and philosophy. Whilst not a panacea, i feel it does help reduce inertia in organisations that adopt it.


Yes, but in many cases the rules were not fixed also

What does this have to do with Agile? You either have adequate resources (human and money) and desire to fix problems or not. Absent that issues and bugs are lingering for years in organizations that employ agile. I saw it with my own eyes.


Agile doesn't help you here. It might only take a day to fix this issue in your code base. But then it will take months to get it tested and approved. And then it has to be rolled out into the organization and periodically analyzed during the rollout until it's done. Given how large airlines are and how many people interface with an airline booking system Agile's "contribution" to development timing is meaningless here. You could spend a day making a fix and a couple years watching that fix slowly percolate into the organization, followed by another couple of years waiting for all the travel agents to stop calling and complaining about how this impacts the workflow they've had for the last 20 years.


No, as others have mentioned the reason for this was to save bytes or bandwidth when bytes and bandwidth were really expensive.


That's what others here have conjectured. Suddenly it's authoritative?


Look up two-digit date fields, packed-decimal, zoned-decimal, bitfields, and other encoded-field datatypes. All were explicitly created to save on data storage, when the main transmission medium was punch cards.

The rationalisation is not conjecture. It's fact. Bits and baud were expensive.

And that's before getting to bitshifted storage of software and similar tricks.

SABRE dates from the 1950s. Which was a long time ago in Internet Years.

https://en.wikipedia.org/wiki/Sabre_(computer_system)

The computer it was based, the IBM 7090, on had memory storage of 32,768 words of 36 bits, about 64 KB using today's 16-bit byte. It operated at 100 kflops/s. A modern AMD-64 CPU tends around 4-64 flops per cycle, or in the neighbourhood of 4-250 gigaflops/s, up to about 2 billion times faster.

https://en.wikipedia.org/wiki/IBM_7090

https://en.wikipedia.org/wiki/FLOPS


I've programmed on punched cards. System/360, FORTRAN77. Carter administration.

Whether history exists or not, is not the question. Whether byte misers from long ago caused this name-mangling issue, that's the conjecture part. At least one aspect of the guy's complaint was of recent origin, and ironically that was from arbitrarily insisting the name be longer than it is.


The SE answer links to the relevant documentation. Spaces are indeed optional. It's not explicitly documented that a non-whitespaced 'MR' is read as a title, but it seems likely.

http://www.amadeus.com/bg/documents/aco/bg/basic-qrg.pdf


Well I'm convinced then. That's nuts. Not so much that something so crappy can exist, but that it can persist!


Yes, because some of the "others" were actually there, then...


My first name ends with "-dr" and I've been awarded doctorates by quite a few airlines.


An honorary Doctorate from American Airlines, quite the impressive feat.


Would like to extend my congratulations for your accomplishments


I was prevented from doing an advance online check-in with Emirates (while overseas) because - I was told later - my son and I have the same (first/last) name and their system couldn't handle it. Subsequently, the flight was overbooked and we got bumped (which would have been avoided had I been able to use their online advance check-in).

I still can't get my head around how their online check-in system was setup where this could happen.

updated for clarity


This is strange as airlines usually allow you to enter the PNR (booking reference) and any of the names in the booking. The implementation is usually that they lookup your booking from the PNR then effectively regex to see if your name is inside one of the name lines in the record. I can't think of a reason why this wouldn't work unless it was a connecting journey and the carrier didn't have the correct permissions on the the record.


So my best guess with my limited knowledge about GDS:

- You were on the same booking with an infant with the same name

- You were the primary passenger

- Due to overbooking people get bumped

- They usually start with comfort seats (both in your name) or an accidental double booking

- Because your son, probably underage at that time, was bumped the primary passenger was bumped too


It says that the issue was 2 persons on a flight with the same name. The system can’t handle it.


Cool, but not being able to checkin online is usually a sign of already being bumped, not the other way around.


Well, that's certainly plausible. Though I was told - after escalating the issue - that their system had an "issue with our names being the same" causing some bug in their online system.

Normally our middle initials are carried forward to the carrier's booking system and that differentiates my son and I when we fly together... but in Emirates case it seems they weren't.


I noticed while watching the Athletics World Championships, that the 5000m runner Ben True always had his name shown in the results as:

SMITH John

True Ben

JONES Fred

BLOGGS Bill


So it probably means that a "dynamically" typed system is used. Is there a database where if you INSERT the string 'TRUE' you get an actual boolean stored? IIRC MySQL used to not actually coerce types into the type the column is declared as.


A simpler case is Hispanic surnames. Suppose I'm writer Miguel de Cervantes Saavedra. From airline tickets to name tags in networking/speaking engagement programs, I show up as Miguel Saavedra. But I'm Cervantes.

I'm not Cervantes and I suppose I'm a little patriarchal, but I'm very proud of my dad's achievements and wear his name (whenever they let me) with pride.


Also with a name like "Firstname Last Name" I love having my initials show up in software as "FN"


Telegram does this in chat icons and it's very annoying.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: