Hacker News new | comments | show | ask | jobs | submit login

A Japanese company once made the decision that they needed "virtual" employees in a particular system, for example to support e.g. adding a job to the org chart before that position had been filled (and another dozen use cases), so they had the clever idea "Hey, if we need to do this, we'll just input their 'name in Japanese' as one of a dozen status flags, like XX_JOB_REQUEST or XX_INCOMING_TRANSFER."

One developer at this company, who was annoyed with having to tweak a particular system every time they added a new possible status flag, wrote code which was, essentially:

  if (InternalStringUtils.isAllLatinCharacters(employee.getJapaneseName()) {
    /* no need to pay this 'employee' so remove them from batch 
    before we retrieve bank details for salary transfers */
Do I have to explain why I'm aware of this curious implementation choice?

Many systems in China depend on your Chinese name and identification number. Suffice it to say, foreigners who work in China don't have either of these: a made up Chinese name is not meaningful or legal (China also lacks any kind of kana), a passport number is not a "valid" ID number and changes every 10 years anyways. I don't get to use many online services accordingly, and every year there is some problem with how they handle my last name (MC DIRMID, there is a freaking space after the MC in my passport, causes all sorts of problems).

When I worked in Taiwan they came up with a Chinese name for me and gave me a little official wooden stamp I could use to "sign" official documentation. The name was indeed useful to fill in all sorts of forms that expected the Chinese format. That was a neat time.

I'm picturing this as a character saying 'silly blond english man' or along those lines.

鬼佬 (Gweilo). Or, where I live now, 红毛 (Ang Mo).

It's meaningful in some contexts. I had a different Chinese name on my work permit and marriage certificate (in both cases transliterated without my input), and it caused me no end of hassle when I was applying for a mortgage - we got refused the first time and had to get the work permit changed to make them the same.

Yes, we (my wife and I) were careful about that when we got married last year. But to be honest, I think that's the only case (and the name on my work permit is not even my own chosen Chinese name!)

Yes, and imagine the impact this has on "foreigners who work in China" such as Tibetans in Lhasa.

Tibetans (with Hukou anyways) have Chinese names though. Usually they are phonetically chosen. Its something that must be done when you are born in China I guess, even if your native language isn't Chinese. This applies to all minorities who use different writing systems (Uigher, Manchu, Mongolian, etc...).

Japanese of Korean descent can also choose Kanji names I think, to use as legal aliases.

Anyone who lives in Japan can register a legal alias, regardless of their citizenship. The special permanent residents that you mentioned (who typically hold North / South Korean or Chinese citizenship) are probably the most common users, but some Japanese people who are divorced use them, too.

An alias with Kanji is really useful for living in Japan. I'm an American citizen but I use an alias with a Kanji last name for just about everything I can (including my job, bank account, and apartment contract). Immigration paperwork and credit cards are just about the only things where the alias can't be used.

That is actually a crazily innovative solution to a hard problem. I wish China would adopt something like this.

I think SiVal was making making a reference to China's invasion of Tibet.

It's expected when we look at how many English pages handle non-ascii characters.

Took me a minute...

    InternalStringUtils.isAllLatinCharacters("Patrick McKenzie") == true

I worked in a security software development department where the primary security request application had to allow a request from anyone, for anyone (Approval was more stringent). I personally found several bugs in the system in my first few months, because I, personally, conflicted with the various "uniqueness" constraints in the system... like lastname + ssn-last-four, or dob + firstname, etc.

The org had 380K active entries, so it was definitely interesting being a dev on such a project, with a relatively common name, and conflicting dob and last 4-5 of my ssn.

Your DOB and last 4-5 of your SSN matched someone else?

The reason I ask is because rules like this are often used to de-duplicate records. It's not perfect but it is useful, especially when trying to integrate data from more than one system. It's also used quite a bit in fraud detection etc. to find connections in the data.

There were about 380K users in the various systems... so conflict chances were pretty high... I can't imagine what it would be like to have a name like "John Smith" or "Adam Jones" ... even more common...

Well, there are 366 dates of birth, and 9999 last 4 digits in SSNs, so approximately 3.7M combinations.

If you drop 380k users into 3.7M buckets, that's only ten times as many buckets as users. A lot of the buckets will be shared.

I'm actually curious now, was your (or whoever became the victim of this code) "Japanese name" not in the system in katakana or something?

That was imprecision because I was trying to avoid the quick discussion of Japanese orthography. Like most systems in Japan dealing with names, there are separate fields for 漢字名 and カナ名. (Some systems also have ローマ字名.)

Japanese systems have wide, wide variability in what they do for 漢字名 for people who, ahem, don't have one. Some repeat the カナ名. Some do so but use half-width kana (半角 vs. 全角). Some managers who believe that there is such a thing as an "official name" think that one's official name should go in 漢字名, regardless of whether it is 漢字 or not.

A related problem: what happens when you have two systems which have different behaviors on this? For example, let's say you're a Japanese bank, and your branch employees were instructed in 2012 to update any 漢字名 of foreigners to be the name written on their foreigner registration card, in double-width characters. Let's further suppose that your web tier does Javascript validations when you try to sign up for online banking, and because any engineer can see that DOUBLEWIDTH latin characters are not 漢字, this means that it is literally impossible for the web tier to match the DB for affected customers.

Hilarity ensues.

It's not quite the same, and it does't break things so much as make them rubbish, but try getting a radiology information system to talk to a scanner of some description (CT, MRI etc). GE scanners accept Surname^Name. If someone has a middle name it doesn't display or come across to the scanner, so as to save space (I assume). This is fine until you get someone who has a first name with 2 separate words. I discovered it with someone called something like Al Amen as a first name. No hyphen. So now he is called Al. To make the medical images correct we have to incorrectly spell his name and make the RIS incorrect. Since then I look out for this and I have seen lots of patient names broken in this manner. Mid name capitalization also breaks and all becomes lower case. McDonald to Mcdonald etc. Names are horrid to deal with and people (myself included) like them to be correct.

For a few years my airplane boarding passes said I was PAULA JUNGWIRTH because A is my middle initial. I got a few questions trying to board. I've noticed the last couple years that they print with a space now.

When I flew Lufthansa, I was granted an honorary Ph.D.

My official name is "Aleksandr Feinberg" -- it's transliterated from Cyrillics (hence the "ks") and Russian version of "Alexander" does not put an e between the d and r -- which Lufthansa decided to print on my boarding pass as Dr. Aleksan Feinberg

I recently moved to a house on a street named "Martin Luther King Junior Way East". The service rep must have bumped the tab key while I was signing up for DSL, because I now get mail addressed to "Mars Saxman Junior Way East".

The funny part is that the street name is still filled out in all of its glorious detail - but the capitalization is different, which tells me they did some kind of zip code based address sanitization.

so i'm reading this hilarious thread about names, and i come across a comment from _mars_saxman_...

my word, it's been long-time-no-see, mr. saxman!

-bowerbird intelligentleman

I've never understood why people entering data into a system enter an initial. It's a partial entry. You wouldn't enter a date of birth as 3. On further thought, actually, they do. Then dismiss all the error messages and quit the program to get past the system keeping them in the field waiting for completion. Users seem hell bent on breaking our databases.

I frequently have to enter my first initial and middle name as my "first name". Why? Because that is how it appears in numerous official places, such as my credit card.

The users aren't broken, your database (and your assumptions about names) is.

The data base has errors and faulty assumptions, yes. I have some too, but I do not allow names entered into a big medical system to be anything other than the persons name, minimum of first name and last name, but I go over every record that passes through our scanners and enter middle names too. We have an AKA field where the patient can be called what ever the want, characters and numbers allowed. This is not stuck into medical image dicom headers but appears on the information system which is used when talking to patients or browsing records. Dicom files area transmitted across hospital, out information system data isn't but does transmit reports with a limited amount of patient data on them. Screw ups with identification happen too often already (once is too many) and matter too much to have a load of bad data in the system. Abbreviate anything at great risk. We have lots of people with the same name and same date of birth already, so extreme care is needed.

Wow, that first one is great, thanks. I'm saving that for future reference. Japan seems a hotbed for hard database problems. There is no way any of our systems could handle some of that, and its mostly not our fault - trying getting access to change stuff on medical database software or imaging equipment. It isn't possible.

J Strother Moore[1,2]. Ok, it's his first name.

[1] http://en.wikipedia.org/wiki/J_Strother_Moore

[2] http://www.cs.utexas.edu/~moore/

That example came along fast. I didn't expect an Anglo-American example (assumption based on links) and I wonder about the origin? The lack of a period makes it somewhat simpler for system handling, but I wouldn't bet on it sailing through without issue.

My father's middle "name" was a single letter. My grandparents didn't give him a middle name (at least not in English), but the nurse took it upon herself to record what sounded like a middle initial to her, and that's what ended up on his birth certificate. It stood for nothing but the letter itself.

I love it when I get asked that as a security question, only to be told it's invalid (too short). Tell that to Harry S Truman!


Well, that's one way to deal with having to name 10 kids.

Do you think it's impossible that someone's middle name is only one letter long?

Ok, more accurately, people enter names into our database as an initial., as in they stick in a period. Such as John Andrew Doe becomes Doe^John^A., the period is not part of their name. I agree that its possible that is someone's name, but I am 100% certain that every instance I have encountered is incorrect. The various manufactures software we deal with either get confused by middle names or drop them. They also commonly assume that having 2 names in the first name field means that one is a middle name, and drop it. This isn't useful. I have never encountered someone with a single letter name in my workplace (first, last, anything) and so hadn't considered it. I am confident of this as I compare what every person writes on a form as their name with what our system says.

Some people might find their middle name embarrassing, thus choosing to use an initial to keep it secret where it isn't required for anything but disambiguation.

When the middle name avoids confusion with another person and the situation is medical files its about as important as it gets. There is also a legal obligation for it to be accurate with some if our government contract work. That said, they don't seem to monitor accuracy. I spent a lot of time monitoring it though (checking, correcting, hounding data entry inaccuracy serial offenders), and it still causes me cold sweats every now and again.

Some people have surnames that are only one letter.

Particularly names in southern regions of India, where traditionally people would have only a single name (no surname) and when a surname is required they might give the first letter of their father's name.

The name of someone if a delicate problem, but all of these are straightforward questions.

There is an "official name" and it's on your registration certificate. That's the one that goes in the 漢字名 field and that's the one that the bank will accept (my registration had both the double with latin character and the katakana in parenthesis: no questions asked, they both go, parenthesis included). The validation of the kanji name is not done on the exact range of the characters ('is it really a kanji?') but if it's double width or not, double width latin characters are OK, you could use emoji the validation would pass. Half width katana would get rejected.

Protip: if you have a shitty registration name, have it change, that's easy and that's for you own good.

They're not straightforward.

There is an "official name" and it's on your registration certificate.

The thing about official names is that I have so many to choose from! Alien registration certificate? MCKENZIE PATRICK JOHNATHAN. No kana because town hall called up the local immigration authorities and heard "'Nicknames' are not required for the administration of Japanese immigration law and accordingly should not be registered. You should only register him under the exact name printed on his passport." (This is official policy, but many local government authorities ignore it, including half of the clerks at Ogaki. I drew the short straw on my most recent visit though and "had to change.")

Mr. Short Straw did not, however, actually use the name written in my passport, because some genius at the US Passport Control Center thinks Irish people get an extra space in their last names and, after substantial argument with town hall, I was able to convince them that a lifetime of being addressed as Mc-san would be very inconvenient for my wife and I.

But wait there's more! As a result of marriage the McKenzie household finally exists on the books in Japan as a 戸籍, whereas before it was just little ol' me happily residing here as a foreigner. An hour of investigation with a totally different part of the Ministry of Justice later, Town Hall refused to register a 戸籍 with Latin characters, and was actually able to produce an authoritative Least Frequently Asked Questions At Ogaki City Hall internal guidelines document on what to do in the event of international marriages. So my "official" name in that part of the system is different: ミッケンジー、パトリックジョナサン. Mr. Short Straw remarked, direct quote, "Cripes, that seems like an inconvenient name to go around with. Have you considered just changing it? I've got the forms and I'm pretty sure you could be Tanaka Taro by the end of today." (Bonus points: We filed a name change for Ruriko at the same time as getting married, and hers is based on what's written in the 戸籍 and her 住民票, which gives us the wonderful circumstance where "Wife took husband's name after marriage but, important note, their names will still fail naive string compares... well, some of the time, depending on which agency and what data source we're querying.")

But wait there's more! City Hall is my single point of contact for Japanese Social Security, Japanese national insurance, and the Gifu prefectural revenue office. I think I count four different official names there unless one or more decided to change policies recently. Gifu extends its apologies but it is physically incapable of handling sole proprietors with given names which are 7 letters long because, quote, "Who does that to a child?!", so Kalzumeus Software is on the books as being owned by MCKENZIE P.

The decision not to manage "nicknames" (通称名) under the new immigration law because they aren't necessary from an administrative standpoint is illustrative of the disconnect between the people making these laws in Japan and the people that are subject to them. I realize that this is inevitable because foreign residents can't vote, but it's frustrating that the government doesn't seek input from them when formulating new policy that will have large effects on them.

Because of the difficulties in using foreign names with Japanese computer systems and paperwork that you mentioned, 通称名 ("nicknames") are essential for many foreign residents. Some groups have been using them for decades now, so even a cursory attempt to get feedback on the new laws would have identified this problem.

Still, some groups of special permanent residents have organized and successfully overturned some of the more odious aspects of the immigration law, like the fingerprinting requirement for alien registration. In particular, the Korean special permanent resident community has some degree of influence on policy because of their size and organization.

Given the general ignorance of the central government (and the immigration bureaucracy as a whole) towards the real needs of foreign residents, I see this decision as ignorance on the importance of 通称名 rather than an attempt to quash the rights of foreign residents. In my experience the local governments tend to be more sympathetic towards the actual needs of foreign residents, perhaps because they have more prolonged interaction with them. (Though as with every government organ in Japan, the interpretation of the law varies wildly depending on which clerk you interact with.)

Troubles like the ones you describe are a large part of why I registered 長瀬ダニエル as a 通称名 and use it for everything I possibly can.

> I've got the forms and I'm pretty sure you could be Tanaka Taro by the end of today

Now I see where point 39. of your post about names comes from :).

Dealing with names in Japan is really a life experience in itself. Yes, I was refering to the Alien registration certificate.

As you say, there are so many to choose from. I changed 4 times during my stay (at the end I had my name twice in the same field, one in latin characters and one in katakana, plus a 'nickname' with my wife's family name. BTW it was the best choice so far, even if it's awkward to fill bank papers with 'SOME ROMAJI NAME (カタカナ名)'. Immigration Office staff really do a shitty job at dealing with the registration, but I had it changed at the prefecture I lived. They are much more forthcoming, and will accept to use anything reasonable as a name.

some genius at the US Passport Control Center thinks Irish people get an extra space in their last names

Just as a point, some people with "Mc" surnames, do put a space after the X. Varies from person to person.

Could this have been solved by using "外人第一" as a suffix or instead of the romaji/katakana name?

I once tried to apply for a Japanese credit card online about 10 years ago (certainly things have changed with some banks sine then). IIRC the form would not accept romaji and my kana name was too long for the kanji input field.

This was painfully frustrating at the time but helped frame my approach to forms and DB specification when I got into web development (e.g. always using UTF8 in MySQL, full name as a single field in some applications, etc.).

I think the problem you ran into wasn't so much a user interface problem, but a "gaijin aren't our target clientele" thing. 10 years ago a lot of Japanese banks regarded "gaijin without permanent residence" as riskier than a 20-year-old Japanese student; this included gaijin with good credit records and income above average. Things have been changing, though.

I think it better to paraphrase Hanlon "Never attribute to malice that which is adequately explained by lack of imagination." or something similar.

1) I suspect Chinese(!) names would have been validated.

2) I successfully applied for the same card via paper form 18 months later with guidance by a sales rep.

As a Norwegian in the UK without any problematic things about my name, I think you don't need to assume malice or lack of imagination, simply that they did not want to handle anything but the braindead "safe" situations online.

I similarly often prefer to apply offline or in person about things, because despite more than a decade here, and great credit history, I occasionally get hit by UK banks assuming that not being on the electoral roll means increased risks (it can mean you're trying to keep your real address out of official registers). They have no problems dealing with me in person, when someone manually reviews the situation, but either they've decided it's not worth the hassle to try to deal with this online, or that it's safer to just point me to a branch or call for extra verification.

Fair enough, but I'm not exactly assuming "malice", I'm just saying that such "lack of imagination" wouldn't happen if foreigners were considered an important target for the banks.

If you're using MySQL then to get real UTF8 you need to declare the column as some other funny name (UTF8MB4?).

Thanks for this. MySQL 5.5+ it looks like. We're on earlier MySQL versions but will keep this in mind for after we upgrade in the future.

Applications are open for YC Summer 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact