Good article, Patrick. It is so easy to say "oh, those stupid/lazy/naive web-form programmers". But actually this is a fiendish problem to solve correctly - as your article demonstrates.
what's wrong with rejecting no characters and using escaping/encoding to avoid injection problems etc? that catches every possible edge case - even the ones I don't know yet.
the hard problem is validating the data - but you don't actually have to do it.
1. A web site that demands your mother's maiden name for a security verification question. Of course, that's ridiculously insecure, so I entered my wife's mother's maiden name instead. It turns out that it won't accept names less than 4 letters -- which excludes a very large portion of those of Chinese descent.
2. New Jersey has all kinds of regulations governing your identity authentication when you go to renew your driver's license. If your current license mis-matches your documents, you really have to jump through hoops. My problems is that my first name is "Christopher" -- a common enough name -- which is too long to fit in their system's 10-character field. So they only put on my license "C S Wuestefeld", meaning that my name mismatches, ensuring that I'll always have a mismatch, and be subject to their extra verifications.
Whenever possible, I just enter a randomly generated password for those ridiculous "security question" fields. Isn't "your first job" or "your hometown" easily discoverable via my Facebook profile?
Right - the security questions are just a secondary verification method. I simply fill them in with a secondary password.
This does occasionally lead to hilarity when I have to do business over the phone, and they want to use those questions to verify my identity. "What town did you honeymoon in?" "P7qkn1~f"
Situations like that are exactly why I try to avoid using the "~" character in strings that anybody else might have to type in. A lot of people have no idea what "tilde" means, or where the ~ key is located. The same problem crops up with ampersands, carets, and "#", for which nobody can seem to agree on a name.
Some sites have taken to occasionally asking you these verification questions, even under "normal" circumstances when you haven't lost your password. So it's necessary to remember the answers. And it seems that tools like RoboForm aren't yet very good for tracking this.
Amazon handles my apostrophe without drama. One advantage to having an apostrophe in your name is that you can spot when people migrate their databases without paying attention, because the apostrophes tend to double each time. DirecTV is now sending my bill to a Mr. D''''''A…
Some systems will manage to accept the apostrophe properly, and then make life difficult by making up the collation as they go along.
At least 80% of the time someone tries to look me up in an alphabetical list, I have to try to convince them that no, I really am on the list, but that I might be found at the very beginning of the Ds, at the very end of the Ds, in the Ds where I'd be with no apostrophe (depending on how they handle alphabetization of punctuation), or at the top of the Fs (middle initial) or in the As (on lists where the D has been interpreted to be an extra middle initial).
Even better, I often wind up on these lists several times when someone decides that it's easier to enter all my information again rather than look for it in one or two more places.
Credit-card name verification seems to be a bit smarter; every credit card I have has a different variant of my name on it (and a lot of credit-card forms consider an apostrophe 'invalid' in a name even though there's one on my business Visa card), but I've never had a charge rejected because the name was not an exact match.
Reminds me of those websites that force you to have a password that conforms to their overly-specific rules. Here is a real example:
Passwords must be 8 to 12 characters long composed of the following character types:
Uppercase Alpha (A, B, C, etc.)
Lowercase Alpha (a, b, c, etc.)
Numeric (1, 2, 3, etc.) or the following Special Characters (!, @, #, $,*, +,-)
Each password must contain UPPERCASE AND LOWERCASE ALPHA CHARACTERS, and at least one character that is either a Numeric or a Special Character.
So if your password is 13 characters long, it doesn't work. Or if it has letters, numbers and punctuation but no uppercase letters, it doesn't work.
Those are actually very close to the rules for setting up the admin password for windows server 2008 during install except that you're never TOLD those rules during the install. I spent half an hour typing passwords and having windows tell me that the weren't "complex" enough before having to go and google what the hell it was really after.
Or even worse, accept your 13 character long password and just truncate it at 12. Therefore the next time you sign in your password is invaild and you have no idea why!
I had a similar experience. I'm based in the UK so I have my computer set up to use a UK keyboard layout. One time, I decided to use the # symbol in my Windows password. The next time I tried to log in to Windows, my password was rejected. I eventually discovered that Windows was using a US keyboard layout until I logged in, meaning that # was in a completely different location on the keyboard and I was typing a completely different character. Of course, there's no option to actually see your password while you type it so it took me a long time to realise that I had to type shift-3 to get that character. I'm used to getting £ when I hit shift-3, that being the pound sign that we user over here.
My first name is Patrick, but I go by Paddy, which is a common enough name in Ireland. When I initially signed up for a hotmail account (around 1997/98) I got a message telling me my name was a racially offensive term (or words to that effect). I found it quite funny at the time, so in the end I just registered as 'Patrick' and didn't use paddy in my email address. (No such problems with gmail, which I use now)
Years ago I tried to get a Hotmail account with my real name and had the same experience because of Cumming. Instead I signed up for a Hotmail address with the name Ivana Watch-Teens-Give-Head.
A lot of people really would find that sentence less offensive than "I jacked off in the shower" or "Fuck, you're awesome" simply due to the more decorous phrasing. The fact that your example describes bestiality and my second example is actually an ecstatic compliment doesn't matter to such people — the phrasing itself is what offends them, not the content of the idea.
As a developer, I'm wondering if there is ever any legitimate reason for this. Is it the laziness of properly escaping inputs on the part of the developer? Complying with some standard? Integrating with a legacy system?
My favorite part is when I'm not allowed to use a special character in my password.
It is still a good idea to strip leading/trailing spaces, however. Users may inadvertently include a space when copy and pasting, and some less computer-savvy ones may just include one by accident.
I think it comes from silly belief that some characters are "dangerous", i.e. application fails to escape data properly and tries to make up for it by forbidding anything that author can imagine to be useful in exploits.
Or the author heard that you need to 'sanitize inputs' and equated that with trying to completely validate all incoming input (i.e. trying to figure out if a name is valid/invalid rather than just escaping it properly).
Or the PHB saw that you could enter a character that 'no one' would have in their name, and demanded something like Amazon/eBay/<popular website that PHB uses> has on their forms.
I stopped counting the number of times I called up banks in the UK to give them grief over their "security". My favourite pet peeve is the Mastercard SecureID and Secured by Visa. Both allow only a few characters with just letters and numbers. I get furious that my very secure passwords are not allowed.
One bank blamed the system, saying the reason it's doing it this way is because it's "American".
But you get locked out forever after three tries. Consider the PIN to be a convenient way to not have to call the helpdesk every time you want to access your account, not as a key used to secure your data for eternity.
Does Verified by VISA works (or not working) the same everywhere? I thought each bank has to support it independently, so I'm only blaming my bank for requiring me to use exactly 6-digit numbers as my VbV password.
My vote would be on stupidity. Maybe there really was a time when getting it right was hard (COBOL with custom proprietary databases?), but it is not hard anymore. And seeing as it really is rather important, there simply is no excuse for not fixing it.
Think about it another way: if they can't get even that right, what does it say about the rest of their IT systems?
Welcome to my world. I have mostly given up of writing "Bánffy" in web forms. I have even given up on pronouncing it right, as nobody in Brazil (except Hungarian expats) can do it in a non-painful - for me to hear - way.
Well... At least your name is valid ASCII... Once entered, it won't be mangled at the database layer.
I was just about to say this, too. I've ended up just splitting up the letter æ in my last name into a and e. Even some Norwegian businesses end up mangling my name, which is really very embarrassing. For them, that is. Æ, Ø, and Å aren't considered ligatures, they're considered separate characters in our alphabet. Worth taking the time not to mangle them, since they're fairly common in names :P
+1 for wishing you had a name in ASCII only. :-/ My last name is Schröder, which some systems can't handle, some systems transcribe to Schroeder, some systems make it Schroder, and some systems handle perfectly. I'm always slightly nervous when booking flight tickets abroad, because there's always some mismatch between what's in my passport, what's in the booking, and what's on my credit card, etc.
I'm from Sweden and my first name is Mikael, not that uncommon a name. A while ago I saw that a nordic airline had "helpfully" reverse transcribed my first name on the ticket to Mikäl, which I've never seen as a first name. Wonderful :) Fortunately, I didn't have any troubles with the mismatch between the ticket name and the passport name.
I got pissed off with having a home address that different organisations insisted on putting their own spin on, so I can imagine how frustrating it must be when it's your actual name!
Oh, my home address has a "ö" in it as well, and that's also very hard to handle correctly everywhere, but addresses are pretty resilient, if you get my street name wrong on a letter to me, it will most probably be delivered anyway.
These things are especially hard to endure for developers, because they are so obviously stupid. I have two umlauts in my name, basically my name is completely unsuitable for the 21st century.
Yea, I did the same thing. Name must contain only ASCII characters and must be easily pronounceable in English where two of my requirements when naming my daughter.
I know a woman whose last name is Null. I remember her telling stories of frequent failed form validation years ago (for failing to enter a last name, of course). I wonder if it's gotten any better as the web has matured.
I've found having a hyphenated surname and an email address that ends in ".id.au" quite beneficial in determining how much thought a company has put into their computer systems.
I've got a name fairly unfamiliar to many speakers of English, ("Maciej", or "Maciek" in the everyday diminutive form). There aren't any invalid characters in it, but I have on several occasions had it re-written or transcribed by a human being after I've entered it into a form on a website, often to something completely different or garbled (with the implication that I spelled my own name wrong). In particular, my bank did this to me and for about a year and a half I was receiving statements addressed to a "Macie J", where someone had clearly thought the J was part of my middle name (which itself is another tongue twister to English-speaking folks).
No wonder so many immigrants to Canada and the US change their names :)
Happens to everyone. I’ve got a friend called Marc, but inevitably he gets entered on forms as Mark Withersea. Should have learnt by now not to give his name as Marc-with-a-'c' I guess :)
This wasn't really about the fact that it wouldn't accept his name but about the message it gave him, which to me is a problem of letting developers write error messages. Why do we write error messages that sound like robots when spoken?
Maybe that is unfair to robots who speak the way developers programmed them.
Either way it doesn't seem like anything worth getting in a huff over. I don't think it was meant as purposefully degrading or had intent to insult. If anything it just wasn't clear enough; they meant - contains invalid characters for this system. Why take it personally? I don't think whoever wrote the error message was trying to make some sort of statement against "traditional" names.
I feel you. My last name as a space (Di Cillo) and here in the States everyone think that Di is my middle name so now I have a bunch of documents (even credit cards) with my name written Davide D. Cillo... sigh.
I went to lunch once with a few people in Coeur d'Alene, Idaho. They always complained that they couldn't enter their city in address fields. Likewise, though there are many American placenames which are semantically possessive, they usually lack apostrophes, by policy of the United States Board on Geographic Names.
They didn't mean to be rude, but do come across as such, especially for people who are not as computer-savvy. Phrasing things badly can be as rude as a deliberate insult; it signals you don't respect the one you communicate with enough to think things through.
No human working in customer service would get away with saying things like that, computer user interfaces should be held up to the same standards.
TL;DR: Here is an error message. It could be worded better, but I understand exactly what it means and where it came from. I'm going to get all offended about it, even though there was clearly no intent to offend.
http://www.kalzumeus.com/2010/06/17/falsehoods-programmers-b...