

You Suck (On web apps that mismanage text) - raganwald
http://weblog.raganwald.com/2007/09/you-suck.html

======
jgrahamc
I just tried to sign up for Notifo and wasn't allowed to use my real name
because it has a - in it. So now I'm waiting for that to be fixed.

And don't get me started on how profanity filters consider my last name
profane. Years ago I had a Hotmail account registered to the name Ivana C.
Teens-Give-Head because they wouldn't accept John Graham-Cumming.

edit: Notifo telling me this will be fixed shortly.

~~~
patio11
We should start a club of people whose names vex software.

I could talk your ear off with how many ways I've broken systems in Japan, but
one of my Asian American friends takes the cake. Her name is not Kim Kim, but
it could be. Apparently some systems check to make sure you don't enter your
first name twice...

~~~
pchristensen
Can't find the reference, but Caterina Fake has complained about her name
being rejected as, well, fake.

~~~
idoh
Penenberg: Fake is real, right?

Fake: Yes. I can't tell you how many times I've booked an air ticket only to
get to the airport and find out they killed my ticket because it goes into the
system and the program tosses a ticket that says "fake" on it. Twice I've gone
to the counter for a KLM flight through Northwest and have been rejected. They
say, "You don't have a ticket." I give them a confirmation and after some
investigation I learn my ticket has been cancelled because the system deleted
it. For a while I couldn't join Facebook because of my last name. During the
registration process I was asked for my real name and when I wrote "Fake" it
rejected me. Finally a friend working for Facebook took care of me.

[http://www.fastcompany.com/blog/adam-penenberg/penenberg-
pos...](http://www.fastcompany.com/blog/adam-penenberg/penenberg-post/flickr-
co-founder-caterina-fake-value-viral-loops-exclusive-qa)

------
jcromartie
For any "human" data: trim and escape, and you're done. If you want to
validate it, just ask the party that knows for sure (send an email, run a
transaction, visit the URL).

This includes names, addresses, phone numbers, emails, URLs, CC/account
numbers, user names, passwords (maybe tell them that caps-lock is on or any
other weird keyboard state if you can).

~~~
DougBTX
As long as you wait until you know where the data is going, escaping is always
a good idea.

------
jrockway
Yes! For years I thought _I_ was the crazy one for telling our clients, "we
don't really need to validate names. if they give you the wrong name, it's
their problem; and it's more work for us; and we'll probably fuck it up and
make someone mad". The answer was always "do it anyway, it's what we agreed
on." The client feels like we are screwing them if we make the work easier,
even though the end result is a higher-quality more usable website. Sigh.

The comment about developers making work for themselves is also spot on. I
answer a lot of programming questions, and the questions are always asked
because the programmer has reached the end of a twisty maze of his own
creation. Turn around, walk, spin around, and try again. You'll find a better
solution.

And oh yeah, I do this all the fucking time. Pick any random github project of
mine, and you'll see 8 revisions of the API before I finally pick one that's
not retarded. Even then. (Side note: I don't change the API after I release.)

Anyway, best rant ever.

~~~
uriel
> The comment about developers making work for themselves is also spot on. I
> answer a lot of programming questions, and the questions are always asked
> because the programmer has reached the end of a twisty maze of his own
> creation. Turn around, walk, spin around, and try again. You'll find a
> better solution.

This deserves to be repeated a thousand times.

How many times bad code and bad ideas stick around simply because those that
came up with them can't even _imagine_ that they could do without them.

I have run into this many times with people that try Plan 9, ' _where is my
pet unix "feature"?!?_ ', guess what? It was not a 'feature' and it causes
untold pain, and that is why it is not in Plan 9.

Just last week somebody was in the Go mailinglist asking why there is no
preprocessor! _sigh_

------
IgorPartola
// from appendix B of rfc 3986 (<http://www.ietf.org/rfc/rfc3986.txt>)

'&^(([^:/?#]+):)?(//([^/?#] _))?([^?#]_ )(\?([^#] _))?(#(._ ))?&'

The above regular expression is meant to match URI's. Since almost anything
can be a URI, the re also matches almost everything.

~~~
patio11
You know you have done too much work with regular expressions when you think
"Hey, wait a second, that can't possibly work" and start trying to debug it in
the Ruby console for 10 minutes prior to realizing "Oh, HN is italicizing it
because of the asterixes it is silently stripping."

------
adnam
I had this booking a flight. The system mangled a hyphenated surname, so the
"pay now" page was wrong. With no way to go back and modify it, we had to
return to the start and try all over again. On the third failed attempt the
clever system had detected a certain interest in our flights, and socked up
the price by $200!

~~~
jrockway
_On the third failed attempt the clever system had detected a certain interest
in our flights, and socked up the price by $200!_

Or the inventory you were trying to buy was no longer available. Seats are not
all the same price; they are divided into "fare buckets" that are usually
lettered. There are very few cheap fares on each flight, more medium fares,
and even more full-fares. You should check the code as you are booking; if the
fare code changed, they ran out of inventory. If the fare code stayed the
same, then they raised the price of that inventory.

Just saying -- it wasn't some conspiracy. Someone just bought them out from
under you. Most airlines let you hold a reservation for a few hours, so if
this ever hits you again, just hold the reservation, call the web services
desk (not the general reservations desk), ask web services to fix your name,
and then continue the ticketing process online.

(This is the procedure for AA, anyway. Dunno about other airlines, as I've
never used them.)

Personally, I always hold, triple-check my plans, and then buy. So I have
never had inventory disappear out from under me, and I have never needed to
change a non-changeable fare :)

------
stan_rogers
Dammit, Reg -- I thought you were back in business when I saw this. Call me a
greedy bastard if you must, but I've been suffering major withdrawal and the
methadone I've found out there just ain't cuttin' it anymore.

~~~
jwinter
He's still publishing, at least as of February 2010:
<http://github.com/raganwald/homoiconic/tree>

------
nollidge
I just went through a website sign-up yesterday, and got my password e-mailed
to me in cleartext. These anti-patterns will always exist.

~~~
arethuza
Hacker News emails passwords in plain text :-)

~~~
IgorPartola
Not if you use your own OpenID.

~~~
jrockway
Pretty sure that if I use an OpenID, Hacker News still emails passwords in
cleartext.

------
raganwald
Meta-comment: The comments on the OP and here are far, far better than the
post itself.

~~~
jrockway
Very interesting. Normally I don't read blog comments, because usually they're
dumb, but these aren't too bad. I think you wrote a rant that everyone can
agree with. We have all been burned by validation before, and we have all been
forced to write it. It's boring and annoying for everyone. (Watching the
clients test the website usually consists of them typing stuff to test the
validation rules. They don't check the spelling, they don't check that it
works as they specified, but they do check that they can't put 999999 as their
zip code. Sigh!)

You also have the right readership -- the people that will disagree with your
post don't even know what a "raganwald" is.

A perfect storm, if you will, for constructive blog comments :)

------
avar
Here's another thing that needs to stop: Asking people for their first name +
surname.

Not every culture has the concept of a surname. If you need to ask people for
their names just do so via _one_ one text field.

------
pilif
waybackwhen, I added additional code to validate s Swiss ZIP code to an
application, thinking that the 8000 area must be the highest number area

Of course it is 9000 and ever since the ZIP code field is a non-validated text
field in all applications I have done since :-)

~~~
a-priori
Not only that, but they're only called ZIP codes in the US (this is a peeve of
mine). In Switzerland they're "post codes" ("code postal" in French, don't
know the Swiss German name). They also write addresses in a different order.
As an example, here's the address of a Kebap shop I used to frequent:

    
    
      Avenue de la Sallaz 29
      1010 Lausanne
      Suisse
    

1010 is the post code, identifying La Sallaz (or a part of it?), in Lausanne.
My point? All this stuff is very local and if you want to do it right you
should just ask for addresses free-form and if you need to extract information
from it then you should use a geocoding library (e.g.
<http://geocoder.rubyforge.org/>) to normalize it for you.

~~~
pavlov
My pet peeve is the field labelled "State or province" which can't be left
empty. This is surprisingly common.

Most countries on Earth are not federations; dozens of nations are small
enough that dividing into "provinces" would be meaningless; and even larger
countries that do have regions may not include their names in mailing
addresses.

~~~
jrockway
Clearly the country is "European Union" and the state is "France".

------
tptacek
If the email address was literally "foo+bar@domain", they may not have gone
out of their way to screw you; there are lots of web apps that treat "+" as a
special character, so all they had to do was pass it over another HTTP
connection.

~~~
Terretta
I use these as labels to auto-file registrations in Gmail, and I've seen a few
anti-patterns:

1\. Reject at data entry even though they make me type it twice and they're
going to do a round-trip click-to-verify. Too many to name fail this way.

2\. Accept at data entry, convert it to a space (+ on URL means space). Pray
that the login routine accepts an email address with a space (it probably
won't). Tirerack fails this way.

3\. Accept at data entry, then fail to create the account in other internal
systems. Allow login using the +, but once logged in, data from internal
systems is unavailable, and the portal errors in unusual ways. VMWare fails
this way.

Failing using anti-pattern 1 is preferable to pattern 2 (replace with space
then fail login) or 3 (accept and allow login but fail to interoperate with
other systems internally).

After 45+ emails and calls about anti-pattern 3 over 18 months, VMWare still
hasn't successfully delivered a VMWare license to me. By now they're on
version 3 (a free upgrade if v2 was bought recently) but still haven't
delivered me version 2 or 3. Next email, perhaps I should send them
raganwald's article.

// EDIT: Added Tirerack.

~~~
tptacek
This is pretty close to an argument for writing the extra line of code to
reject email addresses with "plus" characters in them: the front-end team
might not know how the backend team will screw up.

------
tlb
The problem is that the actual spec for validating email addresses is
preposterously long and complex, and can't even be implemented as a regexp
since it requires nested parsing. So everyone just writes
/^\w+@\w+\\.[\\.\w]+$/ or something lame.

~~~
ominous_prime
No, the problem is people trying to validate email addresses when they
shouldn't. Similar to the credit card name example in TFA, you're trying to
save a call to the MTA, when in reality, a user that doesn't want to be
contacted will enter foo@foo.com, and you have to send it anyway.

~~~
Confusion
You want to validate email addresses, because a surprising number of users is
incapable of typing their email address correctly in one try. Validation saves
a lot of rework.

~~~
ars
What I do is validate and warn if it looks bad, but not prevent.

------
shalinmangar
I cannot include my middle name on my twitter profile because they have a
limit on how long a name can be.

------
qeorge
Ragan is totally right about this being wrong and needing to be fixed.

However, the idea that websites should use the bank's payment gateway for
validation is misguided. Your fee's will be increased (or your account will be
suspended) on many payment gateways if you do this.

~~~
raganwald
I think it depends on what you're validating. If you are trying to validate a
name, you had better get it right! For example, my Visa still says Reginald
Braithwaite-Lee, but I might register on your site as Reg Braithwaite. Did I
misspell my name or is that what's on my Visa? Is the hyphen a typo?

OTOH, some things seem to be more certain, like rules about check sums. I like
the approach suggested by many folks: Use JS to validate on the client, and
put up a "Are you really, really sure?" message for things that seem unusual
like a single name or a funny character.

------
Tichy
I wonder how much you could get out of their support as an apology, ie free
flights and so on? I never really try that, but I hear some people extract
bonus offers routinely.

~~~
raganwald
REDDIT MODE=ON

Apology? Mwahahaha! What happened was that they sent me an email saying the
electronic ticket would follow by email within 24 hours. This was on a Friday
for a Monday morning flight at 8:00. When no ticket arrived by Sunday morning,
I called them to discover their office was closed. I called again at 5AM on
Monday morning and they were still closed, so I waited until their office
opened at 8:00.

Wrong move. They charged me for the flight, saying that even though they
promised an e-ticket and didn't send it, and even though their offices were
closed, I should have shlepped out to the airport where the airline would have
resolved my problem. I appealed to Visa but lost.

~~~
jrockway
This reminds me -- I never look for my e-tickets in my email. I just show up
at the airport and print it there.

I have never _needed_ email to fly on an airplane.

------
gnubardt
Isn't the sure fire way of validating an email address to connect to the mail
server and see if the address exists?

~~~
QE2
Many mail daemons will "accept" mail no matter who the recipient is.
Internally, unmatched mail may be discarded or forwarded to a "catch-all"
account, but all the sender sees is "Recipient ok"

I think this started as a response to SPAM bots that used to "RCPT TO:" random
strings and save a list of valid address.

~~~
vsync
Pretty sure that's a MUST NOT in the RFC

------
TrevorBurnham
In the era of server-side JavaScript, there's no excuse for not using the
exact same validation code everywhere.

------
msie
...and that's why I have a simple email address: only letters and numbers
(plus the @ sign).

------
fnid2
tl;dr, couldn't get past the absurdity. Did this post have a point?

~~~
raganwald
This was not one of my best, so I forgive your snarky tl;dr :-)

Here's an even longer but (IMO) much better article about much the same thing:
[http://weblog.raganwald.com/2007/09/we-have-lost-control-
of-...](http://weblog.raganwald.com/2007/09/we-have-lost-control-of-
apparatus.html)

