
Email address validation: please stop - sadiq
http://blog.sinjakli.co.uk/2011/02/13/email-address-validation-please-stop/
======
georgecmu
What's even worse is when they have different validations in different places.
I've had ticketmaster accept my name+tm@gmail.com email at sign up, only to
never let me log in after that first time.

~~~
raganwald
I ran into this with a flight reservation system. I lost hundreds of dollars
and missed my flight when the front end accepted my email address but the back
end didn't send me a confirmation.

A plausible explanation given to me was that the web application worked just
fine, but it sent the data to some COBOL-ish back end written long before the
advent of web bookings, and the integration code mangled my email address when
it stuffed it into some data field that was never intended to hold an email
address.

This is why any real-world system needs end-to-end integration testing and not
just unit testing :-)

------
retube
I recently changed our email validation: all I do now is check for the
existence of an "@".

~~~
Torn
Then why validate at all? Just send the user an email with an activation link.
If it doesn't work and they're still logged into the site (with a limited
account), then let them change their email address.

~~~
eli
If you want your site to have a "login as guest" type feature that's one
thing, but if an email address doesn't have an @ it will never work and it
would be misleading to encourage people to check their email for a validation
link.

People sometimes misread labels and enter their name on the line for their
address. This would stop that.

~~~
retube
Yeah exactly. It's more to catch wrong stuff entered in wrong field.

~~~
muitocomplicado
I had people write www.hotmail.com in the email field in the past.

------
furyg3
The "+" feature of gmail is great, but I hesitate to use it after some weird
validation problems I've had. I've stopped asking that people validate
properly, and started hoping that they a) don't validate or b) fail
gracefully.

One (very important) site properly validated my "+" email address on the front
end (gave me no errors), but the backend failed and I never received the
required confirmation email... all resulting in a customer service call.
Arggg.

~~~
T-hawk
Shouldn't the spammers have figured out the "+" feature by now? Just remove it
and the suffix to get a valid address, for gmail or any other provider that
uses the syntax.

return Regex.replace(email, "([^\\+])°[^@]°(@.°)$", "$1$2")

° should be * but HN eats it as markup.

~~~
muitocomplicado
I don't see this feature as a way to fight spam, but to make it easier to
label incoming mail by using the sufixes. As you said, it's easy to bypass it
with some simple find and replace.

------
citricsquid
I like the idea of having validation but when the email doesn't match your
pattern, give the user a warning that says "sorry, we don't think this is
correct" but allow them to continue if they think it's legit, then have them
click a link to validate so an incorrect email serves 0 purpose for them.

~~~
brown9-2
The point here is that "the pattern" used by developers is often grossly
incorrect. It'd be better to not even attempt to enforce any pattern.

~~~
patio11
Optimize for a few hundred spam obsessed power users, or, prevent a major
cause of the #2 most common CS complaint at many businesses. This does not
take much pondering.

P.S. Trivially A/B testator at high volumes if your CS infrastructure is
capturing sufficient data.

~~~
brown9-2
Which part of this is the #2 complaint - users making a mistake entering in
their email address? Isn't this why you ask them to confirm it by sending them
an activation email?

~~~
patio11
The activation email isn't a panacea for users fumble fingering (or
misremembering, or not knowing) their email address. Users who don't receive
it will either a) ignore it if you let them use the application anyway or b)
frequently bounce hard if they don't get it, because they assume naturally
that your Googles are broken.

The single most compelling reason to send people activation emails is -- I kid
you not -- to remind them that they signed up for your website and how to get
back to your website. A secondary consideration is not proving that they got
their inbox right but proving that they didn't get someone else's inbox wrong.

------
beaumartinez
Some people, when confronted with a problem, think "I know, I'll use regular
expressions!". Now they have two problems.

------
DevX101
While we're on the topic of emails, does anyone have any anecdotes or data on
how often users will click activation links if I log them in after
registration?

I always hated having to log in to my email after signing up, so I just create
an account and login users without any upfront verification.

My email to the user says I will disable accounts that are not activated in 4
days, but its just a bluff :)

~~~
jacobolus
I sometimes put my email down as foobar@example.com, so I’m definitely not
going to click your activation links. That’s probably not representative of
most users though.

~~~
eli
You'd be surprised how many people supply bogus email addresses for something
they ostensibly actually would like emailed. Like the To field on "tell a
friend" or the signup box for an email newsletter.

------
uptown
Ironic ... I tried to leave a comment on his blog with a random email address,
but received this message:

"Error: please enter a valid email address."

------
caryme
This is truly annoying and makes me hesitate to use the '+' feature. The
problem I've had is when I get manually subscribed to an email list (like when
giving my email on paper or being added after sending a senator an email) and
then cannot unsubscribe due to validation failures.

------
eli
There are two separate issues going on here.

One: validating addresses to catch typos. A common example is typing a comma
instead of a dot or typing just a username instead of a whole email address.
Flagging these errors is a good thing.

Two: some developers believe that they can make people enter _real_ email
addresses by being very clever about only accepting strings that _look_ like
real email addresses. This is stupid, doesn't work, and often blocks
legitimate addresses.

~~~
pbhjpbhj
Re "Two", if you choose an email address that doesn't look like an email
address and it gets blocked then I'm not sure that it is the developer [alone]
who is being stupid.

~~~
eli
I'm not talking about escaped @ symbols here (that's bonkers). There is still
plenty of code out there that assumes a domain suffix is only ever 2 or 3
letters long and that usernames are only letters and numbers.

------
perlgeek
Please stop... to collect email addresses you don't really need.

When I participate in some kind of online community, I want to chose if I
receive emails from them at all. And if not, it should be my choice if I
provide any email address at all.

I have a small site where you can participate anonymously or log in, and when
you create an account it's your choice if you provide an email address at all.
If not, and you lose your password, you're out of luck.

~~~
ChuckMcM
True, although I'm a fan of 'tiered' services since robots (spammers, trolls,
and others) also participate, I'd like a way of saying "this is a real
person".

If you're looking for a startup idea how about a service that creates an
anonymous ID (to me anyway) where the user provides that id to me, I send it
to a service and get back a 'reputation' bit which says if you're a good guy
or a bad guy (person what ever). And a way to report you've not been co-
operating so that others can benefit.

Ebay reputation model but nominally anonymous. (at some point in some server
somewhere there will be a way to link token a to token b but I'm totally ok if
it can't be resolved into an actual person.)

------
PaulHoule
I dunno. I know a guy who bought a list of 750,000 e-mail addresses from a
shady source and was dismayed to discover that many of them didn't even have @
signs in them... ;-)

------
glenjamin
I tend to rely on <http://www.regular-expressions.info/email.html> when coming
to validate an email address.

I do often fall into the trap of trusting the framework's built-in email
validation to be correct.

Apparently, this is the regex to match RFC2822 (?:[a-z0-9!#$%&'
_+/=?^_`{|}~-]+(?:\\.[a-z0-9!#$% &'_+/=?^_`{|}~-]+)
_|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\\\[\x01-\x09\x0b\x0c\x0e-\x7f])_
")@(?:(?:[a-z0-9](?:[a-z0-9-] _[a-z0-9])?\\.)+[a-z0-9](?:[a-z0-9-]_
[a-z0-9])?|\\[(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\\])

~~~
retube
The problem with matching against the fully fledged RFC compliant regex is
that not all email addresses are RFC compliant. As I indicated in my comment
above, I've abondoned trying to "correctly" or "completely" validate email
addresses. There's onyl one thing certain in an email address: it contains the
"@" character.

~~~
metageek
Cam you give an example of a noncompliant address that actually works?

~~~
nourishingvoid
One cellular phone company in Japan used to allow people to register e-mail
addresses with two periods in a row before the @ character. I have seen some
addresses like this in the wild, but the decision to mark these as valid or
not for an app depends on the domain you're working in. At my old job making
web apps for Japanese companies, programmers would usually allow these types
of addresses if we were making a mobile site.

------
d99kris
A loosely related anecdote:

I was registering a general purpose domain name a couple of years back and
asked a friend if he had any input on a good short name. He replied "nope",
and being a Swede I registered nope.se.

There was a time I still forwarded all nope [at] nope [dot] se emails to my
primary email, but as it turned out (not that unexpected), this was an address
frequently used by Swedes to register "anonymously".

Anyway, it was an interesting/alternative way of keeping track of popularity
of new communities etc. Clearly not all users expected that they had to verify
their email addresses.

------
trustfundbaby
My fandango account is completely irrecoverable because of this ... I turned
my name@gmail.com address into name+fandango@gmail.com and was able to login
without any trouble, but of course when I went back to login months later, I
had completely forgotten about it, so I couldnt get into my account (kept
putting in name@gmail.com and couldn't figure out what the problem was)

Naturally I reset my password, and the temporary password arrives in my
account, but when I went to put in the new password, it puked on the email I
was using ... then I remembered what I had done, but till today whenever I go
to put in the new password with the correct login (name+fandango@gmail.com)
... I keep getting sent back to the reset password screen, over, and over and
over.

It knows its me, because my credentials (Hi xxxx) are displayed in the top
right hand corner, but it simply won't reset my password correctly.

Fandango support is well ... worse than useless.

Its all very maddening ... an account I've had for lord-knows-how-long
containing my entire theater going experience, inaccessible. Thats what I get
for trying to be clever.

------
Tichy
On the other hand, maybe the specification for email addresses is too loose.

~~~
petenixey
Too loose for what though? To make it more useful as a communication format or
to make it easier for developers to validate it? It's hard to believe that a
tighter spec could have improved the former.

~~~
tomjen3
Well for starters, why do you need to put comments in your email? That can't
be used to make it more useful as a communications format.

~~~
petenixey
I'm afraid I don't understand your question

~~~
thwarted
Then you are unfamiliar with email address formats, which have a specific
formatting that allows for "comments", and it is kind of hairy.

------
thirsteh
Here's an example of RFC 2822 using RegEx in case HackerNews comments filter
out some of the symbols: <http://bit.ly/g1uFMz>

~~~
pbhjpbhj
I note that you apparently put the source for the regex in there -
<http://tools.ietf.org/html/rfc2822>?

~~~
thirsteh
That's the RFC it follows, yes. I found out about the regex here:
<http://www.regular-expressions.info/email.html>

The author does say you shouldn't use it -- it's a crazy regular expression
after all -- but it IS the RFC :D

------
Sinjo
As a little addendum to this piece, I now realise that WordPress does exactly
the kind of horrible validation I was talking about in the article, and
apologise for it.

------
adrahon
I still think you should validate, but instead of rejecting "incorrect" email
adresses, just ask the user to check if there's no mistake.

------
Sinjo
I wrote a little follow up to the article, covering some of the points
mentioned here and on reddit. [http://blog.sinjakli.co.uk/2011/02/15/email-
address-validati...](http://blog.sinjakli.co.uk/2011/02/15/email-address-
validation-an-addendum/)

------
AllahJesus
I think that recaptcha is just a better idea in regards to this. Yes, it's an
extra step, but it does two things: a. verify that the person is in fact a
person and b. cancel out spam bots, because of the need for the spam bots to
be able to read the image, which is almost usually impossible to fake.

This way, email validation is not even important anymore to avoid spam.

Using part of what was suggested in your post, if we do both, use recaptcha
and send an email validation link before sending any emails, we avoid spam to
our servers and to the people from us and we save everybody a little bit of
time. :)

The next issue arises with email delivery. How do we then ensure that our
validation emails don't get filed as spam? Because if the user never sees it,
then it becomes a hassle for them and chances are, unless they really, really
wanted access to our site, they're not going to spend time contact us to help
them with validation so that they can login or otherwise...

------
pschlump
Interestingly enough - I subscribe to a email list that has an email address
that fails to validate at google. Most irritating.

------
ChuckMcM
"If you want to know that you’re being given a valid address, send it an email
and have the user click a validation link in it, and stop annoying your
users!"

Epic fail. Its this sort of approach that ends up resulting in cross site
scripting bugs. Oh just take what ever the user typed in, and send it to the
server they told me to send it to. Boom!

The perl code is perfectly reasonable for validating RFC compliant addresses.

------
aneth
I never understood the point of enforcing the spec for user input. Even if
done properly it may reject some working, but invalid email addresses. And it
does nothing to increase your chances of getting a good email address. Your
user is either willing to give you their real address or not. If they are
willing, validating fully does not protect against typos and if they are not
you will get a well formattd fake address - validating to spec serves no
purpose and possibly harms. Just don't do it.

Check for an @ sign and possibly a top level domain name (at leasr one dot)
and be done with it.

~~~
JoachimSchipper
Some _really_ broken/misconfigured mailer software may still accept the
'foo@bar@qux' syntax (route mail for user foo@bar via qux, or the other way
round - I forgot which, since no sane system has implemented this since the
word 'spam' came to mean bad e-mail.)

So there may, theoretically, be some value in checking for the presence of
exactly one @.

~~~
ZoFreX
@'s are fine as long as they are escaped properly!

------
cynoclast
You should first ask yourself if you really need an email address. And see if
you can get away with not having one.

Requiring that the user have an email on file is not as necessary a
requirement as a lot of people seem to think. It seems like half the time or
more, they just want to spam it anyway.

~~~
eli
Honest question: how do you handle the extremely common case of a lost
password without an email address?

~~~
torme
As someone else mentioned in the thread, make the email address optional, with
the understanding that without it they can't recover lost passwords.

This is probably an unacceptable solution for some sites, but I can definitely
think of cases where the only reason you need email is for password recovery,
and account loss isn't the end of the world.

------
soulclap
While I agree that some kinds of validation are 'too eager' and annoying, just
use a 'legit' e-mail address, ffs.

By including super-special characters and whatever extra features GMail or
whoever provides, you're just asking for it, sorry.

Especially if you're a coder yourself, you can already assume that even if it
passes the initial validation, it probably won't be properly stored or escaped
when the actual mail is sent, when you try to log in with your address later,
etc.

~~~
EliRivers
The relevant RFCs make it clear what is a correct eMail address. Why should we
have to put with lazy or incompetent coders who can't be bothered to meet the
standard?

~~~
retube
Yeah but not all email addresses are RFC compliant. Plenty of mail servers
accept, or can be configured to accept, non-compliant addresses.

~~~
frobozz
That's not the issue at hand - the grandparent comment advocates that you
shouldn't even expect a service to accept RFC-Compliant addresses.

Validating against the RFC is more, not less permissive than the position held
by the grandparent comment.

~~~
soulclap
Well, just go ahead and try signing up at facebook with {^|~!}@gmail.com then.

~~~
EliRivers
So because Facebook don't know what counts as a valid eMail address, everyone
else has to adapt to them?

~~~
T-hawk
When that's the behavior of the 800-pound gorilla, then the answer is yes. IE6
didn't know what counts as valid DOM or CSS but everybody else adapted to it.

