Stop Validating Email Addresses With Your Complex Regexes

jyap · on Sept 13, 2012

There's got to be a bug because this was on the front page 6 days ago: http://news.ycombinator.com/item?id=4486108

Well the link in the previous post had a trailing slash.

crisnoble · on Sept 13, 2012

So to sum up the comments on the previous submission:

1) Someone did make a fully compliant Regex https://github.com/larb/email_address_validator

2) Lots of people (including the author http://news.ycombinator.com/item?id=4486264) think that you should at least validate that the '@' is somewhere to avoid username / email mixups

3) If your regex fails for foo+tag@something.com lots of people get pissed. I learned you can use that for inbox sorting since foo+tag@something.com goes to foo@something.com.

4) There are other schools of thought that promote validation and telling the user something is wrong like mailcheck.js (https://github.com/Kicksend/mailcheck), but these should not stop users from submitting forms.

Did I miss anything?

ars · on Sept 13, 2012

Yes.

5) Do some basic validation. If it fails, ask the user: Are you sure this email address is correct? If they say yes, then allow it anyway.

This lets you catch all the typos, without annoying people with more complicated email addresses.

pbreit · on Sept 13, 2012

That the solution proposed kills conversion and is unnecessary.

terramauthe · on Sept 13, 2012

Funny part is that this submission made it to the front page again... I guess this is a really popular topic?

pdenya · on Sept 13, 2012

It's a really popular headline at least. The actual article i'm not impressed with but email regexes are a pain.

ry0ohki · on Sept 13, 2012

One suggestion I would add: Let them in right now, and confirm later. Sites that require you to wait for an email, click a validation link etc... have a higher barrier to entry. Sometimes email is slow, sometimes it ends up in spam, etc...

Just let the users in at first to poke around before forcing them to validate their email. After a day or two (or maybe to access certain features) remind them they need to confirm that email they got if they haven't already. Yes, some people may try your product and you won't have their real email address, the the ROI on spamming these people later is probably not worth the initial friction.

jere · on Sept 13, 2012

If you've been developing login systems for any length of time, this should be fairly obvious.

Be careful about taking that "it’ll get bounced" attitude too far though. The last time I did so I forgot to trim the email addresses and didn't lowercase them. Failing to trim will probably result in the email going through, but then might cause problems later on when you try to match their login ID to what they enter the next time.

A similar issue arises for case. Email addresses are supposed to be case sensitive but providers don't seem to take advantage of it in practice. Again, the case a user types varies from time to time (I assumed nobody used upper case... it seems silly). And if you switch to case insensitive login IDs down the line, you may have to deal with duplicate accounts (same email but different case).

Steuard · on Sept 13, 2012

Regarding upper case: I've taken to typing the first part of my work email address as "JensenS". My wife and I work at the same place, and when I wrote it as "jensens" there were honestly people who assumed that address was a plural and went to both of us.

jere · on Sept 13, 2012

>when I wrote it as "jensens" there were honestly people who assumed that address was a plural and went to both of us.

facepalm

ericd · on Sept 13, 2012

Following this advice creates a lot of "I never got my activation email" and "Why won't it let me log in" support emails that are a big pain to deal with. It's pretty important to nip that in the bud, hence immediate email address validation.

egiva · on Sept 13, 2012

I couldn't help thinking while reading this that my main concern with registration systems isn't the complicated Regex as much, but rather the really annoying registry bots that sign up phantom accounts. I'm not a huge Captcha fan, but without something (Recaptcha, ghosted fields, etc) you'll get SPAMMED with tons of fake accounts - and they have valid emails, AND the bots click on the links in the confirmation email automatically. It's really sad.

crisnoble · on Sept 13, 2012

Well now some bots can read captchas too: http://userscripts.org/scripts/show/56989

leeoniya · on Sept 13, 2012

we've been doing full mx lookups and smtp RCPT TO: queries for some time.

beware: yahoo's smtp servers always say addresses are valid...making validation quite pointless.

TomatoTomato · on Sept 13, 2012

What about just checking if a MX record exists?

kingatomic · on Sept 13, 2012

That verifies the domain, but not the actual mailbox.

One possibility would be to lookup the MX, then connect to the remote MTA and ask it directly if the mailbox is valid. But that seems a bit like over-engineering.

ars · on Sept 13, 2012

It's also possible that the email server is down at the moment, but will come back up later.

tete · on Sept 14, 2012

HTML 5 input fields support it now (if you want to do it for the user input).

madprops · on Sept 13, 2012

I just check if there's a @

crisnoble · on Sept 13, 2012

Which the authors conceded is good idea most of the time on the old submission http://news.ycombinator.com/item?id=4486264