Hacker News new | past | comments | ask | show | jobs | submit login
Stop Validating Email Addresses With Your Complex Regexes (davidcelis.com)
46 points by cyen on Sept 13, 2012 | hide | past | favorite | 20 comments



There's got to be a bug because this was on the front page 6 days ago: http://news.ycombinator.com/item?id=4486108

Well the link in the previous post had a trailing slash.


So to sum up the comments on the previous submission:

1) Someone did make a fully compliant Regex https://github.com/larb/email_address_validator

2) Lots of people (including the author http://news.ycombinator.com/item?id=4486264) think that you should at least validate that the '@' is somewhere to avoid username / email mixups

3) If your regex fails for foo+tag@something.com lots of people get pissed. I learned you can use that for inbox sorting since foo+tag@something.com goes to foo@something.com.

4) There are other schools of thought that promote validation and telling the user something is wrong like mailcheck.js (https://github.com/Kicksend/mailcheck), but these should not stop users from submitting forms.

Did I miss anything?


Yes.

5) Do some basic validation. If it fails, ask the user: Are you sure this email address is correct? If they say yes, then allow it anyway.

This lets you catch all the typos, without annoying people with more complicated email addresses.


That the solution proposed kills conversion and is unnecessary.


Funny part is that this submission made it to the front page again... I guess this is a really popular topic?


It's a really popular headline at least. The actual article i'm not impressed with but email regexes are a pain.


One suggestion I would add: Let them in right now, and confirm later. Sites that require you to wait for an email, click a validation link etc... have a higher barrier to entry. Sometimes email is slow, sometimes it ends up in spam, etc...

Just let the users in at first to poke around before forcing them to validate their email. After a day or two (or maybe to access certain features) remind them they need to confirm that email they got if they haven't already. Yes, some people may try your product and you won't have their real email address, the the ROI on spamming these people later is probably not worth the initial friction.


If you've been developing login systems for any length of time, this should be fairly obvious.

Be careful about taking that "it’ll get bounced" attitude too far though. The last time I did so I forgot to trim the email addresses and didn't lowercase them. Failing to trim will probably result in the email going through, but then might cause problems later on when you try to match their login ID to what they enter the next time.

A similar issue arises for case. Email addresses are supposed to be case sensitive but providers don't seem to take advantage of it in practice. Again, the case a user types varies from time to time (I assumed nobody used upper case... it seems silly). And if you switch to case insensitive login IDs down the line, you may have to deal with duplicate accounts (same email but different case).


Regarding upper case: I've taken to typing the first part of my work email address as "JensenS". My wife and I work at the same place, and when I wrote it as "jensens" there were honestly people who assumed that address was a plural and went to both of us.


>when I wrote it as "jensens" there were honestly people who assumed that address was a plural and went to both of us.

facepalm


Following this advice creates a lot of "I never got my activation email" and "Why won't it let me log in" support emails that are a big pain to deal with. It's pretty important to nip that in the bud, hence immediate email address validation.


I couldn't help thinking while reading this that my main concern with registration systems isn't the complicated Regex as much, but rather the really annoying registry bots that sign up phantom accounts. I'm not a huge Captcha fan, but without something (Recaptcha, ghosted fields, etc) you'll get SPAMMED with tons of fake accounts - and they have valid emails, AND the bots click on the links in the confirmation email automatically. It's really sad.


Well now some bots can read captchas too: http://userscripts.org/scripts/show/56989


we've been doing full mx lookups and smtp RCPT TO: queries for some time.

beware: yahoo's smtp servers always say addresses are valid...making validation quite pointless.


What about just checking if a MX record exists?


That verifies the domain, but not the actual mailbox.

One possibility would be to lookup the MX, then connect to the remote MTA and ask it directly if the mailbox is valid. But that seems a bit like over-engineering.


It's also possible that the email server is down at the moment, but will come back up later.


HTML 5 input fields support it now (if you want to do it for the user input).


I just check if there's a @


Which the authors conceded is good idea most of the time on the old submission http://news.ycombinator.com/item?id=4486264




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: