To anyone that sees this email validation regex, DO NOT USE IT. Hope that was cl...

robomartin · on Jan 3, 2013

>Use something like `^([^\s]＊)@([^\s]＊\.[^\s]＊)$` which will do most of the work for you

I don't understand. How does this expression do anything even remotely close to email validation?

For example, how does it tell you that:

    These are valid:
      test@nasa.gov
      ~~~~@nasa.gov
      joe+sometext@nasa.gov
      test@bbc.co.uk

and that:

    These are NOT valid
      test@example.com    (no MX RR)
      test@-nasa.gov
      test"@nasa.gov
      test@nasa.gov-
      test         
      test@nasa.rockets
      test@bbc.co..uk
      test@bbc.com.uk
      test@bbc.co.eu.uk

You'd have to write all the validation logic yourself all over again. And that's just a few examples.

Barring anything else, the RFC822 expression isn't so bad that someone should replace it with the kind of thing you are suggesting.

Sorry if I don't see it.

rawb92 · on Jan 3, 2013

Honestly I don't understand why people get all flustered over email validation, I would probably use something along the lines of this just to check that the email address is along the lines of name@domain.com, obviously this could do with a little tweaking.

The best way to validate an email address is to send an email to whatever address is supplied to you, if it is a true email address the user will receive an email and it will be validated, if not then their account or query will go unused/unanswered and that will be down to them.

robomartin · on Jan 3, 2013

> Honestly I don't understand why people get all flustered over email validation

Multiple reasons, and, yes, context is important.

Landing Page: You have one, and ONLY ONE, opportunity to capture a potential new customer's contact info. If they make a mistake entering their email and you didn't catch it you'll loose them forever. You can't send an email to let them know they entered two periods by mistake, can you? They are gone and you screwed-up.

Every single potential customer is sacred. Thou shalt not loose them by being careless.

Forum signup: In general terms, if someone is visiting a forum it probably means that they want to sign-up. In this case, it is OK to make them enter their address twice, make sure they match and send them a confirmation email. They'll probably try to log-on later on and discover something went wrong and re-register.

While I said "that's OK", I also think it is bad form not to at least do enough validation of all input data, including email, to catch innocent mistakes. I think people who are against email validation might have that position because they don't understand it or gat bitten by a crappy regex expression and that is that.

Now your forum sign-up user is angry because they have to enter all of their information again and go through the process one more time. Who knows, they might make a mistake once again. While I don't have any data to back this up I would venture to guess that the drop-off rate for making a visitor enter all of their data multiple times is significant.

Payment Confirmation: Must check as much as you can.

From my vantage point taking ANY action that might loose or annoy a visitor is simply --to be kind-- programming. There's no excuse for that in my book.

riquito · on Jan 3, 2013

It's a good approach because it doesn't filter out grammatically correct e-mails while blocking the blatantly false.

About the invalid cases, who cares? It's not a problem, you must send an e-mail to check for correctness either way:

your user may

* wrongly type his e-mail e.g. bil.gates@microsoft.com

* write on purpose a valid e-mail of another person e.g. yourbestenemy@gmail.com

* write a grammatically valid but nonexistent address

* forget how to access his own e-mail address

You must always send a mail to confirm his validity, so if you have some false positives there is no harm, and it's faster to validate too.

robomartin · on Jan 3, 2013

> It's a good approach because it doesn't filter out grammatically correct e-mails while blocking the blatantly false.

Sorry, that's not a good reason to use this. If you use the correct approach you will NOT filter out syntactically correct emails and you WILL catch all invalid addresses that can be detected syntactically.

It just isn't a good idea to use this expression in place of the RFC822 expression. And, keep in mind, I am not a fan of the RFC822 expression.

With regards to your other scenarios, please read my reply to "rawb92" here:

http://news.ycombinator.com/item?id=5003032

In a nutshell, if someone enters a malformed email address by mistake and you don't catch it, it's game over. What are you going to do? Send them an email?

The spammers and tricksters will always exist. You'll have to decide how to deal with them yourself. In other words, stuff like someone attempting to sign-up their buddy to a porn site. That's got nothing to do with email validity, that's a matter of identification, and, yes, in that case the first line of defense is to send out a confirmation email.

godDLL · on Jan 16, 2013

This is an interesting sample. Can I see the rest of your data?

TeMPOraL · on Jan 3, 2013

> then check second group for common domain typos, and what have you.

You can use something like Mailcheck.js [0] for that client-side; it'll help weed out a lot of domain typos.

[0] - https://github.com/kicksend/mailcheck