There are basically three levels of address checking:
1) You need to validate an email field for login or a website - checking for an @ mark with some text before and at least one . after the @ will do for this.
2) You need to do some sort of address validation, library regexps like this will do for 99.9...% of these.
3) You are building an email handling system which needs to actually support the RFCs, in which case regexp will not handle what you need, and you need to use a proper parser, like https://github.com/mikel/mail/tree/master/lib/mail/parsers
Ref: I am the original author of the Ruby mail gem.
Yep. Back in university I was acquainted with the fellow who set up the MX record for .ai (and assigned himself the memorable address n@ai). IIRC he was involved in planning an academic conference to be held there.
As soon as I read the OP's comment I knew someone would reply to say the dot was not technically required, but I didn't expect to see an actual publicly addressable example!
Heh. A while back I worked at <now bought-out startup> whose main business included handling emails, and they were looking to speed it up, so I came up with this code to do the header parsing, which was 250x faster than the mail gem... but they ended up not going with it due to risk >..<
Well it’s usually worth doing 1. It’s super easy, and it catches silly typos (like people putting their name instead of their email address or something)
You can certainly do it with regex, given a certainly non-regular regex implementation and a probably unbounded computational space. But you shouldn’t.
Ref: I (regrettably) have one of the top SO answers for matching URLs. It’s wrong in a few different ways and I’ve stopped fielding edits/comments for the last few years.
The most helpful thing I've used in the real world is something that looks for common typographical errors, even if the email is technically valid.
Like, if the user types "john.doe@gnail.com", it pops a dialogue asking "Did you mean john.doe@gmail.com?". But lets them keep what they typed, or do a different fix if needed.
I assume it's using popularity statistics, edit distance, etc, to come up with suggestions. There are updated clones that use react, vue, etc, instead of jquery.
With a working ecommerce site, this improved the percentage of correct emails more than anything else I tried, and I had tried many things. Because it's a bad situation when you've taken someone's money and have nothing other than a shipping address to contact them if something goes wrong (bad shipping address, out of stock situation, etc).
There is really no point going further than this. It's more likely that someone will type the email wrong but still valid than they will type it completely invalid. There are also some completely wrong validators out there which expect the TLD to be 2-3 chars only.
The ultimate email validation is just trying to send an email to the address and confirming with a code/link.
Never mind the regex, `email.indexOf("@") > 2` does the trick and faster if you happen to need to check many emails. All websites these days require verification of emails (regardless of whether or not it's necessary), and if that's not enough validation, I don't know what is!
The index of the first character is 0 and an email must have a local part so that means the index of "@" has to be at least 1. My guess is OP also forgot the index of the first character is 0 instead of 1 resulting in 1+1=2 (that or they meant >=).
Off by one errors are about half of working with arrays.
The regex listed there isn’t that much more complex. It’s basically a check for *@a*.* where * is some minimal whitelist of valid characters and a is an alphanumeric to start the domain name.
And even though that sounds reasonable, it is still wrong. Technically, you do not even need the dot, if you add an MX record to a whole TLD or have some other funky DNS setup.
> This requirement is a willful violation of RFC 5322, which defines a syntax for email addresses that is simultaneously too strict (before the "@" character), too vague (after the "@" character), and too lax (allowing comments, whitespace characters, and quoted strings in manners unfamiliar to most users) to be of practical use here.
Personally, I say if it's good enough for Web browsers, it's probably good enough for your app.
Yep! I know that @mil addresses used to be a thing. With the explosion of new TLDs I would be unsurprised if name@gmail addresses became a thing in the future and silly devs handled it by adding an exception to their huge regex instead of just using “.+@.+”.
This depends on your use-case. If you're writing a mail agent, then you do probably want to parse email addresses in their entirety. If you're writing a website that accepts email addresses and wants to make sure the user doesn't just type "foo", then yeah, check for `.+@.+` and call it a day.
It’s probably a waste of time for an individual developer to write a one-off complicated regex for a contact form. A team of contributors to a standard library should be optimizing regex a bit more since doing so will save so many developers time vs using even a very simple one-off regex, when testing is accounted. The optimizations here are reasonable and internationally compatible.
Interesting question. RFC1123 states that TLDs should be alphabetic, and that the first (or only) character must be a letter or a digit, which should exclude a TLD of '@'. local-parts can include most printable characters other than ()<>[];:@\,.
/^+@+$/ is easy and the false positives it allows are weird enough to accept.
I feel like every developer at some point Googles "URL regex" and is inevitably led down a rabbit hole of different regexes — some optimizing for maximum accuracy, others for minimum insanity.
Having been down that rabbit hole before myself, I have to admit, this email regex is tamer than I expected it to be.
I have yet to find the library that is doing this, but I have had a number of issues with website really not liking an "@me.com" email address.
I assume there is some commonly used library (or multiple) out there that don't recognize an email a domain that is less than 3?
But it is driving me insane, most recently I was on the phone with my vet and she told me their system told them my email was invalid (and would not accept it).
Recently I had a doctors office call me to confirm an appointment instead of obeying my wishes to be contacted via email. The email I provided to their contact form was <theirname>@<mydomain>. The receptionist was convinced the provided email was incorrect because it was "their" email. I'm not sure what I expected.
I've done this for decades so I could see who was selling my email address and spamming me, and the answer is: nobody. That's not where spam comes from apparently. I still do it out of habit though.
SMTP had a very useful VRFY command after you've tested for the @ and MX record, but only a handful of service providers will tell you if the email is invalid nowadays due to spam concerns.
Gmail still does though, which is a big deal as 90% of people who register on my sites are using a gmail address only and thus easy to verify instantly and notify the user to double check the email spelling.
Yes, although that only helps if they typo the address to one that doesn’t happen to exist. Quite likely to hit a valid one by mistake with the size of gmails user base. I know at least one person who uses my emails address by mistake.
This regexp and the whatwg one it is based off (correctly) do not validate the presence of a TLD since it's not technically required (foo@bar is considered valid). But if you are building consumer products it's best to test that there is at least a presence of something TLD-like after validating against this regexp.
See also this comparison of email regular expressions (as found in various languages and libraries), compared against a selection of valid email addresses:
It's also interesting to step through the history on this line - it's undergone several revisions and, of course, also seen some reverts of well-intentioned features.
Right. Discover bank app's zelle settings don't allow any email.on .in domain, as in they assume that nobody from India, who already has an email on .in domain, will come to US & use their zelle.
1) You need to validate an email field for login or a website - checking for an @ mark with some text before and at least one . after the @ will do for this.
2) You need to do some sort of address validation, library regexps like this will do for 99.9...% of these.
3) You are building an email handling system which needs to actually support the RFCs, in which case regexp will not handle what you need, and you need to use a proper parser, like https://github.com/mikel/mail/tree/master/lib/mail/parsers
Ref: I am the original author of the Ruby mail gem.