If it doesn't actually send a validation email to the address, I don't see the value over a simple m/.+\@.+\..+/
They say up front that email validation is hard, and yes there are tons of edge cases and obscure tricks and rules and probably there's no guarantee that even they managed to get it right with this service, but ultimately the customer either puts in the correct address or they don't. If they're going to make a typo then it's far more likely that it would be a legal typo, and if they're going to intentionally enter a false address then it's likely it'd be a simple asdf@asdf.com.
Edit: This was a bit of a knee-jerk reaction to what I at first saw as a redundant overcomplication, however as russjones points out below it has already proven its value in reducing bounce rates by a significant percentage.
So, it might not fit my own limited use cases, but it certainly can't be ruled out entirely. Best of luck to the Mailgun team and I hope people smarter than I am can put this service to good use.
Hi Zikes, I'm the developer of this service at Mailgun. Nice to meet you.
We've been using this service at Mailgun during testing and we've reduced bounce rates by 5%. That might not seem like a signficant number, and it may not be for a personal blog, but it can be significant number for a larger ecommerce website.
For someone like us, a service that sends billions of emails, it's huge and we wanted to provide additional value for our customer to help them reduce their bounce rates and improve conversion.
Plus we don't correct typos on local-parts, just domains. So we won't correct Jooohn@gmail.com, but would suggest a correction for john@gmaill.com.
I think I've been a bit harsh and quick to judge, sorry about that. Obviously the service must have value, and it's my own failing not to have seen it.
I do wish you the best of luck with this new service, and I hope that many people are able to benefit from all the hard work you guys have put into it.
Thanks for this service. There isn't any replacement for actually sending an email and having someone click a link to confirm receipt, but this is a step closer towards that.
Maybe I'm an idiot but it took a while to see the 'user' actually entered "gmall" (I assumed it was correcting the spelling of Michael since the vowels in that name get mixed up so frequently).
Can you talk about why reducing bounce rate is important? For me, ensuring that the email address entered belongs to the user who entered it is the most important.
Hey, mailgunnner here. Thanks for the ux suggestions! On bounce rate, a bounce will happen when you collect an email address that is invalid, and then send to it. It is basically the symptom that you can measure to know that the email is indeed invalid. By measuring a reduction in bounce rate, we've been able to determine that the validator does its job catching invalid addresses. For determining whether the email address entered belongs to the user who entered it, you can (and should) use double opt-in, where you email a confirmation link to the users.
The "did you mean" feature alone is gold. I talked to the guys in our customer service department about a "did you mean xvy@hotmail.com"/ "xvy@gmail.com" feature for a while. That one feature alone could cut our misdelivered emails in half. Forget adding two email fields to a format, people just copy paste anyway.
Honestly, to you and me, it may seem trivial, but your customers have a really had time entering their email address. Our customers are end users, really end users, these are people for whom their email address means VERY little, but at least they remember that their email isn't at gmalr.con, if you remind them.
I'm not a Mailgun customer, but I would pay for this service (but not to much).
This has actually existed as an independent library called Mailcheck.js[1] for a long time and I have used it in quite a number of projects. So if that is the only reason you want to use this API it would be better to go with mailcheck.js, simply to remove an external dependency.
I know, it's a great deal more than a regex, and that's actually my point. It's dedicating tons of extra resources and time to a process that is ultimately redundant.
You can't truly validate someone's email address because you don't know what their email address actually is. Your only two options are:
If someone's buying something from my site, I'm not going to send them a validation email during the checkout process.
Something like this greatly reduces the number customer support emails saying "I didn't get an order confirmation email from you!" when they typed in joe@yahoo.cm.
That's the point: Mailgun's Validation API does catch that joe@yahoo.cm is incorrect and suggests a correction. Try it: http://mailgun.github.io/validator-demo/
I see that as the only valuable part of this service, and I can instead use it statically via mailcheck JavaScript: https://github.com/Kicksend/mailcheck
... not with 100% accuracy, but it's totally plausible that you can do substantially better than random (or a simple regex). So there is incremental value here.
I wish more places would validate their email. In the past two days I've been informed that I know have a Playstation, EA, and Blizzard account because someone signed up with my email address (just without the dots.)
Only Blizzard tried to confirm the email address. The other two created the account, and now won't let me disown the accounts. I can go through the forgot password process and turn them off, but I'm not sure how kosher that is.
Have you ever crafted a sign-up form for a paid app though? Requiring users go and check their email before they can proceed is going to kill conversions. Mail providers (I use mandrill) also deliberately throttle outgoing email - I've seen them delayed by several minutes.
Hi fsckin, example@example.com fails because while we do fallback to A records, no mail exchanger responds to us, so no one can actually receive mail at that inbox which is why we mark it as invalid.
Just curious: is this a one-time immediate check for an MX that responds?
Say, for example, that example.com has only a single MX and for whatever reason (server is down, network connectivity is broken, etc.) it is unreachable or unresponsive at the exact time you test. Would that constitute a "fatal error" in your validation process?
If there is a single MX RR but it "expands" to multiple hosts (multiple A RRs), do you test a second one if the first doesn't respond?
It does a DNS lookup, advanced parsing (which can be hard [1]) and common typo-fixing suggestions.
Not a useless service, to be sure; you have to make your customers aware that their email is submitted to a third party for validation though. Usually people don't care all that much (and have that same emails addresses listed in public profiles), but a relevant bit of fine print should be there.
It only implements a subset of DNS functionality, so I'd argue its giving a false sense of security and false negatives. mailtest@пример.испытание fails, for example, even though it is a valid email address.
This was my first response too; there's been a CLI tool (called vrfy) for a while that I've given up on using because it almost always says "send some mail, i'll try my best". That tends to be the ultimate test, which you can probably already do pretty quickly with a backend for a web form.
EDIT: That being said, making a potential customer wait for thorough verification, on top of "spamming" them before they've (technically) bought from you would be unwise; I can see how this would have merit.
Have you ever worked on e-commerce sites? Bad email validation or the requirement for a confirmation email can kill business. If a user buys a product but types `sally@yahhhoo.com` and then doesn't get their purchase or shipping confirmation, you can bet there will be long labor-intensive interactions with customer service. Getting a high confidence value in an email address without a confirmation email has great value.
Yes, and none of those extra steps actually have a significant effect over simple regex checking to confirm the validity of the email, because there is no way to know what the user's correct email address actually is besides sending them a validation email. None.
The domain may exist, the mailbox may exist, it may match all the rules for that particular mail exchange, it can pass this validator with flying colors, but you still have to send that validation email because I could just as easily have entered your email as my own, and that makes this a moot step.
Mailgunner here. Perhaps we should have given it a different name? We do not guarantee that the address is valid. But we surely beat your regexp. Why?
Simply because we're an ESP and have a huge database of valid MX hosts and the most common misspellings for them that are NOT valid MX hosts. And we give you suggestions to implement autocompletion/autocorrection.
Besides, many of our customers reach out to our support asking for regexp suggestions. So we figured it would be cool to offer this to everyone for free, so we launched it.
Actually I think you've got the name spot on, I had the wrong term. What I was looking for was confirmation email, not validation.
And yes, your service is a pretty big step up from regular validation regexps. I didn't see the value in that before, but that's my own fault. I now see there's a lot of potential for a service like this, and I hope it's put to good use.
I don't think anyone is claiming that this can replace the function that a validation email serves.
What this allows you to do is give feedback to users quicker. If a user signs up for your service and mistypes the domain name for their email address, they'll never hear anything from you. And if they don't know they made a typo, they'll just think that you have a flaky service. This lets you prevent that from happening.
I think you're missing the intended use case. It's meant to keep people from fat-fingering their address on an email signup form (think mailing list, not web app).
This may be very German of me, but the privacy implications of sending an email address to a third-party service before form submission appear murky to me.
Also, we need to find a name for "give me some personal data in return for a minimal value add service" offerings.
Would you be interested in something open source that does this? I would be willing to release https://www.emailitin.com/email_validator to github. It does all the same checks as this except for the "valid user part for Yahoo.com" stuff.
I would definitely appreciate an open source option, though to be honest I am more interested in the correction feature than I am in validation.
I have been using the same regex[1] for years it gets the job done adequately--at least, I've never received any complaints from users or clients.
I've used the regex on some relatively high profile sites--the kind where if someone was unable to signup with a valid email address we would've definitely heard complaints.
"No addresses submitted to the guardpost service are ever stored on any Rackspace servers. Nothing is persisted after the request is complete."
However, the API is using GET.
So unless you turned off logging, you are storing all tested emails in your web server logs which most places gzip up and archive indefinitely. In other words, I'd imagine these logs "persist" after the request is complete.
ADDED: As an end user, you may or may not be considered that these logs will contain:
1. Your email address
2. The place you are making an account
3. The time you created the account
And this information will be stored likely indefinitely (whatever the server log retention policy is). These logs also give mailgun and/or Rackspace a great resource for the membership rolls and adoption rate of any site adopting this service.
Perhaps a good solution would be a JS library to validate and strip the local part, then ship the non-local part to Mailgun's servers for domain validation.
I have to agree with pudo. Sending a user's email to a third party service has many privacy implications. I don't care if you share, not share, store, not store, or whatever; users are trusting you not to share their emails and you should comply.
If it’s outside the EU, you would need permission to send the address there, and after PRISM, German data protection agencies decided to block all further requests based on the safe harbour agreements with the US.
I'm going to come up with a free password validation API for web forms. Just call my API with your username and password and the service it's used for and i'll return 200 OK status if it's a secure password.
Sure this is an extreme edge-case, but this was my second test. Who knows what else it rejects. Actually, why do we validate email addresses anyway? Whynot just try and send that validation email that you're going to send anyway? And on top of that, why would I ever trust a free third party service to check all my user's addresses?
A surprising number of people mistype their email addresses - if you can cut that down, it cuts down a lot on unnecessary bounced traffic, and the support costs of explaining to users that they've used the wrong email address to sign up with.
I appreciate your point that email addresses are allowed to include all kinds of wacky characters, but if the check simply triggers a warning to double check the address, and doesn't block registration, I don't see the harm in doing it - it reduces a lot of unnecessary problems for ordinary users signing up and making mistakes who don't have addresses like the one above.
Hi lucb1e, thanks for finding this issues. It actually gets validated correctly by our parser (which will soon be open source) but hits a bug in our API. We are working on a fix right now.
Unfortunately this is broken just like all other attempts I've ever seen.
Unfortunately you didn't read the blog like most other commentors posting about ridiculously unusable but otherwise valid addresses:
"""Our goal is not to make a perfect address validator that can validate every single address that has ever been created. Our goal is to build a realistic address validator for the types of addresses we see everyday."""
For the folks with privacy concerns: we are actually planning on open sourcing the entire service as well as our MIME handling library. So if you have privacy concerns, you'll be able to run it locally.
Great to hear! Maybe it was mentioned and I missed it, but what language(s) will it be implemented in? And is there a way I can be notified when it's released?
We have a web form builder service. (Jotform) We serve 3 million forms and process hundreds of thousands of submissions daily. Every form pretty much has an email question and most people enable validation feature.
We first started with a very very smart and long email validation check that was going to be perfect. Every time users reported a case where our regular expression didn't envision we had to reduce it. During the years we had to change it so many times, I am pretty sure we have left with something like this: Does it have an @? Does it have a period? Great! You are validated!
"Does it have an @? Does it have a period? Great! You are validated!"
Technically a period is not required, you can have an email address directly on a top-level domain. Though, of course, that is a rare enough use case that I'm sure anyone who has one has long since given up on expecting it to be validated correctly by most webforms.
"We know that gmail.com is a valid MX host while gmali.com is not."
They've failed in their own example. gmali.com has valid MX records and accepts mail. Just because they don't accept billions of email per month, why should this service block any mail?
When I enter fred@gmali.com on their validation demo, I get a 'Did you mean' message, but the icon is a warning indicator (yellow exclamation point), rather than the error indicator (red cross).
When I enter fred@gmal.com, I get a 'Did you mean' message, with the red cross error icon.
I presume this means that it's detecting that gmali.com is a valid domain and can receive e-mail, but for most people it's not what they actually meant, whereas gmal.com is both a probable typo and a domain that cannot receive e-mail, and therefore invalid.
In other words, I think it's doing the right thing, in that it's detecting that fred@gmali.com is a valid address, but warning you that it's not the correct one. I think there is definitely a usability improvement though, as it's easy to miss the yellow warning icon, and assume that it's actually telling you that you can't use that address.
Because you'll probably have more people who accidentally type gmali.com instead of gmail.com then people who have an actual gmali.com email addresses.
Again, due to the robustness principle, just because a host does not define MX records does not mean they can’t accept mail. Mail servers will often fall-back to A records to try and deliver mail. That’s why we go one step further than just a DNS query, we ping the Mail Exchanger to make sure that it actually exists."
Plenty of boxes don't respond to pings (icmp). Can I assume you're doing a tcp scan on mail ports?
> Can I assume you're doing a tcp scan on mail ports?
Nope. We're an ESP ourselves. Mailgun has delivered (and accepted) many billions of emails since our launch a few years ago. We have a lot of data on 99.99% other ESPs in the world and the most common misspellings for them. We also have ESP-specific data on what kind of RFC-compliant email addresses they do not allow.
It may make business sense for something like an e-commerce site to reject this sort of technically valid but far more likely to be a typo e-mail address.
Works on my validator (which is also free, and does the same checks as this one with the exception of knowing the Yahoo trick): https://www.emailitin.com/email_validator
I only have performance numbers for the parser right now, and on average it takes about 0.07 milliseconds to parse and validate an address. That's the pure RFC syntax part.
The longest part of validation service is DNS checks, but once the DNS server warms up and starts caching lookups the roundtrip time is going to be the longest part of the request.
We're still collecting reliable statistics and once we have them we can follow up with you.
99+% of the time it's a couple of in-memory lookups plus a bit of CPU for syntax validation. The API endpoint is hosted in Chicago on a dedicated aggregation to DC networks (nice to be on Rackspace). Should be fast.
This is great, thanks for your effort. Couple more things I'd love to see:
1. Parse a chunk of text and salvage any email addresses from it that you can find. Use case: my users upload spreadhseets with email of their other team members, but email field would often contain more than one email (separated by slash or space or coma or god knows what), or other stuff like Skype account etc.
2. Actual validation service. I'd pay for it at standard mailgun rates, it would be easier for me than rolling my own as I do now.
Hi cleverjake, I'm the developer of this service at Mailgun.
You can't actually register "email with spaces"@gmail.com or "very.unusual.@.unusual.com"@mail.yahoo.com with Google or Yahoo.
Both addresses pass pure syntax checks but then the validator kills it when it notices that Google or Yahoo won't let you register addresses like that.
They're only valid if you can own that email address and
The only one that looks valid there is the last n@ai, since apparently ai actually has an mx record.
The first two are legal by rfc but not allowed by the individual providers. The third is not legit by rfc 2606 because example.com is a reserved domain.
If it was just a RFC validation then it should validate all of those (except maybe example.com). But they go beyond that:
"Furthermore, the validator is ESP specific, so we can go way beyond valid syntax checks, bring in specific requirement for Gmail vs. Yahoo vs. Hotmail."
The best "email address validation" that ever existed was SMTP's VRFY verb (and EXPN was quite useful too). Unfortunately, the spammers killed that real quick.
There are a few concerns about privacy and response times using this approach, but you can get a very similar effect just by writing/hosting a little bit of js locally which consults a list of common errors, does a simple syntax check, and shows a warning beside the email entry box asking if the user is sure about the mail address they are submitting. That's enough to make users look hard for any mistake, and can be done in about 10 lines of js, automatically whenever the email field loses focus. I've found even a very simple script checking common errors/domains has a significant effect on typos and bounce rates, and it has the advantage of not sharing the email and being a lot quicker. If you're only ever warning the user to take a second look, you don't have to worry about false positives.
I'm not sure that anything other than basic checks adds a lot of value, and I'd worry about sending off users' email addresses to an API on a third party website before they've even agreed to terms - I don't think most users would be happy to find out that was happening.
It would be interesting to experiment with different levels of checks, and see which ones provide the most value though.
Oh the edge cases will be so angry. There are a very few engineers at Google who have Gmail account names shorter than 6 characters. It's not the norm, but they exist. Their addresses can't be validated.
I'm sure there are special cases all over the place. It would be nice if Mailgun differentiated between 'this address is just malformed' and 'from what I know of [ISP], this address oughtn't exist'.
Well, you are not forced to reject the email, you can just stick to the unobtrusive spellchecker, that just suggests in case if something suspicious. So the end decision is up to your app.
The spellchecker doesn't actually kick in all that often. The better solution is to give flags back differentiating between 'unlikely' and 'impossible' email addresses.
To all those that think this is no better than a regex:
Yesterday a company invited 7 new users to their account using their email addresses. 3 of those addresses had typos in the domain names which this service would have caught. As it was, this error was only discovered when the service tried to send invitation emails to the new users and that's not a great UX.
Validation emails, particularly those with a confirmation link, are a horrible horrible solution. They interrupt the user's process flow, taking them away from their web browser, possible delaying the process, and you'll also get users searching through their emails and clicking that link just to access their account (yep, really).
I think I'm going to implement something like this Mailgun service plus sending a welcome email (with no confirmation link). If the welcome email bounces then I can handle that case but it should happen less often with the Mailgun live-validation.
This looks pretty awesome and my first thought is it would be sweet if it was integrated into common webmail software.
I work for an ISP and we, of course, provide e-mail access via webmail. Right this moment, I can see dozens and dozens of e-mail messages queued up on our outbound relays that will never be delivered because the user typo'd the recipient's e-mail address.
An amazingly high number of messages bounced back to our users (the original senders) are due to typo's like this. Some people, despite not being "techies" can skim over a bounce message and realize they misspelled "live.com" and will resend. Others, well, they call support wanting to know why they suddenly can't e-mail Aunt Sally.
Here an easier option for those concerned with privacy of their users.
1) Validate the email address first on your side (using regex).
2) Then send the domain name part to the service for validating and correction. Maybe append a fake username before the @.
So if a user enters someone@yahoo.cm , I validate it first then send someotherperson@yahoo.cm to the the mailgun. Now your real user is protected.
Now you don't send them the real user name but get most of the benefits.
It doesn't seem to accept IDN addresses, though it's able to resolve the punycode equivalent just fine.
I was just implementing email and name validation checks for a project myself. Luckily email addresses can at least be validated by a confirmation email; it's the real name field I have no clue what to do with.
It's funny that after all these years, we still don't seem to have cracked these basic problems.
This is cool! One thing- as a mailgun user, can i use it to check the valid email addresses when one of my client sends an email to his customers? That would save some email credit him and reduce bounce rate.
They say up front that email validation is hard, and yes there are tons of edge cases and obscure tricks and rules and probably there's no guarantee that even they managed to get it right with this service, but ultimately the customer either puts in the correct address or they don't. If they're going to make a typo then it's far more likely that it would be a legal typo, and if they're going to intentionally enter a false address then it's likely it'd be a simple asdf@asdf.com.
Edit: This was a bit of a knee-jerk reaction to what I at first saw as a redundant overcomplication, however as russjones points out below it has already proven its value in reducing bounce rates by a significant percentage.
So, it might not fit my own limited use cases, but it certainly can't be ruled out entirely. Best of luck to the Mailgun team and I hope people smarter than I am can put this service to good use.