
Real Email Validation - pythonist
http://www.djangotips.com/real-email-validation
======
roc
The only E-Mail validation involves sending an actual email with a response
link.

Because even if people happen to give you a _functional_ email address, it
isn't necessarily _their_ email address.

And I say that as someone who has come to regret registering a first-initial-
last-name gmail address. And it's not even a particularly common last name.

~~~
vincentkriek
I think the purpose of this validation is to help people who mistype their
emailadress, not to check if it is their emailaddress.

~~~
cincinnatus
Right but on a large system it is possible to mistype it to a valid address
that isn't yours.

------
baudehlo
This is just awful. A quick scan of the code brings up the following problems:

* It fails to deal with the case where there is no MX record for the domain (fall back to A record)

* It fails to sort the MX records, potentially falling foul to tarpits

* It fails to connect to each A record lookup of the MX host on failures

* It fails to deal with transient failures (such as 4xx responses)

That was just from a quick scan.

Connecting to MX servers in a web environment (especially one using blocking
I/O like Django) is generally a really bad idea. Many MX servers use delays
and slow responses to combat spammers, and you're passing those slow responses
on to your users.

Just check it looks vaguely like an email (the regexp fein posted is good
enough most of the time) and send a confirmation email - it's the right thing
to do.

~~~
greyboy
Additionally, doesn't it rely on the truthfulness of the SMTP server? That's
not a good assumption - it's common to accept anything and null-routes bad
addresses.

~~~
baudehlo
Indeed it does - the only way to truly validate is to get that confirmation
email through.

On the flip side I do think there's some value in a service which provides a
check on the domain - that way you can prevent someone typing in
username@gmail.con by accident. But you'd have to actually implement it
correctly.

Would people be interested in something like this as service?

------
jodrellblank
And I'll still give you fakeaddress@mailinator.com, it will pass every check
you can throw at it, including sending an email and getting me to click a
link, and it still won't be a _real_ email address.

Still your move, e-mail harvesters.

Checking that I haven't mistyped it or put the wrong thing in the wrong field
is a basic sanity check. Beyond that, the only way to actually get a real
email address that I read is to _be a service I care about_.

~~~
Swizec
For me the trick isn't to get my real email address, I give that to anyone.

But kudos to you if you can make it into my "Important and unread" inbox and
remain there. It's the only part of my email that I actually check.

Some services are _so great_ I let their daily reminder emails go there and
_enjoy_ reading them. That's right, there are services out there (I only know
of one) whose daily "You should use us" email is so awesome I enjoy reading it
every day.

~~~
npx
Out of curiosity, what service has such a great daily e-mail?

~~~
Swizec
750words.com

------
martinp
Making your app connect to random SMTP servers every time it needs to validate
an email address doesn't seem like a good idea.

Shared domains (gmail.com etc.) might even get you blacklisted if you flood
the same SMTP servers over and over again.

~~~
healthenclave
Is there a work around ? How about using proxy but I guess that adds another
layer of complexity

~~~
SudoAlex
Use a queue processor - but that's probably going too far for simple email
validation.

The simple work around - don't do it. This code is susceptible to Denial of
Service problems similar to the URLField verify_exists option
[https://www.djangoproject.com/weblog/2011/sep/09/security-
re...](https://www.djangoproject.com/weblog/2011/sep/09/security-releases-
issued/) \- a malicious SMTP server could tarpit all your SMTP connections
from Django leaving your site with no workers to process other requests.

The email validation from an EmailField is designed to ensure that it could be
a valid email address, not that it's a valid mailbox. Live with the limitation
instead of trying to be too smart.

------
tomwalsham
The best way to improve email delivery is to understand that email addresses
represent humans. Address validation and long-term deliverability is primarily
a problem of social engineering, not technical.

Ordinarily I'm in favour of things that can improve data quality with minimal
user friction, but in this case while it looks like an attractive solution,
it's both dangerous _and_ broken.

It's dangerous because if you repeatedly open empty SMTP sessions with major
ISPs (and some neckbeard boxen) to validate addresses, you will rapidly fall
onto blacklists. Furthermore existence of an address says nothing of the end
user's ownership of that address.

It's broken because of the myriad crazy responses that mailservers return -:
5XX errors for soft-bounces, 4XX errors for permanent failures, deliberately
dead primary MX server... The web's email infrastructure is so massively
fragmented and quirkily non-RFC-compliant you just cannot rely on technical
solutions to these problems except at scale of an ESP (disclaimer: I work at
PostageApp.com, a transactional ESP, and we tackle this problem on a large
scale)

Finally, it fails my 'Spammer Sniff Test': If you think of a clever trick to
improve email delivery/opens/responses etc, it's been thought up 10 years ago
by spammers and long since added to blocked behaviours in email protection
infrastructure.

Check for '@', and craft your email verification process to incentivize
following through. For long term delivery (to bypass the mailinator issue)
provide value, pure and simple.

------
mmmooo
Greylisting is pretty common, and this would obviously fail:

<http://en.wikipedia.org/wiki/Greylisting>

------
bambax
As an aside, would there be some value in providing an email validator API?

Something exactly like this: <http://mythic-beasts.com/~pdw/cgi-
bin/emailvalidate>

but which would respond in an easy-to-parse way (JSON|XML).

It could be enriched by detecting common spelling errors ('gmial' or 'g-a53'*
instead of 'gmail' for example).

*: gmail when typed on a European laptop with numlock on.

------
alexkus
Will also fail to allow addresses that purposely soft bounce (4xx) the first
attempt (or attempts within a certain time limit) to deliver to them.

------
bambax
('SMPT' is used throughtout instead of 'SMTP'.)

What does django.core.validators.EmailValidator actually do?

Validating an email address with a regex is surprisingly hard: see
<http://www.ex-parrot.com/~pdw/Mail-RFC822-Address.html>

I wonder if EmailValidator does this, or something simpler?

~~~
baudehlo
That validates RFC822 addresses, which is the full syntax of the From/To/CC
headers. You don't want that for validating an email address on a web form.

------
fein
Here's a secret:

regex: /^(.+)\@(.+)\\.(.+)$/

maxlen: 254, minlen:5

Aside from sending your verification email, that's all you need.

~~~
noneTheHacker
That excludes TLD emails like postbox@com

I have never come across someone using one but it is valid. I would actually
hate to see someone try to use one. I come across enough issues trying to use
'+' in my gmail email address.

[http://en.wikipedia.org/wiki/Email_address#Valid_email_addre...](http://en.wikipedia.org/wiki/Email_address#Valid_email_addresses)

~~~
nicktelford
It's not just TLDs. Machine aliases are also perfectly valid in e-mail
addresses, e.g. "root@localhost", "fred@finance" etc.

This might not be practical in a majority of applications (you're hardly going
to sign up to 3rd party services using an alias to a machine on your local
network) but if you're building a _generic_ e-mail address validation library,
it's an edge case you cannot ignore.

------
makethetick
Could be easily modified to verify email lists too, very handy if you haven't
sent for a while and want to avoid bounces.

------
jpadilla_
This is pretty awesome! Wonder how much time would it take to validate. Last
thing I would want is to make that signup process even slower. I guess you
could still let the user pass and then run an async task to check "if the
domain name exists, ask for MX server list from DNS, and verify that SMPT
server will receive a message to that address" and then maybe set a flag
somewhere.

------
healthenclave
Very helpful thanks !!

