
Validating Email Addresses with a Regex? Do Yourself a Favor and Don’t - greengobble
http://blog.onyxbits.de/validating-email-addresses-with-a-regex-do-yourself-a-favor-and-dont-391/
======
herbst
Did i miss it or did you forget that domains also can be punycode?

My devise this days is, if i need to know the email works. I'll just send them
a email.

~~~
greengobble
> Did i miss it or did you forget that domains also can be punycode?

That's an interesting question. DNS doesn't handle non-ASCII characters, so
you have to encode them as xn--SOMETING. This will validate. Debatable if it
should because it means you have to handle it in the code you are validating
for.

> My devise this days is, if i need to know the email works. I'll just send
> them a email.

Making validation someone else's problem ;)? Sure, that works, unless your
task involves cleaning up a database containing a couple million addresses.

~~~
herbst
True. thats a different use case.

I may would except you (the form or whatever) to translate my inputs to
punycode. I live in a german/french country special chars in domains are not
exactly unusual, but i avoid them for myself because i've encountered so many
sites that simply ignore there existence.

~~~
greengobble
I guess, that's the whole point of the blogpost: when you try to validate
email addresses with a regex, you are screwed with unicode domains. It is not
clear if you should let them pass, only let them pass if they are properly
encoded or not let them pass at all (as the software behind the validator is
clueless about them either way).

Building the DFA yourself gives you more control.

