Hacker News new | past | comments | ask | show | jobs | submit login

And also interesting unicode features in the regexes... \d matches anything that is a digit in unicode for example. What if you only want ascii though, like for computer languages and maybe security?

EDIT: security as in, no similar-looking but different characters to confuse users etc...




Guess one can always do [0-9]+. Though I can't see why, for example, an arabic numeral pose a security problem. ISIS?


There are a number of security issues, mostly around using "look alike" Unicode characters in phishing/impersonation attacks. See, for example:

http://unicode.org/reports/tr36/ http://unicode.org/faq/security.html https://www.blackhat.com/presentations/bh-usa-09/WEBER/BHUSA...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: