Hacker News new | past | comments | ask | show | jobs | submit login

It's both, but ASCII vs. EBCDIC is worse. Even in Unicode, the regex will still grab the printable characters that also happen to be part of ASCII: you won't see anything wrong until you get to characters outside that range. In EBCDIC, things get much hairier: it won't get capital letters, nor lowercase letters from r through z (but it will get all the other lowercase letters), nor brackets or braces (though it will get parens).

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact