Hacker News new | past | comments | ask | show | jobs | submit login

Would it really be a lot of work? I'd imagine the effort expended towards security in general exceeds what you're proposing.

EDIT: As I understand it, Cyrillic languages use code page 866 as an extension to ASCII http://en.wikipedia.org/wiki/Code_page_866. Is this correct?




I'm sure that something would be extremely painful. Some sort-of-independent tribe/people that uses one or two letters that are "really part" of another language. In short, politics.

To the best of my knowledge, Cyrillic languages don't use the Roman script (except where letters appear to be similar). The ASCII subset of codepage 866 is for "cd C:\", not for Cyrillic.


I agree - it could be tricky politically, but I'm not sure if the alternative of representing the characters via punycode conversion is more culturally supportive / sympathetic?


CP866 is a very old standard, it was used in pre-Windows times. There are at least 3 more standards to encode cyrillics in 8 bits. Today, most Cyrillic letters on the web are encoded in either UTF-8 or CP1251.

All of them define the whole alphabete, though, so even the letters that look similar to some latin letters are always encoded differently.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: