Hacker News new | past | comments | ask | show | jobs | submit login

I don't think today's chaos is related to other wide encodings (those are probably very rarely used). Today's chaos is like Batchelder describes, but I'm suggesting that some of that is due to the ambiguity of the encoding: is this data I'm consuming iso-8859-x or is it utf-8? It's this ambiguity that contributes to the whack-a-mole (and this is a big part of the chaos IMO).

That said, would anyone have been interested in a totally new encoding? For European languages which use mostly the same 26 latin characters with occasional diacritics and accents, UTF-8-with-incompatible-consumer degrades into occasional unreadable characters. But if your out-of-date browser or application gave you a "cannot decode this encoding" error, that might have caused a whole lot of pain during that transition. Not to mention that some of the same issues with OS/filesystem/language library interaction would probably remain.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact