Hacker News new | past | comments | ask | show | jobs | submit login

There's a very sound argument to be made for the opposite conclusion, that if we care about a problem we should make it necessary to solve the problem correctly or else stuff very obviously breaks, not have broken systems seem like they kinda work until they're used in anger.

Outside of MySQL (which unaccountably had a weird MySQL-only character encoding which only covered the BMP and named it "utf8" then when you tried to shove actual UTF-8 strings into it, they'd get silently truncated because YOLO MySQL) UTF-8 implementations tended to handle the other planes much better than UTF-16 implementations, many of which were in practice UCS-2 and then some thin excuses. Why? Because if you didn't handle multiple code units in UTF-8 nothing worked, you couldn't even write some English words like café properly. For years pretending your UCS-2 code was UTF-16 would only be noticed by people using obscure writing systems or academics.

I am also reminded of approaches to i18n for software primarily developed and tested mainly by monolingual English speakers. Obviously these users won't know if a localised variant they're examining is correctly translated, but they can be given a fake "locale" in which translated text is visibly different in some consistent way, e.g. it has been "flipped" upside down by abusing symbols that look kind of like the Latin alphabet upside down, or Pig Latin is used "Openway Ocumentday". The idea here again is that problems are obvious rather than corner cases, if the translations are broken or missing it'll say "Open Document" in the test locale which is "wrong" and you don't need to wait for a specialist German-speaking tester to point that out.




> There's a very sound argument to be made for the opposite conclusion, that if we care about a problem we should make it necessary to solve the problem correctly or else stuff very obviously breaks, not have broken systems seem like they kinda work until they're used in anger.

Oh, definitely :)

I'm rationalizing the focus on grapheme clusters, if I had my way "what is a string" would be a mandatory unit of programming language education and reasoning about this would be more strongly enforced by programming languages.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: