> Unicode provides a unique code for every character, regardless of the language.
Moreover, this is not strictly true even after the generous reinterpretation (assuming "unique-under-normalization", "code point sequence", "abstract character" and "script") because Unicode still doesn't encode some scripts [1].
… and Han unification means that you’ll often get one code point representing several different “characters”, and you must convey the language out-of-band (e.g. via an XML or HTML lang attribute) for the text to be correctly understood, sometimes.
Yeah I've got beef with Unicode. It doesn't support CJK*. Since I work in games and there are a lot of Japanese games that want to be sold in the Chinese market (and vice versa), this is A Problem. I don't know where they got off thinking those character sets were the same, because if I treat them the same, I don't get paid.
*in my particular example, you can say unicode doesn't support Japanese, /or/ doesn't support Chinese. The answer depends on what font you're using. "Han Unification" affects more than just those two languages, but that's what I have experience with.
Moreover, this is not strictly true even after the generous reinterpretation (assuming "unique-under-normalization", "code point sequence", "abstract character" and "script") because Unicode still doesn't encode some scripts [1].
[1] https://www.unicode.org/standard/unsupported.html