> Java (and JavaScript) is outdated: if you were designing them today, their strings would not be UTF-16.
Except that UTF-16 makes a lot of sense on Windows, which won’t change anytime soon.
Since you always have to deal with noncharacters, initial vs. non-initial BOMs, isolated combining characters, etc., and you have to validate your inputs anyway (meaning you almost always need a failure path for unvalidated strings anyway), I’m not sure if (unpaired) surrogates constitute that much more of a complication.
It probably won't change soon, but Microsoft (or at least some teams in it) have acknowledged the mistake of UTF-8, AND they have taken some steps toward UTF-8:
Except that UTF-16 makes a lot of sense on Windows, which won’t change anytime soon.
Since you always have to deal with noncharacters, initial vs. non-initial BOMs, isolated combining characters, etc., and you have to validate your inputs anyway (meaning you almost always need a failure path for unvalidated strings anyway), I’m not sure if (unpaired) surrogates constitute that much more of a complication.