I have not personally encountered this problem but it's definitely there. The other problem historically is that Java didn't explicitly require clients to specify encodings explicitly when moving between strings and bytes. That's been cleaned up quite a bit in recent releases of the JDK.

All things considered Java character handling was an enormous improvement over the languages that preceded it and still better than implementations in many other languages. (I wish the same could be said of date handling.)

