> People’s names are all mapped in Unicode code points Curious how the author re...

zinekeller · 2025-03-08T06:27:47 1741415267

Actually, there is a case of this in China where (previously) both Unicode and the Chinese standard (GBK) was unable to encode Ma Cheng's (马𩧢) name. If you have Android, you might even see this in action (since that Noto still does not encode this one).

Currently, the solution going forward is to restrict what characters are acceptable as names.

hinoki · 2025-03-08T07:05:05 1741417505

Safari on iOS 18.1 doesn’t render the second character properly either. (Unless it’s a 口 :)

decimalenough · 2025-03-08T07:07:56 1741417676

There are lots of Chinese people whose last names use rare characters or variants that are not mapped in Unicode. Taiwan, which retains traditional characters and hasn't forced people to standardize as much as mainland China and Japan, is particularly notorious for this. There's also the whole Han unification debacle, where similar but not always identical characters used in Chinese/Japanese/Korean have been mushed together.

Support for some Indic scripts also remains quite patchy: https://modelviewculture.com/pieces/i-can-text-you-a-pile-of...

numpad0 · 2025-03-08T08:08:35 1741421315

I'm guessing the solution is to forget Unicode sequences as an identifier, and assign a hashed integer account number instead. Treat Unicode as non-textual data format for human use only, like how account pictures are. I think many web systems for mainly CJK users do so.