The safest path is to consider it a blob. There is some library that can render ...

spookthesunset · on March 26, 2021

The thing about Unicode is.... anybody who tried to do it “more simple” would eventually just develop a crappier version of Unicode.

Unicode is complex because the sum of all human language is complex. Short of a ground up rewrite of the worlds languages, you cannot boil away most of that complexity... it has to go somewhere.

And even if you did manage to “rewrite” the worlds languages to be simple and remove accidental complexity I assert that over centuries it would devolve right back into a complex mess again. Why? Languages represent (and literally shape and constrain) how humans think and humans are a messy bunch of meat sacks living in a huge world rich in weird crazy things to feel and talk about.

kps · on March 26, 2021

There are definitely crappy things about Unicode that are separate from language.

- Several writing systems are widely scattered across multiple ‘Supplement’/‘Extended’/‘Extensions’ blocks.

- Operators (e.g. combining forms, joiners) are a mishmash of postfix, infix, and halffix. They should have been (a) in an easily tested reserved block (e.g. 0xF0nn for binary operators, 0xFmnn for unary), so that you could parse over a sequence even if it contains specific operators from a later version — i.e. separate syntax from semantics, and (b) uniformly prefix, so that read-ahead isn't required to find the end of a sequence (and dead keys become just like normal characters).