I tried using U+13161 EGYPTIAN HIEROGLYPH G029[1], which resulted in a string of length 2 as expected.
Using both chars (code units) and just the first char (code unit) worked equally fine. In Windows Explorer the first one shows the stork as expected, while the second shows that "invalid character" rectangle.
So yeah, treating filenames as nearly-opaque byte sequences is probably the best approach.
I tried using U+13161 EGYPTIAN HIEROGLYPH G029[1], which resulted in a string of length 2 as expected.
Using both chars (code units) and just the first char (code unit) worked equally fine. In Windows Explorer the first one shows the stork as expected, while the second shows that "invalid character" rectangle.
So yeah, treating filenames as nearly-opaque byte sequences is probably the best approach.
[1]: https://en.wiktionary.org/wiki/%F0%93%85%A1