Hacker News new | past | comments | ask | show | jobs | submit login

Unix paths are nul-byte-terminated, so UTF-16 generally doesn't make any sense in this context. Valid Unicode paths are encoded as UTF-8 on unix systems. UTF-16 and UTF-32 are invalid ways to encode Unicode paths. (That's not to say no one has tried to do it, just that it doesn't make any sense.)

(As other commenters have pointed out, Unix paths do not require a specific encoding, so robust applications cannot rely on any assumptions about encoding of existing files. But when creating new files, they must not try to encode paths as UTF-16.)




> Unix paths are nul-byte-terminated

That's the point. The separator is not "/", because "/" would be a character to encode. The separator is a specific byte, and so is the terminator.

> Valid Unicode paths are encoded as UTF-8 on unix systems.

There is no such thing as "unicode paths" on UNIX systems, valid or invalid.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: