APFS does not normalise unicode filenames (mjtsai.com)
38 points by okket 1 hour ago





I agree that this is a good change. Unicode, normalisation, character encodings, etc. should really be handled at the presentation layer, and everything below that just treats filenames as sequences of bytes, perhaps with one or two exceptions like '/' and \0.

It is interesting to consider a theoretical system in which paths are represented in 0-terminated count-length format (e.g. "foo/bar/baz/myfile.txt" would be "\003foo\003bar\003baz\012myfile.txt\000"), truly allowing any byte in a filesystem node's name, although that might be going a little bit too far.

Things are much easier for the file system if it can just treat names as bags of bytes.

If you're really talking about bags (unordered sets), that would certainly make for an interesting filesystem since filename.txt, filemane.txt, and maletent.fix would all be the same...

This is a big change. I guess they now decided that compatibility with external systems is a more important goal than end-user-friendliness.

It’s a reasonable decision to come to, but it will cause quite a bit of churn in the short term.

