Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What do you mean by valid characters? The file name usually can include any characters except the forward slash. There are, of course, filename limitiations, which are implementation specific.


Well, any character but 0x2f and 0x00. :P


I was thinking that - but more than that I was thinking of FSs that allow UTF8 or UTF16 characters - which do not allow invalid code points. But from this morning's research that seems to only be NTFS and UTF16, which is not exactly a "unix" FS.


I don't think that's strictly true. Most unix filesystems "allow" UTF-8 characters, specifically because they treat the filename only as an array of bytes and don't interpret that array at all. Perhaps NTFS does do some work to present it as UTF-16 codepoints, I don't know, but it's far from the only file system that allows this to happen.


He may have meant accept only UTF-8/16. That is the very nice thing about UTF-8 though is that it plays so nice with routines that can accept ASCII or Latin-1. You can't use old routines to character count/change case/etc., but at least they won't corrupt your string by accident.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: