
Fixing Unix/Linux/Posix Filenames: Control Characters (Such as Newline), 2009 - based2
https://dwheeler.com/essays/fixing-unix-linux-filenames.html
======
catern
The problem is not in the filesystem, it's in the shell. Stop writing shell
scripts for tasks that need to be robust! None of these problems hit you when
using a better programming language like Python. (Yes, Python 3 has a bit of
an impedance mismatch between its native Unicode string type and the
filesystem, but that just makes for slightly uglier code, not actual bugs)

~~~
blablabla123
I wonder how much effort it would take, to make the shell robust when it comes
to this. Obviously in Python 3 it's not a big deal to process UTF8 strings
with control characters, store them in a database and still have decent XSS
protection with no ("manual") input filter. But with Python you loose the
flexibility of pipes, combining commands tersely on the fly.

~~~
JdeBP
Of course, the problem here is in thinking that there is "the" shell, and that
one needs to always use a POSIX compatible shell for interactive login. (The C
shell users of the world should disabuse one of that particular notion, if
nothing else.)

* [https://news.ycombinator.com/item?id=11672023](https://news.ycombinator.com/item?id=11672023)

* [https://news.ycombinator.com/item?id=17989710](https://news.ycombinator.com/item?id=17989710)

------
j0057
"Filenames are reasonable" is in practice a perfectly workable assumption. In
nearly 20 years of using Linux, I've never – not once – been bitten by a
filename starting with a hyphen, or one that contains a newline or control
characters. It seems plenty dangerous to allow untrusted user input for naming
files, so just don't do that and you're fine.

I don't get why you'd want to limit yourself to POSIX in 2019. Do you really
need to run your script on AIX or what have you? 99.9% of cases you just want
GNU and macOS, as far as I know that includes the most important GNU tools.

Bash is really good enough for many tasks that involve gluing together a
number of utilities or shuffling files around. To do this in Python or Node.js
is to invite another world of hurt where you write easily 10x as much code and
need to worry about package management all of a sudden.

------
sys_64738
I've come across some crazy filenames under UNIX before and most originate
from SMB connections to PCs. Ones I've seen had '*' and '&' in their names
amongst other combinations. The craziest I ever saw had several '#' in their
name and that was interpreted by the bash script as a comment!

~~~
thisacctforreal
This isn't a problem if you properly use double quotes in your scripts.

------
usr1106
It's 10 years later, what has changed? Not that much. I am not aware of any
Linux distro that comes with an LSM to prevent nonsense/dangerous names.

All distros I have used use UTF-8 locales by default now, so that part of his
lengthy argumentation could nearly be removed. In Yocto though, non-ASCII
characters on the command line still cause havoc. Well, Yocto is not a distro,
but a way to build your own. So one needs to replace several standard tools
like vi, less, etc. from busybox by the "full" version.

------
dcassett
In the '90s there was a case of corporate espionage (theft of integrated
circuit IP) where the guy had copied the design databases into a unix
directory named '..^A' (control-A at the end) such that a 'ls -a' would show
two '..' entries.

------
based2
[https://lobste.rs/s/luehpr/fixing_unix_linux_posix_filenames](https://lobste.rs/s/luehpr/fixing_unix_linux_posix_filenames)

