

Filenames and Pathnames in Shell: How to Do It Correctly (2010) - thefox
http://www.dwheeler.com/essays/filenames-in-shell.html

======
Adaptive
One of the reasons I like zsh is its copious expansion flags. If you aren't
wedded to POSIX or bash syntax, consider zsh for this reason alone.

In the case of the first example he shows, the cat -n problem, you can do the
following in zsh to expand the results of the wildcard globbing automatically:

    
    
        cat *(:A)
    

This will give cat full paths. Not significantly different from the PWD
referenced with ./* in his example, but more universal in applicability,
particularly when you start to use things like the file type filters:

    
    
        cat *(.:A)
    

giving you only files, not directories, and also expanding to absolute paths,
while

    
    
        cat **/*(.:A)
    

does the same for plain files in the working directory and all subdirectories
as well.

Remember to test your patterns with a print statement first:

    
    
        print -l **/*(.:A)
    

before passing them to a command.

~~~
sjolsen
>Remember to test your patterns with a print statement first

ZSH has an option to expand globs onto the command line the first time you hit
return, and issue the command the second. I don't remember what it is.

~~~
Adaptive
By default zsh will expand globs if you hit tab on the command line.

However if you have more than a handful of matches it's going to be a
suboptimal method of eyeballing the glob results.

I prefer to use `print -l` as it will break the glob results on separate
lines.

An additional benefit of ZSH is that it will normally return each glob result
as an independent item (this avoids the problem bash has with just returning
strings that in turn are broken on IFS characters). Using that flag with print
will give you a clear sense of the actual results.

------
deathanatos
Today I learned globbing happens after word-splitting.

Can someone explain the following:

    
    
       for file in ./* ; do        # Prefix with "./*", NEVER begin with bare "*"
         if [ -e "$file" ] ; then  # Make sure it isn't an empty match
    

1\. Why prefix with a "./" ? Is that just to help avoid the `cat $filename`
scenario? (i.e., that $filename will be "./-n" instead of "-n", and that

    
    
      cat -- *
    

is perfectly valid?)

2\. What's the -e check for? It says "an empty match" — -e means that the file
exists, but * would only return files that exist, so -e must (with some
caveats) be true. (The caveat being that there's a race condition between the
globbing and the test, but with the added test, there's _still_ a race
condition between the glob, the test, and the command execution. Are we just
attempting to minimize the amount of race-condition by testing?)

~~~
rjgray
1\. That's my understanding from the article.

2\. I think this is for the situation where the glob doesn't match, and the
nullglob shell option is not set. Without that option, a non-matching glob is
processed as a regular word. e.g. In an empty directory:

    
    
      $ for file in ./*; do echo $file; done
      ./*
    

Note the glob pattern is printed by the echo statement. The -e test catches
this condition.

------
white-flame
The original strength of Unix also ends up being such a commonly frustrating
feature: Everything is marshalled through strings.

With human-manageable strings comes ambiguities, especially in concatenative
situations like commandline expansion and SQL injection susceptible code.

There's really no good universal solution. Judiciously adding explicit
boilerplate as the article describes, or using less open-ended syntax which
ends up adding common syntactic overhead as well, are both more painful to the
user in common cases.

~~~
sjolsen
The solution is actually pretty straightforward: don't do everything with
text. The actual content you want to manipulate, sure, do that with text. But
there's no good reason to _structure_ with syntax except that that solution is
compatible with even the most arcane computer interfaces.

 _How_ you replace syntax-based structuring is the hard part, but it's not
impossible.

~~~
sukilot
What can you structure with besides syntax? Even data structures in memory
have a (binary) syntax

~~~
sjolsen
>What can you structure with besides syntax?

At the interface level, nested navigable fields.

>Even data structures in memory have a (binary) syntax

No, they don't. Syntax is the expression of structure through the arrangement
of the contents of a single sequence. Data structures as they are typically
realized express structure through the relationships _between_ several
sequences.

------
bch
Note also that some[1] commands will have a "\--" (dash dash) flag indicating
"end of flags", so (eg): "cat -- -n" really would cat a file called "-n"[2].

[1] on my BSD system, many internal Tcl commands honor this convention. Damned
if I can find a section 1 shell command that uses the convention, but I'm sure
I've seen them.

[2] my version of cat doesn't have a -- flag, so my example is contrived; not
sure if GNU cat differs.

EDIT: typo, perl(1) supports "\--". See perlrun(1) for details.

------
Scaevolus
Shellcheck will find a broad variety of unsafe shell operations, including
most (all?) of the issues on this page:
[http://www.shellcheck.net/](http://www.shellcheck.net/)

------
swatow
First step for doing things correctly:

    
    
      import shutil

------
artmageddon
I'd love to know if there's a Windows equivalent to this.

~~~
est
Windows's parameter switch is clever. It use slash, like

dir /s

Since / can not be in filename, so it avoids the problem compeletely.

~~~
vacri
I couldn't find a link, but a few years ago there was a problem with a virus
checker and a game that (errantly) triggered it (I read this in the game's
support pages). It turns out that the given virus checker would quarantine
executables to a holding file called "c:\program". This game's launcher was
quarantined by the virus checker to that location.

So, it turns out that when Windows wanted to launch things, it would find the
first exe it could, then apply the rest of the command as args. "c:\program"
comes before "c:\program files\", so every time a user went to launch a
program, windows would find the "c:\program" exe first, and apply the rest of
the string as args (" files/and/rest/of/string"). So the launcher would fire
up, and it ignored the args. For some reason I can't recall, Windows kept
looking for the right program and eventually it would launch as well.

So the end-user, on trying to run any application, would get that application
plus the game's launcher, all because of the crazy way Windows searches it's
path... well, when combined with a crazy virus checker behaviour.

Unfortunately I can't recall the checker or the game, sorry.

~~~
gear54rus
That's a scenario more common than it should be actually:

[http://www.commonexploits.com/unquoted-service-
paths/](http://www.commonexploits.com/unquoted-service-paths/)

There's even a hint of privilege escalation there (but not always: writing to
C:\ still requires root in most cases).

~~~
vacri
A much better explanation, thank you.

