Hacker News new | past | comments | ask | show | jobs | submit login
Unix Wildcards Gone Wild (defensecode.com)
269 points by 0x0 on Aug 17, 2014 | hide | past | favorite | 51 comments



David A. Wheeler, FOSS (and occasionally security) luminary, who also happens to be creator of the popular sloccount tool, has an excellent page that covers this topic and how to use paths safely and portably in shell scripts (spoiler: it's hard) :

http://www.dwheeler.com/essays/filenames-in-shell.html

I strongly recommend his many other essays to HN readers: http://www.dwheeler.com/

Edit: a simple way to avoid these problems is to prepend the wildcard with ./ (so globbed files won't start with - or -- but with the path ./) and on GNU systems put -- before the wildcard, telling the tool that following arguments are not options.


The double dash -- to stop option parsing is not a GNU thing – it comes from the POSIX standard.


Thanks! I had always figured that, along with long options, they were a GNU thing.


There's some digging here: http://unix.stackexchange.com/questions/147143/when-and-how-...

Looks like the convention was introduced as part of getopt in AT&T Unix System III (1980), though initially only in a handful of places. Then POSIX adopted it as standard for all utilities sometime later.


They were quite distinct.

When --longopt was starting to get popular (in an ad-hoc way) for gnu and other utilities there was a poll (gnu.misc? late 90s?) whether to make it a gnu standard. The other choices were a +, or have commands not accept both bundled single letter options and longopts. There may have been other choices I've forgotten about.

The double-dash won. There were some people who were concerned that it might be confused with the end-of-args double dash.

Maybe that is true but they are certainly unambiguously parsed.


> portably in shell scripts (spoiler: it's hard)

At what point do you give up and, if you must have portability, bootstrap a saner environment?

Are there many options in this area? Assume perl is available? Use autoconf-like shell script compiler? Starting with Lua (which I hope/assume can build anywhere, though I know it's missing lots of functionality out of the box)?


I'm lucky not to have to worry about this, but I figured that having Perl is a safe assumption for desktop and server Unix systems.

Not embedded devices, but I guess those are different enough that you're probably not targetting them for portable scripts. Oh no, have I contradicted myself? :)


"I'm not sure how to put it mildly, but I think you might have been scooped on this some 1-2 decades ago..."

http://seclists.org/fulldisclosure/2014/Jun/136


I really wonder who this person asked, that was an "old-school Unix admin", that didn't know of this attack. This article also doesn't mention the countermeasure which is available in every utility I know of: the -- argument, which disables parsing of all further arguments and treats them just as filenames.


Even if you're using a tool that doesn't support "--" you can just use "./*" and everything will be fine.


Many of these gotchas have been known for quite a while. I suggest, for people who still don't know it, to read the UNIX-HATERS handbook

- homepage http://homes.cs.washington.edu/~weise/unix-haters.html

- working download link http://richard.esplins.org/static/downloads/unix-haters-hand...

A lot of it is outdated, and yet many things are still incredibly relevant


Many people are recommending the '--' option-terminating option. Note David Wheeler's caution (in the essay already linked by AceJohnny2 at https://news.ycombinator.com/item?id=8190208) about why this is not an all-purpose solution: http://www.dwheeler.com/essays/filenames-in-shell.html#dashd....


The reasons given there aren’t really any good:

  1. For “--” to work, all maintainers would have to faithfully use “--” in
     practically every command invocation. That just doesn’t happen in real
     life,  even after decades  of people trying.  People forget it all the
     time;  no one is that consistent,  especially since code seems to work
     without it. Very few commands require it, after all.
So because other people may or may not forget it, I shouldn’t use it in my scripts/day-to-day usage? That’s about as silly as saying that, because other people will write unreadable code anyway I shouldn’t bother with comments, short functions or sensible variable names.

  2. You can’t do  it anyway,  even if you  were perfectly consistent; many
     programs and commands do not support “--”.  POSIX even explicitly for-
     bids echo from  supporting “--”,  and echo must support “-n”  (and GNU
     coreutils echo supports other options too).
This is a problem, but iff you have to use echo for some reason. printf works nearly equally well and supports -- just fine. I may have read somewhere that using printf is actually advocated nowadays, but I’m not sure where and why.


> So because other people may or may not forget it, I shouldn’t use it in my scripts/day-to-day usage?

He specifically mentions that that is not his point, but rather that he is arguing against exactly the sort of "just use '--'" response that one can find in this post:

> Do feel free to use “--” between options and pathnames, when you can do it, as an additional protective measure. But using “--” as your primary (or only) mechanism for dash-prefixed filenames is bad idea.


The excellent Shellcheck static analysis tool will save you from many of these potential pitfalls:

^-- SC2035: Use ./* so names with dashes won't become options.

http://www.shellcheck.net/about.html


I have worked with Linux systems since I was able to write and have heard about this the first time. Thanks HN \o/


   echo rm *

   echo chown -R nobody:nobody *.php

   echo chmod 000 *

   echo tar cvvf archive.tar *

   echo tar cf archive.tar *

   echo rsync -t *.c foo:src
This article starts with the premise that the person executing the command has no idea what files are in the current working directory. That is itself a more serious problem than the behaviour of wildcards.

Later in the article we learn that it also assumes GNU utilities. That is a second problem (IMO), and arguably also one more serious than the behaviour of wildcards. GNU userland and unneeded complexity (e.g. more features than any user will ever use) are practically synonymous.

Then there is the peculiar assumption that someone can place arbitrary files beginning with - or -- on this system. That itself is a far more serious problem than the behaviour of wildcards; I would say with that capability it is more or less "game over". In BSD you have, at the very least, mtree. How does the Linux user know she isn't executing some substituted executable?

Moreover, if caution was important to the hypothetical user in the examples, I think they would be in the form

   /path/to/program *


> GNU userland and unneeded complexity (e.g. more features than any user will ever use) are practically synonymous.

Ahhhh, the memories. These accusations bring me back to the early 1990's. Remember? Remember how it was? Oh, boy, how did we all get so old? To be young and running System V again...

And still today, whenever I'm faced on a system without GNU utilities, I find myself installing them to get what seems to me like basic functionality. We have never changed, have we?


To be young and running VMS again...


> This article starts with the premise that the person executing the command has no idea what files are in the current working directory. That is itself a more serious problem than the behaviour of wildcards.

Substitute `person executing the command' with `simple script', and you've got a better justification. I don't want my simple scripts to break down in the face of silly filenames.


I keep forgetting that GNU echo has an -e option, which, if I'm not mistaken, makes it behave like printf. (Why does it need this feature when there also exists a builtin printf? Nevermind.)

Anyway, I didn't think about what happens if you create one file called "-e" and then do

    echo *
Anyway, as someone else pointed out, using ./* instead of * will defeat the "exploits" in the article.


I remember when I used pdfimages to extract images from a pdf file and I didn't read the manual beforehand. It turned out that you should call it like pdfimages <pdf_file> <prefix> and if you don't specify a prefix it generates filenames in the form of -img001, -img002, ... (or something like that). I had a hard time deleting those images.


Tangentially, anyone know how to make zsh less greedy about parsing wildcards? Something like this will fail with "no files matched", and the command won't run:

    rsync example.com:/foo/* .
My workaround is to quote the argument, but it's annoying.


escape it, since you know the shell is going to be greedy about things.

rsync accepts globbing it's not a shell expansion that makes it work.

     rsync example.com:/foo/\* .


Well, the problem is that "example.com:/foo/*" isn't actually a local path that can ever be globbed by zsh alone -- it's an rsync/ssh/scp-style remote path, yet zsh interprets it as such. Not sure why. I think the solution is make sure zsh ignores arguments containing colons, but I don't know what the config option is for that.

Edit: Just remembered that zsh's over-eager globbing also fails with git -- eg., "HEAD^" must be quoted.


IIRC zsh has an option for this (if a wildcard has no matches, it just gets left in its original form), but I don't remember what.

There's also `noglob`, which disables glob expansion for a command (e.g. `noglob echo 3*4 | bc`).


Unfortunately, this has some unwanted side effects (see https://github.com/robbyrussell/oh-my-zsh/issues/2945). The noglob option might work, if it doesn't prevent scp from working with local wildcards.


How about one of

    rsync -a example.com:/foo/. .
    rsync example.com:/foo/"*" .


Good summary of surprising behavior. You can work around a lot of these issues using "--" as an argument before you use any wildcards. This tells most commands to stop processing options and treat the rest of the arguments as files (or whatever other non-option arguments the command takes). That's getopt(3)'s behavior[0]. For example, "rm -- *" will not have the problem where directories are removed if there's an entry called "-rf" in the directory.

[0] http://pubs.opengroup.org/onlinepubs/9699919799/functions/ge...


A better workaround is to prefix your wildcarded arguments with ./ as this will work with all commands. For example, rm ./* will safely remove files regardless of how they're named.


Thankfully on BSD the options are passed before the file names.

For example chown username:username files directory -R

Doesn't actually work. You have to move the -R to before the usernames. chown -R username:username files directory.

Same thing with rm.


This is why I not only use "--" everywhere, I also religiously use full quoting of "${vars[@]}" and options like mv(1)'s --no-clobber when appropriate. Even without the security concerns, this kind of "least privilege" approach can help prevent a lot of really-annoying bugs.

That said, I going to have to check a few scripts for that chmod attack (or similar) - I think I've seen that type of attack before, but I must have forgotten about it... sigh


This is one of the reasons sudo should (by default) only allow a whitelist of built-in commands to be run with wildcards.

Somewhat like sudoedit.

This is of course for the corporate case of a less privileged user performing a certain task at elevated privileges. Not for the more common use of sudo (these days) of people managing their own personal machines.


It's the shell that expands the wildcards which are passed to sudo as arguments.


This is a dumb question.

Can files be created with "-" in Unix? I am using a Linux system and I am not able to do so using vim or touch commands.


You can also use the "touch ./-filename" syntax.


Try this:

touch -- -asdf


That works. Thanks :)


I think if you have the ability to create new files on a remote host, you are already compromised. No need to wait for an Admin mistake.


There are hosts with multiple users on them, who have some level of write access to somewhere on the filesystem. After all, Unix is a multi-user system, so it is not heard of to have multiple users on it. That being said, this article is just stating to be careful of wildcards when you are sitting in a user-owned (or user-writable, such as /tmp) directory.


Not necessarily, you could make a .zip/.tar file with filenames like these that could trip up the end user trying to clean up after unpacking.


Was this bug already fixed in OpenBSD?


There's a reason this isn't talked about, it's not an actual legitimate/common vector to compromise a server.


Oh it is. Many people do e.g. backups of all userdirs by using tar * as root inside a cron, thus rendering themselves vulnerable.


Yes, because no user-uploaded filename will ever be parsed by a maintenance script in a server.


Not by any sane person, no. Why are you letting people upload and name their own files on your server? Should we be posting articles about the vulnerabilities in the finger daemon or Solaris 8's NIS implementation, while we're at it?

It seems like this article is aiming towards shared servers where you actually allow shell login to "untrusted" users, which IMO is a relic of days long past, that only really persists at (maybe) universities. Hell, even at the university I last worked at 6 years ago we just gave everybody their own VM. And nowadays I wouldn't even need to give them that... students can download and run a vagrant environment on their personal machine in like 2 commands.

It's not to say there's no audience for articles like this... I'm sure there's plenty of environments out there that still follow the multi-user server model from the 1970's, and I certainly pity anyone who has to administer those types of systems. But it certainly should be no surprise to anyone that there's a lot of malicious things you can do if you have shell access to a system (or to your point, the ability to upload arbitrarily-named files with arbitrary content. shudder)


This kind of article is good to keep bubbling up over time, to educate new users in best practices. Not everyone is a 20-year unix admin that's seen a bit of everything.


Lets say you have web server, this web server have web application that allows users to upload files. Web application is sane and doesn't allow path traversal and have proper .htaccess inside of the upload directory.

Now all you need is a user who kindly requests for a copy of uploaded files. If you're not aware of this issue (and you must be "actively aware" i.e. watch out for it all the time) you can do something that shouldn't be possible.


I'm a bit late replying, but I wouldn't consider an application that allows users to upload files, and pick their names to be a sane application. Do you think imgur (as an example) lets users name their files? What about stuff like the defunct megaupload or other file sharing sites? You get to pick filenames but that's really metadata... the URLs and (presumably) the underlying file storage structure are database-driven.


Could you go into more details ?


In most environments it would be insane to allow anybody untrusted to put files on your server. That includes a trusted sysadmin extracting unknown tar files on your server. It's a case of "if they got this far, you're already fucked."




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: