Hacker News new | comments | show | ask | jobs | submit login
Common shell script mistakes (2008) (pixelbeat.org)
166 points by pmoriarty 500 days ago | hide | past | web | 79 comments | favorite

Shout out to Shellcheck (http://www.shellcheck.net/), a linter for shell scripts

If you're on Sublime, this works like a charm!


If you're on Vim or Emacs you already (maybe) know you have integration with shellcheck

Emacs - https://github.com/koalaman/shellcheck

Vim - part of syntactic - https://github.com/scrooloose/syntastic/blob/master/syntax_c...

The google shell style guide linked now lives here and is probably an even better starting point than this article:


The google shell style uses bash which is not portable, one of the main point of the OP article.

Bash is the default shell on the most popular server nix (Linux) and the most desktop nix (OS X, though it's always a slightly old version).

If you use OpenBSD and hate ports, then maybe ignore all the good things bash brings and write everything in Bourne shell. If not, then write bash. Just fire it up with '/usr/bin/env bash' to label it correctly.

"OS X, though it's always a slightly old version"

Apple ships the last version that had a GPL2 license. That version gets older by a day every day.

On OS X 10.11.2, it seems to be bash 3.2 (http://opensource.apple.com/source/bash/bash-99/, via http://opensource.apple.com/release/os-x-10112/), which is from October 12, 2006 (https://lists.gnu.org/archive/html/info-gnu/2006-10/msg00006...)

I wouldn't call that slightly old.

Bourne shell being replaced as /bin/sh by dash doesn't change anything: Linux and OSX don't use /bin/sh as the default shell.

> Linux and OSX don't use /bin/sh as the default shell.

You didn't even read the first sentence of the first hyperlinked page.

Yes I did. Proper shell scripts specify their actual shell - ie, bash scripts use bash as their interpreter. Your linked article mentions this explicitly.

Do, again: whether bin sh is Bourne or Dash doesn't change anything, because a properly written script will specify its interpreter.

I believe xe meant "default login shell". Also, "Bourne Again shell", while we're complaining about missing words :-)

Xe would have been a bit daft to mean that, given the headline topic as well as that the conversation so far has dealt in shell scripts and shell script interpreters, to which the choice of login shell is almost wholly irrelevant.

My username is nailer not xe, but you're right, I meant default login shell when I wrote 'default shell', rather than what bin sh is linked to.

Even if bin sh links to bash, bash will still run a bin sh script in Bourne shell mode, stripping newer features, in some cases. Hence why proper scripts alway specify bash explicitly.

However I meant Bourne shell when I wrote 'ignore all the good things bash brings and write everything in Bourne shell.' Bourne shell being the original /bin/sh (though it's murky).

>If you use OpenBSD and hate ports, then maybe ignore all the good things bash brings and write everything in Bourne shell. If not, then write bash.

If Bourne is not comfortable or powerful enough for the task you're trying to do, you're using the wrong tool and Python is probably what you should be using.

Yeah I'd do most systems work in Python too, but bash has some useful safety options in shopt, some neat syntax items like ..., arithmetic subshells, etc. which can make it fine for basic scripts.

I guess you think Bourne shell -> Python depending on complexity. I think bash -> Python depending on complexity.

>which can make it fine for basic scripts.

As long as it's for your own system administration, sure. The moment it's being used for software intended to be used by third parties, no, the niceties of bash over bsh are not worth it.

> the niceties of bash over bsh are not worth it?

Worth what? What do you lose?

...portability? That's the whole point of this conversation :)

So many edge cases. I prefer to just use perl if I am doing anything more complicated than making a wrapper for some other executable. Harder to screw up, easier to test, same expressiveness, better portability (works with all shells and with windows), plus you get regular expressions for free.

Python is a good alternative as well.

Perl is brutally under-rated as a scripting tool these days. I suppose it's passed into software orthodoxy by this point, the notion that perl is a uniformly horrific tool akin to nuclear waste.

I think that this is because people are finding Python equally well suited for such tasks and more modern. Perl (easy to) obscure syntax is not helping either.

If you're doing a really grungy, grotty text mangling task in a UNIX pipeline, suddenly all those crazy Perl features start making sense. The reason why Perl is arguably not the best choice for other tasks, such as creating large systems, is precisely that it's actually got many distinct advantages over Python for that sort of task. It's also really good for "I have a UNIX pipeline but this one chunk of it needs a database and this other chunk needs a JSON file and a few other things that shell just can't do".

For instance, for the first time in 15+ years, I recently had a reason to use Perl's half-deprecated, off-handedly-included form printing support, and, you know what? It's really nice if that's what you want. Many dozens of lines of code replaced by what is basically a picture of what I want.

Python is a great language, but every once in a while "explicit is better than implicit", which is generally a philosophy I agree with wholeheartedly, results in a lot more "explicit", in the form of "code I had to write", than you might be looking for. (Or, alternatively, you start writing Python that is a lot more implicit, which isn't exactly a huge victory for Python here.)

Nowadays I'd also suggest that if you find yourself having a lot of those tasks, you need to ask some questions about how you're storing your data. But pretty much everybody ends up with some of these somewhere; it's hard to eliminate them, and not really worth it (due to diminishing returns).

you can't learn perl 5 (as a new programmer) because of perl 6, and the last time I did a search for a perl 6 solution it didn't work for me, I actually ended up installing perl 5 and just doing it that way. I remember now, I literally googled "perl 6 installer windows" and clicked the first link and within 10 minutes couldn't install it. I think the installer just hangs - http://rakudo.org/how-to-get-rakudo/

so I just installed perl 5 and did it my usual way. But if I didn't know Perl already it would be a non-option. it's just not a language anymore, it's a legacy thing that if you already learned you can sometimes still use.

Although Perl6 exists, Perl5 is completely unfit for any purpose on the modern world. And just the memories from helpless hours debugging any simple 30 lines script means that I'll resort to Haskell before trying Perl6 for text manipulation uses.

An younger developer may think differently, but I do think the 'Perl' brand is already too damaged for surviving the transition, even if Perl6 is better than modern languages (what I didn't see anybody claiming yet).

political correctness easily gets in the way of the serious warning that should go out to the younger folks: stay the f*ck away from bash and its siblings! do not ever think it is a solution worth considering! abandon!

instead use something with a cross-platform interface (Python, Chrome or Lua, whatever), compile your own executable or use a compiler that targets bash.

besides, it is only a matter of time before the systemd trojan invades this area, and your newly acquired skills will be obsolete </sarcasm>.

Great post. I disagree on using the concise form of the if statement, however.

  if [ "$var" = "find" ]; then
    echo "found"
Is far more readable than its equivalent

  [ "$var" = "find" ] && echo "found"
I understand the upside of readability. What does concision get me?

concision gets me less code to read. that's how i define "readability". but if forum commments are any indication, i know my preferences do not follow the norm.

most programmers seems to prefer verbosity.

however in my case verbosity slows me down.

>concision gets me less code to read. that's how i define "readability".

That's a bad definition. Readability should be: "code easier to read", not just "less code to read" -- since less is not always easier and can even be much harder.

Case in point: the J language.

Or "magic" constructs that do too much under the scenes and don't let you immediately understand what a part of code is doing.

By that logic you find minified javascript easier to read too.

This is not @pwd_mkdb's logic though, it's you who brought this to the extreme.

Not really. @pwd_mkdb said he defines readability as "less code to read".

If that's the only criterion, then the parent just followed the logic. Or let's put it another way, if the definition of readability can't survive taking it to the extreme, it is flawed.

> If that's the only criterion, then the parent just followed the logic.

wodenokoto offered up some reductio ad absurdum and you are calling it logic.

It was not the only criterion. It was the only criterion that was explicit. There are implied criterion and most of us understand what pwd_mkdb means even if we don't necessarily agree with him.

>It was the only criterion that was explicit.

Well, even in itself, it is wrong.

It's not just that other criteria apply too -- it's that it alone needs several caveats, as succinctness is quite orthogonal to readability (e.g. sometimes even needless boilerplate syntax that the compiler could infer by itself, make for better readability when present).

> Well, even in itself, it is wrong.

I am not in disagreement with you about whether it is wrong, I am in disagreement with you about why it is wrong.

The counter argument was that Y is wrong and X is Y, therefore X is wrong. While this is a valid argument, it is not sound, because X is not Y.

> ... succinctness is quite orthogonal to readability ...

There is a relationship between succinctness and readability. The relationship is definitely not directly proportional as the pwd_mkdb's post could be read to imply, but to say there is no relationship between the two is flatly absurd.

I define readability the same way the Linux kernel style guide does: Avoid tricky expressions.

Something very 'readable' is something read and understood easily. Python is considered very readable, for example. I would not consider Python verbose. C++ Template programming is verbose and it is not particularly pleasant to read. At the same, APL or J or something can be very concise and is largely unreadable to most. Notice how Python, which is considered by most to be readable, is somewhere in between.

If the command is as short as "echo 'found'", then that one-liner is easier, but if it's a nice, lengthy command, I find the 'full' version easier to read.

Is ts cmmt esr t rd?

Is it "more readable" or "more familiar to you"? They're both familiar to me and I find them equally readable.

Concision gets you fewer places to possibly have a bug, and fewer things to possibly miss/misunderstand.

Bash is the love of my life! I have been working for years on this problem now (not full-time of course), gradually moving in the direction of finally being able to challenge this:

"Inappropriate use

shell is the main domain specific language designed to manipulate the UNIX abstractions for data and logic, i.e. files and processes. ... Correspondingly, please be wary of writing scripts that deviate from these abstractions, and have significant data manipulation in the shell process itself. While flexible, shell is not designed as a general purpose language and becomes unwieldly when ... "

Another person has actually solved the most important show stopper already: http://ctypes.sh.

What now remains to be solved, are a few minor, additional details, and then simply writing a good manual of how to very successfully use bash as a general-purpose language.

My personal belief is that everything that you can do in other scripting languages, you can also do in Bash, only better.

Ok, I'll bite.

>> My personal belief is that everything that you can do in other scripting languages, you can also do in Bash, only better.

1) Native JSON, XML

2) Classes, namespacing, objects

3) Multiprocessing, multithreading

4) Performance

5) Package management

6) Portability

7) Documentation

8) Runtime debugging (!set -x)

I'm too tired to continue.

>3) Multiprocessing

IMO shell makes it very easy to work with multiple process (&). It's built in and natural.

>4) Performance

If you are carefull and know what you're doing, you can achive very good performance with the shell. Usually, better performance is achived processing less data, ie being inteligent. Rarely depends on the language (unless you care about cycle level performance, then yes :).

>6) Portability

I claim that it's way easier to depend on sh being on a (UNIX) system than $SCRIPTING_LANG.

>7) Documentation

?? You can mess up documentation in any language.

Shell makes it easy to spawn multiple processes. It makes it reasonably easy to read those processes' standard out or standard error, though it's not that much fun to try to do both at the same time while keeping them distinct. [1]

It pretty much doesn't do anything else that you might want to do with multiple processes, though, and it tends to encourage multiple processes to communicate via text which is a problematic limitation that one often finds oneself "working around".

Shell is really powerful, but it hits a certain limit of what kind of tasks it can do and it hits that limit hard, and that's why when one imagines orchestrating many processes on a machine to do some task, to say nothing of orchestrating many processes on many machines, you don't see solutions based on shell, and indeed the very idea is laughable. Shell is best used by making sure it stays firmly restricted to the domain it shines in and not so much as trying to dip a toe into the spaces where it is not.

[1]: Note "not much fun" != "can't". Shell is fundamentally written around the idea that a process has one stream STDOUT that may go to other processes, and one stream STDERR which is generally intended to go to the console (or other user output like a log) no matter how complicated the pipeline. While you can get both streams and do things to them, you're starting to fight shell, which really wants to create pipelines with one "through" path with no branches out.

I think with the shell you have to adapt your abstractions to the "unix-way". For example, a queue to process will be a directory with N files, and each file can be processed in pararell by just something like "for f in dir/*; do process.sh "$f" & done;" but yeah ... it has limitations like everything.

With regards (3), my problem in shell is that it is very hard to spawn children without risking overloading the machine.

What I would like in bash is some easy way to limit the number of background processes I can spawn, and to just wait when I try to start another one until an existing one is finished.

Some simple jobs can be converted to use xargs -P, but for more complex things I end up having to do them without parallelisation, so I don't end up spawning 100s of background processes and bring my computer to it's knees.

Yes ... I think that should not be allowed (bring down the machine by a non-root user process). In Linux CPU_GROUPS/MEM_GROUPS can help, and the fair scheduler has improved the situation a bit from the old days where a fork bomb will bring the machine down.

But limiting the # of spawned children is possible using not so complicated ad-hoc solutions, but I guess it depends on the specific problem.

my personal belief is that anything one can do in bash i can do in sh. not sure if that's really true in practice, but that's my belief. i never use bashisms because i do not know what they are or how to use them.

You're just having a laugh at the OP, right?

In case you're not, here are some "Bashisms" that really suck to be without:

* built-in regex support (e.g. `[[ $var =~ ^1\.2\.[34]$ ]]`)

* process substitution (e.g. `diff <(before_command) <(after_command)`) and all sorts of other redirection tricks

* indexed and associative arrays

Some of this can be worked around by shelling off to grep for regular expression matching or awk for arrays, but Bash makes things so much cleaner and maintainable.

Add to that better error handling.

AFAIK in POSIX /bin/sh it's not possible to detect if a process that writes into a pipe exits with an error status.

bash has "set -o pipefail" and "$PIPESTATUS" for that.

But once you adopt bash-only features, you're losing the main argument for a shell script: portable scripting without the need to first install something to get something else running. Once you require Bash, it's equally easy to demand Perl and that will provide a much richer scripting experience.

When it this last time you logged into a box and /bin/bash wasn't an option? Let me guess 1999 on a SPARC box running Sun Solaris?

If you limit yourself to either a popular Linux distro or one of two Unixes/Unix-likes, then bash can be available out of the box. Just as C is not C++, shell is not bash, so a shell script is what runs on (d)ash, busybox, toybox, ksh, bash, zsh, etc., without modifications. If it requires bash or zsh, then it's not a shell script but a bash/zsh/fish script. To name a popular non-Linux OS, take a fresh FreeBSD or OpenBSD install. No bash to be found, unless installed via ports and rightfully so. That said, I use bash myself all the time as an interactive shell but /bin/sh is not bash. sh (including bash and zsh) are terribly hard to write correct and resilient scripts in, and even rc is much saner to script in.

> To name a popular non-Linux OS, take a fresh FreeBSD or OpenBSD install. No bash to be found, unless installed via ports and rightfully so.

And even if you do install it, it still won't work with scripts that specify /bin/bash (which is a lot of them, thanks to sloppy tutorials), since it will be in /usr/local/bin and not /bin.

> When it this last time you logged into a box and /bin/bash wasn't an option?

A few hours ago; and before that, last Wednesday. The latter says:

    JdeBP % /bin/bash
    zsh: no such file or directory: /bin/bash
    JdeBP %
The Z Shell is an add-on here, too. /bin/sh is the Korn shell. No, it's not OpenBSD.

today. the various BSD systems all come without bash in their default configurations (and it's never in /bin).

this should not be construed as an argument in support of /bin/sh.

On QNX all you get is ksh, which has little support for bash-isms.

Do embedded systems count? ;)

I'm having trouble remembering the last time I saw a BSD box during my work day, unless Juniper gear counts. No knock against BSD but seriously how many people are writing POSIX shell scripts because they need their shell scripts to work with Linux, QNX and BSD?

You should be first asking how many people are writing POSIX shell scripts because they need their shell scripts to work with Debian and Ubuntu.

* http://unix.stackexchange.com/a/250965/5132

* https://debian.org/doc/debian-policy/ch-files.html#s-scripts

* http://manpages.ubuntu.com/manpages/xenial/en/man1/checkbash...

* http://manpages.ubuntu.com/manpages/xenial/en/man1/posh.1.ht...

The answer is, of course, lots of them; as there was a massive project to do exactly this in Debian and Ubuntu some years ago.

A few, but there are still cases where it's best to stick with the plain old Bourne shell syntax.

When I mentioned embedded systems I primarily thought of Linux/Busybox-based devices, like OpenWRT. While one surely can have bash there, usually base image doesn't contain it.

Same story about the most common Linux-based OS out there: Android. And, while it's no one manually runs shell scripts there, apps quite frequently exec() things, so shell scripting still matters there - a tiny bit. Also, shell scripting is heavily used in firmware upgrades/patches, as well as a glue for the root/unlocking hacks.

If you plan to write a redistributable component or an useful hack for such platforms, you'd best stick with a very limited subset of POSIX.

> My personal belief is that everything that you can do in other scripting languages, you can also do in Bash, only better.

Matrix multiplication?

Effective bash programming involves invoking other Unix commands. That's what the shell is for, after all. I don't know off the top of my head what you'd use for matrix multiplication; I guess it depends.

For another example, jq and jshon are excellent tools for querying, transforming, or creating JSON values.

My experience is that programming by composing Unix programs that communicate via pipes and files can be very pleasant and productive.

Sometimes I lament aspects of bash syntax and semantics, but it's also much more convenient than other scripting languages for tasks related to spawning programs and pipelines and working with files.

One of my recent projects that I've been having a lot of fun with is a kind of continuous integration system built mostly with bash scripts. It involves a lot of calls to git and Docker, process manipulation, file descriptor redirection, and file system manipulation—and bash makes all this pretty easy and concise.

You're missing out on Zsh ;)

The big missing bashism in the list is the use of &> or >& to redirect both standard output and standard error.

In a POSIX conforming shell,

    &> word
is the same as

    & > word
since '&>' is not a distinct token; it puts the command in the background and redirects standard output.

The more commonly seen >& will bite you even in bash. What does this do?

    >& $FILENAME
Answer: in bash, it depends on the spelling of the file name.

Note that bash accepts both even in its so-called ‘POSIX compatible’ mode.

> please be wary of writing scripts that deviate from these abstractions, and have significant data manipulation in the shell process itself.

I appreciate he mentions this first. Shell scripting is excellent in its domain, just as SQL is great in its.

You would write an application in SQL and you shouldn't write one in shell.

> Just test the string directly like [ "$var" ] && echo "var not empty"

Will this work if you've activated set -u, fail on unset variables? Will the !-z variant do?

If `set -u` is enabled, the aforementioned test will fail on unset variables. To avoid this, you can "declare" `var` at the beginning of the script, e.g. `var=''`. Also instead of `[ ! -z "$var" ]` I prefer `[ -n "$var" ]`, which is the same as `[ "$var" ]` but more explicit.

"Writing shell scripts leaves a lot of room to make mistakes, in ways that will cause your scripts to break on certain input, or (if some input is untrusted) open up security vulnerabilities. Here are some tips on how to make your shell scripts safer.

Don't" [0]

[0] MIT, writing safe shellscripts ~ https://sipb.mit.edu/doc/safe-shell/

That temp file example is a vulnerable to symlink attacks.

Is there a modern "JavaScript: The Good Parts" for bash? There are so many ways to do things and it's often hard to tell which is preferable.

Not sure if this is what you're looking for, but it's where I tend to go when writing something.


I'm a big fan of this guide. I wish every language had such a clear best practices list. See also http://mywiki.wooledge.org/BashPitfalls

Not exactly "The Good Parts" for bash, but one might want to look at Turtle library:


Related: http://www.haskellforall.com/2015/01/use-haskell-for-shell-s...

Written in 2008.

Shell hasn't changed much, if at all, since, has it?

nope. reminds me of comments where someone is purporting to be able to assess the quality of software based only on looking to see when the last changes to the source code were made.

I'm not saying that is a valid strategy, but man I see more bad old code than new code.

Thanks; added.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact