Hacker News new | past | comments | ask | show | jobs | submit login
Defensive BASH programming (kfirlavi.com)
114 points by jdkanani on Dec 15, 2015 | hide | past | favorite | 52 comments



The only thing in this post that can be accurately called defensive is the use of "local" and "readonly". The rest is all just style preferences, which are rather subjective, and none of which are very appealing to me.

Three real defensive bash programming tips are:

- Quote all uses of variables

- set -o nounset

- set -o errexit

And many others can be found in and around http://mywiki.wooledge.org/BashFAQ


While not as common as nounset and errexit, pipefail is a useful option as well (set -o pipefail).

Using pipefail, if any program in a pipeline fails (i.e. exit code != 0), then the exit code for the pipeline will be != 0.

E.g. pipefail can be useful to ensure `curl does-not-exist-aaaaaaa.com | wc -c` doesn't exit with exit code 0..!


You can set all three of them in a single line. Set up your Bash template with this today:

    set -o nounset -o pipefail -o errexit


I usually shorten this to:

    set -eu -o pipefail


I used to do that but the long versions are more understandable by other people who will look at the script.


Note that this can be shortened to "-e" and "-u":

    #!/...
    set -eu

    # ... Main part of your script ...


I'd used "set -e" and "set -u" before, but I had never seen it written as "-o nounset" and "-o errexit".

The latter makes it clearer exactly what features are being enabled, and it's a bit of a false economy to try and "shorten" the script like this.


Very interesting article and I agree with most of the points. One point I strongly disagree is what is called "Code clarity" where the author replaces conditional expressions (https://www.gnu.org/software/bash/manual/html_node/Bash-Cond...) by function calls.

I agree that the function call introduces a better name. The problem is that this name is specific to the author of the script. It replaces a reusable tricky knowledge of the language by a simple knowledge of the author usages.

If we take into account the increased verbosity, increased typing, increased number of lines, there is a clear loss in using these functions.

For me, adding a comment would be far enough and far less annoying.


I disagreed with that too, but mostly because those tests are a core part of Bash programming. If you don't know what they mean by reading them, then you just need to learn.

It'd be like defining an "and" function and writing

    if ( and(conditionA, conditionB) ) {
        //do something
    }
because "&&" is too confusing. It's just syntax. Learn it.

And on the subject of &&, it's interesting that in his example about clarity he chooses to use short-circuit and for brevity instead of the more readable if block.


A guy publishing a guide for "defensive bash programming", who, in the process, provides this listing as an example for anything, is not fit for publishing said guide in the first place:

  main() {
      local files=$(ls /tmp | grep pid | grep -v daemon)
  }


You might have been better off reading the text after the code example that came after the code you quoted.

The text read:

- Second example is much better. Finding files is the problem of temporary_files() and not of main()’s. This code is also testable, by unit testing of temporary_files().

- If you try to test the first example, you will mish mash finding temporary files with main algorithm.


The problem isn't where it is - using "ls" to find files is never a good idea from my understanding. I think it's because of ls garbling file names but I might be mistaken there.

That the author is using that to find files is reasonable enough to me to disregard the rest of the blog.

edit: take that back...looked through rest of blog but don't see anything useful. I hate his idea for functions for builtins, using local is ok, please don't break up shell pipelines over N lines for simple stuff


Google also mandates that you shouldn't assign to a local var at declaration, because the exit code of the function producing the value is overwritten by the exit code to `local`.

https://google.github.io/styleguide/shell.xml?showone=Use_Lo...


It's said that all the example was missing quoting variables. It made all others defensive things useless.


Slightly off-topic but maybe some people will find it useful:

I used to write my various glue-things-together scripts in bash, but this quickly becomes a nightmare as the script grows, due to bash's corner cases, syntax, portability issues etc.

Recently I wrote my massive glue-things-together script with nodejs (since I already use node for many things) and it's much more maintainable and I couldn't be more happy. Node 0.12 has execSync which was the missing piece for making node the proper shell scripting platform.

If you are interested, you may want to check shelljs [1] and my snippets for robust `exec()` and `cd()` in node.

[1] https://github.com/shelljs/shelljs [2] https://gist.github.com/jakub-g/a128174fc135eb773631


Is it really a 'script' if you need to add a dir full of modules? I've always thought of 'scripts' as all-in-ones.


Well, you're right, it's a tradeoff. The 'script' is not standalone anymore, but it's a (script + package.json) and needs `npm install` to work properly. I still use bash for simple things, but if it grows too much and logic starts getting non-trivial, IMO it's a valid use case to switch.


I would love to write my bash scripts in node, but end up writing them in bash anyway ... How do you do something like this in node?

  sudo -u 2>&1 >> logfile | tee | echo


I'm assuming this isn't a real example of a thing you would want to do, because it doesn't make sense. (redirecting stderr to stdout, stdout to a file, then pipe to tee and echo?)

At any rate, you'd:

1. use some sort of popen()

2. use pipes and pass the same pipe to stdout as stderr

3. pass stdout of one popen to another's stdin

4. just use bash, because this gets so hairy and bash does it really well. When I want to use a complex pipeline of programs in non-bash programs, I'll frequently popen() to a bash script. This is ugly, but if you're really careful (shellshock?) it can be ok.


Does anyone know if these ideas conflict with http://mywiki.wooledge.org/BashFAQ or http://mywiki.wooledge.org/BashGuide? These are the resources that are drummed into you on #bash on Freenode, with the strong suggestion that All Other Resources Are Bad And You Should Feel Bad


The #bash users on freenode are right on that one. Everything in the wiki you linked to is documented and everything is explained clearly.

Another very good resource is the O'Reilly book: http://shop.oreilly.com/product/9780596009656.do


At some point, "defense" against a language's faults should translate to "use the right language for the job".

Any sufficiently-complex shell script can usually be written clearly as a Python or Perl program for instance, without having to worry about how the code might be misinterpreted.

Yes, I write shell scripts sometimes. I just make sure they're doing something pretty straightforward.


I think the best "defensive" piece of advice you left out that everyone abuses is never pipe `find` results to `xargs.` One should always do `find ... -print0` to a read-while loop because of filenames with whitespace.


`xargs -0' was designed to properly handle the output of `find -print0' -- no while loop needed.

`find -print0 | xargs -0 grep some_pattern'


And you also want to prevent `while` loop in shell script. See http://unix.stackexchange.com/q/169716/38906 for more details.


Use "find ... -exec command '{}' '+'" instead of "find | xargs command".


avoid read-while loop... there are problems with filenames starting/ending with spaces... better use

    find ... -print0 | xargs -0 ... [eventually with -n1]


And you need non-standard feature `-print0` and `xargs -0` there. Better way should be `find ... -exec cmd {} +`.


Is there a place you can use `-exec cmd {} +` but not `print0`? My experience has been I can use both or I can use none (Solaris 10 boxes).


Systems with a non-GNU find, including the default one in Solaris 10 (I've just double-checked). Perhaps you were using the GNU find in Solaris?

By the way, there are systems where 'print0' is a recognized parameter for the default find, but not '+'. RHEL 4.x comes to mind.


Sorry I think we're saying same thing. On systems with non-GNU find do either `-exec cmd {} +` or `print0` work? My exp was both did NOT work. So either both work or both don't.

But if I understand you correctly on RHEL 4.X `print0` works but `-exec cmd {} +` doesn't.

Which is to say I disagree with op that it's better to rely on `exec cmd {} +` when it seems you're more likely to have `print0 and xargs -0` then that.


The sloshes are redundant after a pipe symbol. i.e.

    ls $dir \
        | grep something
is the same as

    ls $dir |
        grep something


However, the author argues that the pipe symbol should be moved to be beginning of the next line, so the backslash isn't optional.


All these recommendation are excellent (except omitting the necessary obsessive quoting of all variables everywhere (just in case they contain a space).

I've been enjoying BATS [0] for my bash testing.

[1] https://github.com/sstephenson/bats


Why BATS? Why not something *unit-like? Example: https://github.com/vlisivka/bash-modules/blob/master/main/ba... .


TAP output is killer, plus I prefer to write my tests in bash -e, not Java thanks.


BATS looks great, but it might not be maintained any more: https://github.com/sstephenson/bats/issues


I have always tried to avoid using functions in my bash scripts. That said, it's always nice to other people's bash programming techniques and style. This one had the biggest impact for me, although I can't imagine it's news to any of you on here: http://redsymbol.net/articles/unofficial-bash-strict-mode/


Remembering best practices for BASH might be hard. I recommend using http://www.shellcheck.net/. I use with vim and syntastic. This article talks more about it: http://jezenthomas.com/shell-script-static-analysis-in-vim/


Content from so long ago, posted here on HN less than two weeks ago... here are some recent reddit comments about the post: https://www.reddit.com/r/bash/comments/3w354v/defensive_bash...


IMHO, it is much easier to parse arguments using bash_modules args module (disclosure: I am author).


I'm not sure if scripting can be consider programming. Excessive scripting means lack of software architecture, no to mention the least efficient way to work with the FS


Depends on how you define scripting vs programming. I'd argue that you can "program" in a shell language; bash is Turing-complete after all.

Whether doing so or not is a good idea is another question entirely..




This guide is all but ready for prime time. Make it go down, as it is a pile of bad habits. Sure, there are one or two good ideas, but nothing revolutionnary.


Use www.shellcheck.net as a complement


Step 1: Choose a better language. Almost anything will do.


I quite like

    set -u
    set -e
with

    trap 'echo $0 internal error at $LINENO' ERR


I kind of like that BASH allows you to make unconventional programs.


Yes, I applied some of these in my bash program.


To me, BASH is very offensive




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: