When using `set -x` this makes it so that it shows the filename, function name, and line number. Which in larger Bash scripts can be quite handy in debugging.
I recommend shellcheck as well. It might not catch your problem, but it will point out possible problems.
Also I recommend: rewriting scripts in another language. At work we are converting bash scripts to rust and while it’s a high ramp-up time, the resulting code is much easier to maintain and I have a much higher level of confidence in them. Bash is still good for quick scripts but once you hit 100 lines or so you really deserve a language with stronger guarantees.
That's a nice list; I guess every experienced user has their helper functions. However, I have a small criticism for the philosophy of that `die`: `die` functions should pass by default the exit code of the failed command, and not silence its error output. If I want to give my own meaning to the command failure in a large script for instance, I will use a different, more specialized `die`. My own die is roughly as follows:
__errex() {
printf 'Fatal error [%s] on line %s in '"'"'%s'"'"': %s\n' \
"${1:-"?"}" \
"${2:-"?"}" \
"${3:-"unknown script"}" \
"${4:-"unknown error"}" >&2 ;
exit "${1:-1}"
}
alias die='__errex "$?" "${LINENO}" "$0"'
Is there a reason bash is still the de facto shell scripting language other than sheer momentum of legacy? I'm able to get what I need done in it, but it's clunky and the syntax is horrid. I guess it forces you to move to a proper language once scripts grow to a certain size/complexity, so perhaps it's by design?
Its legacy usage is certainly a big part of its popularity. You can generally rely on having a newish version of it on any modern distro and you don't have to worry about the version unless you want to do stuff with arrays etc.
What I find compelling about bash is its position in relation to other languages and tools. It's ideal for tying together other tools and is close enough to the operating system to make that easy whilst also not requiring libraries to also be installed (c.f. python).
I often hear the opinion that more complex scripting should be moved to a language such as python, but that adds a layer of complexity that is probably not helpful in the long-run. I can take a bash script that I wrote twenty years ago and it'll still work fine, but a python programme from twenty years ago may well have issues with versions.
bourne shell scripting is good enough, which makes it nearly impossible to replace. Plan9's rc is a bit cleaner, and no one is going to switch for 'more of the same, but cleaner'. You haven't switched to something similar but better even though you could literally do it right now https://pkgsrc.se/shells , and it doesn't run any different for anyone else. It usually takes something several times better in some crucial aspect to replace an entrenched technology. For example, Plan 9 is better than UNIX-like systems, but not good enough to replace them. I don't think it's possible to make something good enough to replace bourne shell scripting in its niche because before you have something several times better, good enough to actually replace it, you're in a different ecological niche or problem domain, for real scripting languages like Perl, Python, and Ruby. It's a local maximum solution that sucks the air out of the room for potential competition closer to the theoretical global maximum solution for the narrow problem domain.
It really is just legacy and momentum. Recent additions build on sh/bash really well but in the end shell scripting is a means to an end that need to evolve much slower than standard programming languages.
I think bash/sh’s key feature is that they are anti-entropy, there’s no development or evolution so there’s no chance you need to mess with dependencies or new features, the stuff that worked 20 years ago will continue to be the “bread and butter”. By design, this results in a system that’s averse to change and incentivizes people to reach outside of its limits when they are met.
Are you sure it's bash? Most scripts on FreeBSD's are written for sh, which I feel is much more widely supported due to being part of the POSIX standard. Bash is just popular I think.
FreeBSD's /bin/sh is based on ash, like NetBSD's, although I'm not sure how much they have in common these days. dash was forked from NetBSD's version of ash and then simplified considerably and fixed up to be fully (? or at least mostly) POSIX compliant. A while after that NetBSD's shell also had a bunch of POSIX fixes. I'm not sure how FreeBSD's shell is in terms of strict POSIX compliance.
In my opinion, bash has two things (at least vs NetBSD's shell, possibly a few more vs POSIX) that make the average shell script (that I write) much easier. The first is &> which makes it easy to redirect both stdout and stderr to a file for logging. The standard 2>&1 can work but needs to be placed correctly or it doesn't work. That place isn't always the obvious place like it is with &> and running bash seems much preferrable to me than figuring that one out.
The second is ${var@Q} which prints var quoted for the shell, which is nice to use all over the place to make sure any printed file names can be copied and pasted.
My sense is that targeting POSIX is usually done for maximum portability or for use on systems that don't have bash installed by default. However, bash is quite widely available even if not by default and very widely used so I wouldn't say it is unreasonable to look at bash as the de facto standard and POSIX and other shells as being used in more limited circumstances.
I feel like when I see a shell script in my work, which is not in operating systems development of course, people are targeting bash. I agree many things are careful to target sh for certain reasons (e.g. a script that runs in a container where the base image doesn’t have bash installed) but i still think GP’s question is interesting because it’s not common to see, say, a zsh shell script, but seeing #!/bin/bash is super common.
I have done some delightful stuff in `zsh`, but I always lament how slow its numerical array traversal is. Frustratingly, experts told me it really doesn't have to be slow, the devs just don't seem to be bothering to revamp the underlying data structure because they are focusing more on associative arrays.
Bash is pretty much expected to be installed on any Linux distro. On FreeBSD (and likely other BSDs) it is an optional install. If you want a script to run on either, use sh. If strictly Linux, bash is probably safe.
Bash/sh is good for when you need to combine some commands and what needs to be done can be accomplished mostly by CLI commands with a little glue to tie them together. Some times it is surprising what can be accomplished. I wrote a program to import pictures from an SD card on Windows using C#, copying pictures to C:\Pictures\YYYY\MM\DD according to the EXIF data or failing that, file time stamp. I tried to port it to Linux but ran into problems trying to connect to the EXIF library. After struggling with that, I rewrote it using sh, some EXIF tool and various file utilities. It took 31 lines, about half of which were actual commands and the rest comments or white space.
A much bigger project is a script to install Debian with root on ZFS. It's mostly a series of CLI commands with some variable substitution and conditionals depending on stuff like encrypted or not.
> Bash/sh is good for when you need to combine some commands and what needs to be done can be accomplished mostly by CLI commands with a little glue to tie them together.
Once I’ve learned bash, I realised how much more problems i could solve, in addition to a majority of old ones. It’s an entirely new level of “computer literacy”; and a more genuine one.
Plug and slightly related. I once created a bash pipeline debugger that preserves the intermediate outputs. Has a few limitations but maybe generally useful: https://github.com/ketancmaheshwari/pd
The `die()` trick is good, but bash has an annoying quirk: if you try to `exit` while you're inside a subshell, then the subshell exits but the rest of the script continues. Example:
#!/bin/bash
die() { echo "$1" >&2; exit 1; }
cat myfile | while read line; do
if [[ "$line" =~ "information" ]]; then
die "Found match"
fi
done
echo "I don't want this line"
..."I don't want this line" will be printed.
You can often avoid subshells (and in this specific example, shellcheck is absolutely right to complain about UUOC, and fixing that will also fix the die-from-a-subshell problem).
But, sometimes you can't, or avoiding a subshell really complicates the script. For those occasions, you can grab the script's PID at the top of the script and then use that to kill it dead:
#!/bin/bash
MYPID=$$
die() { echo "$1" >&2; kill -9 $MYPID; exit 1; }
cat myfile | while read line; do
if [[ "$line" =~ "information" ]]; then
die "Found match"
fi
done
echo "I don't want this line"
...but, of course, there are tradeoffs here too; killing it this way is a little bit brutal, and I've found that (for reasons I don't understand) it's not entirely reliable either.
Just adding `set -e` also exits the script when a subshell exits with non-zero error code. I'm not sure why I would leave `set -e` out in any shell script.
I use `set -e` but it has its own quirks. A couple:
An arithmetic expression that evaluates to zero will cause the script to exit. e.g this will exit:
set -e
i=0
(( i++ )) # exits
Calling a function from a conditional prevents `set -e` from exiting. The following prints "hello\nworld\n":
set -e
main() {
false # does not return nor exit
echo hello
}
if main; then echo world; fi
Practically speaking this means you need to explicitly check the return value of every command you run that you care about and guard against `set -e` in places you don't want the script to exit. So the value of `set -e` is limited.
You can `set +e` before and `set -e` after every such command. I indent those commands to make it look like a block and to make sure setting errexit again isn’t forgotten.
But you probably still want an error if the input file does not exist. To handle grep correctly in a robust manner requires many lines of boilerplate, set +e, storing $?, set -e, distinguishing exit values 0, 1, and 2.
Perhaps, but in any case I would never write code like this.
First of all, sending sigkill is literally overkill and perpetuates a bad practice. Send `TERM`. If it doesn't work, figure out why.
Secondly, subshells should be made as clear as possible and not hidden in pipes. Related, looping over `read` is essentially never the right thing to do. If you really need to do that, don't use pipes; use heredocs or herestrings.
Fourth, if you cannot avoid subshells and you want to terminate the full script on some condition, exit with a specific exit code from the subshell, check for it outside and terminate appropriately.
Do enlighten me on why it is a bad idea to use loops over read; it's perhaps one of my favourite patterns in bash, and combined with pipes, appears to me one of the cleanest ways to correctly and concisely utilise parallelism in software.
The provided points don't seem to be reasons for generally avoiding the subshell loop pattern.
Reasons of 1) performance, 2) readability, and 3) security are provided as points against the pattern, and the post itself acknowledges that the pattern is a great way to call external programs.
I'd think that the fact that one is using shell to begin with would almost certainly mean that one is using the subshell loop pattern for calling external programs, which is the use case that your post approves of. In this case, subshells taking the piped input as stdin allows the easy passing of data streamed over a file-descriptor, probably one of the most trivially performant ways of data movement, and the pattern is composable, certainly easier to remember, modify, and to extend than the provided xargs alternative, without potential problems such as exceeding max argument length. Having independent subshells also allows for non-interference between separate loops when run in parallel, offering something resembling a proper closure. In these respects, subshell loops provide benefits rather than pitfalls in performance and readability. Certainly read has some quirks that one needs to be aware of, but aren't much of an issue when operating on inputs of a known shape, which is likely the case if one is about to provide them as arguments to another command.
Regarding "security", the need to quote applies to anything in shell, and has nothing specifically to do with the pattern.
Thanks for the reference! Seems like a really good resource.
I disagree with the reasoning about pipefail though. If I expect a command to return non-zero exit code I'd rather be explicit about it.
Gentoo has a script that you can source at /lib/gentoo/functions.sh that provide various helper methods, mostly for printing messages, and it provides nice little green and red starts to indicate whether something has succeeded of failed.
I use functions.sh in all of my scripts that are known to be running on Gentoo only and it makes them feel Gentoo-y and is useful in general.
It's fine for BASH versions above v3 and provides decent logging though I typically extend the script so that I can pipe long running commands into its logging framework. It also ensures that you specify the "help" options correctly as it parses the usage information to process the command line arguments with support for short and long options.
I think of it as a gateway to writing better scripts. When you first run it and it highlights what it considers to be a problem, you end up reading why it considers it to be a problem and that clues you in on some of the many footguns that Bash has.
I find such reasoning backwards. Indeed, shell scripting is not friendly to debugging. But ensuring correctness of shell scripts is essential: usually, they touch part of your "$HOME" or system folders and do tons of I/O, some of it destructive. I find it baffling to see people write careless scripts; sometimes using `rm` for cleanup with unquoted parameters, or much worse, dangerous uses of `mv`.
I believe P's point was that it doesn't matter how simple or complex a script (or anything else) is, everything requires debugging. And/or that you have to be ready to debug bash regardless if you like it or not, regardless what you choose to write your own stuff in.
OP's comment is not unfunny and not 100% untrue either though. But not 100% true either. A single word script still needs to be debugged.
I hear this argument occasionally and it’s very contextual. While it’s certainly possible to rewrite any given shell script in Python, Rust, or whatever language you prefer there are some things which are just clearer in Bash.
I wouldn’t want to write an entire application in Bash, but equally I wouldn’t want to write a script which does relatively simple file operations in Python. Bash is a language which has been honed over many decades for precisely that sort of thing, and so can communicate what’s happening far clearer than Python does in my view.
While that's certainly true for people trying to do very complex things in "pure" shell, when the tools you're using are possibly buggy, it's not very useful. Sometimes you have to debug and figure out where the problem is occurring, an then you can do the much simpler work of replacing one part rather than writing a bespoke program to replace all the boring functionality obfuscated away by shell and the working programs.
Doesn't make sense. What if you get a script someone else wrote? Printing every command and confirming as you run every command is a great idea.
And, unfortunately shell has become the norm in CI/CD environments, pipelines etc. Can be convenient at times but can also be inconvenient and confusing as these scripts don't run in interactive shells.
Good stuff! I use set-x frequently and have used a similar thing to die (but Julia’s version is nicer). I’ll consider using the debugger thing but stepping through a bash script line by line sounds a bit tedious. Perhaps less so than having to reread a log and rerun the script a bunch.
You do e.g. `fail-unless somecommand`. The result (exit/return code) is captured in the function and based on that, the function logs and exits or not.
The line-by-line debugging would probably only be useful for a particular section of your script that you're trying to fix. In that case, you can remove the trap at the end of it with `trap - DEBUG`
I have almost that same die() in every script, except I call it abrt(). Maybe I'll switch to die() since it's shorter. Mine also prepends $0 and sometimes I use printf or echo -e so I can pass larger more complex messages with linefeeds and escape codes etc.
The `trap DEBUG` thing is pretty interesting; I almost always write POSIX code, so I don't get to play with such tricks. Does anybody know of some wizardry that could mimic this in arbitrary POSIX compliant shells?
Hitting ctrl-t on our main menu will, when booting with debug logging enabled, show a screen like this: https://i.imgur.com/Ge75zkP.png
We also have a flamegraph profiling mechanism that can be enabled with https://github.com/zbm-dev/zfsbootmenu/blob/master/zfsbootme... . That will dump data to a serial port, which when re-assembled, can be used to produce a graph like https://raw.githubusercontent.com/zbm-dev/zfsbootmenu/master...
Bash is suprisingly flexible.