Hacker News new | past | comments | ask | show | jobs | submit login
Techniques I use to create a great user experience for shell scripts (nochlin.com)
412 points by hundredwatt 27 days ago | hide | past | favorite | 275 comments



Don't output ANSI colour codes directly - your output could redirect to a file, or perhaps the user simply prefers no colour. Use tput instead, and add a little snippet like this to the top of your script:

    command -v tput &>/dev/null && [ -t 1 ] && [ -z "${NO_COLOR:-}" ] || tput() { true; }
This checks that the tput command exists (using the bash 'command' builtin rather than which(1) - surprisingly, which can't always be relied upon to be installed even on modern GNU/Linux systems), that stdout is a tty, and that the NO_COLOR env var is not set. If any of these conditions are false, a no-op tput function is defined.

This little snippet of setup lets you sprinkle tput invocations through your script knowing that it's going to do the right thing in any situation.


Yes and even better for faster speed and greater shell compatibility for basic colors, you can use this POSIX code:

    if [ -t 1 ] && [ -z "${NO_COLOR:-}" ]; then
      COLOR_RESET=''
      COLOR_RED=''
      COLOR_GREEN=''
      COLOR_BLUE=''
    else
      COLOR_RESET=''
      COLOR_RED=''
      COLOR_GREEN=''
      COLOR_BLUE=''
    fi
For more about this see Unix Shell Script Tactics: https://github.com/SixArm/unix-shell-script-tactics/tree/mai...

Be aware there's an escape character at the start of each of color string, which is the POSIX equivalent of $'\e'; Hacker News seems to cut out that escape character.


You should also at least check for TERM=dumb which is an older convention than NO_COLOR.


Good point, you're right. Added. Thank you.


If you use tput a lot it's also worth caching the output, because invoking it for every single color change and reset can really add up. If you know you're going to use a bunch of colors up front you can just stuff them into vars

  RED=$(tput setaf 1)
  GREEN=$(tput setaf 2)
  RESET=$(tput sgr0)


There should just be a command for this. Like echo with a color flag that does something if you’re in a tty.


But since there isn’t, even if you make one, people won’t want to rely on it as a dependency.


Can add it to bash?


It’s a bit more complicated. POSIX shell != bash, for example the default shell (/bin/sh) on macOS is now zsh, on Ubuntu it’s dash. Bash may still be installed, but may be years older than 2024. At a certain point, you’re better off embracing a dependency that just does everything better, like Python or Oil shell, for example.


Why? Benefit? 10% of people hace problemas with colors. Depending on the terminal/background, you will produce bad/invisible output. Any good typesetting book will tell you not to use colors.


I have problems with non-colored output because it makes it harder to distinguish important from less important stuff.


Why are you arguing against the entire thread as a reply to my comment?


This reads like what I've named as "consultantware" which is a type of software developed by security consultants who are eager to write helpful utilities but have no idea about the standards for how command line software behaves on Linux.

It ticks so many boxes:

* Printing non-output information to stdout (usage information is not normal program output, use stderr instead)

* Using copious amounts of colours everywhere to draw attention to error messages.

* ... Because you've flooded my screen with even larger amount of irrelevant noise which I don't care about (what is being ran).

* Coming up with a completely custom and never before seen way of describing the necessary options and arguments for a program.

* Trying to auto-detect the operating system instead of just documenting the non-standard dependencies and providing a way to override them (inevitably extremely fragile and makes the end-user experience worse). If you are going to implement automatic fallbacks, at least provide a warning to the end user.

* ... All because you've tried to implement a "helpful" (but unnecessary) feature of a timeout which the person using your script could have handled themselves instead.

* pipefail when nothing is being piped (pipefail is not a "fix" it is an option, whether it is appropriate is dependant on the pipeline, it's not something you should be blanket applying to your codebase)

* Spamming output in the current directory without me specifying where you should put it or expecting it to even happen.

* Using set -e without understanding how it works (and where it doesn't work).


Addendum after reading the script:

* #!/bin/bash instead of #!/usr/bin/env bash

* [ instead of [[

* -z instead of actually checking how many arguments you got passed and trusting the end user if they do something weird like pass an empty string to your program

* echo instead of printf

* `print_and_execute sdk install java $DEFAULT_JAVA_VERSION` who asked you to install things?

* `grep -h "^sdk use" "./prepare_$fork.sh" | cut -d' ' -f4 | while read -r version; do` You're seriously grepping shell scripts to determine what things you should install?

* Unquoted variables all over the place.

* Not using mktemp to hold all the temporary files and an exit trap to make sure they're cleaned up in most cases.


As a bash casual, these suggestions are a reminder of why I avoid using bash when I can. That's a whole armory of footguns right there.


What is better?


I think Python is overused, but this is exactly what Python is great for. Python3 is already installed or trivial to install on almost everything, it has an enormous library of built-ins for nearly everything you'll need to do in a script like this, and for all of its faults it has a syntax that's usually pretty hard to subtly screw up in ways that will only bite you a month or two down the road.

My general rule of thumb is that bash is fine when the equivalent Python would mostly be a whole bunch of `subprocess.run` commands. But as soon as you're trying to do a bunch of logic and you're reaching for functions and conditionals and cases... just break out Python.


I've been pretty happy with the experience of using Python as a replacement for my previous solutions of .PHONY-heavy Makefiles and the occasional 1-line wrapper batch file or shell script. It's a bit more verbose, and I do roll my eyes a bit occasionally at stuff like this:

    call([options.cmake_path,'-G','Visual Studio 16','-A','x64','-S','.','-B',build_folder],check=True)
But in exchange, I never have to think about the quoting! - and, just as you say, any logic is made much more straightforward. I've got better error-checking, and there are some creature comforts for interactive use such as a --help page (thanks, argparse!) and some extra checks for destructive actions.


Golang. You build one fat binary per platform and generally don't need to worry about things like dependency bundling or setting up unit tests (for the most part it's done for you).


I use different languages for different purposes. Although bash euns everywhere, its a walking footgun and thus I only use it for small sub 100 line no or one option Scripts. the rest goes to one of Python, which nowadays runs almost everywhere, Julia or a compiled language for the larger stuff


If you just want to move some files around and do basic text substitution, turning to Python or another other "full fledged programming language" is a mistake. There is so much boiler plate involved just to do something simple like rename a file.


You mean

    import os
    os.rename(“src.txt”, “dest.txt”)

?


Yes. And it is only downhill from there.

Now, show us `mycommand | sed 's/ugly/beautiful/g' | awk -F: '{print $2,$4}' 1> something.report 2> err.log` in Python.


That looks like a snippet from a command session which is a perfectly great place to be using sh syntax.

If it became unwieldy you’d turn it into a script:

  #!/bin/sh

  beautify() {
    sed -e ‘
      s/ugly/beautiful/g
      …other stuff
    ‘
  }

  select() {
    awk ‘
      {print $2, $4}
      …other stuff
    ‘
  }

  mycommand | beautify | select
For me, now it’s starting to look like it could be safer to do these things in a real language.


I have a lot of scripts that started as me automating/documenting a manual process I would have executed interactively. The script format is more amenable to putting up guardrails. A few even did get complex enough that I either rewrote them from the ground up or translated them to a different language.

For me, the "line in the sand" is not so much whether something is "safer" in a different language. I often find this to be a bit of a straw-man that stands in for skill issues - though I won't argue that shell does have a deceptively higher barrier to entry. For me, it is whether or not I find myself wanting to write a more robust test suite, since that might be easier to accomplish with Ginkgo or pytest or `#include <yourFavorateTestLibrary.h>`.


Is it really so bad? A bit more verbose but also more readable, can be plenty short and sweet for me. I probably wouldn't even choose Python here myself and it's the kind of thing shell scripting is tailor-made for, but I'd at least be more comfortable maintaining or extending this version over that:

  from subprocess import Popen, PIPE

  CMD = ("printf", "x:hello:67:ugly!\nyy$:bye:5:ugly.\n")
  OUT = "something.report"
  ERR = "err.log"

  def beautify(str_bytes):
      return str_bytes.decode().replace("ugly", "beautiful")

  def filter(str, \*index):
      parts = str.split(":")
      return " ".join([parts[i-1] for i in index])

  with open(OUT, "w") as out, open(ERR, "w") as err:
      proc = Popen(CMD, stdout=PIPE, stderr=err)
      for line_bytes in proc.stdout:
        out.write(filter(beautify(line_bytes), 2, 4))
I would agree though if this is a one-off need where you have a specific dataset to chop up and aren't concerned with recreating or tweaking the process bash can likely get it done faster.

Edit: this is proving very difficult to format on mobile, sorry if it's not perfect.


In ruby you can just call out to the shell with backticks.

Like.

    myvar = `mycommand | sed 's/ugly/beautiful/g' | awk -F: '{print $2,$4}' 1> something.report 2> err.log`
That way, if something is easier in Ruby you do it in ruby, if something is easier in shell, you can just pull its output into a variable.. I avoid 99% of shell scripting this way.


That is fair...

But if all I need to do is generate the report I proposed...why would I embed that in a Ruby script (or a Python script, or a Perl script, etc.) when I could just use a bash script?


Bash scripts tend to grow to check on file presence, conditionally run commands based on the results of other commands, or loop through arrays. When it is a nice pipelined command, yes, bash is simpler, but once the script grows to have conditions, loops, and non-string data types, bash drifts into unreadability.


I don’t think it’s fair to compare a workflow that is designed for sed/awk. It’s about 10 lines of python to run my command and capture stdout/stderr - the benefit of which is that I can actually read it. What happens if you want to retry a line if it fails?


> I don’t think it’s fair to compare a workflow that is designed for sed/awk.

If your position is that we should not be writing bash but instead Python, then yes, it is absolutely fair.

> the benefit of which is that I can actually read it.

And you couldn't read the command pipeline I put together?

> What happens if you want to retry a line if it fails?

Put the thing you want to do in a function, execute it on a line, if the sub-shell returns a failure status, execute it again. It isn't like bash does not have if-statements or while-loops.


My point is that if you take a snippet designed to be terse in bash, it’s an unfair advantage to bash. There are dozens of countless examples in python which will show the opposite

> And you couldn't read the command pipeline I put together?

It took me multiple goes, but the equivalent in python I can understand in one go.

> Put the thing you want to do in a function, execute it on a line, if the sub-shell returns a failure status, execute it again. It isn't like bash does not have if-statements or while-loops.

But when you do that, it all of a sudden looks a lot more like the python code


Just ask chatgpt and you’ll get a script, probably makes some tests too if you ask for it.


I have not really been a fan of ChatGPT quality. But even if that were not an issue, it is kinda hard to ask ChatGPT to write a script and a test suite for something that falls under export control and/or ITAR, or even just plain old commercial restrictions.


import os

os.system('mycommand | sed 's/ugly/beautiful/g' | awk -F: '{print $2,$4}' 1> something.report 2> err.log')


You forgot to point out all those "footguns" you avoided by writing in Python rather than bash...


This has all of the purported problems of doing this directly in a shell language and zero advantages...


Babashka/clojure is a fairly pleasant way to write scripts.

I also think bun alongside typescript is quite viable, especially with the shell interop:

https://bun.sh/docs/runtime/shell


Xonsh. Been using it since 2018. Bash scripting sucks in comparison.


For reference, it seems to be this: https://xon.sh

  XONSH is a Python-powered shell

  Xonsh is a modern, full-featured and cross-platform shell. The language is a
  superset of Python 3.6+ with additional shell primitives that you are used to
  from Bash and IPython. It works on all major systems including Linux, OSX, and
  Windows. Xonsh is meant for the daily use of experts and novices.
Haven't heard of it before personally, and it looks like it might be interesting to try out.


Use bash for simple stuff, and Perl or TCL for applications.


Python or go


Python.


Zsh


Not being a filthy BASH casual?


Embrace bash. Use it as your login shell. Use it as your scripting language. Double check your scripts with shellcheck.


POSIX gang disapproves


I stopped caring about POSIX shell when I ported the last bit of software off HP-UX, Sun OS, and AIX at work. All compute nodes have been running Linux for a good long while now.

What good is trading away the benefits of bash extensions just to run the script on a homogeneous cluster anyways?

The only remotely relevant alternative operating systems all have the ability to install a modern distribution of bash. Leave POSIX shell in the 1980s where it belongs.


> * #!/bin/bash instead of #!/usr/bin/env bash

Except that'll pick up an old (2006!) (unsupported, I'm guessing) version of bash (3.2.57) on my macbook rather than the useful version (5.2.26) installed by homebrew.

> -z instead of actually checking how many arguments you got

I think that's fine here, though? It's specifically wanting the first argument to be a non-empty string to be interpolated into a filename later. Allowing the user to pass an empty string for a name that has to be non-empty is nonsense in this situation.

> You're seriously grepping shell scripts to determine what things you should install?

How would you arrange it? You have a `prepare_X.sh` script which may need to activate a specific Java SDK (some of them don't) for the test in question and obviously that needs to be installed before the prepare script can be run. I suppose you could centralise it into a JSON file and extract it using something like `jq` but then you lose the "drop the files into the directory to be picked up" convenience (and probably get merge conflicts when two people add their own information to the same file...)


> Except that'll pick up an old (2006!) (unsupported, I'm guessing) version of bash (3.2.57) on my macbook rather than the useful version (5.2.26) installed by homebrew.

Could you change that by amending your $PATH so that you're preferred version is chosen ahead of the default?


> Could you change that by amending your $PATH

I think the `#!/bin/bash` will always invoke that direct file without searching your $PATH. People say you can do `#!bash` to do a $PATH search but I've just tried that on macOS 15 and an Arch box running a 6.10.3 kernel and neither worked.


I think I misread the original recommendation as being the other way round i.e. to use #!/usr/bin/env bash instead of #!/bin/bash.

That's why env is generally preferred as it finds the appropriate bash for the system.


The GP is pointing out [bad bash, good bash] and not [good bash, bad bash]. It was unclear to me at first as well.

You two are in violent agreement.


No, they're not. The script they're critiquing uses #!/bin/bash, so they have to have been saying that #!/usr/bin/env bash is better.


They're definitely both critiquing the script in the OP for the same thing in the same way. They're in agreement with each other, not with the script in TFA


> They're in agreement with each other

Oh. Oh! This is a confusing thread. Apologies all!


Oh, I also got confused! You're right, this is just a confusing subthread.


The 1brc shell script uses `#!/bin/bash` instead of `#!/usr/bin/env bash`. Using `#!/usr/bin/env bash` is the only safe way to pick up a `bash` that’s in your $PATH before `/usr/bin`. (You could do `#! bash`, but that way lies madness.)


Madness is

    #!/usr/bin/env


As far as quick and dirty scripts go, I wouldn’t care about most of the minor detail. It’s no different to something you’d slap together in Ruby, Python, or JS for a bit of automation.

It’s only when things are intended to be reused or have a more generic purpose as a tool that you need them to behave better and in a more standard way.


I had some similar thoughts when seeing the script.

For better user friendliness, I prefer to have the logging level determined by the value of a variable (e.g. LOG_LEVEL) and then the user can decide whether they want to see every single variable assignment or just a broad outline of what the script is doing.

I was taken back by the "print_and_execute" function - if you want to make a wrapper like that, then maybe a shorter name would be better? (Also, the use of "echo" sets off alarm bells).


What's so bad in using echo?


Well, strokes beard, funny you should ask.

Have a look at https://mywiki.wooledge.org/BashPitfalls#echo_.24foo

Most of the time, "echo" works as you'd expect, but as it doesn't accept "--" to signify the end of options (which is worth using wherever you can in scripts), it'll have problems with variables that start with a dash as it'll interpret it as an option to "echo" instead.

It's a niche problem, but replacing it with "printf" is so much more flexible, useful and robust. (My favourite trick is using "printf" to also replace the "date" command).

Also, here's some more info on subtle differences between "echo" on different platforms: https://unix.stackexchange.com/questions/65803/why-is-printf...


Thank you!


> * #!/bin/bash instead of #!/usr/bin/env bash

This one becomes very apparent when using NixOS where /bin/bash doesn’t exist. The vast majority of bash scripts in the wild won’t run on NixOS out of the box.


BOFH much? It’s not as if this script is going to be used by people that have no idea what is going to happen. It’s a script, not a command.

Your tone is very dismissive. Instead of criticism all of these could be phrased as suggestions instead. It’s like criticising your junior for being enthusiastic about everything they learned today.


https://en.m.wikipedia.org/wiki/Bastard_Operator_From_Hell

For anyone else not familiar with this term


> BOFH much

This made me chuckle.

> Your tone is very dismissive.

I know, but honestly when I see a post on the front page of HN with recommendations on how to do something and the recommendations (and resulting code) are just bad then I can't help myself.

The issue is that trying to phrase things nicely takes more effort than I could genuinely be bothered to put in (never mind the fact I read the whole script).

So instead my aim was to be as neutral sounding as possible, although I agree that the end result was still more dismissive than I would have hoped to achieve.


It is your responsibility to try.

"I can't help myself" is not a valid excuse. If you seriously cannot bother to phrase things less dismissively, then you shouldn't comment in the first place.

One of the best guidelines established for HN, is that you should always be kind. It's corny and obvious, and brings to mind the over-said platitude my mom, and a million other moms, used to say: "if you don't have anything nice to say, don't say anything at all."

Your concession was admirable, but your explanation leads me to think that you misunderstand the role you play in the comments. You are not supposed to be a reaction bot; HN is not the journal for your unfiltered thoughts and opinions.

Despite how easy it would be, you cannot and must not simply write replies. Absolutely everything (yes, everything) written here should assume the best, and be in good faith. Authors and the community deserve that much.

This goes for other sites as well, but especially for a community that strives for intellectual growth, like Hacker News.

Apologies if I sounded harsh.


> It is your responsibility to try.

I don't agree that I have any responsibilities on the internet. (edit: Outside of ones I come up with.)

> One of the best guidelines established for HN, is that you should always be kind.

Kindness is subjective, I was not trying to be actively unkind. It's just that the more you attempt to appear kind across every possible metric the more difficult and time consuming it is to write something. I had already put in a lot of effort to read the article, analyse the code within it, and analyse the code behind it. You have to stop at some point, and inevitably someone out there will still find what you wrote to be unkind. I just decided to stop earlier than I would if I was writing a blog post.

> "if you don't have anything nice to say, don't say anything at all."

This is not a useful adage to live by. If you pay someone to fix your plumbing and they make it worse, certainly this won't help you. Likewise, If people post bad advice on a website lots of people frequent and nobody challenges it, lots of people without the experience necessary to know better will read it and be influenced by it.

> You are not supposed to be a reaction bot; HN is not the journal for your unfiltered thoughts and opinions.

I think it's unkind to call what I wrote an unfiltered thought/opinion/reaction. You should respect that it:

* Takes a lot of time and experience before you can make these kinds of remarks

* Takes effort to read the post, evaluate what is written in it, write a response, and verify you are being fair and accurate.

* Takes even more effort to then read the entire script, and perform a code review.

If I had looked at the title and headlines and written "This is shit, please don't read it." then I think you would have a point but I didn't do that.

More to the point, a substantial number of people seem to have felt this was useful information and upvoted both the comments.

> Despite how easy it would be, you cannot and must not simply write replies. Absolutely everything (yes, everything) written here should assume the best, and be in good faith. Authors and the community deserve that much.

I prefaced my first comment by pointing out that the people who make the mistakes I outlined are usually well meaning. My critique was concise and could be seen as cold but it was not written in bad faith.


Thank You for letting me know about BOFH, I'm going to read those stories now, Seems fun!!


I appreciate the parent comment and it's frankness. Not everyone, especially juniors, need to, or should be, coddled.


Coddling != taking issue with “but have no idea about the standards for”

I’ve seen more than enough code from folks with combative takes to know “[their] shit don’t shine” either.


Sure, but there’s styles of writing and talking that have a counterproductive effect.

If your goal is improvement instead of venting you don’t want to use those.


I agree.


> pipefail when nothing is being piped (pipefail is not a "fix" it is an option

I think it’s pretty good hygiene to set pipefail in the beginning of every script, even if you end up not using any pipes. And at that point is it that important to go back and remove it only to then have to remember that you removed it once you add a pipe?


Pipefail is not a fix. It is an option. It makes sense sometimes, it does not make sense other times. When you are using a pipeline in a script where you care about error handling then you should be asking yourself exactly what kind of error handling semantics you expect the pipeline to have and set pipefail accordingly.

Sometimes you should even be using PIPESTATUS instead.


It’s an option and one that should probably have been on by default together with -e.

It’s not so much about error handling. It’s more about not executing all sorts of stuff after something fails.


I would argue pipefail and set -e are much more reasonable defaults to take.


It's not that bad.


Another way to look at this is to chill out, it's a neat article and sometimes we write low stakes scripts just for ourselves on one machine.


The colors topic is really bad! I like to use blue backkround in my terminal, which breaks the output of that stupid scripts. DO NOT USE COLORS.


I think it's fine to use colours, but there should be a way to disable colours for when you're writing to a file etc.


Nowhere in this list did I see “use shellcheck.”

On the scale of care, “the script can blow up in surprising ways” severely outweighs “error messages are in red.” Also, as someone else pointed out, what if I’m redirecting to a file?


I find shellcheck to be a bit of a nuisance. For simple one-shot scripts, like cron jobs or wrappers, it's fine. But for more complicated scripts or command line tools, it can have a pretty poor signal-to-noise ratio. Not universally, but often enough that I don't really reach for it anymore.

In truth when I find myself writing a large "program" in Bash such that shellcheck is cumbersome it's a good indication that it should instead be written in a compiled language.


I’ve definitely hit places where shellcheck is just plain wrong, but I’ve started to just think of it as a different language that’s a subset of shell. It’s less of a linter and more like using gradual type checking, where there’s no guarantee that all valid programs will be accepted; only that the programs which are accepted are free of certain categories of bugs.


If ShellCheck is spitting out lots of warnings, then it'd be worth changing your shell writing style to be more compliant with it. Simple things like always putting variables in quotes should prevent most of the warnings. If anything, long scripts benefit far more from using ShellCheck as you're more likely to make mistakes and are less likely to spot them.

For the false positives, just put in the appropriate comment to disable ShellCheck's error ahead of that line e.g.

# shellcheck disable=SC2034,SC2015

That stops the warning and also documents that you've used ShellCheck, seen the specific warning and know that it's not relevant to you.


Thanks but I'm not really asking for advice. I'm uninterested in changing how I write correct, often POSIX-compliant shell scripts because of a linter that has an inferior understanding of the language. I'm also not a fan of this kind of dogmatic application of tools. Shellcheck can be useful sure, but my point is that, at least for me, the juice is often not worth the squeeze. I'm aware of how to disable rules. I often find the whole endeavor to be a waste of time.

If that doesn't track with you, that's cool, I'm happy for you.


That's an odd way to respond to someone who's trying to be helpful.

I find that there's a lot of good information in the comments on HackerNews, so sometimes advice and recommendations aren't just designed for the parent comment.

Your reply adds nothing of value and comes across as being rude - you could have simply ignored my comment if you found it of no value to you.


I think it’s because:

> If ShellCheck is spitting out lots of warnings, then it'd be worth changing your shell writing style to be more compliant with it.

Is just a very roundabout way of saying ‘If you get a lot of errors using shellcheck you are doing it wrong’, which may or may not be true, but it’d make anyone defensive.


You could be right - I certainly didn't intend my comment to be antagonistic.

My experience of ShellCheck is that you only get loads of warnings when you first start out using it and it finds all of your unquoted variables. Once you get more experienced with writing scripts and linting them with ShellCheck, the number of warnings should dramatically reduce, so it seems odd that an experienced script writer would be falling foul of ShellCheck being pedantic about what you're writing.

> ‘If you get a lot of errors using shellcheck you are doing it wrong’

I kind of agree with that, although a lot of ShellCheck's recommendations might not be strictly necessary (you may happen to know that a certain variable will never contain a space), it's such a good habit to get into.


> it seems odd that an experienced script writer would be falling foul of shellcheck being pedantic about what you're writing.

It's not simply being pedantic, it is wrong. Your writing gives the impression that the tool is infallible.

If I was new to writing shell scripts, shellcheck is clearly a wise choice. The language is loaded with footguns. But as someone who has been writing scripts for decades, I already know about all the footguns. My experience with shellcheck is that it mostly finds false positives that waste my time.


I do agree about ShellCheck being wrong sometimes - that's why I mentioned the method of disabling it for specific lines with a comment. When I first started using ShellCheck, it was highlighting lots of footguns that I wasn't aware of, but nowadays, it's very rare for it to spit a warning out at me - usually just for something like it not following a script or stating that a variable wasn't defined when it was.

I think the huge number of footguns is what makes BASH scripting fun.


> I think the huge number of footguns is what makes BASH scripting fun.

We have a completely different definition of fun.

I really only use bash when I need to chain a few commands. As soon as there is more complex logic I move to some programming language.


I read it as "when in Rome...".

Adopting any kind of quality assurance tool is implicitly buying into its "opinionated" worldview. Forfeiting one's autonomy for some person(s) notions of convention.

Rephrased: Using shellcheck is a signal to potential user's, per the Principal of Least Astonishment. No matter if either party doesn't particularly care for shellcheck; it's just a tool to get on the same page more quickly.


Yeah, there's that aspect to it as well. It's like using a coding convention - the reason behind the conventions may not be applicable for every time that you write a variable name, but I think they're good habits to get into.

e.g. I always put BASH variables in curly braces and double quotes which is often unnecessary (and more verbose), but it means that I don't trigger any ShellCheck warnings for them and it's easier to just type the extra characters than thinking about whether or not they'll actually make any difference.


People should learn to take constructive criticism and harsh truths better. I saw nothing unkind with that comment.


I wouldn't say it's unkind, but I do take issue with "it's worth changing how you write scripts" because, at least for me, it isn't.

If it's useful for you, then wonderful!


Shellcheck, and really any linter (and arguably also any other form of programming language safety, like static typing or compile-time memory safety), is not there for the very experienced author (which it sounds like you are).

Those mechanisms exist for the inexperienced author (especially in a team setting) where you want some minimum quality and consistency.

An example where Shellcheck might be useful for you is when working with a team of junior programmers. You don't necessarily have the time to teach them the ins and outs of bash, but you can quickly setup Shellcheck to make sure they don't make certain types of errors.

I think your position is totally valid and nobody can or should force you to use a linter, but I think that even for you there _can_ be situations where they might be useful.


Personally I disagree, but can understand your point.

I think I'm fairly experienced in shell and Python (~20 and ~8 YOE, respectively), and still find value in linters, type checkers, etc. Maybe moreso in Python, but that's probably a function of me writing larger programs in Python than in shell, and usually changing my mind on something as I'm writing it.


I agree that there is value even for very experienced users. A good example of this is how expert C/C++ programmers still make mistakes with memory management -- memory safe languages benefits beginners and experts equally in this case.

I personally setup linters/formatters/other static analysis for solo projects, even for languages I know very well.

I just didn't want to write a comment large enough to capture all of the nuance :)


This is one of the worst takes I've ever heard. People like you are the reason code breaks and kills people (or destroys property, etc.). Do you also refuse to use calculators, under the pretense of being too experienced, and as such calculating the square roots of four-digit numbers by hand?


I think you're misunderstanding my position. Either way, this is not a constructive comment. It contributes nothing to the discussion.


What sort of noise do you see? I find it's pretty rare to run into something that needs to be suppressed.


It's been so long since I used it seriously I couldn't tell you.

There's over 1000 open issues on the GitHub repo, and over 100 contain "false positive". I recognize several of these at first glance.

https://github.com/koalaman/shellcheck/issues?q=is%3Aissue+i...


It doesn’t already need to be a compiled language, that’s kind of like noticing you’re not going to walk down a mile to the pharmacy and decide to take a Learjet instead. The gradient should include Python or similar scripting languages before you reach for the big guns :-)


I think a lot of people jump to Go because it’s nearly as convenient as bash for writing shell scripts?


People say that (and the same for Python), but I just don’t get it – and I’m a huge fan of Python.

With shell, I can take the same tools I’ve already been using as one-liners while fiddling around, and reuse them. There is no syntax mapping to do in my head.

With any other language, I have to map the steps, and probably also add various modules (likely within stdlib, but still). That’s not nothing.

I’ve rewritten a somewhat-complicated bash script into Python. It takes some time, especially when you want to add tests.


Completely agree.

I have a rule that if I’m using arrays I should move to Python, PHP, etc. It’s a nice red flag. Arrays in Bash are terrible and a sign that things are getting more complicated.


Akshuaaaly, I think you'll find that any BASH script uses an array for the command line arguments.

Personally, I'm fine with BASH arrays (even associative ones) and yes, the syntax can be opaque, but they get the job done. I find that the long-term advantages of BASH outweigh the many, many problems with it. (If you want something to keep running for 20 years on a variety of different machines and architectures, then BASH is easier to manage than almost any other language).


Command line argument parsing is already a complicated issue in Bash. It’s not a hard rule, but whenever I have to Google Bash’s array syntax I think to myself “stop it now, you’ll regret it”.


I use the same heuristic for when I should switch from shell to Python :-). Arrays (especially associative ones, at least for me) are a good indication that a more advanced language like Python might be more appropriate than shell.


It hasn't got the same level of cross compatibility though. I can just copy a BASH script from some x86 machine and it'll run just fine on an ARM processor.


I feel like that should be reversed. If you’re writing a tiny script, the odds that you’ll make a critical mistake is pretty low (I hope).

As others have pointed out, you can tune shellcheck / ignore certain warnings, if they’re truly noise to you. Personally, I view it like mypy: if it yells at me, I’ve probably at the very least gone against a best practice (like reusing a variable name for something different). Sometimes, I’m fine with that, and I direct it to be ignored, but at least I’ve been forced to think about it.


Always fun trying to read errors in a CI build and they are full of [[


Depends on the CI used, I guess. Gitlab CI and Github Actions show colors and I use them deliberately in a format check job to show a colored diff in the output.


It's just another Lisper wishing they weren't writing Bash right now.


shellcheck is an option for helping you creat a good user experience by guessing against common errors we all make (out of a mix of laziness, bad assumptions, being in a rush, or just only thinking about the happy paths), but it isn't directly creating a great user experience and there are other methods to achieve the same thing.


You took the words right out of my mouth.


Or what if I use red background?!


It is impossible to write a safe shell script that does automatic error checking while using the features the language claims are available to you.

Here’s a script that uses real language things like a function and error checking, but which also prints “oh no”:

  set -e

  f() {
    false
    echo oh
  }

  if f
  then
    echo no
  fi
set -e is off when your function is called as a predicate. That’s such a letdown from expected- to actual-behavior that I threw it in the bin as a programming language. The only remedy is for each function to be its own script. Great!

In terms of sh enlightenment, one of the steps before getting to the above is realizing that every time you use “;” you are using a technique to jam a multi-line expression onto a single line. It starts to feel incongruous to mix single line and multi line syntax:

  # weird
  if foo; then
    bar
  fi

  # ahah
  if foo
  then
    bar
  fi
Writing long scripts without semicolons felt refreshing, like I was using the syntax in the way that nature intended.

Shell scripting has its place. Command invocation with sh along with C functions is the de-facto API in Linux. Shell scripts need to fail fast and hard though and leave it up to the caller (either a different language, or another shell script) to figure out how to handle errors.


Here's a script that left an impression on me the first time I saw it:

https://github.com/containerd/nerdctl/blob/main/extras/rootl...

I have since copied this pattern for many scripts: logging functions, grouping all global vars and constants at the top and creating subcommands using shift.



    if [ "$(uname -s)" == "Linux” ]; then 
       stuff-goes-here
    else # Assume MacOS 
While probably true for most folks, that’s hardly what I’d call great for everybody not on Linux or a Mac.


Gotta draw a line somewhere


Yeah, but you could at least elif is mac then ... else unsupported end.


If you look at the next couple of lines of the code, it emits a warning if neither command is found, but carries on. Running without in this case works but it's not optimal, as described in the warning message.


The following check for gtimeout means that other OSs that don't have the expected behaviour in either command won't break the script, you'll just get a warning message that isn't terribly relevant to them (but more helpful than simply failing or silently running without timeout/gtimeout. Perhaps improving that message would be the better option.

Though for that snippet I would argue for testing for the command rather than the OS (unless Macs or some other common arrangement has something incompatible in the standard path with the same command name?).


The worst part is that if you're not running Linux but have a perfectly working 'timeout' command, the script will ignore it and fail.


Eh. It’s true for most, and if not, it’s probably still a *BSD, so there’s a good chance that anything written for a Mac will still work.

That said, I’ve never used any of the BSDs, so I may be way off here.



One of my favorite techniques for shell scripts, not mentioned in the article:

For rarely run scripts, consider checking if required flags are missing and query for user input, for example:

  [[ -z "$filename" ]] && printf "Enter filename to edit: " && read filename
Power users already know to always do `-h / --help` first, but this way even people that are less familiar with command line can use your tool.

if that's a script that's run very rarely or once, entering the fields sequentially could also save time, compared to common `try to remember flags -> error -> check help -> success` flow.


Not trying to offend anyone here but I think shell scripts are the wrong solution for anything over ~50 lines of code.

Use a better programming language. Go, Typescript, Rust, Python, and even Perl come to mind.


> shell scripts are the wrong solution for anything over ~50 lines of code.

I don't think LOC is the correct criterion.

I do solve many problems with bash and I enjoy the simplicity of shell coding. I even have long bash scripts. But I do agree that shell scripting is the right solution only if

    = you can solve the problem quickly 
    = you don't need data structures
    = you don't need math 
    = you don't need concurrency


In my opinion, shell scripting is the right tool when you need to do a lot of calling programs, piping, and redirecting. Such programs end up being cumbersome in "proper" languages.


If there are already software written to do the stuff and I'm just coordinating them (no other computation other than string manipulation) I'd take bash every day. I would only reach to python if I need to do stuff like manipulating complex data structures or something with heavy logic.


"you can do anything not matter how horrible you feel"

but yea, shell is foremost a composition language/environment


~1k lines of bash with recutils for data persistency and dc for simple math. Was not quick to solve for me, custom invoices .fodt to .pdf to email, but I got it done. Shell is the only other scripting language I am familiar with other than Ruby. And I am worse at Ruby.

Sometimes options are limited to what you know already.


I enjoy the simplicity of shell coding

You mean, the complexity of shell coding? Any operation that in a regular language is like foo.method(arg) in shell expands into something like ${foo#/&$arg#%} or `tool1 \`tool2 "${foo}"\` bar | xargs -0 baz`.


> in a regular language [...] like foo.method(arg)

Note what you just said: when you want an object with a method that takes a parameter, you find bash too complex.

You gave an example that is not appropriate for bash.

However, bash does have functions. So if you don't need an entire underlying object system just to run your logic, you could have

    function foomethod () { 
        parm1=$1
        #more logic   
    }


My comment was more about basic string/number ops implemented as cryptic incantations, not functions per se. I regularly write bash. Simple things like trimming, deleting suffix, removing quotes, taking a substring, etc always look like someone is cursing in the code. I can’t remember this afterthought gibberish no matter how many times I write it, so I have to maintain few pages of bash snippets in my obtf.

Be damned the day I decided to write a set of scripts in it rather than looking for a way to make typescript my daily driver, like it is now. Bash “code” “base” is one of the worst programming jokes.


Exactly, plus there's no compiler or type safety.


some data structures, math, and some concurrency are just fine in shell scripts. bash has arrays so you can do pretty elaborate data structures. where it falls down is being bug-prone. but some code is useful even if it's buggy


Try running a 6 month old Python project that you haven't run in that time and report back.

Meanwhile, 10 year old Bash scripts I've written still run unmodified.

Winner by a mile (from a software-longevity and low-maintenance perspective at least): Bash


That’s a testament to your distribution/package manager’s stability, not to the language itself. I happen to write my scripts in Elixir (which is as fast-moving as Python 3), using a pinned version of it in a Nix flake at the root of my `~/bin` directory. Scripts from 5 years ago are still as reproducible as today’s.


Isn't compare a Python project to a Bash script an unfair comparison?

Compare a Python script to a Bash script. If your Python3 script (assuming no dependencies) doesn't work after 6 months I got some questions for you.

(And I don't really get how a 6 month old Python _project_ is likely to fail. I guess I'm just good at managing my dependencies?)


Unless you have pinned the exact version numbers in requirements.txt (which you should) or kept you conda/venv..etc around it might be hard. I know at this stage this would be too much compared to what we are talking about regarding python scripts but non-python dependencies are real painful.

I know that this is probably beyond bash scripting vs python scripting but I was just replying to how 6 months project can have problems.

Also not a 6 months scale but the python standard library definitely changes and I would take bash anytime if I have to use venv for running util scripts.

Edit: When python3.8 removed 'time.clock()' it was annoying to change that. I know that it was deprecated since 3.3 (as I remember) but this is the example I have in mind now. But probably it was before I was born that people using

start=$(date +%s%N)

end=$(date +%s%N)

And I will be dead before/if this becomes impossible to use.


Some of us enjoy the masochism, thank you very much.


Hear, hear! Bash me datty.


Deno is great for writing quick scripts: https://deno.com/learn/scripts-clis

Bun has similar features: https://bun.sh/docs/runtime/shell


I draw the line after a single if.


I agree with the general idea, but 1) LOC is not a good metric 2) I would immediately take off the list Python and Typescript, and anything what is compiled, leaving the list much shorter


TypeScript is good for this kind of a thing if you're only running it on your own machines and don't have to mess around with environments all the time. You can also run it with ts-node.


I draw the line at around 300 lines.


If you want a great script user experience, I highly recommend avoiding the use of pipefail. It causes your script to die unexpectedly with no output. You can add traps and error handlers and try to dig out of PIPESTATUS the offending failed intermediate pipe just to tell the user why the program is exiting unexpectedly, but you can't resume code execution from where the exception happened. You're also now writing a complicated ass program that should probably be in a more complete language.

Instead, just check $? and whether a pipe's output has returned anything at all ([ -z "$FOO" ]) or if it looks similar to what you expect. This is good enough for 99% of scripts and allows you to fail gracefully or even just keep going despite the error (which is good enough for 99.99% of cases). You can also still check intermediate pipe return status from PIPESTATUS and handle those errors gracefully too.


> "It causes your script to die unexpectedly with no output."

Oh? I don't observe this behavior in my testing. Could you share an example? AFAIK, if you don't capture stderr, that should be passed to the user.

> "Instead, just check $? and..."

I agree that careful error handling is ideal. However, IMO it's good defensive practice to start scripts with "-e" and pipefail.

For many/most scripts, it's preferable to fail with inadequate output than to "succeed" but not perform the actions expected by the caller.


  $ date +%w
  0
  $ cat foo.sh 
  #!/usr/bin/env sh
  set -x
  set -eu -o pipefail
  echo "start of script"
  echo "start of pipe" | cat | false | cat | cat
  if [ "$(date +%w)" = "0" ] ; then
    echo "It's sunday! Here we do something important!"
  fi
  $ sh foo.sh
  + set -eu -o pipefail
  + echo 'start of script'
  start of script
  + echo 'start of pipe'
  + cat
  + false
  + cat
  + cat
  $
Notice how the script exits, and prints the last pipe it ran? It should have printed out the 'if ..' line next. It didn't, because the script exited with an error. But it didn't tell you that.

If you later find out the script has been failing, and find this output, you can guess the pipe failed (it doesn't actually say it failed), but you don't know what part of the pipe failed or why. And you only know this much because tracing was enabled.

If tracing is disabled (the default for most people), you would have only seen 'start of script' and then the program returning. Would have looked totally normal, and you'd be none the wiser unless whatever was running this script was also checking its return status and blaring a warning if it exited non-zero, and then you have an investigation to begin with no details.

> IMO it's good defensive practice to start scripts with "-e" and pipefail.

If by "defensive" you mean "creating unexpected failures and you won't know where in your script the failure happened or why", then I don't like defensive practice.

I cannot remember a single instance in 20 years where pipefail helped me. But plenty of times where I spent hours trying to figure out where a script was crashing and why, long after it had been crashing for weeks/months, unbeknownst to me. To be sure, there were reasons why the pipe failed, but in almost all cases it didn't matter, because either I got the output I needed or didn't.

> it's preferable to fail with inadequate output than to "succeed" but not perform the actions expected by the caller.

I can't disagree more. You can "succeed" and still detect problems and handle them or exit gracefully. Failing with no explanation just wastes everybody's time.

Furthermore, this is the kind of practice in backend and web development that keeps causing web apps to stop working, but the user gets no notification whatsoever, and so can't report an error, much less even know an error is happening. I've had this happen to me a half dozen times in the past month, from a bank's website, from a consumer goods company's website, even from a government website. Luckily I am a software engineer and know how to trace backend network calls, so I could discover what was going on; no normal user can do that.


> the script exited with an error. But it didn't tell you that.

Yes it did, by having a non-zero exit code. However, it didn't explicitly mention that it didn't complete successfully, but that's down to the script writer. I like to include a function to tidy up temporary files etc. when the script exits (e.g. trap __cleanup_before_exit EXIT) and it's easy to also assign a function to run when ERR is triggered - if you wish, you can set it to always provide an error backtrace.


I'd add, in each my Bash scripts I add this line to get the script's current directory:

SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )

This is based on this SA's answer: https://stackoverflow.com/questions/59895/how-do-i-get-the-d...

I never got why Bash doesn't have a reliable "this file's path" feature and why people always take the current working directory for granted!


I like:

    readonly SCRIPT_SRC="$(dirname "${BASH_SOURCE[${#BASH_SOURCE[@]} - 1]}")"
    readonly SCRIPT_DIR="$(cd "${SCRIPT_SRC}" >/dev/null 2>&1 && pwd)"
    readonly SCRIPT_NAME=$(basename "$0")


I've been using

script_dir="$(dirname "$(realpath "$0")")"

Hasn't failed me so far and it's easy enough to remember


Read the answers and comments from the SO threads. It won't always work for other people in other context.


Every time I see a “good” bash script it reminds me of how incredibly primitive every shell is other than PowerShell.

Validating parameters - a built in declarative feature! E.g.: ValidateNotNullOrEmpty.

Showing progress — also built in, and doesn’t pollute the output stream so you can process returned text AND see progress at the same time. (Write-Progress)

Error handling — Try { } Catch { } Finally { } works just like with proper programming languages.

Platform specific — PowerShell doesn’t rely on a huge collection of non-standard CLI tools for essential functionality. It has built-in portable commands for sorting, filtering, format conversions, and many more. Works the same on Linux and Windows.

Etc…

PS: Another super power that bash users aren’t even aware they’re missing out on is that PowerShell can be embedded into a process as a library (not an external process!!) and used to build an entire GUI that just wraps the CLI commands. This works because the inputs and outputs are strongly typed objects so you can bind UI controls to them trivially. It can also define custom virtual file systems with arbitrary capabilities so you can bind tree navigation controls to your services or whatever. You can “cd” into IIS, Exchange, and SQL and navigate them like they’re a drive. Try that with bash!


I also hate bash scripting, and as far as Unix shell go, bash is among the best. So many footguns... Dealing with filenames with spaces is a pain, and files that start with a '-', "rm -rf" in a script is a disaster waiting to happen unless you triple check everything (empty strings, are you in the correct directory, etc...), globs that don't match anything, etc...

But interactively, I much prefer Unix shells over PowerShell. When you don't have edge cases and user input validation to deal with, these quirks become much more manageable. Maybe I am lacking experience, but I find PowerShell uncomfortable to use, and I don't know if it has all these fancy interactive features many Unix shell have nowadays.

What you are saying essentially is that PowerShell is a better programming language than bash, quite a low bar actually. But then you have to compare it to real programming languages, like Perl or Python.

Perl has many shell-like features, the best regex support of any language, which is useful when everything is text, many powerful features, and an extensive ecosystem.

Python is less shell-like but is one of the most popular languages today, with a huge ecosystem, clean code, and pretty good two-way integration, which mean you can not only run Python from your executable, but Python can call it back.

If what you are for is portability and built-in commands, then the competition is Busybox, a ~1MB self-contained executable providing the most common Unix commands and a shell, very popular for embedded systems.


> What you are saying essentially is that PowerShell is a better programming language than bash

In some sense, yes, but there is no distinct boundary. Or at least, there ought not to be one!

A criticism a lot of people (including me) had of Windows in the NT4 and 2000 days was that there was an enormous gap between click-ops and heavyweight automation using C++ and COM objects (or even VBScript or VB6 for that matter). There wasn't an interactive shell that smoothly bridged these worlds.

That's why many Linux users just assumed that Windows has no automation capability at all: They started with click-ops, never got past the gaping chasm, and just weren't aware that there was anything on the other side. There was, it just wasn't discoverable unless you were already an experienced developer.

PowerShell bridges that gap, extending quite a bit in both directions.

For example, I can use C# to write a PowerShell module that has the full power of a "proper" programming language, IDE with debug, etc... but still inherits the PS pipeline scaffolding so I don't have to reinvent the wheel for parameter parsing, tab-complete, output formatting, etc...


Windows still has horrendous automation support, PowerShell falls short and loses its USP as soon as you need anything that is not a builtin and series of bandaids like DSC didn't even ameliorate the situation. The UX is bad even when working with nothing but MS products like MSSQL.

The biggest leap for automation on Windows has been WSL, aka shipping Linux.


> Dealing with filenames with spaces is a pain, and files that start with a '-',

Wait! The fact that arguments with a leading hyphen are interpreted as options is not bash's fault. It's ingrained in the convention of UNIX tools and there's nothing bash can do to mitigate it. You would have the same problem if you got rid of any shell and directly invoked commands from Python or C.


Indeed, it is not the fault of bash but of the Unix command line in general. Made worse by the fact that different tools may have different conventions. Often, "--" will save you, but not always. And by the way, it took me years to become aware of "--", which is exactly the reason why I hate shell scripting: a non-obvious problem, with a non-obvious solution that doesn't always work.

One of GP arguments in favor of PowerShell is that most commands are builtin, so this problem can be solved by the shell itself, and furthermore, it is based on strongly typed objects, which should make it clear what is a file and what is a command line option. And I think he has a point. Regular command line parsing is a mess on Windows though.

In "real" programming languages, library APIs are usually favored over command lines, and they are usually designed in such a way that options and file arguments are distinct. You may still need to run commands at some point, but you are not reliant on them for every detail, which, in traditional shell scripting includes trivial things like "echo", "true", "false", "test", etc... Now usually builtin.

As for bash "doing something about it", it would greatly benefit from a linter. I know they exist, but I don't know if it is standard practice to use them.


> Wait! The fact that arguments with a leading hyphen are interpreted as options is not bash's fault. It's ingrained in the convention of UNIX tools and there's nothing bash can do to mitigate it. You would have the same problem if you got rid of any shell and directly invoked commands from Python or C.

A better system shell could make it easy to define shims for the existing programs. Also it could make their calling easier, e.g. with named arguments. So when you wanted to delete your file called -rf, you would say

  rm(file="-rf")
or something like that, with your preferred syntax. It would be much safer than just pass big strings as arguments, where spaces separate the different arguments, also spaces can appear in the arguments, also arguments can be empty. Bash or Posix sh is not very good at safely invoking other programs, or at handling files.


What you're suggesting is that the shell should have every possible command builtin and not call external programs.

Let's analyze your example with 'rm': it works as long as 'rm' is an internal routine. If it's an external program, independently of the syntax you use to specify the arguments, sooner or later the shell will need to actually call the 'rm' executable, and to pass '-rf' to it as argument number 1. The 'rm' executable will then examine its arguments, see that the first one begins with a hyphen and interpret it as an option.

As I said, the only way to avoid all this would be to replace 'rm' with an internal routine. Then you would replace 'cp' and 'ln', and what else? Of course 'echo' and 'printf', 'cat', 'ls', 'cd' maybe, why not 'find' and 'grep'? What about 'head', 'tail', 'cut'? Don't forget 'sed' and 'awk'... the list is getting longer and longer. Where do you draw the line?

Seriously, the only mitigation would be to define a function to 'sanitize' an argument to make it appear as a file if used as an argument to an external program. Something like:

  force_file() {
    case "$1" in
      -*) echo "./$1" ;;
      *)  echo "$1" ;;
    esac
}

This doesn't work with 'echo' though.


Bash also has a built-in to validate parameters; it’s called test, and is usually called with [], or [[]] for some bash-specifics.

Re: non-standard tools, if you’re referring to timeout, that’s part of GNU coreutils. It’s pretty standard for Linux. BSDs also have it from what I can tell, so it’s probably a Mac-ism. In any case, you could just pipe through sleep to achieve the same thing.

> …inputs and outputs are strongly typed objects

And herein is the difference. *nix-land has everything as a file. It’s the universal communication standard, and it’s extremely unlikely to change. I have zero desire to navigate a DB as though it were a mount point, and I’m unsure why you would ever want to. Surely SQL Server has a CLI tool like MySQL and Postgres.


The CLI tool is PowerShell.

You just said everything “is a file” and then dismissed out of hand a system that takes that abstraction even further!

PowerShell is more UNIX than UNIX!


What? How are typed objects files?

What I’m saying is that in *nix tooling, things are typically designed to do one thing well. So no, I don’t want my shell to also have to talk MySQL, Postgres, SQL Server, DB2, HTTP, FTP, SMTP…


> "So no, I don’t want my shell to also have to talk..."

What's the point of the shell, if not to manage your databases, your REST APIs, files, and mail? Is it something you use for playing games on, or just for fun?

> designed to do one thing well.

Eeexcept that this is not actually true in practice, because the abstraction was set at a level that's too low. Shoving everything into a character (or byte) stream turned out to be a mistake. It means every "one thing" command is actually one thing plus a parser and and encoder. It means that "ps" has a built-in sort command, as do most other UNIX standard utilities, but they all do it differently. This also means that you just "need to know" how to convince each and every command to output machine-readable formats that other tools on the pipeline can pick up safely.

I'll tell you a real troublshooting story, maybe that'll help paint a picture:

I got called out to assist with an issue with a load balancer appliance used in front of a bunch of Linux servers. It was mostly working according to the customer, but their reporting tool was showing that it was sending traffic to the "wrong" services on each server.

The monitoring tool used 'netstat' to track TCP connections, which had a bug in that version of RedHat where it would truncate the last decimal digit of the port number if the address:port combo had the maximum possible number of digits, e.g.: 123.123.123.123:54321 was shown as 123.123.123.123:5432 instead.

Their tool was just ingesting that pretty printed table intended for humans with "aligned" columns, throwing away the whitespace, and putting that into a database!

This gives me the icks, but apparently Just The Way Things Are Done in the UNIX world.

In PowerShell, Get-NetTCPConnection outputs objects, so this kind of error is basically impossible. Downstream tools aren't parsing a text representation of a table or "splitting it into columns", they receive the data pre-parsed with native types and everything.

So for example, this "just works":

    Get-NetTCPConnection | 
        Where-Object State -EQ 'Bound' | 
        Group-Object LocalPort -NoElement | 
        Sort-Object Count -Descending -Top 10
Please show me the equivalent using netstat. In case the above was not readable for you, it shows the top ten TCP ports by how many bound connections they have.

This kind of thing is a challenge with UNIX tools, and then is fragile forever. Any change to the output format of netstat breaks scripts in fun and create ways. Silently. In production.

I hope you never have to deal with IPv6.


For fun, I took a crack at your example and came up with this craziness (with the caveat it's late and I didn't spend much time on it), which is made a bit more awkward because grep doesn't do capturing groups:

  netstat -aln \
  | grep ESTABLISHED \
  | awk '{print $4}' \
  | grep -Po '\:\d+$' \
  | grep -Po '\d+' \
  | sort \
  | uniq -c \
  | sort -r \
  | head -n 10
Changing the awk field to 5 instead of 4 should get you remote ports instead of local. But yeah, that will be fragile if netstat's output ever changes. That said, even if you're piping objects around, if the output of the thing putting out objects changes, your tool is always at risk of breaking. Yes objects breaking because field order changed is less likely, but what happens if `Get-NetTCPConnection` stops including a `State` field? I guess `Where-Object` might validate it found such a field, but I could also see it reasonably silently ignoring input that doesn't have the field. Depends on whether it defaults to strict or lenient parsing behaviors.


I know this sounds like nit-picking but bear with me. It's the point I'm trying to make:

1. Your script outputs an error when run, because 'bash' itself doesn't have netstat as a built-in. That's an external command. In my WSL2, I had to install it. You can't declaratively require this up-front, you script has to have an explicit check... or it'll just fail half-way through. Or do nothing. Or who knows!?

PowerShell has up-front required prerequisites that you can declare: https://learn.microsoft.com/en-us/powershell/module/microsof...

Not that that's needed, because Get-NetTcpConnection is a built-in command.

3. Your script is very bravely trying to parse output that includes many different protocols, including: tcp, tcp6, udp, udp6, and unix domain sockets. I'm seeing random junk like 'ACC' turn up after the first awk step.

4. Speaking of which, the task was to get tcp connections, not udp, but I'll let this one slide because it's an easy fix.

5. Now imagine putting your script side-by-side with the PowerShell script, and giving it to people to read.

What are the chances that some random person could figure out what each one does?

Would they be able to modify the functionality successfully?

Note that you had to use 'awk', which is a parser, and then three uses of 'grep' -- a regular expression language, which is also a kind of parsing.

The PowerShell version has no parsing at all. That's why it's just 4 pipeline expressions instead of 9 in your bash example.

Literally in every discussion about PowerShell there's some Linux person who's only ever used bash complaining that PS syntax is "weird" or "hard to read". What are they talking about!? It's half the complexity for the same functionality, reads like English, and doesn't need write-only hieroglyphics for parameters.


Because I didn't see the edited version when I was writing my original reply and its too late, I want to call out another problem that you graciously overlooked that we can call #2 since it touches neatly on your #1 and #3 items and #2 is already missing. The extra junk you see in your wsl after the awk step is probably because the other big *NIX problem with shell scripts is my `netstat` or `grep` or even `echo` might not be the same as yours. I originally wrote it on a mac, and while I was checking the man page for netstat to see how old it was and how likely netstat output would change, it occurred to me that BSD netstat and linux netstat are probably different, so I jumped over and re-wrote on a linux box. Entirely possible your version is different from mine.

Heck, just checking between Bash, ZSH and Fish on my local machine here and Bash and ZSH's version is from 2003 and provides a single `-n` option and declares POSIX compliance, but explicitly calls out that `sh`'s version doesn't accept the `-n` argument. Fish provides their own implementation that accepts arguments `[nsEe]` from 2023. Every day I consider it a miracle that most of the wider internet and linux/unix world that underlies so much of it works at all, let alone reliably enough to have multiple nines of uptime. "Worse is better" writ large I guess.


I was worried that my toy problem wasn’t complex enough to reveal these issues!

I had an experience recently trying to deploy an agent on a dozen different Linux distros.

I had the lightbulb moment that the only way to run IT in an org using exactly one distro. Ideally one version, two at the most during transitions. Linux is a kernel, not an operating system. There are many Linux operating systems that are only superficially “the same”.


At least here, we can agree. If I ran a business and allowed employees to run Linux (which is reasonable, IMO), the last thing I want is someone's riced-out Gentoo with an unpatched security exploit onto the VPN.


Sure, I'm not arguing that having a set of well defined outputs and passing objects around wouldn't be better. You're talking to someone that often laments that SmallTalk was not more popular. But you'd need to get the entire OSS community to land on a single object representation and then get them to independently change all the tools to start outputting the object version. PowerShell and Microsoft have the advantage in this case of being able to dictate that outcome. In the linux world, dictating outcomes tends to get you systemd levels of controversy and anger.

Technically speaking though, there's no reason you couldn't do that all in bash. It's not the shell that's the problem here (at least, to an extent, passing via text I guess is partly a shell problem). There's no reason you couldn't have an application objNetStat that exported JSON objects and another app that filtered those objects, and another that could group them and another that could sort them. Realistically "Sort-Object Count -Descending -Top 10" could be a fancy alias for "sort | uniq -c | sort -r | head -n 10". And if we're not counting flags and arguments to a function as complexity, if we had our hypothetical objNetstat, I can do the whole thing in one step:

  objNetstat --json \
  | jq -c '[.[] | select(.state == "ESTABLISHED")] | group_by(.port) | map({port:.[0].port, count: map(.host) | length}) | sort_by(.count) | reverse | [limit(10;.[])]'
One step, single parsing to read in the json. Obviously I'm being a little ridiculous here, and I'm not entirely sure jq's DSL is better than a long shell pipe. But the point is that linux and linux shells could do this if anyone cared enough to write it, and some shells like fish have taken some baby steps towards making shells take advantage of modern compute. Why RedHat or one of the many BSDs hasn't is anyone's guess. My two big bets on it are the aversion to "monolithic" tools (see also systemd), and ironically not breaking old scripts / current systems. The fish shell is great, and I've built a couple shell scripts in it for my own use, and I can't share them with my co-workers who are on Bash/ZSH because fish scripts aren't Bash compatible. Likewise I have to translate anyone's bash scripts into fish if I want to take advantage of any fish features. So even though fish might be better, I'm not going to convince all my co-workers to jump over at once, and without critical mass, we'll be stuck with bash pipelines and python scripts for anything complex.


Half way through your second paragraph I just knew you'd be reaching for 'jq'!

All joking aside, that's not a bad solution to the underlying problem. Fundamentally, unstructured data in shell pipelines is much of the issue, and JSON can be used to provide that structure. I'm seeing more and more tools emit or accept JSON. If one can pinch their nose and ignore the performance overhead of repeatedly generating and parsing JSON, it's a workable solution.

Years ago, a project idea I was really interested for a while was to try to write a shell in Rust that works more like PowerShell.

Where I got stuck was the fundamentals: PowerShell heavily leans on the managed virtual machine and the shared memory space and typed objects that enables.

Languages like C, C++, and Rust don't really have direct equivalents of this and would have to emulate it, quite literally. At that point you have none of the benefits of Rust and all of the downsides. May as well just use pwsh and be done with it!

Since then I've noticed JSON filling this role of "object exchange" between distinct processes that may not even be written in the same programming language.

I feel like this is going to be a bit like UTF-8 in Linux. Back in the early 2000s, Windows had proper Unicode support with UTF-16, and Linux had only codepages on top of ASCII. Instead of catching up by changing over to UTF-16, Linux adopted UTF-8 which in some ways gave it better Unicode support than Windows. I suspect JSON in the shell will be the same. Eventually there will be a Linux shell where everything is always JSON and it will work just like PowerShell, except it'll support multiple processes in multiple languages and hence leapfrog Windows.


>Years ago, a project idea I was really interested for a while was to try to write a shell in Rust that works more like PowerShell.

So this whole conversation and a different one about python and it's behavior around `exit` vs `exit()` sent me down a rabbit hole of seeing if I could make the python interpreter have a "shell like" dsl for piping around data. It turns out you sort of can. I don't think you can defeat the REPL and make a function call like `echo "foo" "bar" "baz", but you can make it do this:

  netstat("-ln") | filter_by({"state":"ESTABLISHED"}) \
    | group_by(["local_port"]) | count_groups \
    | sort("count", descending=True) | limit(10)
And only ever parse plain text once on the input from netstat. For your amusement/horror I present "ShPy": https://gitlab.com/tpmoney/shpy


>Years ago, a project idea I was really interested for a while was to try to write a shell in Rust that works more like PowerShell. >Where I got stuck was the fundamentals: PowerShell heavily leans on the managed virtual machine and the shared memory space and typed objects that enables.

Hmmm, if you need that sort of shared memory access throughout the shell, you probably need a language like python (or maybe better Lisp) with a REPL and the ability/intent to self modify while running. Of course, every time you have to farm out because you don't have a re-written replacement internally to the shell app, you'd still parsing strings, but at least you could write a huge part of data processing in the shell language and keep it in house. Years ago I worked for a company that was using Microsoft's TFVC services (before it was azure devops or whatever they call it now) and wrote a frontend to their REST API in python that we could call from various other scripts and not be parsing JSON everywhere. Where this is relevant to the discussion is that one of the things I built in (in part to help with debugging when things went sideways) was an ability to drop into the python REPL mid-program run to poke around at objects and modify them or the various REST calls at will. With well defined functions and well defined objects , the interactive mode was effectively a shell for TFVC and the things we were using.

Though all of that said, even if one did that, they would still either need to solve the "object model" problem for disparate linux tools, or worse commit to writing (or convincing other people to write and maintain) versions of all sorts of various tools in the chosen language to replace the ones the shell isn't farming out to anymore. Its one thing to chose to write a shell, it's something else entirely to choose to re-write the gnu userland tools (and add tools too)


> Years ago, a project idea I was really interested for a while was to try to write a shell in Rust that works more like PowerShell.

Today's your lucky day!

https://www.nushell.sh


> PowerShell has up-front required prerequisites that you can declare

Anyone who's written more than a few scripts for others will have learned to do something like this at the start:

    declare -a reqs
    reqs+=(foo bar baz)
    missing=0
    for r in "${reqs[@]}"; do
        if (! command -v "$r" &>/dev/null); then
            echo "${r} is required, please install it"
            missing=1
        fi
    done
    if [ $missing -gt 0 ]; then
        exit 1
    fi
> Your script is very bravely trying to parse output that includes many different protocols, including: tcp, tcp6, udp, udp6, and unix domain sockets

They probably didn't know you could specify a type. Mine only displays TCP4.

> Now imagine putting your script side-by-side with the PowerShell script, and giving it to people to read.

I'm gonna gatekeep here. If you don't know what that script would do, you have no business administering Linux for pay. I'm not saying that in a "GTFO noob" way, but in a "maybe you should know how to use your job's tools before people depend on you to do so." None of that script is using exotic syntax.

> Note that you had to use 'awk', which is a parser, and then three uses of 'grep' -- a regular expression language, which is also a kind of parsing.

They _chose_ to. You _can_ do it all with awk (see my example in a separate post).

> Literally in every discussion about PowerShell there's some Linux person who's only ever used bash complaining that PS syntax is "weird" or "hard to read". What are they talking about!? It's half the complexity for the same functionality, reads like English, and doesn't need write-only hieroglyphics for parameters.

And yet somehow, bash and its kin continue to absolutely dominate usage.

There is a reason that tools like ripgrep [0] are beloved and readily accepted: they don't require much in the way of learning new syntax; they just do the same job, but faster. You can load your local machine up with all kinds of newer, friendlier tools like fd [1], fzf[2], etc. – I definitely love fzf to death. But you'd better know how to get along without them, because when you're ssh'd onto a server, or god forbid, exec'd into a container built with who-knows-what, you won't have them.

Actually, that last point sparked a memory: what do you do when you're trying to debug a container and it doesn't have things like `ps` available? You iterate through the `/proc` filesystem, because _everything is a file._ THAT is why the *nix way exists, is wonderful, and is unlikely to ever change. There is always a way to get the information you need, even if it's more painful.

[0]: https://github.com/BurntSushi/ripgrep

[1]: https://github.com/sharkdp/fd

[2]: https://github.com/junegunn/fzf


put the pipe character at the end of the line and you don't need the backslashes


> What's the point of the shell, if not to manage your databases, your REST APIs, files, and mail? Is it something you use for playing games on, or just for fun?

It's for communicating with the operating system, launching commands and viewing their output. And some scripting for repetitive workflows. If I'd want a full programming environment, I'd take a Lisp machine or Smalltalk (a programmable programming environment).

Any other systems that want to be interactive should have their own REPL.

> This kind of thing is a challenge with UNIX tools, and then is fragile forever. Any change to the output format of netstat breaks scripts in fun and create ways. Silently. In production.

The thing is if you're using this kind of scripts in production, then not testing it after updating the system, that's on you. In your story, they'd be better of writing a proper program. IMO, scripts are automating workflows (human guided), not for fire and forget process. Bash and the others deals in text because that's all we can see and write. Objects are for programming languages.


> In your story, they'd be better of writing a proper program.

Sure, on Linux, where your only common options bash or "software".

On Windows, with PowerShell, I can don't have to write a software program. I can write a script that reads like a hypothetical C# Shell would, but oriented towards interactive shells.

(Note that there is a CS-Script, but it's a different thing intended for different use-cases.)


I'm kind of with the OP that it would be nice if linux shells started expanding a bit. I think the addition of the `/dev/tcp` virtual networking files was an improvement, even if it now means my shell has to talk TCP and UDP instead of relying on nc to do that


> What's the point of the shell, if not to manage your databases, your REST APIs, files, and mail? Is it something you use for playing games on, or just for fun?

To call other programs to do those things. Why on earth would I want my shell to directly manage any of those things?

I think you're forgetting something: *nix tools are built by a community, PowerShell is built by a company. Much like Apple, Microsoft can insist on and guarantee that their internal API is consistent. *nix tooling cannot (nor would it ever try to) do the same.

> It means that "ps" has a built-in sort command, as do most other UNIX standard utilities, but they all do it differently.

I haven't done an exhaustive search, but I doubt that most *nix tooling has a built-in sort. Generally speaking, they're built on the assumption that you'll pipe output as necessary to other tools.

> This also means that you just "need to know" how to convince each and every command to output machine-readable formats that other tools on the pipeline can pick up safely.

No, you don't, because plaintext output is the lingua franca of *nix tooling. If you build a tool intended for public consumption and it _doesn't_ output in plaintext by default, you're doing it wrong.

Here's a one-liner with GNU awk; you can elide the first `printf` if you don't want headers. Similarly, you can change the output formatting however you want. Or, you could skip that altogether, and pipe the output to `column -t` to let it handle alignment.

    netstat -nA inet | gawk -F':' 'NR > 2 { split($2, a, / /); pc[a[1]]++ } END { printf "%-5s     %s\n", "PORT", "COUNT"; PROCINFO["sorted_in"]="@val_num_desc"; c=0; for(i in pc) if (c++ < 10) { printf "%-5s     %-5s\n", i, pc[i] } }'
Example output:

    PORT      COUNT
    6808      16
    3300      8
    6800      6
    6802      2
    6804      2
    6806      2
    60190     1
    34362     1
    34872     1
    38716     1

Obviously this is not as immediately straight-forward for the specific task, though if you already know awk, it kind of is:

    Set the field separator to `:`
    Skip the first two lines (because they're informational headers)
    Split the 2nd column on space to skip the foreign IP
    Store that result in variable `a`
    Create and increment array `pc` keyed on the port
    When done, do the following
    Print a header
    Sort numerically, descending
    Initialize a counter at 0
    For every element in the pc array, until count hits 10, print the value and key
You can also chain together various `grep`, `sort`, and `uniq` calls as a sibling comment did. And if your distro doesn't include GNU awk, then you probably _would_ have to do this.

You may look at this and scoff, but really, what is the difference? With yours, I have to learn a bunch of commands, predicates, options, and syntax. With mine, I have to... learn a bunch of commands, predicates, options, and syntax (or just awk ;-)).

> This kind of thing is a challenge with UNIX tools

It's only a challenge if you don't know how to use the tools.

> Any change to the output format of netstat breaks scripts in fun and create ways

The last release of `netstat` was in 2014. *nix tools aren't like JavaScript land; they tend to be extremely stable. Even if they _do_ get releases, if you're using a safe distro in prod (i.e. Debian, RedHat), you're not going to get a surprise update. Finally, the authors and maintainers of such tools are painfully aware that tons of scripts around the world depend on them being consistent, and as such, are highly unlikely to break that.

> Silently. In production.

If you aren't thoroughly testing and validating changes in prod, that's not the fault of the tooling.


And for anyone who might be open to trying powershell, the cross platform version is pwsh.

Pythonistas who are used to __dir__ and help() would find themselves comfortable with `gm` (get-member) and get-help to introspect commands.

You will also find Python-style dynamic typing, except with PHP syntax. $a=1; $b=2; $a + $b works in a sane manner (try that with bash). There are still funny business with type coercion. $a=1; $b="2"; $a+$b (3); $b+$a ("21");

I also found "get-command" very helpful with locating related commands. For instance "get-command -noun file" returns all the "verb-noun" commands that has the noun "file". (It gives "out-file" and "unblock-file")

Another nice thing about powershell is you can retain all your printf debugging when you are done. Using "Write-Verbose" and "Write-Debug" etc allows you to write at different log levels.

Once you are used to basic powershell, there are bunch of standard patterns like how to do Dry-Runs, and Confirmation levels. Powershell also supports closures, so people create `make` style build systems and unit test suites with them.


The big problem with trying to move on from BASH is that it's everywhere and is excellent at tying together other unix tools and navigating the filesystem - it's at just the right abstraction level to be the duct tape of languages. Moving to other languages provides a lot more safety and power, but then you can't rely on the correct version being necessarily installed on some machine you haven't touched in 10 years.

I'm not a fan of powershell myself as the only time I've tried it (I don't do much with Windows), I hit a problem with it (or the object I was using) not being able to handle more than 256 characters for a directory and file. That meant that I just installed cygwin and used a BASH script instead.


I am Microsoft hater. I cannot stand Windows and only use Linux.

PowerShell blows bash out of the water. I love it.


except for the fact that it is slower than hell and the syntax is nuts. I don't really understand the comparison, bash is basically just command glue for composing pipelines and pwsh is definitely more of a full-fledged language... but to me, I use bash because its quick and dirty and it fits well with the Unix system.

If I wanted the features that pwsh brings I would much rather just pick a language like Golang or Python where the experience is better and those things will work on any system imaginable. Whereas pwsh is really good on windows for specifically administrative tasks.


The fact that it is "basically just command glue for composing pipelines" makes it even more regrettable that it takes more knowledge and mental focus to avoid shooting my foot off in bash than it does in any other programming language I use.


If you're trying to write a full fledged program in it, it's going to be a pain as there are only strings (and arrays, I think). Bash is for scripting. If you have complex logic to be done, use another programming language like perl, ruby, python, $YOUR_PREFERRED_ONE,...


You’re arguing that the power of PowerShell is pointless because you’ve resorted to alternatives to bash… because it’s not good enough for common scenarios.

This is Stockholm Syndrome.

You’ve internalised your limitations and have grown to like them.


No. bash as a shell is for interactive use or for automating said interactions. I want the computer to do stuff. The “everything is a file” and text oriented perspective in the unix world is just one model and bash is very suitable for it. Powershell is another model, just like lisp and smalltalk. I’m aware of the limitations of bash, but at the end of the day, it gets the job done and easily at that.


I’m curious. How useful is Powershell outside of a windows environment? I use it on Windows since much of the admin side of things requires it.


My issue with powershell is that it’s niche language with a niche “stdlib” which cannot be used as general purpose. The same issue I have with AHK. These two are languages that you use for a few hours and then forget completely in three weeks.

Both of them should be simply python and typescript compatible dlls.

You can “cd” into IIS, Exchange, and SQL and navigate them like they’re a drive. Try that with bash!

This exists.


   PowerShell can be embedded into a process as a library... and used to build an entire GUI that just wraps the CLI commands.
Sounds pretty interesting. Can you tell me what search terms I'd use to learn more about the GUI controls? Are they portable to Linux?


It doesn’t have GUI capabilities per-se. Instead, it is designed to be easy to use as the foundation of an admin GUI.

The .NET library for this is System.Management.Automation.

You can call a PowerShell pipeline with one line of code: https://learn.microsoft.com/en-us/dotnet/api/system.manageme...

Unlike invoking bash (or whatever) as a process, this is much lighter weight and returns a sequence of objects with properties. You can trivially bind those to UI controls such as data tables.

Similarly the virtual file system providers expose metadata programmatically such as “available operations”, all of which adhere to uniform interfaces. You can write a generic UI once for copy, paste, expand folder, etc and turn them on or off as needed to show only what’s available at each hierarchy level.

As an example, the Citrix management consoles all work like this. Anything you can do in the GUI you can do in the CLI by definition because the GUI is just some widgets driving the same CLI code.


Bash is crap and powershell an abomination with a few good ideas.

fish, Python, and oilshell (ysh) are ultimately on better footing.


Or just the old Perl. Any Bash/AWK/Sed user can be competent with it in days.


I ask LLMs to modify the shell script to strictly follow Google’s Bash scripting guidelines[^1]. It adds niceties like `set -euo pipefail`, uses `[[…]]` instead of `[…]` in conditionals, and fences all but numeric variables with curly braces. Works great.

[^1]: https://google.github.io/styleguide/shellguide.html


Why would you change a shell (sh?) script into a Bash script? And why would you change [[ into [ expressions, which are not Posix, as far as I remember? And why make the distinction for numeric variablesand not simply make the usage the same, consistent for everything? Does it also leave away the double quotes there? That even sounds dangerous, since numeric variables can contain filenames with spaces.

Somehow whenever people dance to the Google code conventions tune, I find they adhere to questionable practices. I think people need to realize, that big tech conventions are simply their common debominator, and not especially great rules, that everyone should adopt for themselves.


>That even sounds dangerous, since numeric variables can contain filenames with spaces.

Or filenames that contain the number zero :D

    #!/bin/sh
    #
    # Usage : popc_unchecked BINARY_STRING
    #
    #   Count number of 1s in BINARY_STRING.  Made to demonstrate a use of IFS that
    #   can bite you if you do not quote all the variables you don't want to split.
    
    len="${#1}"
    count() { printf '%s\n' "$((len + 1 - $#))"; }
    saved="${IFS}"
    IFS=0
    count 1${1}1
    IFS="${saved}"
    
    # PS: we do not run the code in a subshell because popcount needs to be highly
    # performant (≖ ᴗ ≖ )


> This matches the output format of Bash's builtin set -x tracing, but gives the script author more granular control of what is printed.

I get and love the idea but I'd consider this implementation an anti-pattern. If the output mimics set -x but isn't doing what that is doing, it can mislead users of the script.


Even worse, it mimics it poorly, hardcoding the PS4 to the default.

The author could also consider trapping debug to maybe be selective while also making it a little more automatic.


I can highly recommend using bash3boilerplate (https://github.com/kvz/bash3boilerplate) if you're writing BASH scripts and don't care about them running on systems that don't use BASH.

It provides logging facilities with colour usage for the terminal (not for redirecting out to a file) and also decent command line parsing. It uses a great idea to specify the calling parameters in the help/usage information, so it's quick and easy to use and ensures that you have meaningful information about what parameters the script accepts.

Also, please don't write shell scripts without running them through ShellCheck. The shell has so many footguns that can be avoided by correctly following its recommendations.


Tiny nitpick - usage errors are conventionally 'exit 2' not 'exit 1'


Only one that’s shell specific is 4. The rest can be applied any code written. Good work!


Even 4 can be generalized to "be deliberate about what you do with a failed function call (etc) - does it exit the command? Log/print an error and continue? Get silently ignored? Handled?"


I'd add that if you're going to use color, then you should do the appropriate checks for determining if STDOUT isatty


Or $NO_COLOR, per https://no-color.org


A tip:

        sh -x $SCRIPT
shows a debugging trace on the script in a verbose way, it's unvaluable on errors.

You can use it as a shebang too:

         #!/bin/sh -x


Thanks! I've always edited the script adding a `set -x` at the top. Never occurred to me that I the shell of course had a similar startup flag.


Few months ago, I wrote a bash script for an open-source project.

I created a small awk util that I used throughout the script to style the output. I found it very convenient. I wonder if something similar already exists.

Some screenshots in the PR: https://github.com/ricomariani/CG-SQL-author/pull/18

Let me know guys if you like it. Any comments appreciated.

    function theme() {
        ! $IS_TTY && cat || awk '

    /^([[:space:]]*)SUCCESS:/   { sub("SUCCESS:", " \033[1;32m&"); print; printf "\033[0m"; next }
    /^([[:space:]]*)ERROR:/     { sub("ERROR:", " \033[1;31m&"); print; printf "\033[0m"; next }

    /^        / { print; next }
    /^    /     { print "\033[1m" $0 "\033[0m"; next }
    /^./        { print "\033[4m" $0 "\033[0m"; next }
                { print }

    END { printf "\033[0;0m" }'
    }
Go to source: https://github.com/ricomariani/CG-SQL-author/blob/main/playg...

Example usage:

    exit_with_help_message() {
        local exit_code=$1

        cat <<EOF | theme
    CQL Playground

    Sub-commands:
        help
            Show this help message
        hello
            Onboarding checklist — Get ready to use the playground
        build-cql-compiler
            Rebuild the CQL compiler
Go to source: https://github.com/ricomariani/CG-SQL-author/blob/main/playg...

        cat <<EOF | theme
    CQL Playground — Onboarding checklist

    Required Dependencies
        The CQL compiler
            $($cql_compiler_ready && \
                echo "SUCCESS: The CQL compiler is ready ($CQL)" || \
                echo "ERROR: The CQL compiler was not found. Build it with: $CLI_NAME build-cql-compiler"
            )
Go to source: https://github.com/ricomariani/CG-SQL-author/blob/main/playg...


Definitely don't check that a variable is non-empty before running

    rm -rf ${VAR}/*
That's typically a great experience for shell scripts!



Also, you'd want to put in a double dash to signify the end of arguments as otherwise someone could set VAR="--no-preserve-root " and truly trash the system. Also, ${VAR} needs to be in double quotes for something as dangerous as a "rm" command:

    rm -rf -- "${VAR}"/*


These are all about passive experiences (which are great don't get me wrong!), but I think you can do better. It's the same phenomenon DHH talked about in the Rails doctrine when he said to "Optimize for programmer happiness".

The python excerpt is my favorite example:

```

$ irb

irb(main):001:0> exit

$ irb

irb(main):001:0> quit

$ python

>>> exit

Use exit() or Ctrl-D (i.e. EOF) to exit

```

<quote> Ruby accepts both exit and quit to accommodate the programmer’s obvious desire to quit its interactive console. Python, on the other hand, pedantically instructs the programmer how to properly do what’s requested, even though it obviously knows what is meant (since it’s displaying the error message). That’s a pretty clear-cut, albeit small, example of [Principle of Least Surprise]. </quote>


`exit` will work as expected in Python 3.13: https://docs.python.org/3.13/whatsnew/3.13.html#whatsnew313-...


Yes. I'd be surprised if exit without parentheses quit the interactive shell when it doesn't quit a normal python script.


Ipython quits without parenthesis.


IPython includes a whole lot of extra magic of various kinds, compared to the built-in Python console.


Bad timing, this is changed in Python 3.12.

Still I’ve always used Ctrl+D, which works everywhere unixy.


On the one hand, being generous in your inputs is always appreciated. On the other hand, the fact that both exit and quit will terminate ruby means the answer to "how do I quit ruby" now has two answers (technically 4 because `quit()` and `exit()` also work, and if we're talking about "least surprise" if you accept "exit" and "quit", why not also "bye" or "leave" or "close" or "end" or "terminate".

Python might be surprising, but in this example, it's only surprising once, and helpful when it surprises you. Now you know quitting requires calling a function and that function is named exit() (although amusingly python3 anyway also accepts quit()). And being fully pedantic it doesn't know what you mean, it is assuming what you mean and making a suggestion, but that's not the same as knowing.

From here on I'm not arguing the point anymore, just recording some of the interesting things I discovered exploring this in response to your comment:

You can do this in python (which IMO is surprising, but in a different way):

  ```
  >>> quit
  Use quit() or Ctrl-D (i.e. EOF) to exit
  >>> quit=True
  >>> quit
  True
  >>> quit()
  Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
  TypeError: 'bool' object is not callable
  >>> exit()
  ```
But this also gives some sense to python's behavior. `quit` and `exit` are symbol names, and they have default assignments, but they're re-assignable like any other symbol in python. So the behavior it exhibits makes sense if we assume that they're not special objects beyond just being built int.

`exit` is a class isntance according to type. So we should be able to create something similar, and indeed we can:

  ```
  >>> class Bar:
  ...   def __repr__(self):
  ...     return "Type bar() to quit!"
  ...   def __call__(self):
  ...     print("I quit!")
  ...
  >>> bar = Bar()
  >>> bar
  Type bar() to quit!
  >>> bar()
  I quit!
  >>>
  ```
Interestingly this suggests we should be able to replace exit with our own implementation that does what ruby does if we really wanted too:

  ```
  >>> class SuperExit:
  ...   def __init__(self, real):
  ...     self.real_exit=real
  ...   def __repr__(self):
  ...     print("Exiting via repr")
  ...     self.real_exit()
  ...   def __call__(self):
  ...     print("Exiting via call")
  ...     self.real_exit()
  ...
  >>> exit = SuperExit(exit)
  >>> exit
  Exiting via repr
  ```


> why not also "bye" or "leave" or "close" or "end" or "terminate".

We can include these as well, but each keyword that you include brings diminishing returns at the cost of clutter and inconsistence in the API. Python problematically decides that returns diminish after the first --- “first” according to developers, that is --- possibility in all cases. Ruby anticipates that everyone's first choice will be different and practically maximizes comfort of users.


>Python problematically decides that returns diminish after the first --- “first” according to developers, that is --- possibility in all cases

Eh, that feels pretty arbitrary to me. `quit()` and `exit()` both work, and looking at other languages, `exit()` should almost certainly be your first choice

     C: exit(int)
  Java: System.exit(int)
  SBCL: (quit)/(exit)
    C#: Environment.Exit(int)
   PHP: exit(int)/exit(string)
  Rust: std::process::exit(int)
Having `exit` or `quit` without the parens work might accommodate some people whose first choice isn't to call a function (I guess because they're thinking of the REPL as a shell?), but surely if you're going that far `bye` is a reasonable and practical choice. ftp/sftp use it to this day. At some point you make a cutoff, and where you do is pretty arbitrary. You can like that ruby is twice as lenient as python, but I think it's a stretch to say that python using the single most common function call and a very common alias is "problematic" or even surprising. IMO, python's behavior is less surprising because I don't expect `exit` to be a special command that executes inside the REPL. I expect `exit()` because that's what other programing languages do. And python famously ditched the inconsistent `print "Foo"` syntax in favor of `print("Foo")` in python3 exactly because inconsistency in what was and wasn't a function call was surprising.


> Having `exit` or `quit` without the parens work might accommodate some people whose first choice isn't to call a function (I guess because they're thinking of the REPL as a shell?),

In ruby, parentheses are optional for function calls. `exit` is a regular call, not some REPL peculiarity.

EDIT: Nevermind. Just found that despite the `exit` method being already defined, both irb and pry overshadow that with a repl command that does the same thing. Maybe it's so that it can't be redefined.


this actually completely turned me off from python when I first encountered it. I was like... "the program KNEW WHAT I WAS TRYING TO DO, and instead of just DOING that it ADMONISHED me, fuck Python" LOL

The proliferation of Python has only made my feelings worse. Try running a 6 month old Python project that you haven't touched and see if it still runs. /eyeroll


>Try running a 6 month old Python project that you haven't touched and see if it still runs.

My experience has been 6 month of python works fine. In fact, python is my go to these days for anything longer than a 5 line shell script (mostly because argparse is builtin now). On the other hand, running a newly written python script with a 6 month old version of python, that's likely to get you into trouble.


argparse? docopt or google-python-fire


argparse, because argparse is built in. I'm usually writing shell scripts to automate some process for myself and my co-workers. The last thing I want is for them to have to be fiddling with installing external requirements if I can avoid it.


Regarding point 1, you should `exit 2` on bad usage, not 1, because it is widely considered that error code 2 is a USAGE error.


> if [ -z "$1" ]

I also recommend you catch if the argument is `-h` or `--help`. A careful user won’t just run a script with no arguments in the hopes it does nothing but print the help.¹

  if [[ "${1}" =~ ^(-h|--help)$ ]]
Strictly speaking, your first command should indeed `exit 1`, but that request for help should `exit 0`.

¹ For that reason, I never make a script which runs without an argument. Except if it only prints information without doing anything destructive or that the user might want to undo. Everything else must be called with an argument, even if a dummy one, to ensure intentionality.


Maybe in the late ‘90s it may have been appropriate to use shell for this (I used Perl for this back then) sort of TUI, but now it’s wrong-headed to use shell for anything aside from bootstrapping into an appropriately dedicated set of TUI libraries such as Python, Ruby, or hell just…anything with proper functions, deps checks, and error-handling.


Let's normalize using python instead of bash


Using what version of python? How will you distribute the expected version to target machines?

python has its place, but it's not without its own portability challenges and sneaky gotchas. I have many times written and tested a python script with (for example) 3.12 only to have a runtime error on a coworker's machine because they have an older python version that doesn't support a language feature that I used.

For small, portable scripts I try to stick to POSIX standards (shellcheck helps with this) instead of bash or python.

For bigger scripts, typically I'll reach for python or Typescript. However, that requires paying the cost of documenting and automating the setup, version detection, etc. and the cost to users for dealing with that extra setup and inevitable issues with it. Compiled languages are the next level, but obviously have their own challenges.


> Using what version of python? How will you distribute the expected version to target machines?

Let's focus on solving this then. Because the number of times that I've had to do surgery on horrible bash files because they were written for some platform and didn't run on mine...


Depends on how long you want the script/program to be usable.

Try running a twenty year old BASH script versus a python programme on a new ARM or RISC-V chip.

Or, try running BASH/python on some ancient AIX hardware.


I can still run old python fine, as much as old bash scripts. I often have to edit bash scripts written by others to make it run on my mac.


I don’t remember where I got it, but I have a simple implementation of a command-line spinner that I use keyboard shortcuts to add to most scripts. Has been a huge quality of life improvement but I wish I could just as seamlessly drop in a progress bar (of course, knowing how far along you are is more complex than knowing you’re still chugging along).


can you share it?


Not OP but I have used this one with success.

https://stackoverflow.com/questions/12498304/using-bash-to-d...


> Strategic Error Handling with "set -e" and "set +e"

I think appending an explicit || true for commands that are ok to fail makes more sense. Having state you need to keep track of just makes things less readable.


Good stuff.

One rule I like, is to ensure that, as well as validation, all validated information is dumped in a convenient format prior to running the rest of the script.

This is super helpful, assuming that some downstream process will need pathnames, or some other detail of the process just executed.


I was so frustrated by having to enter a lot of information for every new git project (I use a new VM for each project) so I wrote a shell script that automates everything for me.

I'll probably also combine a few git commands for every commit and push.


Sounds like a cool setup! Did you write it up somewhere publicly?

I also use VMs (qemu microvms) based on docker images for development.


Sorry it's on my other machine so I don't have it at hand. But it's an extremely simple setup that configs the email, the user, removes the need of --set-upstream when pushing, automate pushing with token.

I asked ChatGPT to write it and double checked btw.


Most of those things can just be set in your global git config file, and surely you're using some kind of repeatable/automated setup for VMs.. I don't see why you'd ever need to be doing something other than "copy default git config file" in your Vagrantfile/etc


That's a good idea. I use VirtualBox but I'm sure there is something similar I can do.


The benefit of vagrant is it works with a wide variety of Hypervisors (including vbox) and strongly encourages a reproducible setup through defined provisioning steps.


  if [ -x "$(command -v gtimeout)" ]; then
Interesting way to check if a command is installed. How is it better than the simpler and more common "if command...; then"?


The form you propose runs `command`, which may have undesired side-effects. I always thought `which` to be standard, but TIL `command` (sh builtin) is[0].

[0]: https://hynek.me/til/which-not-posix/


To be clear: both alternatives shown below will invoke the same thing (`command -v gtimeout`).

    if [ -x "$(command -v gtimeout)" ]; then
and

    if command -v gtimeout >/dev/null; then

The first invokes it in a sub shell (and captures the output), the second invokes it directly and discards the output, using the return status of `command` as the input to `if`.

The superficial reason the second is "preferred" is that it's slightly better performance wise. Not a huge difference, but it is a difference.

However the hidden, and probably more impactful reason it's preferred, is that the first can give a false negative. If the thing you want to test before calling is implemented as a shell builtin, it will fail, because the `-x` mode of `test` (and thus `[`) is a file test, whereas the return value of `command -v` is whether or not the command can be invoked.


Ah! I misread the parent, if thought he meant `if command` to look for a random command (e.g. `if grep`)


This post and comment section are a perfect encapsulation of why I'll just write a Rust or Go program, not bash, if I want to create a CLI tool that I actually care about.


https://github.com/charmbracelet/glow is pretty nice for stylized TUI output


Glow is awesome.


No. Glow connects to internet servers, screw that.


Oops, I actually meant Gum, not Glow. Different project from the same folks.

That said, I use Glow to render markdown sometimes. When and how does it connect to internet servers?


I think the person you're responding to is FOS but anyone can audit the source code to find such a thing: https://github.com/charmbracelet/glow


Please point me to the exact place in the source code where it does that:

https://github.com/charmbracelet/glow



yeah, looks like they had a "secure markdown stashing feature" at some point which is a cool way to do usage tracking while claiming you're offering a useful feature... and then people weren't fooled.

I'm fine with it


In the 4th section, is there a reason why set +e is inside the loop while set -e is outside, or is it just an error?


He says in the article. 'Set +e' prevents any error in a fork blitzing the whole script.


That's clear, my question is why the +e is inside the loop and thus is set at each iteration, while the -e is outside it and thus is set only once at the end.


Nicely done. I love everything about this.


In the first example, the error messages should be going to stderr.


I liked the commenting style


The first four parts of my Typesetting Markdown blog describes improving the user-friendliness of bash scripts. In particular, you can use bash to define a reusable script that allows isolating software dependencies, command-line arguments, and parsing.

https://dave.autonoma.ca/blog/2019/05/22/typesetting-markdow...

In effect, create a list of dependencies and arguments:

    #!/usr/bin/env bash
    source $HOME/bin/build-template

    DEPENDENCIES=(
      "gradle,https://gradle.org"
      "warp-packer,https://github.com/Reisz/warp/releases"
      "linux-x64.warp-packer,https://github.com/dgiagio/warp/releases"
      "osslsigncode,https://www.winehq.org"
    )

    ARGUMENTS+=(
      "a,arch,Target operating system architecture (amd64)"
      "o,os,Target operating system (linux, windows, macos)"
      "u,update,Java update version number (${ARG_JAVA_UPDATE})"
      "v,version,Full Java version (${ARG_JAVA_VERSION})"
    )
The build-template can then be reused to enhance other shell scripts. Note how by defining the command-line arguments as data you can provide a general solution to printing usage information:

https://gitlab.com/DaveJarvis/KeenWrite/-/blob/main/scripts/...

Further, the same command-line arguments list can be used to parse the options:

https://gitlab.com/DaveJarvis/KeenWrite/-/blob/main/scripts/...

If you want further generalization, it's possible to have the template parse the command-line arguments automatically for any particular script. Tweak the arguments list slightly by prefixing the name of the variable to assign to the option value provided on the CLI:

    ARGUMENTS+=(
      "ARG_JAVA_ARCH,a,arch,Target operating system architecture (amd64)"
      "ARG_JAVA_OS,o,os,Target operating system (linux, windows, macos)"
      "ARG_JAVA_UPDATE,u,update,Java update version number (${ARG_JAVA_UPDATE})"
      "ARG_JAVA_VERSION,v,version,Full Java version (${ARG_JAVA_VERSION})"
    )
If the command-line options require running different code, it is possible to accommodate that as well, in a reusable solution.


literally nothing here of interest




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: