Argbash – Bash Argument Parsing Code Generator

databasher · on March 20, 2023

Handling command-line arguments in Bash is easy. Bash's `getopts` handles short and long arguments gnu-style without any problem, out of the box, without any need for libraries or complicated packages.

This pattern handles lots of styles of options: short and long options (-h, --help), `--` for separating options from positional args, with GNU-style long options (--output-file=$filename).

  while getopts :o:h-: option
  do case $option in
         h ) print_help;;
         o ) output_file=$OPTARG;;
         - ) case $OPTARG in
                 help ) print_help;;
                 output-file=* ) output_file=${OPTARG##*=};;
                 * ) echo "bad option $OPTARG" >&2; exit 1;;
             esac;;
         '?' ) echo "unknown option: $OPTARG" >&2; exit 1;;
         : ) echo "option missing argument: $OPTARG" >&2; exit 1;;
         * ) echo "bad state in getopts" >&2; exit 1;;
     esac
  done
  shift $((OPTIND-1))
  (( $# > 0 )) && printf 'remaining arg: %s\n' "$@"

JoBrad · on March 21, 2023

The video in the linked page mentions this.

databasher · on March 21, 2023

The video in the linked page says that getopts only supports short options.

sacnoradhq · on March 20, 2023

I don't know where you learned shell scripting, but your formatting is very confusing and nonstandard.

- while...; do / if ...; then

- space before ;;

- no space before case match

- use util-linux getopt because getopts doesn't handle long args

- DRY with a die() function rather than exits littered all over

fsckboy · on March 20, 2023

>I don't know where you learned shell scripting, but your formatting is very confusing

i don't know where you learned English composition, but I can't follow your critique. are you pointing out your preferences (which may well match gnu or bsd standards, i don't know tell me) or things that actually make a difference? Did he break emacs auto-indent? Are you pointing out all the flaws or the correct ways, i'm having to eyeball diff. Use more of your words.

    while
    do

doesn't seem substantially worse/confusing compared to while;do unless you have some reason?

spacing out ;; does it make a difference? or do you like things spaced out?

etc.

sacnoradhq · on March 20, 2023

Flagged for not being a professional response to constructive feedback.

yesenadam · on March 20, 2023

I thought it was "professional", helpful, and a great comment. I didn't think yours was constructive, just unhelpfully treating a different formatting style as objectively worse, in an unfriendly way. "Be kind"! It was also hard to parse, as the GP pointed out.

edit: You changed your comment after I wrote this. Now it mentions that you flagged the GP. That's ridiculous.

sacnoradhq · on March 20, 2023

You have way too much time on your hands and seem determined to argue nonconstructively. Have a good one.

yesenadam · on March 20, 2023

> You have way too much time on your hands and seem determined to argue nonconstructively. Have a good one.

Please refresh on the HN comment guidelines. And reconsider who has been "constructive" vs "nonconstructive" in this thread.

databasher · on March 20, 2023

If you refer to my example above, you'll find that Bash's native "getopts" handles long options just fine. It accepts short option `-`, with an argument. This handles --long-options and --long-options=with-arguments.

Feel free to use your own formatting preferences.

sacnoradhq · on March 20, 2023

https://google.github.io/styleguide/shellguide.html

feisuzhu · on March 20, 2023

I've found that every time my bash scripts become sufficiently complex, I end up rewriting them in Python.

vultour · on March 20, 2023

Honestly, I don't think I found a single case where this was true. Every time I try to rewrite a moderately complex bash script in Python it becomes hundreds or thousands of lines of code dealing with calling external binaries, streams and proper error handling. Perhaps if you're only dealing with text processing it will work, but the moment you start piping together external programs via Python it's all pointless.

epr · on March 20, 2023

Glad I'm not the only one. Honestly, whenever bash comes up in any context, 10 different people feel compelled to express this whole replace all bash scripts with python sentiment and I just have no clue what the fuck they're talking about.

I would consider myself an expert or at least near-expert in python, but I don't see opportunities to replace my shell scripts with python popping up left and right. Do you open files manually and set up pipe chains with subprocess.Popen? I've done this, and its generally many more LOC compared to the shell original, and harder to read.

On the other hand, I'd consider myself maybe 7/10 skill level with bash, but most developers are only ever a 2/10 or 3/10 with bash/shell. I can't help but think that the average developer's lack of shell understanding is where all these suggestions to convert to python come from. If it's that easy or beneficial to convert to python, then it probably should have been written in python originally.

m463 · on March 21, 2023

I think when you have a very-well-written bash script, it starts to get pretty close to python in size, but python wins in readability. A well written bash script will handle errors cleanly, will not die on quoting problems, and can use complex data structures or objects when data handling gets hairy.

When you use regexes, letting python use them directly seems to be much cleaner than quoting and passing them to awk/sed/grep or other text processing tools.

Additionally, python seems to have libraries to handle all kinds of input like json and csv.

One thing that python doesn't do as naturally as bash is invoke commands. For python I usually include some functions to run commands passed as a list, and check the return code or return the output. then 'cmd -f $arg1' becomes myrun('cmd','-f', arg1)

elmolino89 · on March 21, 2023

Stringing up a pipe with a mix of awk, tr, sed, cut, sort, uniq etc. is certainly not more readable than a Python script with comments not invoking any of these but using Python's data structures, or even data frames/SQL.

iamjackg · on March 20, 2023

100% agree. There are some libraries like https://amoffat.github.io/sh/ that aim to make that easier, but they always have some quirks that, funnily enough, are often the corner cases you were hitting in your complicated Bash script in the first place.

iudqnolq · on March 21, 2023

sh is the secret to transliterating bash to python. It's basically bash with proper loops and variables and string operations.

    import sh
    names = sh.ls("./src")
    for name in names:
        sh.mv(["./src/" + name, "./dst/" + name + ".out"])

ff317 · on March 20, 2023

Yeah, I figure if a bash script goes over ~10-20 lines, or involves really long lines, or quotes within quotes within quotes of different kinds, it's time to move on to Python or similar.

3np · on March 21, 2023

It'\''s always the quotes.

JohnFen · on March 20, 2023

When I'm working on/writing complex bash scripts, it's because the scripts have to run on a variety of different customer machines where Python is not consistently available (let alone being able to count on a particular version) and, if it's not there, cannot be installed.

The main advantage of bash is that it exists on just about every unix machine.

vaughan · on March 20, 2023

TypeScript is now tenable for bash scripts with the fast-starting Bun runtime https://bun.sh/.

Before Bun, Node+V8 was just too slow to start.

IMHO all scripts should be written in TypeScript...you get typechecking and all the rest of the editing experience. Plus things like Wallaby.js for live coding/testing.

My `.bashrc` now just runs a TypeScript script with Bun. Allows you to use proper structure instead of brittle spaghetti stuff. The number of times I'm debugging silly PATH stuff was too much...

ryapric · on March 21, 2023

>you get typechecking and all the rest of the editing experience

This is true for a number of languages that can also be run without AOT compilation (Go, Rust, etc). This feels like some really weird, incoherent astroturf take for Bun promotion.

vaughan · on March 27, 2023

JavaScript was not tenable for scripts with Node.js. Startup time is too slow. Now it is because of Bun. I don't think you can argue with that.

Rust isn't a scripting language - too strict, no optional typing, slow compilation.

c0wb0yc0d3r · on March 20, 2023

Do you have an example?

vaughan · on March 27, 2023

    import {alias, _export, _eval} from './util.js'
    
    export default async function config() {
      git()
    }
    
    function git() {
      alias({
        gpsuom: 'git push --set-upstream origin master',
        gpsuvm: 'git push --set-upstream vjpr master',
        gpsu: 'git push --set-upstream',
        gmvd: 'git merge vjpr/dev',
        gmvdgp: 'git merge vjpr/dev && git push',
        gprom: 'git pull --rebase origin master',
        gpuohm: 'git push origin head:master',
        gpom: 'git pull origin master',
        grhh: 'git reset --hard HEAD',
        gisquash: 'git reset --soft HEAD~$(git rev-list --count HEAD ^master)',
        gacp: "git add . && git commit --message 'Update' && git push",
        gacmu: "git add . && git commit --message 'Update'",
        gacmup: "git add . && git commit --message 'Update' && git push",
      })
    }
    
    function brew() {
      alias('bbrew', 'br')
      alias({
        br: 'HOMEBREW_NO_AUTO_UPDATE=1 brew',
        // x86_64
        ibrew: 'arch -x86_64 /usr/local/bin/brew',
        abrew: '/opt/homebrew/bin/brew',
      })
      _eval(`$(/opt/homebrew/bin/brew shellenv)`)
    }

c0wb0yc0d3r · on March 28, 2023

I take it, then, in your .bashrc file you have a line to the effect: bun run bashrc.ts?

vaughan · on March 28, 2023

Yeh something like that. I am using unix pipes to communicate with the shell, basically just eval'ing commands. I did this because bun didn't support sockets yet, which is probably a better approach. Pipes have crazy edge cases.

jen20 · on March 20, 2023

Fair - I find that every time a Python program becomes sufficiently complex though (defined as "needs something not in the stdlib of Python 3.6"), I end up rewriting them in Go or Rust ;-)

Cheeeetah · on March 21, 2023

I mainly use Python to handle JSON, while use Shell for other things.

m463 · on March 21, 2023

that's exactly what I do, and argparse (I believe the motivation for argbash) helps a lot

captnswing · on March 20, 2023

this ^

ndsipa_pomu · on March 20, 2023

I'm a fan of BashBoilerPlate (Bash3BoilerPlate) - https://github.com/xwmx/bash-boilerplate

It uses a similar style of deriving the arguments from the usage declaration, but it also includes some useful logging functions and is all in one script. There's some more info available on their style choices here: https://bash3boilerplate.sh/

gdevenyi · on March 21, 2023

I work with both, argbash's parser and boilerplate's helper functions.

chrsig · on March 20, 2023

I really wish bash could evolve...in particular around control flow, variable assignment, string interpolation, and arithmetic.

I love working with bash, but it has some footguns that really require an expert hand.

I know bash as-is will always be around for backwards compatibility with the god-knows-how-many scripts out there. It'd just be nice if there were a widely embraced path forward that kept the shell scripting spirit while shedding some of the unintuitive behaviors

johnchristopher · on March 20, 2023

I gave up on bash (more precisely on bashisms. edit: and because of bashims) and moved to dash or whatever is at /bin/sh. If it can't be done with ash/dash then I look into Python or node or PHP or what fits the situation best.

skowalak · on March 20, 2023

I did the same. I believe the biggest selling point of shell over other scripting languages is its availability on so many platforms. Unfortunately bashisms have ruined the effort of writing a one-size-fits-all script. So I, too, moved to POSIX shell syntax for the small problems and another shell scripting language for the bigger problems.

dwheeler · on March 20, 2023

Bash is very widely available. If you use bashisms like arrays, just use bash as the first line.

ryapric · on March 21, 2023

As the shebang (`#!`), I imagine you mean?

dwheeler · on March 21, 2023

Yes, shebang.

chubot · on March 20, 2023

Evolving bash, removing footguns, and providing a path forward is exactly what https://www.oilshell.org is about :)

It's run thousands of lines of unmodified shell AND bash scripts for years, and provides an upgrade path to a new language.

The new language is described here, and you can try it: https://www.oilshell.org/release/latest/doc/oil-language-tou...

I've gotten some great feedback on it, but it's not stable yet. You can still influence the direction of the language (join us on Zulip)

Latest release gives the status of the project: https://www.oilshell.org/blog/2023/03/release-0.14.2.html

As an anecdote, one thing that was extremely difficult was fixing all the footguns around set -e / errexit in bash.

In shell and bash, "you're damned if you do set -e and damned if yuo don't"

However I believe we have done it: https://www.oilshell.org/release/0.14.2/doc/error-handling.h...

We simply provide options to make sure that every exit code is checked. That's really all. But shells do NOT do this. In fact the standard specifies that shells shouldn't, which has left people mystified for decades.

I've gotten good feedback about error handling and the rest of the changes. Again, you can try it right now if you want to verify that the footguns are indeed fixed.

urxvtcd · on March 20, 2023

> As an anecdote, one thing that was extremely difficult was fixing all the footguns around set -e / errexit in bash.

Shameless plug: I've recently written a blog post about how set -e suddenly ceases to work when using function calls in flow control (what you refer to as "disabled errexit quirk"): https://snails.dev/posts/set-e.html

Your comment made me interested in Oil, thanks!

oweiler · on March 20, 2023

I think there is a place for a transpile to Bash language, which just adds some tiny bits to fix Bash's idiosyncrasis

kps · on March 21, 2023

Regarding the post topic, it's a shame bash hasn't embraced ksh93 getopts. This is enough to provide short and long options including help messages:

    usage='
      [-1]
      [+NAME?cmd - XXX Synopsis here]
      [s:sample?XXX Sample option with argument]:[MODE]
      [v:verbose?XXX Show progress]

      file ...'
    while getopts "$usage" _
    do
      typeset "opt_$_"="${OPTARG:-1}";;
    done
    shift OPTIND-1

shaftway · on March 20, 2023

I've always assumed that there was some argument parser available that just sets things as environment variables, and that my google-fu is just too weak to find it.

Why ccouldn't I just go `source argbash _ARG_ --single option o --bool print --position positional -- $@` and get _ARG_OPTION, _ARG_PRINT, and _ARG_POSITIONAL environment variables set based on the commands passed in, without having to dump a hundred lines of code in my script?

tpoacher · on March 21, 2023

Am I the only one who finds argbash unnecessarily complicated?

For my own needs, I rely on a tiny function I wrote, called process_optargs. Example use:

  source /path/to/process_optargs
  source /path/to/error
  source /path/to/is_in

  function myfunction () {   # example function whose options/arguments we'd want to process

  # Define local variables which will be populated or checked against inside process_optargs
    local -A OPTIONS=()
    local -a ARGS=()
    local -a VALID_FLAG_OPTIONS=( -h/--help -v --version )    # note -v and --version represent separate flags here! (e.g. '-v' could be for 'verbose')
    local -a VALID_KEYVAL_OPTIONS=( -r/--repetitions )
    local COMMAND_NAME="myfunction"

  # Process options and arguments; exit if an error occurs
    process_optargs "$@" || exit 1

  # Validate and collect parsed options and arguments as desired
    if   is_in '-h' "${!OPTIONS[@]}" || is_in '--help' "${!OPTIONS[@]}"
    then display_help
    fi

    if   is_in '-r'            "${!OPTIONS[@]}"; then REPS="${OPTIONS[-r]}"
    elif is_in '--repetitions' "${!OPTIONS[@]}"; then REPS="${OPTIONS[--repetitions]}"
    fi

    if   test "${#ARGS[@]}" -lt 2
    then error "myfunction requires at least 2 non-option arguments"
         exit 1
    fi

    # ...etc
  }

It works as you'd expect, with appropriate checks for correctness of inputs, and is compatible with most unix conventions (including '--' and '-' as arguments).

If anyone's interested let me know and I can share the code.

noloblo · on March 21, 2023

Yes I agree and pls share the code

tpoacher · on March 21, 2023

Here you go: https://git.sr.ht/~tpapastylianou/process_optargs

Enjoy :)

softskunk · on March 20, 2023

as cool as this is, i feel that anything with the complexity of more than one or two arguments should really be written in a different language, like Python (or AppleScript for macOS users). bash just isn’t the right tool for the job then.

meindnoch · on March 20, 2023

>or AppleScript for macOS users

You must be joking.

user3939382 · on March 20, 2023

AppleScript: It’s easy because you write it like natural English! Sort of. We’ll it’s not really natural English. It also doesn’t have semantics that make sense for a programming language. Good luck!

xp84 · on March 20, 2023

Hey but don't worry, it has great documentation, in the form of question posts from 2006 on "MacOSXHints" that may or may not have been answered!

Also don't worry, you can write AppleScript in JavaScript syntax too! Too bad 100% of the code samples out there to help you are in the "other" syntax :D

user3939382 · on March 20, 2023

I never had another interpreter that I had to learn by manually fuzzing it. AppleScript makes bash look like F#.

nadaviv · on March 20, 2023

GPT is pretty good at figuring out argument parsing code based on example CLI usage. Like this:

    > Write bash code to parse the following arguments:
    >
    > $ ./export.sh --out-file dump.json --db foo.sqlite --include-headers --exclude-rev -vvv -p9000

    Here's an example bash code to parse the arguments:

    #!/bin/bash

    # Default values for optional arguments
    INCLUDE_HEADERS=false
    EXCLUDE_REV=false
    VERBOSE_LEVEL=0
    PORT=0

    # Parse the arguments
    while [[ "$#" -gt 0 ]]; do
        case $1 in
        --out-file) OUT_FILE="$2"; shift ;;
        --db) DB="$2"; shift ;;
        --include-headers) INCLUDE_HEADERS=true ;;
        --exclude-rev) EXCLUDE_REV=true ;;
        -v) VERBOSE_LEVEL=$((VERBOSE_LEVEL+1)) ;;
        -p) PORT="$2"; shift ;;
        -h|--help) 
            echo "Usage: ./export.sh [OPTIONS]"
            echo ""
            echo "Options:"
            echo "--out-file     Output file path (required)"
            echo "--db           Database path (required)"
            echo "--include-headers  Include column headers in output (default: false)"
            echo "--exclude-rev   Exclude revisions in output (default: false)"
            echo "-v              Increase verbosity level (default: 0)"
            echo "-p              Port number (default: 0)"
            echo "-h, --help      Show help"
            exit 0
            ;;
        *) echo "Unknown parameter passed: $1"; exit 1 ;;
        esac
        shift
    done

    # Check required arguments
    if [[ -z $OUT_FILE ]]; then
        echo "Error: missing argument --out-file"
        exit 1
    fi

    if [[ -z $DB ]]; then
        echo "Error: missing argument --db"
        exit 1
    fi

    # Print parsed arguments
    echo "OUT_FILE=$OUT_FILE"
    echo "DB=$DB"
    echo "INCLUDE_HEADERS=$INCLUDE_HEADERS"
    echo "EXCLUDE_REV=$EXCLUDE_REV"
    echo "VERBOSE_LEVEL=$VERBOSE_LEVEL"
    echo "PORT=$PORT"

version_five · on March 20, 2023

Honestly nobody cares. It's insulting to every reader to post this kind of crap.

nadaviv · on March 20, 2023

Why is it insulting to point out that GPT can produce good results for this particular use-case?

Being able to define your argument types and generate parsing code for them using an example CLI invocation feels very natural and expressive to me. I personally found it to be useful for my work.

akira2501 · on March 20, 2023

I wish it was crap, then I could pretend I'm participating in some form of modern performance art. This is, unfortunately, just propaganda.

nadaviv · on March 20, 2023

Wait what? Propaganda? I really don't get why this is invoking such strong reactions...

Heston · on March 20, 2023

He's saying the code isn't crap and the attack on it is propaganda. The reason for the strong reactions is pretty simple. The majority of people here program for a living and GPT is a threat to their income.

version_five · on March 21, 2023

I doubt anyone who codes professionally is threatened by kids posting chatbot output. There's a widening gap between actual knowledge and people who think they know something because they can regurgitate it. It's exactly like people who think they know about something because they can look it up on the internet. They're super annoying but completely nonthreatening

Izkata · on March 20, 2023

This is actually very close to what I write manually; I've never been a fan of getopts syntax since you need to do the loop and case anyway.

simonw · on March 20, 2023

Languages that I work with infrequently enough to remember how to use them - like Bash - are the absolute perfect place to apply LLM tech like ChatGPT.

Prompt:

> Write a bash script "foo.sh" that accepts a required filename, optional flags for "-r/--reverse" and "-s/--skip" and an optional "-o/--output=other-file" parameter. It should have "-h/--help" text too explaining this.

Then copy and paste out the result and write the rest of the script (or use further prompts to get ChatGPT to write it for you).

Could it be done better if I spent more time on it or was a Bash expert? Absolutely, but for most of the times when I need to do something like this I really don't care too much about the finished quality.

sacnoradhq · on March 20, 2023

Cute and lots of effort went into this, but code generation is, unfortunately, unmaintainable and inflexible. This seems targeted at users who want to avoid mastery of their tools, which is fine for some.

util-linux getopt exists.

iamjackg · on March 20, 2023

This is very similar to Bashly (https://bashly.dannyb.co/) but with a lot more weird magic going on.

nickjj · on March 20, 2023

If anyone is looking for a snippet to handle both positional args along with required and optional flags (both with short and long form formats) along with basic validation I put together: https://nickjanetakis.com/blog/parse-command-line-positional..., it includes the source code annotated with comments and a demo video.

tveyben · on March 20, 2023

I have great pleasure when using docopt in Python.

I see docopts is ‘the same’ implementation but for shell, have never tried it though.

==== docopt helps you:

- define the interface for your command-line app, and - automatically generate a parser for it. ====

http://docopt.org/

https://github.com/docopt/docopts

akho · on March 20, 2023

Yet another point where Fish is a delight.

Going to Python/whatever is not quite the same — shell scripts are not as much written as extracted from shell history, so switching to a separate language is a large extra step.

rickydroll · on March 20, 2023

I have used are – for many projects and it is wonderful. But as others have indicated, be mindful of whether or not Bash is the right tool for the task at hand.

synergy20 · on March 20, 2023

what is 'are', interesting name

philote · on March 20, 2023

Heh, I'm guessing it was voice to text for "argbash" but got translated to "are dash".

caymanjim · on March 20, 2023

What's with:

    # [ <-- needed because of Argbash

There's also this bit:

The square brackets in your script have to match (i.e. every opening square bracket [ has to be followed at some point by a closing square bracket ]).

There is a workaround — if you need constructs s.a. red=$'\e[0;91m', you can put the matching square bracket behind a comment, i.e. red=$'\e[0;91m' # match square bracket: ].

That kind of kludginess is a turn-off.

gdevenyi · on March 21, 2023

This is because argbash is actually an m4 preprocessor, and the brackets are part of the language

Joker_vD · on March 20, 2023

Huh, it uses M4 to build its DSL, how quaint.

I wish there were more projects on the other side of the spectrum: take the script's self-reported usage string, à la docopt [0], and derive argument-parsing code from that. After all, we have GPT-4 now.

[0] https://github.com/docopt/docopts