
Conventions for Command Line Options - zdw
https://nullprogram.com/blog/2020/08/01/
======
exmadscientist
Just, whatever, you do, please please PLEASE _PLEASE_ support `--help`. I
don't mean `-h` or `-help`, I mean `--help`.

There's only one thing the long flag can possibly mean: give me the help text.
I understand that your program might prefer the `-help` style, but many do
not. And do you know how I figure out which style your program likes? That's
right, I use `--help`. I have to use `--help` rather than just `-help` because
of GNU userspace tools, among others. It seems unlikely they're going to
suddenly clean up their act _this_ decade, so I have to default to starting
with `--help`.

So it's very frustrating when the response to `program --help` is "Argument
--help not understood, try -help for help." This is then often followed by me
saying indecent things about the stupid program.

~~~
Annatar
Traditional UNIX®️ options for help are -? and -h. --long-options are a horrid
GNU-ism and shunned by clean UNIX®️ compliant programs, because such programs
come with detailed, professionally written manual pages which contain
extensive SYNOPSIS and EXAMPLES sections.

Implementing --long-options makes the poor users type much more, hurting
ergonomy and the users' long-term productivity and efficiency.

~~~
jaen
-? - what are you smoking? :)

That is a glob in most shells, try putting a file named "-a" in your current
directory and see what happens...

The proper way to write it would be -\?

~~~
Annatar
It goes without saying that it has to be escaped in most shells, but that is
the traditional option for help on UNIX®️. You would be well advised to
educate yourself on the history of UNIX®️ before coming up with "what are you
smoking?"

------
chriswarbo
I write a lot of commandline utilities and scripts for automating tasks. I
find environment variables _much_ simpler to provide and accept key/value
pairs than using arguments. Env vars don't depend on order; if present, they
always have a value (even if it's the empty string); they're also inherited by
subprocesses, which is usually a bonus too (especially compared to passing
values to sub-commands via arguments, e.g. when optional arguments may or may
not have been given).

Using arguments for key/value pairs allows invalid states, where the key is
given but not the value. It can also cause subsequent, semantically distinct,
options to get swallowed up in place of the missing value. They also force us
to depend on the order of arguments (i.e. since values must follow immediately
after their keys), despite it being otherwise irrelevant (at least, if we've
split on '\--'). I also see no point in concatenating options: it introduces
redundant choices, which imposes a slight mental burden that we're better off
without.

The only advice I wholeheartedly encourage from this article is (a) use
libraries rather than hand-rolling (although, given my choices above, argv is
usually sufficient!) and (b) allow a '\--' argument for disambiguating flag
arguments from arbitrary string arguments (which might otherwise parse as
flags).

~~~
enriquto
The beautiful thing of environment variables is that you can read them
whenever you actually need them in your program. They pierce a hole through
your call stack to the point that you need them. On the contrary, for command
line arguments, you need to pass their values from the main function through
whatever deep of the call stack you need them.

~~~
Too
Same thing can be said about global variables. I think any sane developer
would agree it's a anti-pattern.

~~~
chriswarbo
Mutating global variables is certainly an anti-pattern, but I'm not so
convinced that global constants are a problem; i.e. if we treat env vars like
immutable variables (in fact I treat most variables as immutable).

------
ucarion
Most cli parsers don't fully support the author's suggested model because it
means you can't parse argv without knowing the flags in advance.

For example, the author suggests that a cli parser should be able to
understand these two as the same thing:

    
    
        program -abco output.txt
        program -abcooutput.txt
    

That's only doable if by the time you're parsing argv, you already know `-a`,
`-b`, and `-c` don't take a value, and `-o` does take a value.

But this is a pain. All it gets you is the ability to save one space
character, in exchange for a much complex argv-parsing process. The `-abco
output.txt` form can be parsed without such additional context, and is already
a pretty nice user interface.

For those of us who aren't working on ls(1), there's no shame in having a
less-sexy but easier-to-build-and-debug cli interface.

~~~
saurik
But every single command line parser I have ever used (and I have used many
over the past 25 years in many programming languages) does in fact know before
parsing that a b and c don't take a value and o does: accepting the grammar
for the flags and then parsing them in this way is like the only job of the
parser?

~~~
TeMPOraL
I suppose the only reason why you wouldn't want to have a centralized
description of the command line grammar is if you're using flags which alter
the set of acceptable flags. Like e.g. if you put all git subcommands into one
single executable - the combinations and meanings of commandline flags would
become absurdly complicated.

~~~
tom_
This is usually handled by adding support for that sort of syntax to the
parsing library. The result is typically not complicated: you have a parser
for the global options, then, if using subcommands, in effect a map of
subcommand name to parser for options for that command.

------
barrkel
Very opinionated, and IMO without enough justification.

The assertions around short options with arguments - conjoining with other
short options, for example - are actively harmful to legibility in scripts,
since there's no lexical distinction between the argument and extra short
options. I don't recommend using that syntax when alternatives are available
and I deliberately don't support it when implementing ad-hoc argument parsing
(typically when commands are executed as parsing proceeds).

~~~
Spivak
Counterexamples where this is good for legibility.

    
    
        tar -xf archive.tar.gz
    
        tar -czf archive.tar.gz dir/

~~~
curryhoward
I'm guessing the comment was talking about examples like this:

    
    
        program -abcdefg.txt
    

Just from reading this, you can't tell where the flags end and the filename
begins unless you have all the flags and their arities memorized.

~~~
DangitBobby
Worse than that, it adds an unstable convention to an otherwise stable
interface. What if there used to be a, b flags and then they add a c flag? Who
knows what your command does now. It's just bad all around.

~~~
Spivak
It’s not unstable or ambiguous at all.

    
    
        -abcdefg.txt
    

In your hypothetical -b takes an argument which means it consumes all the
characters following it. Anything following b is never interpreted as a flag.
There is no danger of -c changing the parsing. Even if b takes an optional
argument it still consumes all characters following it.

~~~
DangitBobby
Ah, you are right.

------
m463
I really do like argparse.

It will cleanly do just about anything you need done, including nice stuff
like long/short options, default values, required options, types like
type=int, help for every option, and even complicated stuff like subcommands.

And the namespace stuff is clever, so you can reference arg.debug instead of
arg['debug']

~~~
onei
I always found argparse did argument parsing well enough but it felt clunky
when you need something more complicated like lots of subcommands. I find
myself using it exclusively when I'm trying to avoid dependencies outside the
Python standard library.

My choice of argument parsing in Python is Click. It has next to no rough
edges and it's a breath of fresh air compared to argparse. I recently
recommended it to a colleague who fell in love with it with minimal persuasion
from me. I recommend it highly.

[1]
[https://click.palletsprojects.com/en/7.x/](https://click.palletsprojects.com/en/7.x/)

~~~
cb321
I feel like argh and plac preceded/inspired Click.

Also, it's not Python but in Nim there is
[https://github.com/c-blake/cligen](https://github.com/c-blake/cligen) which
also does spellcheck/typo suggestions, and allows --kebab-case --camelCase or
--snake_case for long options, among other bells & whistles.

~~~
onei
Spell check is something I'd love to see in Click. As complicated as git can
be, I always liked the spell check it has for subcommands.

As for the different cases, I personally avoid using camel or snake case
purely because I don't need to reach for the shift key. Maybe some people like
it, but I find it jarring.

~~~
cb321
Agreed vis a vis SHIFT.

cligen also has a global config file [1] to control syntax options like
whether '-c=val' requires an '=' (and if it should be '=' or ':', etc.) which
has been discussed here, various colorization/help rendering controls, etc.
Shell scripts can always export CLIGEN=/dev/null to force more strict and/or
default modes.

[1] [https://github.com/c-blake/cligen/wiki/Example-Config-
File](https://github.com/c-blake/cligen/wiki/Example-Config-File)

------
alkonaut
> program -iinput.txt -ooutput.txt

What good is that? Who wants to save a space? Given -abcfoo.txt I can't tell
whether it's abcf oo.txt or -abc foo.txt? So that's a definite drawback, and
the benefit is?

~~~
ibejoeb
Right, for all we know, that's 8 options and two duplicate. It's also harder
to implement because now I have to keep symbol table (or I have to refer more
frequently to it.) I can't see much of a utility argument here except
terseness for its own sake. In that case, why don't we just pack our own
bytes?

------
ohazi
> Go’s [...] intentionally deviates from the conventions.

 _Sigh_

Of course if does.

~~~
programd
My impression is that nobody bothers with the Go flags package and most people
use the POSIX compatible pflag [1] library for argument parsing, usually via
the awsome cobra [2] cli framework.

Or they just use viper [3] for both command line and general configuration
handling. No point reinventing the wheel or trying to remember some weird non-
standard quirks.

[1] [https://github.com/spf13/pflag](https://github.com/spf13/pflag)

[2] [https://github.com/spf13/cobra](https://github.com/spf13/cobra)

[3] [https://github.com/spf13/viper](https://github.com/spf13/viper)

~~~
akdor1154
Nobody.. except Hashicorp :(

~~~
DangitBobby
I always wondered who to blame for the single - long arguments in terraform.

------
kazinator
> _When grouped, the option accepting an argument must be last ... program
> -abcooutput.txt_

Good grief, no.

There may be some utilities out there which work this way, but it is not
convention and should not be regarded as one.

Single letter options should almost always take an argument as a separate
argument.

Some traditional utilities allow one or more arguments to be extracted as they
are scanning through a "clump" of single letter options:

    
    
      -abc b-arg c-arg
    

Implementations of _tar_ are like this.

Newly written utilities should not allow an option to clump with others if it
requires an argument. Only Boolean single-letter options should clump.

Under no circumstances should an option clump if its argument is part of the
same argument string. For instance consider the GCC option -Dfoo=bar for
defining a preprocessor macro, and the -E option doing preprocessing only.
Even if -Dfoo=bar is last, we don't want it to clump with -E as -EDfoo=bar ---
and it doesn't.

But, in the first place, even if it did, we don't want to be looking to C
compilers for convention, unless we are specifically making a C compiler
driver that needs to be compatible.

------
desc
Some other commenters have mentioned environment variables as input.

IMO there are broadly two types of command: plumbing and porcelain. There's a
certain amount of convention and culture in distinguishing them and I'm not
going to try to argue the culture boundary...

For the commands which are plumbing (by whatever culture's rules), the
following apply:

* They are designed to interact with other plumbing: pipes, machine-comprehension, etc

* Exit code 0 for success, anything else for error. Don't try to be clever.

* You can determine precisely and unambiguously what the behaviour will be, from the invocation alone. Under no circumstances may anything modify this; no configuration, no environment.

For the commands which are porcelain (by the same culture's rules, for
consistency), the following apply:

* Try to be convenient for the user, but don't sacrifice consistency.

* If plumbing returns a failure which isn't specifically handled properly, either something is buggy or the user asked for something Not Right; clean up and abort.

* Environment and configuration might modify things, but on the command line there must be the option to state 'use this, irrespective of what anything else says' without knowing any details of what the environment or configuration currently say.

To make things more exciting, some binaries might be considered porcelain or
plumbing contextually, depending on parameters... (Yes, everyone sane would
rather this weren't the case.)

------
mmphosis
Do I add more to this code just for convention? The command line option
parsing (or broken ParseOptions dependency) will become magnitudes larger and
more complex than what the program does.

    
    
      usage = 0
      argc = len(sys.argv)
      if argc == 2 and sys.argv[1] == "-r":
       hex2bin()
      else:
       if argc == 2 and sys.argv[1].startswith('-w', 0, 2):
        s = sys.argv[1][2::]
       elif argc == 3 and sys.argv[1] == '-w':
        s = sys.argv[2]
       elif argc >= 2:
        usage = 1
       if usage == 0:
        try:
         width = int(s)
        except ValueError:
         print("Error: invalid, -w {}".format(s))
         usage = 1
        except NameError:
         width = 40
       if usage == 0:
        bin2hex(width)
       else:
        print("usage: mondump [-r | -w width]")
        print("       Convert binary to hex or do the reverse.")
        print("            -r reverse operation: convert hex to binary.")
        print("            -w maximum width: fit lines within width (default is 40.)")
      sys.exit(usage)

~~~
tom_
No, you take code away by using argparse! Handles all this GNU longopt and
argument parsing stuff for you, and autogenerates the --help display. Probably
something like this:

    
    
        import argparse
    
        def auto_int(x): return int(x,0) # http://stackoverflow.com/questions/25513043/
    
        def main(argv):
            parser=argparse.ArgumentParser()
            parser.add_option('-r',dest='reverse',action='store_true',help='reverse operation: convert hex to binary')
            parser.add_option('-w',default=40,dest='width',nargs='?',type=auto_int,help='maximum width: fit lines within width (default is %(default)s.)")
            options=parser.parse_args(sys.argv[1:])
            if options.reverse: bin2hex(options.width)
            else: hex2bin()
    
        if __name__=='__main__': main(sys.argv[1:])

------
misnome
I recently ran into a case I hadn’t seen before with python’s argparse.
Multiple arguments in a single option, e.g. “—foo bar daz” with —foo set to
‘*’ swallows both bar and daz, where I would have expected to have to
explicitly specify “—foo bar —foo daz” to get that behaviour. I guess this is
a side effect of treating non-option arguments the same as dash-prefixed
arguments, but I have no idea what the “standard” to expect with this is?

Otherwise, my main bugbear is software using underscore instead of dash for
long-names, and especially applications or suites that mix these cases.

I really like the simplicity that docopt somewhat forces you into, which
avoids most of these tricky edge cases, but am seeing less and less usage
nowadays of it.

~~~
mixmastamyk
Hmm, if you configure an option to take all args following, why would one be
surprised by that?

------
bschwindHN
I use clap/structop in Rust for all my CLIs ever since those libraries came
out and it's just stupid easy to make a nicely functioning CLI. You define a
data structure which holds the arguments a user will pass to your program and
you can annotate fields to give them short and long names. You can also define
custom parsing logic via plain functions if the user is passing something
exotic.

At the end you can parse arguments into your data structure in one line. At
that point the input has been validated and you now have a type safe
representation of it. Well-formatted help text comes for free.

------
ur-whale
I very much like the subcommand paradigm git implements:

    
    
        mycmd <global args> subcommand <subcommand specific args>
    

However, I haven't found a C++ library that implements this properly with an
easy-to-use API (and what I mean by easy to use is: specifying the structure
of cmd line options should be easy and natural, _and_ retrieving options
specified by the user should be easy, _and_ the "synopsis" part of the man
page should be auto-generated).

If anyone knows of one, would love to hear about it.

~~~
htfy96
Have you tried
[https://github.com/CLIUtils/CLI11](https://github.com/CLIUtils/CLI11) ? The
subcommand example can be found at
[https://github.com/CLIUtils/CLI11/blob/master/examples/subco...](https://github.com/CLIUtils/CLI11/blob/master/examples/subcommands.cpp)
. It can't generate man pages though

------
dkrajzew
Hello there!

Yep, it's some kind of an advertisement, but for an open source project.

I made some experience with parsing command line options when working on an
open source traffic simulation named SUMO
([http://sumo.dlr.de](http://sumo.dlr.de)) and decided to re-code the options
library. It works similar to Python's argparse library - the options are
defined first, then you may retrieve the values type-aware.

You may find more information on my blog pages:
[http://krajzewicz.de/blog/command-line-
options.php](http://krajzewicz.de/blog/command-line-options.php)

The library itself is hosted on github: \- cpp-version:
[https://github.com/dkrajzew/optionslib_cpp](https://github.com/dkrajzew/optionslib_cpp)
\- java-version:
[https://github.com/dkrajzew/optionslib_java](https://github.com/dkrajzew/optionslib_java)

Sincerely, Daniel

------
ur-whale
I wish he mentioned what C++ option parser sticks to the rules he outlined.

------
mixmastamyk
Argparse doesn’t need sys.argv by default. The main gripe about it was also a
bit odd.

Why would one want an option with an _optional_ argument? And then complain
about ambiguity? argparse is flexible on purpose but it doesn’t mean one needs
to use every feature. optparse is still around for those who desire more
discipline.

click is nice though I don’t care for multiple decorator driven libs in one
project.

------
rkagerer
Nice to see these guidelines all laid out.

I grew up with the DOS / Windows convention of slashes, and clicked the link
hoping there'd be mention of it.

------
daitangio
For python click library is fantastic and follows all the good pratice
explained in the blog post.

------
smusamashah
program -c # omitted program -cblue # provided program -c blue # omitted

This is confusing

~~~
goto11
Your comment is confusing due to missing line breaks :-) Try indenting the
lines with two spaces.

And I agree, I would prefer:

    
    
      program -c=blue

~~~
smusamashah
Too late for that. Was on mobile app.

program -c # omitted program -cblue # provided program -c blue # omitted

This doesn't make sense. How -c blue is omitted while it looks just like an
argument for -c

~~~
goto11
If the argument to "-c" is optional, the parser doesn't know if "blue" is a an
optional argument to "-c" or separate argument to "program".

I think the takeaway is to avoid options with optional arguments. If the
argument to -c was required, "-c blue" would be fine.

