Hacker News new | comments | ask | show | jobs | submit login
Exploring CLI Best Practices (localytics.com)
276 points by kddeisz on Oct 11, 2016 | hide | past | web | favorite | 160 comments

Good and useful advice. Just one minor thing I'd change (copied from a comment that I posted there too):

Don't use -v for version. It's very uncommon (though I know software that does it, it's confusing) and people will often blindly add -v for verbosity. A better alternative is -V (capitalized) and of course --version. These should both work, just like -h and --help should both work.

Funny story about -v. I was using pkill to get rid of some hung processes on a remote server. For some reason I decided to get verbose output and run pkill -v.

Turns out -v negates the match so I killed every process on the box but one.

I had to email someone to reboot the server.

I've done that exact thing! Took me a few minutes to realize what happened. Luckily it was just a VM.

It makes sense in the context of pgrep, since that emulates grep. But kill does not have a -v flag that means inverse.

I use grep -v all the time and still think it's a really weird choice for the negate option. (It's called "invert match" in the man page; there's at least a v in there.)

Arguably grep predates any common practices, but I'd have to look that up.

Oh my gosh. I'm so glad I just read that. For the life of me I can't understand why someone would want to say "let me send a kill to every single process on this host but this one..."

No idea, but there it is:

   -v          Reverse the sense of the matching; display processes that do
                 not match the given criteria.
I'm similarly happy that I read that before I went off and did something stupid.

Sometimes it is easier to specify the reverse regex using the option instead of trying to invert it manually http://stackoverflow.com/questions/466053/regex-matching-by-...

Also, this fits in with advice #3, use common options. Users aren't "blindly" adding -v for verbosity; they're expecting it as a common option. Admittedly, the verbose/version/something else distinction is not as standard as I'd like, but if I were writing this I would have explicitly said in point 3, "-v is verbose. -V is version. -h is help. -?, if implemented at all, is a synonym for --help. -n is dry run, if that makes sense for your command. (By the way, unknown flags should cause the command to fail rather than silently do the wrong thing.) The first instance of -- causes all further arguments to be blindly accepted as positional arguments rather than flags, including further uses of --. If at all possible, all command forwarding (e.g., ssh, shell, autocall) is done using multiple positional params, not nested quoting nightmares."

All agreed. Most importantly, unknown arguments should not go by silently and -- should be implemented.

I find it quite annoying every time I need to check the installed version of java and `java --version` doesn't work (I need to type `java -version`).

I hate `-long` style options too. It's not just aesthetic/confusing, it's objectively worse/more limited:

- It means `-thing` != `-t -h -i -n -g`, so the latter has to be that long;

- You also can't have `readable-long-things-as-an-option`;

- You can't (or not without crazy logic and a confusing UX) have variables passed with no space, like `-ffoobar`

In that third case, I find it crazy logic and confusing UX to have values with no space. If I type mysql -uroot on the command line, that should be equivalent to mysql -u -r -o -o -t but it's not.

Well the argument parser knows that `-u` must take a value, and the only way it can really associate it with its value is positionally (i.e., the value is the stuff that immediately follows it). So it could force you to make the value a separate argument, but that would be a purely arbitrary restriction — it couldn't allow anything but the value to follow, so it'd just have to return an error for the sake of it.

This behaviour is defined in POSIX, btw:

>If the SYNOPSIS of a standard utility shows an option with a mandatory option-argument (as with [-c option_argument] in the example), a conforming application shall use separate arguments for that option and its option-argument. However, a conforming implementation shall also permit applications to specify the option and option-argument in the same argument string without intervening <blank> characters.

Oh, I had never thought of it like that. That makes sense! I just treated those commands (like mysql) as one of those commands where -every -option -must -be -separate.

I suppose I don't mind that syntax anymore now. Thanks for changing my mind :)

It is the (social contract) responsibility of the programmer to document in the usage and in the manual page that -u takes a username as an argument, and where such documentation exists, the convention is that username may be provided as -uuser or -u user.

This is trivially achievable with getopts(1):

  while getopts u: Option
    case "$Option"
The colon tells getopts(1) that the -u option expects an argument. This behavior is also mandated by the POSIX specification.

Don't worry, java makes up for it by putting all the config into flags. At one place, I have a triple-monitor setup, all in a horizontal line. This was still not enough for one java application (elasticsearch? logsearch?) - even stretched across all three screens, I couldn't see the full process string on a single line :)

I've made that mistake at least once, if not more than once.

I agree that `--version` should always be present, but I think `-v` as an alias for version is not uncommon. In fact, I usually try it before I bother with `--version`.

Trying the first few programs off the top of my head I found node, ruby, docker, clang, npm, and even the man command all respond to the `-v` flag. I think that's common enough for me to support it when I don't need a verbosity toggle.

Hmm. I suppose if there is no default action (given no stdin input is provided) then -v might still mean verbose and print a version. Like, running grep by itself does nothing, but grep -v by itself might print a version because it's more info (more verbose) than nothing.

Commands that do something without arguments, like cat (even though it just blocks, waiting for input) might still print the version to stderr I suppose.

That would be a way to solve both cases.

"I agree that `--version` should always be present"

I'm not sure why version information has to have an option on its own. For humans it can be provided in help. Considering the limited usefulness it has as a separate thing, it's just command option garbage.

I thought -v was exclude, like grep and pkill.

-v traditionally means verbose, just like -n usually means dry run, but the application's manual page is always the final and definitive authority on which options the application provides, and what those options do.

The convention is to consult the manual page first.

Those are the exceptions to the rule in my experience. Depending on how old those tools are they might just predate the practice.

I'd expand on this and say every command line program should support GNU getopt_long() syntax, since this is the most popular syntax for command line interfaces and it saves having to guess the syntax when using a program for the first time. This doesn't mean all programs should use getopt_long(), but they should understand all the syntax that getopt_long() understands, such that someone could reimplement the program's command line interface using getopt_long() and get exactly the same behaviour.

This means, for example, supporting both space-separated and equals-separated option values (--name value and --name=value). Some programs only support one or the other, so you have to guess which will be supported when invoking a program for the first time. Both forms have their uses too. --name=value is more explicit and should throw an error if the --name option doesn't take a value, so it's good for use in scripts. On the other hand, a lot of shells don't tab-complete file names unless the file name is in a separate argument, so for interactive use, --name value can be more convenient.

http://docopt.org/ demonstrates the common convention for CLI interfaces.

And when spawning external programs, always use the long options, and always close with "--" even if there are no options.

I'd expand on this and say every command line program should support GNU getopt_long() syntax, since this is the most popular syntax for command line interfaces and it saves having to guess the syntax when using a program for the first time.

One of the greatest crimes which GNU is not UNIX has committed are the --long-options, and that horrible practice stems from the fact that command line applications on GNU/Linux have poorly designed usage displays, and even worse (often non-existent) manual pages, and when there is a manual page, it lacks an EXAMPLES section, in stark contrast to UNIX systems (Solaris, HP-UX and SmartOS are exemplary).

Ease of UNIX and efficiency thereof comes from brevity, and that particularly concerns command line options: the lowest amount of typing possible is the goal.

The solution is not to invent an arbitrary and horrible convention, but to design better usage output, and to deliver high quality manual pages with lots of examples.

Treat the root cause, not the symptom.

The quality of GNU documentation has absolutely nothing to do with the merits or demerits of option styles. Not sure why you conflate the two.

The GNU style makes things more explicit and readable, which is why people love it -- long options have been implemented far and wide outside of GNU.

And it has one nice feature: It lets you treat short options as concatenable (so instead of "-v -f" you can just do "-vf") without ever accidentally conflicting with long options.

Because --long-option is way to cop out of writing a good quality usage output and a comprehensive manual page, as well as putting undue burden on the user to type more than is or should be necessary. Why do you think move is mv, list is ls, copy is cp, and remove is rm? And how long did it take to learn cp -p, ls -al, or rm -i? Ergonomy was and remains an extremely important topic in UNIX usage, and --long-option something or --long-option=something completely undermines that.

-vf usage has absolutely nothing whatsoever to do with GNU getopt; it is the core functionality of both getopt(1), getopts(1), getopt(3C), and getopts(3C).

No, the names are shortened because the original Unix was meant to be used over slow connections, and every character not typed was time saved in command execution. We don't have those concerns now, and so to make code unreadable for the sake of conciseness is not really useful.

The names were shortened becuause the terminals had keyboards very similar to a typewriter, and stdout / stderr was a built-in printer, so mistakes were costly, and time consuming:



the ergonomy came first and foremost, not the slow links, and the ergonomy still plays an extremely important role on UNIX and UNIX-like systems, even today, because more typing and mistakes are still time consuming; those factors haven't gone away, and no link, however fast, can help with that.

Ergonomy is extremely important on UNIX. It's one of the major reasons for preferring the command line as opposed to a graphical user interface.

To test this: run non-customized /bin/ksh and see how fast you're able to type long command names without [TAB] completion and command history, and what your input error rate will be.

You can even use a modern shell, like tcsh or zsh which have extremely powerful command history and autocompletion facilities, and see how well you fare with the shell completion of --some-long-option=/path/to/somewhere. You can even use bash and see how well it will complete --some-option for you.

The message here is: if you're developing a command line application, please do not use --long-options, and please consider ergonomy very carefully. Save your users typing, and think about how you can reduce their error rates. --long-option isn't the way to do either of those things, it actually makes things more strenuous and slows one down.

I'm sorry, but that smacks of post-rationalization. Brevity used to be a benefit in the old days with slow connections, but the examples you give are also a small set of super-frequently-used coreutils commands. There are plenty of less core commands where brevity is a hindrance, not a helper, to understanding. For example, git has so many options that short options isn't even feasible for most of them.

As for "cop out" — it's not like long options prevent you from writing good documentation! Those are two different, unrelated things. If my tool has an argument "--force", that's sort of self-documenting, to be sure, which "-f" isn't. But most tools do need to document exactly what "--force" does, irrespective of whether it's a long option. GNU's lack of documentation can't be claimed on option style.

Your argument also sets up another false dichotomy: The presence of long options doesn't preclude to possibility of having short ones (--force maps to -f and so on), or indeed the possibility of short command names.

PS. I noticed the other comment about shell autocompletion. I use zsh, and I autocompletion of long options works just fine (including just typing -- + tab and getting a list of available options).

As for "cop out" — it's not like long options prevent you from writing good documentation! Those are two different, unrelated things.

I must vehemently disagree with you and anyone else that these are two different, unrelated things, because I assert that there is a direct link between poorly or non-existent manual pages and --long-options, and no amount of punting to GNU texinfo, a solution proprietary to GNU/Linux, will change that.

As noted in a related post, brevity in UNIX has to do with ergonomy, and nothing whatsoever to do with link speed. That git has so many options is either a failure of the interface, or poor documentation, or both (after trying out git, I'm completely skipping it and going from Mercurial to Bitkeeper, and the interface is one of the reasons for that).

Actually Mercurial is a perfect example of just how irritating and anti ergonomic long options are, example:

  hg revert --no-backup some/file
so much typing. Unbelievably frustrating and irritating to no end.

As for zsh, how could it possibly know how to complete the long options of every single application on the system? How does that work?

Also a point on your coreutils argument: two letter application names, where names can be meaningfully shortened, is good design every programmer who cares about their and their users' productivity should follow; where that isn't possible, the name should be shortened as much as it can be. For example, I wrote a program which builds a list of SVR4 package dependencies, but instead of naming it find_dependencies, I named it fndeps. The manual page should explain what the program does, not the --long-options; software without a comprehensive manual page on UNIX is fundamentally shoddy software.

The link between long options and poor documentation is something you'll have to prove, I'm afraid. I'm not buying it.

You've also yet to show that long options is some kind of "great crime" that hurts anyone. It's ridiculous. People invented long options because short options are not intelligible unless you already know what they mean.

Your Mercurial example actually undermines your argument completely. I don't know "hg" at all, but when seeing "--no-backup" I immediately understand what that flag does, something which would not be true if the command were "hg revert -C".

Moreover: The fact that "--no-backup" exists does not exclude anyone. You can use "-C". If you can remember it. There's your hallowed ergonomy, you already have it.

As for zsh completion: A program simply needs to follow a specific protocol (take input, reply with the right output), and then be registered with the "complete" command. zsh comes with autocompletion for lots of things.

Your Mercurial example actually undermines your argument completely. I don't know "hg" at all, but when seeing "--no-backup" I immediately understand what that flag does, something which would not be true if the command were "hg revert -C".

Are you a casual user of command line tools? If you aren't, you know that command line options of anything become muscle memory if you use it long enough; at that point anything that stands between you and the correct result is nothing but a hindrance.

The workflow on the command line is:

1. manual page;

2. SYNOPSIS section in the manual page;

3. EXAMPLES section in the manual;

4. run the application.

If your operating system of choice has inadequate manual pages because developers say "I built in --long-options because I didn't want to figure out how to write a proper manual page", it's time to seriously contemplate ditching such a system, and replacing it with a real UNIX. At that point, when that happens, you lost and you lost big.

As for zsh completion: A program simply needs to follow a specific protocol (take input, reply with the right output), and then be registered with the "complete" command. zsh comes with autocompletion for lots of things.

So basically there is a hardcoded file with all the programs which the developers or the packagers of zsh were aware of, so that --long-options would work? So now you're assuming that everyone is using the application on the same substrate as you? What if that file isn't there on my OS? And how does that hardcoded hack scale?

Yes, I can remember short command line options of tools I use often enough, and if I don't -- my operating system substrate comes with comprehensive manual pages, so I don't have to.

I am empathically not a casual command line user that I would need --long-options because that's more intuitive, so --long-options suck because ergonomy trumps casual use.

I will never provide GNU style of long options in my software, because I deeply care about my users, and I'll go out of my way to spare them from typing in the long term. What I will do is write a comprehensive manual page that will be a faithful companion and a good friend to them on their journey. And if they just don't care and want to hack something up on the quick -- I'm not that kind of a person, and it's not the kind of a relationship I want to have with my users; they can go somewhere else if they just needed to run some application casually. My tools are designed for long term, efficient use and comfort. Let people type --no-backup all day long "because it's more intuitive". Good luck with that.

Non-existent manual pages? You forget the GNU convention; have a stub manual page that tells you to run "info" instead, so you can use the text browser to navigate 90 separate pages of two sentences each at your leisure.

If this is sarcasm, I can completely empathize with you; if it isn't, then the stub manual page is as good as non-existant, because no matter what, I am not going to waste my time struggling to navigate through a GNU texinfo page; I will just go to a real UNIX operating system which has high quality manual pages[1], and get my work done there, and that will be the end of it.

[1] https://news.ycombinator.com/item?id=11927525

I like having the option to use both short and long forms. For common tasks the short form becomes second nature and saves time. For scripting or rare tasks I appreciate that the long form is self-documenting and less prone to typos.

>Provide long, readable option names with short aliases.

It's unnecessary, and potentially even burdensome, to provide short aliases for less commonly used options. `ls --quoting-style` doesn't need a short option — you're almost never going to need it.

>Provide a version command that can be accessed by version, --version or -v.

Don't use `-v` for that. It's true that some utilities do, but i think even more use `-V` — including most GNU tools, Python, PostgreSQL, cURL, OpenSSH, iptables, iproute2, procps-ng, just about any Symfony Console app, &c. Even if you don't need a verbosity option right now, you might some day, and it's best to have that work as expected.

>Sometimes your script will take longer to execute than people expect. Outputting something like ’Processing…’ can go a long way toward reassuring the user that their command went through.

If you do print status messages like that, make sure to send them to STDERR if your utility outputs any actual data (like a report or file listing). Same for progress meters.

>Conversely, don’t exit with a nonzero status code if your CLI didn’t encounter an error. Your cleverness will end up confusing and frustrating your users, especially if -e is set.

Using `set -e` is the 'cleverness' here. It's a bad idea in all but a few cases. Also, i'm not sure what qualifies as 'encountering an error', but i think it's perfectly reasonable to return a non-zero status in certain non-error conditions, like when `grep` doesn't find a match, or in other cases where output is valid but empty.

Stderr should only get error messages, and, arguably, warning messages. It shouldn't get status and progress messages. Those should go to /dev/tty, and the program should be gracefully quiet if run without a terminal, e.g., from cron.

Shouldn't the "verbose" option take care of the presence of the status or progress messages? Also, arguably, the error and warning messages are status messages too, so I would say all the status messages, regardless of gravity, belong to the same channel.

Error and warning messages are not status or progress messages, at least not in the context of this discussions, which is that a long-running computation should indicate to the user that it's doing something.

Example: "warning, the x is in y condition and thus z will take place affecting the computation in progress" or "error, could not perform x (out of many x-like things to do, but I'll continue because you instructed me to)". These are status messages in a long-running computation context.

I disagree. They're a warning and an error message, and should go to stderr.

It may _also_ be useful to show them on the terminal, in case the user would like to see them now, and has redirected stderr somewhere other than the terminal. In that case a program might show them on /dev/tty as well.

> return a non-zero status in certain non-error conditions, like when `grep` doesn't find a match, or in other cases where output is valid but empty

This has bitten myself and several colleagues when trying to use grep with hadoop streaming. If any of the input splits doesn't have a match, grep returns an error code, and hadoop interprets that as a failure. We switch to awk in those cases.

> i think it's perfectly reasonable to return a non-zero status in certain non-error conditions, like when `grep` doesn't find a match

To be fair, grep only does that if you explicitly pass it an option that whose only purpose is to enable that behaviour.

Not true. This is grep's default behavior. From https://linux.die.net/man/1/grep:

> Normally, the exit status is 0 if selected lines are found and 1 otherwise.

    $ touch bar
    $ grep foo bar
    $ echo $?

    if grep -q foo file.txt; do
        echo found
Very useful

This is missing something that I highly recommend:

"Start with supporting stdin/stdout as the only input and output. This ensures that it is composable with other utilities. You may find you never need anything else"

Need to read a file? `my-cli < file`

Need to write a file? `my-cli > file`

How about read from a URL? `curl url | my-cli`

Read from an URL:

    vim <(curl -fsL https://news.ycombinator.com)
This creates a temporary file descriptor from the output of the invoked sub shell.

Read standard input in a shell script:

    #!/usr/bin/env bash
    cat <&0 > $(date +%s).txt

While I agree that v0.1 should support stdin/stdout, I don't know that you are serving all of your users well by limiting i/o to ONLY stdin/out.

The two applications for specifying files on the command line is when: they actually do something with that file (e.g. move the file) or when you operate on multiple files and/or directories (e.g. backup application).

Otherwise it still might be useful, but in general it's kind of unnecessary. It adds logic to your application that it doesn't need, which violates "do one thing, and one thing very well" (albeit only a very small violation).

ISTM that you're arguing from a perspective of intrinsic necessity. Your argument is that, anyone can cat a file into my utility, if they need that functionality.

Sure. However, for example, GNU sort doesn't work that way. Most utilities don't work that way. Most utilities accept a file as an source of input, and most of those don't act on the inode. That's the status quo.

sort is a bad example: to support sorting in-place, it has to know the file name. Pity the user who typed

    sort < bigfile > bigfile
for said user just lost a file.

A general solution is something like:

  $ sort "$name" | sponge "$name"
Check whether:

  $ sort -o "$name" "$name" 
works with your version of the sort utility.

This has bitten me occasionally, even though I know the workarounds (tempfiles or pipe-consumers like sponge(1)).

I'm wondering if there's any practical use for the behaviour, or if it's worth hacking a shell such that it produces a warning/interactive confirm prompt for it (or transparently buffers to DWIM maybe?)

It's pretty common to `set -o noclobber` in beginner dotfiles; doing deferred-open-if-exists is an interesting idea that would probably get a lot of resistance :-)

A third is "it uses multiple files (and the behavior can't be replicated by concatenating them)."

Picking one input to still be consumed from stdin can make sense, though. A special case of this is config files, which are almost never read from stdin (they usually have a default location or several).

Outputting to stdout gives the control and power to the user - if they want it in a file which they want to name, they can do that; if they want to pass the output as input to another application - they can do that too.

There are few usage scenarios where such behavior isn't enough, like for example fsck, but even there this paradigm is flexible enough to work - for such applications could be split into an analyzer and a repair program. There is nothing stopping one from outputting a binary data stream on stdout; Lots of applications do exactly that on UNIX, compressors come to mind.

What use case have you where stdout/stdin/stderr isn't enough on an operating system family where the core paradigm of usage is that everything is a binary stream of data?

> Outputting to stdout gives the control and power to the user - if they want it in a file which they want to name, they can do that; if they want to pass the output as input to another application - they can do that too.

Please note that we already agree here.

> What use case have you where stdout/stdin/stderr isn't enough on an operating system family where the core paradigm of usage is that everything is a binary stream of data?

My argument isn't that I have a use case where it 'isn't enough.' My argument is that many people come to a cli utility expecting that <foo /path/to/myfile> simply works.

My argument is that many people come to a cli utility expecting that <foo /path/to/myfile> simply works.

So are for or against that? If you're for the < /path/to/my/file argument, then we are in complete agreement. A command line application should read from stdin where that applies.

It seems to me there was confusion here caused by an unfortunate choice of angle brackets for quoting.

I think `foo /path/to/myfile` is what the parent was trying to say.

It was. Thank you.

I want stdin to work. I also want specifying a file name to work. I think supporting both is good UI design, and follows the principle of least surprise.

Though, you still should provide a way to read from file/write into file. I prefer passing "-" or /dev/stdin (/dev/stdout) for that purpose. Just for the sake of debuggability, if someone would like to debug your script with a debugger/debug prints/debug reads

Why not both? Many of my favorite utilities are stdin/out by default, have file parameters, and will accept '-' to go back to stdin/out.

This is highly useful if you want to allow composition into scripts without forcing users to dynamically build parameters. Meaning, they can use, e.g. `FILE_NAME=${FILE_NAME:-"-"}; some-util --output $FILE_NAME;` and not have to decide whether or not to use the --output parameter in their script.

Is that not why you have stderr?

Is there a simple way to process a list of files with this restriction? Something like `ls | xargs $(my-cli < $MAGIC_XARGS_VAR > $MAGIC_XARGS_VAR)`?

I know that it could be done with a loop, but that's a little unwieldy compared to `ls | xargs my-cli`

If processing the list of files is equivalent to processing the concatenation of the list of files, then you can `ls | xargs cat | my-util`. If it is not, then as I argued in a cousin comment I think that's another good case to support accepting file names.

>`ls | xargs cat | my-util`

lol, this is really turning the whole 'useless use of cat' thing on its head.

`cat` is short for conCATenate. Here it is being used to concatenate files. This is the canonical useful use of cat.

The problem with that example is parsing `ls` output.

> The problem with that example is parsing `ls` output.

Yeah, I left that alone to keep things simpler and was assuming `ls` was simply being used as short-hand for "some unspecified approach to outputting file names". In a context where hard-to-handle filenames are possible, you'd of course need to do something a little more robust.

Yes, that's what i was getting at. The use of `cat` isn't useless (for once) — the use of `ls` and `xargs` is.

so, how to avoid parsing ls? maybe use cat *, or something?

It really depends on what you are trying to accomplish.

`cat -- *` might be the right choice. Maybe `find . -maxdepth 1 -type f -exec cat -- {} +`

> 12. Write to stdout for useful information, stderr for warnings and errors.

I would constrain this even further to "Write to stdout only if information is useful as input to another program."

Even useful information for a human reader can make downstream integration overbearingly complex if the information is intermingled with a lot of extraneous, albeit human-readable, information. Machine-readable layouts (structured somehow: csv, tsv, json, xml, etc.) are vastly more useful for integration.

>I would constrain this even further to "Write to stdout only if information is useful as input to another program."

May not always be possible to know this in advance.

Might be of interest:



Also, stdout and stderr can be redirected independently, and either one can be redirected to the other (cmd args 2> file, 2>&1, 1>file, 1>&2, etc.). There are even more advanced options. "man bash" has the details.

Edit: Fixed a URL typo.

> I would constrain this even further to "Write to stdout only if information is useful as input to another program."

No. `foo --help' should never ever write to STDERR. It breaks piping the message to `less', so I need to either juggle with file descriptors (bash) or use special syntax sugar for this juggling (zsh).

Whatever lands in `less', it's hard to be considered "input to another program", because it's mainly message for human that just happens to be paged.

Completely agreed. In addition, the following point

   > 8. Don't go for a long period without output to the user.
should be controlled by a switch and off by default. I hate chatty programs (only if there is an easy way to tell git to shut up).

on by default. If it's too noisy, you can look for how to make it shut up, and you will find it. If it's too quiet, you will look for why it's hanging, and not find it (because it's not hanging, just slow).

Shouldn't all programs be, besides their functional output, quiet by default? Isn't that why the "verbose" option exist?

Have you ever used wget? Do you think it's too chatty?

There is a signal, SIGINFO. For example, see BSD's `cp(1)'

     If cp receives a SIGINFO (see the status argument for stty(1)) sig-
     nal, the current input and output file and the percentage complete
     will be written to the standard output.

That is seriously neat. But! The user has to know this. Surely, if you're trying to copy a 6GB file, it's better to take a lesson from wget and print a progress indicator than force the user to (1) read the man page, and (2) keep polling the program to ask how it's doing.

"Write to stdout only if information is useful as input to another program."

Maybe devise a new std console channel for this integration purpose alone? "stdipc" (as in standard inter-process communication) maybe?

That's why they're called standard INput and standard OUTput.

"ipcin" and "ipcout" then?

I strongly disagree. A CLI is fundamentally for the benefit of humans interacting with the application, and not for the benefit of interprocess communication. When run interactively, a CLI should print human-useful information on the console.

Conversely, a structured way of exposing program outputs and state should be the preferred way of interprocess communication.

Because of convention and deliberate design choices, on the Unix command line, these two often find themselves in conflict. In my opinion, the solution isn't to compromise human usability to support machine-consumability of outputs.

For one, STDOUT and STDERR both go to the console, so it's very feasible to do both. Additionally, there are quite a few machine-readable layouts that are very very human readable (table or tsv, for example, as a huge swath of existing CLI programs output by default).

I can very easily understand the output of `ps` or `ls` and I can very easily capture their output with a subshell: `for file in $(ls)` or `for process in $(ps -opid | tail -n +2);`, etc.

The more ways there are to accomplish the same goal, the greater the number of users who can figure out some way of automating what they want to do. So yeah, make output format an option, with `--ouptut` or `--format` for those who want to use `jq` or some other tool, but strongly prefer defaulting to table output so that standard tools (awk, cut, etc.) can be integrated easily.


A CLI is fundamentally for executing from the command line. Thus, embracing the power of the command shell is best practice.

If you want a human to see it, put it on stderr.

If you want a machine to see it, put it on stdout. Also I strongly agree with making it "structured" output. At minimum, a line-oriented record output is fairly easy to process downstream.

EDIT: If it is an interactive console application (REPL), then stdout is okay i guess.

Would a good compromise be to provide -q|--quiet to suppress extra, human readable messages when piping out to a different app?

Personally, I'd just say use a tab-delimited table by default. And put columns that may contain whitespace last. This way it's human readable and more easily awk/cut parseable.

Unfortunately, with e.g. `docker ps`, I can't do as much as I'd like, because it's purely human readable by default. For example:

      ~  docker ps | awk '{print $7}'   

Another option is looking at isatty. I'm not sure how I feel about either.

Please provide a flag that bypasses the isatty check. I use nohup for anything slow or hard to rerun and then tail and/or grep the nohup.out file.

Yeah, I think that's definitely good advice. isatty can be a good way of deciding which defaults to use, but it shouldn't be the only way of accessing functionality. Note that the flag might say "pretend isatty said yes/no", or there might be a few flags that independently mediate all of the things set based on isatty.

It may be even better to be quiet by default and explicitly turn on the verbose mode when the "extra, human readable messages" are actually needed.

To suppress the human messages while piping stdin, I add `2> /dev/null`.

There are plenty of times that that's appropriate, but note that it also hides genuine error messages on failure. It's not a good solution for making output less chatty - that's precisely what a `-q` option is for.

yes, that would be a good compromise

I used to use Ruby for CLIs. Then I started using Golang and delivering compiled binaries for each OS that I supported. This has been a game changer for me. With the CLIs that I've worked on that are open source https://github.com/RiotGamesMinions/motherbrain and some other ones in Ruby, we consistently had issues with rubygems and having people be able to run our CLI from version to version.

Using something like Golang for distributing compiled binaries means that as long as they have the CLI, it will continue to work. With Ruby, Python, PHP, etc, there is absolutely no guarantee that your application will work in the future.

I'm not a Go user but I agree. While I make Node CLIs for myself because I'm very quick with JavaScript, I'd choose Go if I was making something for a lot of people!

Go has been a game changer for me also. There is also a very good package https://github.com/spf13/cobra that can make the development of complex CLIs easier.

Go default flag package is pure garbage though. That's a fact. However, Go cross compilation makes it easy to develop multi-os CLI tools. The downside is the size of the executable, it's easy to reach the 50MB bar with Go binaries.

Yes. The default "flag" package is trash. I like the Go team a lot, and I think they did a great job with many things, but it seems to me like they just said, "meh, get something that can work, usability is not something that matters."

consider the history, it probably makes sense if you're used to research unix and plan9

How exactly do you create a program in go that compiles to a 50mb exe?

Why is the flags package trash?

One trend I've really not appreciated in recent years is neglecting man pages for command line utilities.

Having a `--help` option is great as a cheat sheet, but nowadays these help outputs are much too large and hard to navigate because.

Alongside complete man pages, help commands can be helpful high level overviews focusing on common use cases with examples.

That's been a long-standing beef of mine for well over a decade. There are several apparent sources of this.

The GNU project deprecates man in favour of info. This is a category error for numerous reasons. The good news is that the documentation still exists, man can be configured to fall back to 'pinfo' or similar, and projects (Debian are particularly good about this) can schlep the Info format into a manpage.

There's an argument in favour of info: it was a hypertext document format which is, arguably, more powerful than man. On the other hand, it has a format-specific reader (info), there's a far more dominant hypertext document format (HTML), and there are utilities to provide manpages in HTML format and over a local HTML server (dwww:https://packages.debian.org/stable/doc/dwww). GNU should have bailed on this decades ago.

Several projects, notably GNOME (a GNU project) and many Red Hat utilities, lack good / updated / any manpages. This is particularly frustrating. KDE similarly fails frequently to provide manpages.

Again, Debian frequently will create and provide manpages, but they will frequently run behind package development, so new features aren't clearly documented.

Increasingly, standalone packages (say, imagemagick) don't fully document themselves in manpages. Worse are vendor utilities which lack any manpage, a useful --help or -? -h page, or anything else vaguely resembling sane practices.

This is entirely inexcuseable.

Write a man page, but it's 2016, you don't have to write it in ROFF... It got a lot easier to get people to write or update man pages for a bunch of in-house CLI tools when I found ronn https://rtomayko.github.io/ronn/ which starts from markdown (you'll still get a lot of copy/paste formatting from your other man pages but at least it'll be readable.)

Thanks for saying this, I was about to say the same thing. Write a man page. --help or -? is OK for a terse summary but don't put detailed instructions in --help output, because then I have to run it twice (since I didn't add '| less' the first time).

Maybe including dependent switches to enable/disable different parts of shown information? For example just "--help" would show a short description and the existing switches for further debriefing. Then you'll have to type "--help --commands" for showing a list of commands and description in one or two words, or "--help --examples" to show a whole lot of examples, and so on.

Very good article.

Another point that I think is valuable is: "Output one line per record data"

This is useful if your CLI that has to output data to the user. Makes it easier to interface with programs like `grep`

I'd add to that an option to deliminate records with NUL instead of new line and treat all other characters as literals.

Suppressing the header line and any fancy formatting jazz is a nice idea as well, since otherwise there's invariably some buggery needed to strip it like:

    ps | tail -n +2 | cmd_with_just_records
or, if you're doing column extraction as well, I like:

    ps | perl -lanE'say $F[0] if 2..eof'
[ps used here as a stand-in for some other command that has a fixed header. I'm aware that you can omit it via hte somewhat long-winded `ps -o "pid= command= [...]"' there, but afaik there's no simple switch for 'do what you were going to, just without a header line at the top']

> 5. Don’t have positional options.

But also, don't conflate options with arguments. Options should be optional (the clue is in the name) and prefixed with - (single letter options) or -- (long form options). Arguments are typically positional, but as the author says, they can become confusing if there are too many, or the ordering is non-obvious, in which case the syntax may need a rethink.

A blanket "don't use positional arguments" would be tedious though. For example, if I ignore what I say above, imagine having to type "cd --directory=foo" or "cp --source=foo --destination=bar" all the time.

> "cp --source=foo --destination=bar"

Cf. `dd if=foo of=bar`

Not really disagreeing, just adding context.

Then maybe one should also add the bit of context that dd's command line syntax was intended as a joke, a pun on some job control language of yore.

Oh, really? Interesting. While... odd, I always thought it appropriate to be extra explicit with dd, where for some common uses you've the chance of nuking the wrong disk.

Do you know which job control language of yore?

OS/360 JCL, apparently. At least Rob Pike wrote something to the effect on his Google+.

dd is horrible on purpose. It's a joke about OS/360 JCL. But today it's an internationally standardized joke. I guess that says it all.

How the hell does one direct-link to a G+ comment? Well, cllck the thing that says "View 384 bazillion previous comments" and then search for dd.


Semi-related: if you write CLI tools with nodejs, I highly recommend using yargs and/or inquirer - yargs for robust command line arguments parsing, and inquirer for interactively asking questions to the user.



When using these tools correctly (following readme), you get most of the stuff written in the article out of the box.

Ew. Most programs shouldn't ask me questions.

They should print a usage statement if I haven't provided enough info.

Yep, they're commands, not conversations.

And if you want something a bit above yargs (and to prove the point of JS having many solutions to everything!) use `meow`:


cli-kit is rather comprehensive, well-thought out, and modular too if you're looking for Node options:


Most, I agree with, others, I don't.

> 1. Every option that can have a default option should have a default option.

The exception to this are output files. A program should only output to a path that has been explicitly given to it. Too many scripts litter their working directory with output files.

> 8. Don't go for a long period without output to the user.

If there nothing's wrong, then nothing needs to be printed. Especially once a script gets called from another script, I don't want to have any output from a script unless it requires my attention.

Default output files: that's what stdout is for.

> I don't want to have any output from a script unless it requires my attention.

I lean this way, too. That said, a rare "this may take a while" before things that may take a while might do more good than harm. Another option is letting the user ask for a status update (maybe with a signal, like dd?).

Most, I agree with, others, I don't.

This is the fundamental reason why I've grown tiresome of the phrase "best practices". They probably are the best you can come up with, but that doesn't mean they're the best (or sometimes even close) for everyone else.

I think the discussions are useful, even if the conclusions aren't. Discussing best practices gets everyone to bring up the pros and cons of each. Then, when confronted with a new area, one can apply those arguments to figure out a decent set of best practices for that particular field.

Related to CLI best practices, is there any good styleguide or "best practices" list for formatting help and text output?

I've only written a handful of small CLI utils, and each time I end up looking through tools I regularly use and adapt ideas from each. But this is more time consuming than being able to run through a list of suggestions prepared by someone with more experience in this domain.

POSIX talks about it a little bit: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_...

Some argument-handling libraries like Python's argparse and PHP's Symfony Console generate usage help for you, and they all use a relatively similar formatting that i think was inspired by the formatting of man pages: http://man7.org/linux/man-pages/man7/man-pages.7.html

docopt (http://docopt.org/) is an attempt to standardise it from the other way around — instead of you defining the options in the code and it generating the usage help for you, you write the usage help and it generates the code to define the options. But i don't know that it's super widely used.

My favourite annoyance is using the option syntax for things that are not optional.

If it really is a command, then require a command without the -- or -.

Today's examples:

1. centos6/redhat6 /sbin/chkconfig; it needs one of --list or --del or ... so why not just list or del or ?

2. the kafka command line tools, e.g. kafka-topics. It requires --create or --delete or --list ... So just use create/delete/list/... !

"it needs one of --list or --del or ... so why not just list or del or ?"

That's all good and everything but what about the parameters for which optional is only the explicit digression from an existing default value?

I find myself using environment variables much more than commandline arguments when writing CLI programs:

- Env vars provide a key/value interface, which is trickier to handle with arguments (e.g. "get the argument immediately after '-f'").

- Env vars are in-scope at the point they're needed, whereas non-trivial arguments usually have to be parsed up-front all at once and passed into the processing code somehow. Yes, it's good to check things up-front; we can check mandatory/mutually-exclusive env vars up front too, with the bonus that we can keep such 'invocation checking' separated from the rest of the processing.

- Setting env vars in preparation for a subprocess call can be done incrementally, by extending the environment over and over until we're ready. In contrast, commandline arguments must be accumulated into a single (correctly parsable) array.

- Sharing env vars across a program and its children is easy (e.g. "VERBOSE=1 ./foo.sh").

- We can use env vars to control a program even when it's called deep in the bowels of something else, without having to fiddle with all of the layers in between.

I find this works very nicely, as long as we:

- Treat the environment as append-only, i.e. no changing a variable once it's been set. We can append new variables, e.g. "export FOO=bar", and we can invoke processes with extra variables, e.g. "FOO=bar ./baz.sh", but we don't unset or alter a variable.

- Don't use env vars for communication within a process, i.e. we can use the environment variables we're invoked with, and we can append new variables for use by the programs we invoke, but within a program we should refrain from setting variables which affect our own execution (since that causes the coupling and spooky-action-at-a-distance which globals are notorious for).

What are others' thoughts on this?

I think if we make too much of a habit of this, we'll wind up with painful collisions. It also does the wrong thing with no feedback if you try to set a shell variable that isn't actually recognized (and possibly if you misspell the option name, &c).

That said, I agree that it can work quite nicely in the small, but it should probably be reserved for things like VERBOSE where you are likely to want propagation.

> Env vars provide a key/value interface, which is trickier to handle with arguments

Many (most? all?) languages practical for assembling a shell utility have a getopt-ish library that should make this easier (though still likely not as easy as grabbing from a shell var, granted).

> (e.g. "get the argument immediately after '-f'")

Is there a language commonly used for cli tools that doesn't have an inbuilt options lib? Even bash has inbuilt 'getopts' to handle this for you.

    while getopts "ab:" option; do
        case $option in
            a) VAR="foo";;
            b) VAR2="$OPTARG";;
    shift $((OPTIND-1))
which will allow you to "myscript -a -b myarg"

By all means, use env vars if it works for you, but it should be right tool for the right job. Imagine trying to use 'ls' with env vars...

> Is there a language commonly used for cli tools that doesn't have an inbuilt options lib?

Yes, namely C/C++ under the Visual Studio toolchain: https://stackoverflow.com/a/12689342

I stand corrected. I thought I might get burned by C, but didn't expect it to bring Windows in on the action :)

Interesting. I'll have to parse (heh) your comment, but for now, can say that IIRC, environment variables, command line options, and one other thing (maybe config or .*rc - for Run Command - files, such as .exrc, .vimrc, .bashrc) are supposed to form a hierarchy, such that A can override B can override C (something like global and local CSS styles). And out of two of them at least, command-line options can override the values of environment variables, maybe since they can be changed more easily on a per-command-invocation basis.

Reminds me of this super useful thread from SO http://programmers.stackexchange.com/questions/307467/what-a...

As is typically the case in software, different standards tend to emerge from different projects and their associated tools.

Two of the primary diverging Linux/Unix standards are FreeBSD (arguably the older, from AT&T & BSD traditions) and GNU.

The GNU programming standards specifies standards for command line interfaces in section 4.7, strongly influenced by GNU getopt(3). https://www.gnu.org/prep/standards/html_node/Command_002dLin...

This references the Table of Long Options (this needs to be worked in to Game of Thrones): https://www.gnu.org/prep/standards/html_node/Option-Table.ht...

For FreeBSD, the equivalent guide appears to be (I'm not an expert at this, so salt appropriately) the kernel source file style guide: https://www.freebsd.org/cgi/man.cgi?query=style&sektion=9&ma...

There may be another source.

I've seen other significant projects impose their own argument styles. Among those:

MIT/X11 has its own family of arguments.

Various toolkits seem to engender their own styles.

The GNOME and KDE projects have tended to both adopt idiosyncratic and largely undocumented argument formats. I find both tremendously frustrating.

Major applications often develop yet more idiosyncratic command and argument syntax. Chrome, Firefox, and LibreOffice come to mind.

I'm going to pretend Java doesn't exist at all. Nope. It's a myth.

The notorious 'dd' owes its syntax to mainframe JCL notation, from which it is derived.

Providing command line completion should be on this list!

How would you do that unless you were changing the CLI itself?

Modern shells, such as bash or zsh, support specifying completion through external files or scripts that the shell can parse. Having never written one, I'm not familiar with the exacts, but suffice it to say the right file in the right location with the right contents can inform the shell as to how to auto complete.

i.e., the auto-completion facilities are general / extensible.

Especially if many programs follow a general format of

  program subcommand arg arg arg --optional-flag --option
(my personal favorite, as I find it most clear; followed by e.g., argparse in Python, git, many GNU utilities)

then it should be easy to see how a small specification of what subcommands take what for args or options should be enough to enable a pretty powerful auto-complete. (This is an example; I think zsh's autocompleters are actually small scripts; see https://github.com/zsh-users/zsh-completions/blob/master/zsh...)

(This, in zsh, combined with zsh's fuzzy autocomplete, is amazing.)

For bash:

    $ help complete
    complete: complete [-abcdefgjksuv] [-pr] [-DE] [-o option] [-A action] [-G globpat] [-W wordlist]  [-F function] [-C command] [-X filterpat] [-P prefix] [-S suffix] [name ...]
        Specify how arguments are to be completed by Readline.
        For each NAME, specify how arguments are to be completed.  If no options
        are supplied, existing completion specifications are printed in a way that
        allows them to be reused as input.
          -p	print existing completion specifications in a reusable format
          -r	remove a completion specification for each NAME, or, if no
        	NAMEs are supplied, all completion specifications
          -D	apply the completions and actions as the default for commands
        	without any specific completion defined
          -E	apply the completions and actions to "empty" commands --
        	completion attempted on a blank line
        When completion is attempted, the actions are applied in the order the
        uppercase-letter options are listed above.  The -D option takes
        precedence over -E.
        Exit Status:
        Returns success unless an invalid option is supplied or an error occurs.

There are a lot of canned completion approaches, usually tweakable in small ways, but for heavy lifting `complete -F foo bar` will call the function `foo` when you are asking for completion of a command where the first word is `bar`. Information on what words are already on the line, where the cursor is, etc is passed in in shell variables starting with COMP_. See the "Programmable Completion" section of the bash manpage for much more detail.

Note that it's settings for a specific shell process, not global across all instances of bash. "The right file in the right location" is only relevant 1) to set up your shells a particular way by default, and 2) if the function called references a file.

Tying this back in to the earlier discussion, many distros provide standard places for installed packages to drop scripts, which will be sourced during shell startup and which can thus configure the shell appropriately each time. "Providing command line completion" for your utility, therefore, means providing any such scripts as are appropriate, and setting up your packaging (or install scripts) to install them.

For in-house stuff, I also run by the rule that any tool should be safe to run if there are no args provided. Run any of my tools, and if it's potentially destructive, you'll only get a help screen if there are no args provided. Even tools that don't need any args to function will still require a misc arg if they might theoretically damage something.

The theory here is that these tools aren't well-known, so it's an extra level of safety. It also means that colleagues don't have to hunt me down to ask about what it does...

Great advice on the whole!

> 3. Use common command line options.

The GNU options standards are good, but see also the "Command-Line Options" chapter in "The Art Of Unix Programming" for some of the reasoning and history behind these conventions. Actually -- if you're writing unix programs often, do yourself a favor and read all of TAOUP. It really helps crystallize the Unix Philosophy in a way that a grab bag of suggestions doesn't.


> 4. Provide options for explicitly identifying the files to process.

Lots of options pointing to different files is a CLI smell. If your program "does one thing and does it well" (the Unix Philosophy), it often works like a filter: take one or more input files, apply the same transformation to them, output the result. `cat`, `head`, `tail`, `grep`, `cut`, `sort`, `join`, `sed`, `gzip`, `tar`, `cc`, `curl`, `md5sum` -- at the heart of all these programs is a unix filter.

In this case, just have your program accept input files as positional arguments beyond the last option. A nice side effect is that your program will be `find | xargs` -friendly. That puts the user in total control of the inputs. It also means they can parallelize your program with `xargs -P`!

Also consider whether you need input files at all! Can your unix filter program just transform stdin to stdout? If so, it can be used to process streams of data much larger than available disk space. Often you can trick programs that require a file argument by passing `/dev/stdin`, but note that `/dev/stdin` won't be accessible in some environments.

> 8. Don't go for a long period without output to the user... Outputting something like ’Processing…’ can go a long way toward reassuring the user that their command went through.

I get the reasoning behind this, but as a heavy CLI user I'd prefer if programs didn't crap progress all over the terminal by default. If I want verbose progress, I'll pass `-v`. If I want even more insight into what's happening, I'll pass `-v -v`.

See Rule Number 11 in TAOUP:

Rule of Silence: When a program has nothing surprising to say, it should say nothing.


> 10. For long running operations, allow the user to recover at a failure point if possible.

One way to perform a sequence of expensive steps iff they haven't been done yet ("recover at failure points"): orchestrate it with a Makefile.

>> 8. Don't go for a long period without output to the user... Outputting something like ’Processing…’ can go a long way toward reassuring the user that their command went through.

>I get the reasoning behind this, but as a heavy CLI user I'd prefer if programs didn't crap progress all over the terminal by default. If I want verbose progress, I'll pass `-v`. If I want even more insight into what's happening, I'll pass `-v -v`.

>See Rule Number 11 in TAOUP:

>Rule of Silence: When a program has nothing surprising to say, it should say nothing.

Agreed. In fact, informally, that rule goes back to much before TAOUP was written, I think. (TAOUP may have only formalized it.) I remember something like it from classic The Unix Programming Environment book (UPE, I first read it several years ago) which I referred to in another comment in this thread.

IIRC, in UPE it may have been phrased a bit differently, that's all - something like:

if the program succeeds (in some cases, don't generate any output, at least on stderr - of course if the program generates output as part of its normal behavior, like filters and some other programs do, then it has to write to stdout even if no errors). The behavior of the cmp command (compare two files) is an example of that - it produces no output on a successful compare (the files match), only on an unsuccessful one (the files differ).

Also, part of the reason for the brevity of Unix commands and terseness of output, is supposed to be because the first Unix versions were actually developed on teletypes (which were like the old telex machines) for output - that actually printed the output as you typed commands, on rolls of paper. Video screens came later. And the same is the reason for the Unix editor ed's brevity, and even it's p(rint) command - you actually had to (physically) print the changed lines after an edit, to even see them ... :-) All in all, I'd say it was an even more incredible job to develop an OS like Unix under such constraints.

There is also pipe viewer. Peteris Krumins wrote a post about it a while ago:


You insert it at points in the pipeline and it lets you see the progress of the pipe.

And totally agree with your recommendation of TAOUP. The wording can be a little ornate / verbose at times, but that is just ESR's style. Worth tolerating it for the content.

Great post. An amazing opt library is docopt[1]. Instead of writing a lot of code, with docopt you write your usage doc based on long standing best practices and docopt parses that USAGE block.

[1] http://docopt.org/

I'd add "if you're going to output locations (within files), format it conventionally - a line starting with "filename:line: " or "filename:line:column: ". Like `grep -n` or most compile errors.

I'm hoping the angular cli will be able to send email soon.

Also: Write a manpage.

Also, -- (two dashes) in the command line means "end of the options", which allows, e.g. rm to delete a file starting with dash.

Good feedback. I have one comment and one question:

1. I personally prefer to present a terse message to stderr and stack traces to a debug file. This helps a user give support more information when they're hitting transient issues.

2. I fully agree with the recommendation for a --dry_run option, but I think the output needs to be actionable. Do people have good examples of actionable --dry_run output that lets the user verify their intent? E.g. The list of files that would be deleted by "rm -rf"

+1 to Option #1

I'd argue that good software has what I call "good factory defaults".

Pine was a good example of this. It allowed people to use email right out of the box with plenty of options for customization in it's config settings.

No idea why the hivemind is rejecting this, but yes, absolutely.

Defaults should be sane, non-destructive, intuitive, and suit the common case. People don't change defaults, and actions which can be destructive should not be easy to invoke accidentally. Even (or especially) on a CLI.

This was all figured out in the 1980s.

New techies are coming online all the time.

One advantage the teaching profession has is that new opportunities are being created daily.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact