Hacker News new | past | comments | ask | show | jobs | submit login
Why do long options start with two dashes? (2019) (djmnet.org)
356 points by spyh 9 months ago | hide | past | favorite | 246 comments

It is sad that many new command-line parsing libraries don't follow the GNU rules anymore. They more often use "-long". Then users have to figure out whether this means "--long" or "-l -o -n -g". To make command line even more confusing, multiple tools I have used allow spaces in optional arguments (e.g. "-opt1 arg1 arg2 -opt2", where arg1 and arg2 set two values for -opt1). Every time I see this, I worry if I could be misusing these tools. I wish everyone could just follow getopt_long() and stop inventing their own weird syntax.

Yet another tragedy broughtabout by golang (at least in part)! :)


Edit: To be clear, I'm mostly "blaming" Go for re-popularizing this style by a) putting it in the standard library and b) being a widely used programming language; I'm not saying Go came up with this or anything.

(idk about the space separated args tho that's even worse)

Cobra (https://github.com/spf13/cobra), which is a pretty popular library for Go CLI applications, behaves more like classical GNU tools. It also offers usage/help autogeneration and autocompletion for popular shells.

Not sure about the relevant point on compact short options syntax as in `tar -xvzf archive.tgz` though... (edit) after a quick & sloppy test it seems to work as expected

Yeah but `tar xvzf archive.tgz` also works so I remain wary of tar. Basically every time I have tried to do something that's not tfz or cfz or xfz, it went wrong until I checked the manpage.

Right. I probably picked one of the most flaky examples, sorry for that. Let's say `ls -lah` that (I hope...) is less ambiguous.

In my defense, the specific example I gave is valid for both GNU and BSD versions of tar. If I understood correctly, the issue you point to (order among short form flags) is related to the fact that `f` expects an argument and consequently has to appear in the last position.

Ah, it's not a direct argument when you omit the hyphen and fall into "traditional" mode. I think after years and years I can finally wrap my head around how that works. :D

speaking of `man`, why can't it be more like `tldr`?

Because learning from unexplained examples is useless.

Is it? That's how humans learn to speak their language, one of the most complex tasks they need to achieve in their life...

On the other hand, cocking up an unfamiliar phrase in a spoken language doesn't usually result in accidentally killing the listener.

I haven't yet had a computer ask for clarification when I used tar or dd in an uncommon and destructive way.

You don't "have to" use the examples, you can read them as get a feel, and read the captions to find the one that does what you want...

Which is faster and probably safer than scanning the documentation for individual flags and hopping you got the nuances right...

See, the two cases aren't:

(1) Thoroughly study man page -> (2) Become expert at the command's options (3) try command secure in your mastery of it


(2) Check tldr examples -> (2) try command

They're rather:

(1) Open man page, (2) scan and skim the man page and the dozens of irrelevant flags, caveats, and obscure options, until you find some flags that look to do what you want, (3) half-read them, (4) try command


(2) Check tldr examples, (2) find an example that does what you want (which is usually one of the covered use cases) (3) try the command using the example syntax

I guess you could consider https://github.com/tldr-pages/tldr.

I personally wouldn't touch that, but that's related to my allergies to JS ecosystem and predisposition to panic attacks when I see stuff like that https://github.com/tldr-pages/tldr/blob/master/package-lock.....

I've switched to tealdeer: same database, rust implementation.


I'm generally satisfied by ZSH inline options summary, but I'm happy to see a sane instantiation of this, it clearly fits a need. Thanks for the pointer (and sorry for the troll :/).

"ZSH inline options summary"?

TIA if you could explain; is it native zsh or a plugin?

There's also tealdeer in rust.

It's really a Googleism that was inherited by Go. I remember their open source C++ command-line library did the same thing.

Google's command line flags library, known to the public as absl::Flags and formerly gflags, does not distinguish between --foo and -foo, these are both the flag "foo". Each flag has a unique name so there is never a short -f equivalent to --foo, and -foo can never mean -f -o -o.

The main design motivation of absl::Flags is that the flag definitions can appear in any module, not just main. Go inherits this. A quirk that Go did not inherit is gflags --nofoo alternate form of --foo=false.

This is all documented at https://gflags.github.io/gflags/#commandline, which is pretty much a verbatim export of the flags package documentation that a Google engineer would see internally.

> The main design motivation of absl::Flags is that the flag definitions can appear in any module, not just main.

Well that's kind of horrifying. That means that command-line arguments are a form of global state, and can silently alter the behavior of the program without the calling scope noticing.

I'm kind of vary of these mechanisms, because I've been bitten by them before. There was a python library I used that read its configuration from sys.argv the first time an object from the library was constructed. I had a rather painful time debugging to find that my script accepting a -b argument resulted in the library switching to batch mode and suppressing all graphics. Dang it, those were my arguments, and the library had no right to go behind my back and look at arguments that hadn't been directly provided to it!

If you think that's horrifying, what if I told you that a sufficiently-entitled operator of a given program can alter the flags at runtime ... using their web browser. https://twitter.com/jbeda/status/888635505201471490

Oh my. I have a gut feeling that I don't like it one bit, though I tend to be a bit more generous on logging. Logging is one of the only cases where its presence or absence don't change the inputs or outputs of any function, nor any other observable effect of the program. Having or removing logs doesn't impact the testability of a function, unlike any other use of global configuration.

You seem like a pretty reasonable person so prepare to be more shocked :-) In a glog stream like this, the things on the right side are not evaluated unless verbosity is on.

  VLOG(2) << expression_with_side_effect() << " LOL";

I have on occasion been called a reasonable person, and good heavens! I could understand that in a functional language with lazy evaluation, but that doesn't fit at all with my mental model of how C++ works. It can't be a macro, because the VLOG parentheses would need to enclose the entire expression. It can't just be the normal operator<< , because then the expression would always be evaluated. I suppose expression_with_side_effects() could return an object that is implicitly convertible to string, and the actual side effects happen in that optional conversion, but that would require lots of cooperation from the user.

I'm almost scared to ask. How is that even implemented?

It is macros. It expands, through several macros, to:

  !VLOG_IS_ON(level) ? (void) 0 : [a hack to stop compiler warnings] & LOG(INFO) << ...

It's originally from Plan9, which predates Google.

too few people understand this.

Go was designed by former Bell Labs people who worked on Unix, Plan9, or both. many things about Go that people attribute to "googlism" is really attributable to work done at Bell Labs.

In my experience at Google, only Go does flags like this. Everything else (python, java, c++, blaze) all use the same flag syntax, which is all via long args with two dashes.

The Java ecosystem has historically used single-dash options, both the SDK tooling (e.g. `java -jar`, `javac -classpath`) and classic common libraries like Jakarta Commons CLI. It has moved away from it more in recent years so now you get a mishmash of single and double dashes depending on how old the option is. In some cases you end up with stuff like `java -showversion` which prints the version to stderr but ` java --show-version` which prints to stdout.

I have seen a mix. For example, many Android developer tools (not written in Go) use this single-dash style. I believe the standard libraries used for parsing in internal tools mostly support both syntaxes, although some docs do describe the old single-dash style by default.

This seems pretty "standard": https://fuchsia.googlesource.com/fuchsia/+/master/docs/conce...

Is it based on Google's internal preferences?

TBH I have no idea; I've heard of Fuchsia, but know nothing about it. It seems pretty far removed from the majority of work I've done in Google3 (the monorepo).

PowerShell picked up the single-dash flag syntax too.

>many things about Go that people attribute to "googlism" is really attributable to work done at Bell Labs.

We're 50 to 30+ years away from that Bell Labs work. They could have checked what happened in the meantime with the rest of the computing world, before re-imposing obsolete ways with the full power of Google behind them...

No, Plan 9 doesn't have long options at all.

I was going to say, it seems like something Google tooling prefers, even non go tooling.

It predates golang significantly. C and C++ bioinformatics tools have used single dash long opts since the 1990s, unfortunately. I expect the transgression didn't originate in the bioinformatics community.

Single-dash long options are not started in Bioinformatics, but they are more often used in this field than elsewhere. Perhaps that is partly because some of the most popular tools (e.g. blast, muscle, bedtools and gatk) followed this unfortunate convention.

I'd always assumed that was because people without significant deckers experience on a terminal are far more likely to type -help than --help or -h

One thing go's flag's package does that deserves a lot of blame is to automatically sort the flags alphabetically when looking at -help. And the fact that you need to hack your away around it instead of there being simply an option like nosort=true or whatever is even worse. The whole idea is crazy and basically equivalent to the statement that there order of parameters in -help serves no useful purpose.

And yet, I expect flags to be sorted in a man page; I rarely read things in a logical order, I'm just looking into what flag does what.

It's a convention-over-configuration thing I think. I mean they set a standard, so you can move on. The alternative is to sit and think and discuss about what order to put your documentation in.

You read text from top to bottom. Chances are that you're writing help text and describing the most commonly used flags at the top, and the more obscure ones lower down.

> I'm just looking into what flag does what.

So you read the whole man page when you need a flag that does something specific or do you mean you never write new things and just have to look up flags already in use by some script? Because for everything else that seems like a fascinating waste of time.

if I'm searching I'm... searching: like using grep with some keywords. Why order of the parameters should matter in this case?

Because you don't always know which words the man page uses to describe specific functionality. So many ways to express similar ideas, language is fun that way.

Thankfully we have git.sr.ht/~sircmpwn/getopt github.com/pborman/getopt github.com/mattn/go-getopt rsc.io/getopt and a hundred more, but I really wish getopt was a part of the standard library.

I think Go's package "flag" was partially inspired by the one made by Apache for Java, but I can't find any sources confirming that now, so I might have seen that in a dream, heh.

I found it surprising too that the native library in Go does not follow this standard.

Fortunately, there are alternative packages that does.

It's not a standard, it's a GNUism. Why would the inventor of Unix follow GNU, which is Not Unix.

because it's better

No, it's weird af to use two dashes.

It's by Ken Thompson, who invented Unix, so it's ok.

I think Powershell does it too.

In my experience, i had never actually stumbled upon a formal list of these "GNU rules", the closest thing i can find is this: https://www.gnu.org/software/libc/manual/html_node/Argument-...

However, even those seem to raise some questions, for example:

> To make command line even more confusing, multiple tools I have used allow spaces in optional arguments (e.g. "-opt1 arg1 arg2 -opt2", where arg1 and arg2 set two values for -opt1).

Is described as something that's permissible:

> An option and its argument may or may not appear as separate tokens. (In other words, the whitespace separating them is optional.) Thus, ‘-o foo’ and ‘-ofoo’ are equivalent.

Therefore the below would be considered equivalent:

  -opt1 arg1 arg2 -opt2
  -opt1arg1arg2 -opt2
Were your expectations different?

Are there any good articles on the benefits of following such rules (any fungible improvements to legibility or usability, as opposed to just "consistency amongst different tools")?

Are there any tools which can validate whether any piece of software conforms to this standard (either by scanning the man pages, or the code, or a formalized format of parameters the app supports)? Personally, the closest i've found is Typer ( https://typer.tiangolo.com/ ) but without anything that can automatically reject non-conformant code as a part of a CI process, i think enforcing such formats would be a non-starter for me.

I think disallowing short option altogether is not a bad convention. With only `-long` and `-long=xxx`, my command line parsing is simply

    foreach a in arglist
        if a=~/^-(\w+)(?:=(.+))?/
            $opts{$1} = $2;
            push @pos_list, $a 
There is no need for the dependency and complexity of `getopt` library.

And for the user side, no more cryptic ninja arts. The only trick user need to learn is the shell alias and functions.

The point isn’t to use getopt with all its complexity, the point is that two dashes for long options is already extremely well established and Go popularizing long options with a single dash is very much a regression. It creates a lot of unnecessary confusion when they really should have known better and just stuck with the conventions.

In your regex at least, removing the confusion is simple as adding another ‘-‘, and now you’re in compliance with the expectations of almost every IT person in the world who uses Unix command lines.

Go's flag parser treats two leading minus signs the same as one.


Sure, but convention is to treat `-long` as `-l -o -n -g'.

But that is a bad convention, it prioritizes typing speed over readability, and it only works in certain cases (for flag-only parameters). I'd say good riddance to it.

That doesn't justify long options starting with a single dash, as one could have made every option start with two dashes. Sure, `--` is longer than `-`, but typing speed shouldn't matter right?

Sure, but the only reason to add an extra dash is to differentiate --long from -l -o -n -g. No reason to just add extra characters if ypu don't need this differentiation. Not to mention, Go cmd line parsing actually accepts both -long and --long, if you find the -- version more aesthetically pleasing.

If we only did "the accepted convention" indefinitely there would be no going forward. I see this change(of being explicit) as a win. The situation was already confusing before with different tools using different conventions. This way of being explicit allows you to be consistent across OS-s too. The world is not only GNU, fortunately.

Buffer overflows are well established too, not all traditions are good.

GCC does this pretty heavily, no?


If it mattered much, I'd expect GNU to be internally consistent.

GCC or anyone else's C compiler doesn't really count as far as this convention goes. GCC's flag parsing is aiming to be compatible with conventions that preceded GNU, other vendor's compilers break their own conventions to be compatible with GCC or whatever else cc(1) is etc.

GNU's conventions are generally complimentary, but not incompatible with POSIX. And POSIX specifies the behavior of the sort of flags cc(1) should understand[1].

There are many POSIX and other traditional *nix tools that are a convention unto themselves for historical reasons. E.g. notice how GNU "dd" doesn't follow normal GNU command-line conventions either.

1. https://pubs.opengroup.org/onlinepubs/7908799/xcu/cc.html

-long = -l -o -n -g was probably a mistake. But then again, tar xvzf should have been called untar, so it’s not like there are a shortage of opinions and historical mistakes.

How would one call tar `xvjf` then?

You can just call ‘tar xvf’ and it will detect the compression format.

My question was about how "tar xvzf" should be called "untar".

Your reply might still make sense (i.e. untar could automagically figure it out), but I was highlighting how tar/untar today also means (de)compressing that tar archive using many different compression formats.

It should probably be `untar --format gzip`

They are just respecting older, more venerable tools, like find.

Find's syntax is really annoying and clashy too. It makes sense. -- for extended syntax and - for shorthand. Why can't it follow this. Also fuck dd

I'd like a word with the person who thought that regular ( ) parentheses for grouping were a good idea in the find syntax, requiring them to be backslash-escaped in shell scripts.

The obviously right choice would have been [ ]. You know, like in

   if [ $foo ... ]

That won't work because '[' is an alias for test.

"[" as an argument to another command is fine.

"[" is just another name for the "test" command. It isn't special syntax.

It's pretty silly to ask a command written in the seventies(?) to follow a convention (for another operating system) presented in the nineties.

dd I really don't mind, because it makes me think and double-check that I'm flashing the right device every time.

find is annoying, though. I'd encourage you to check out fd.


Ah, thank you very much!

Same with ffmpeg

The Amiga had a pretty cool feature where CLI argument parsing and help was provided via a library. This made things nicely consistent across almost all of the CLI tools.

Suddenly I'm reminded of how Windows represents the command line as a single string (PWSTR), and how entry points that expect argv-style are parsed by the CRT at startup.

vs. Unix where char *argv[] is what makes it to the syscall layer.

The result there is that command line processing is more consistent program-to-program on Unix. On Windows, every program could decide to tokenize the arguments differently.

> On Windows, every program could decide to tokenize the arguments differently.

Worse, even Microsofts two implementations (CRT and WINAPI) disagree: https://github.com/rust-lang/rust/issues/44650

I feel like there are a few interesting Microsoft phenomena that contrast with Unix thinking in both of these examples.

CommandLineToArgvW - You called that "WINAPI", but it's worth mentioning the more specific provenance of shlwapi.dll. This is not a core, foundational part of Windows that is used in core, foundational things. It's a helper function from the shell (explorer, not shell in the Unix sense). So, while it has a look and function that seems pretty foundational, it really isn't. It's there because somebody working on Explorer long ago found that useful to have and decided to export their helper function in the DLL.

CRT - A CRT binary ships with Windows, but really, that code is maintained and distributed by the compiler guys and DevDiv. So theoretically, the argv parser could change at those people's whim alongside a new Visual Studio release. And it seems from squinting at that github issue like that might have happened here.

So really ... there are more artifacts here attesting to the fact that the command line arg parser is not part of the operating system. People find that functionality useful, so they look for things that "look like" the operating system official method, and maybe they find stuff that does "look like it" -- but such a thing isn't really there.

I was not arguing that it was or was not part of the OS but just showing that the parsing being deferred to application code has produce two subtly incompatible implementations that differ for no reason other than that they do.

Yeah, I am not considering anything you say to be argumentative, I am just going in tangents with this topic because I have some experience there and find it interesting.

That's a good thing. You have to be careful using a command line SQL query when typing "SELECT ". If the processing is left to the program, an SQL app in Windows knows you didn't mean "" to mean all the files in the current folder.

How is that different from the getopts library? I don’t understand.

I believe it was standardized and built in, add peer pressure and it was just short of being enforced.

There were also the Amiga style guides that were published with 2.0 that detailed how developers should build application user interfaces. The fragmentation in Linux/Unix distributions means that this kind of consistency is pretty much impossible, although FreeBSD does a much better job of being consistent than $majorlinuxdistros.


I had the impression only CLI tools from the Java world are that strange.

Yes, I messed up -Xmx1024m a million times in my career. We used GNU for in-house stuff so sometimes I'd have --Xmx and -Xmx on the same line

yeah; long w --, short w - is intuitive and annoyingly close to universal...

I love the "long options start with two dashes" convention. It means that you can chose short options that are easily combined (in cases where the command and its options are often used), or you can use long options that are much easier to understand (because they are full words). More command line tools should support them.

I typically use long options in shell scripts that will be checked in or shared with others. The self documenting nature of long opts is much nicer (imho) than the terseness of short ones.

I’m also glad short opts are available for my personal day-to-day work. I spend most of my time in a a terminal and appreciate having short-hand available.

It always confused me when tools don't follow that rule. Eg: "find" where "find . --name '.dat'" won't work but "find . -name '.dat'" will and it's not the only one

`find` is weird anyway. The stuff after the arguments aren't really flags, they're a tiny filter language, with significant ordering and operator precedence and all that stuff. Using "normal" option syntax wouldn't make a lot of sense for it either.

Yes exactly, find is like test a.k.a [.

    test -f foo -a -f bar -o -z foo
can be read

    isfile('foo') && isfile('bar') || emptystring('foo')

    find . -name '*.py' -a -executable -o -printf '%P\n'
can be read

    (F.name matches '*.py') && (F is executable) || print('%P\n', F)
where F is the current node in the file system traversal.

They both respect -o as OR, ! as NOT, and ( ) for precedence, which you have to quote as \( and \).

A couple years ago, someone helped me implement a better "find" without this wonky syntax for https://www.oilshell.org/ . But it isn't done and needs some love. If anyone wants to help, feel free to join Zulip :)

I do think that "find" is more like a language than a command line tool. It's pretty powerful, e.g. I just used it to sort through 20 years of haphazard personal backups.

Related: Problems With the test Builtin: What Does -a Mean?


Thanks for the article! How did I not see this before? Didn't know that POSIX has obsoleted -a and -o either. I guess I have some shell scripts to rewrite, heh.

I feel like `jq` does the "more like a language" thing better than `find`, but possibly its just a product of its time.

Well the thing find and test have in common is that they lack a lexer! They abuse the argv array for tokens instead. I might call it the "I'm too lazy to write a lexer" pattern :)

jq has a lexer and hence a "real" syntax, but so does awk, which is maybe 30 years older. But yes jq is a surprisingly big and rich language, maybe bigger than awk:


Find is about as user-unfriendly as a shell command could be. I never get it to do what I want on the first try. And its error messages are always cryptic and unhelpful.

I don't think any shell commands are particularly "friendly." Most are intentionally terse (in fact I find verbose, "friendly" command options to be annoying), and you learn them by repeated use, or for those that you use only occasinally, by consulting the man pages.

Yes, but errors are at least somewhat helpful. With find, it's this:

    $ find -name something
    find: illegal option -- n
    usage: find [-H | -L | -P] [-EXdsx] [-f path] path ... [expression]
           find [-H | -L | -P] [-EXdsx] -f path [path ...] [expression]
What does "illegal option" mean exactly? Why is it "n" which is the first letter of "-name"? Yes, it wants a path. Yes, even if you want to search in the current directory. Yes, it IS unusual, because all other commands that operate on directories, like `ls`, assume current directory if you don't specify any.

Why could it not just say "a path is required" instead?

It's saying that because it's using getopt to parse any initial option arguments. That diagnostic message is the standard default message printed by the getopt function whenever encountering an invalid option flag. It means all utilities using getopt will, unless you disable the default behavior, display the same initial diagnostic. It's idiomatic for utilities to then print a short usage message of its own.

Judging by the usage message you printed, you were almost certainly using a BSD implementation, probably on macOS, which in turn is probably sync'd from FreeBSD. `find -name something` will fail early in main. See https://github.com/freebsd/freebsd-src/blob/b422540/usr.bin/... When processing the 'n' in '-name' getopt() will return '?', which will end up calling usage().

The GNU implementation of find is completely different, though I'm not sure it does what you expect:

  $ find -name something
That prints nothing and returns a successful exit code. But if you remove the "something" operand you get what I presume you were originally expecting as an error message:

  $ find -name
  find: missing argument to `-name'
But try deciphering the option processing of GNU find to understand why it behaves that way: https://git.savannah.gnu.org/cgit/findutils.git/tree/find/ft... Hint, see https://git.savannah.gnu.org/cgit/findutils.git/tree/find/ut...

Not rocket science, but as a programmer and maintainer which approach do you think makes more sense? Is trying to do the supposedly intuitive thing worth it, especially considering find's already arcane and irregular syntax? As an experienced command-line user I'd just be thankful that the option flags (as opposed to the filter directives) are parsed regularly.

This is a good explaination why it has the current behaviour, but it doesn't answer the question of why the behaviour isn't better (i.e. which would be to tell the user what's needed, the path, instead of telling the user what was provided is not what's needed which is vague and leaves it up to the user to figure it out.)

It's not like the source code is now etched into stone and can't be changed. Or is it?

GNU find, or at least my version of GNU find (4.8.0), will just assume "." if the path is missing, and will work as expected. I think various forms of BSD find are a bit more strict, and based on that usage message is seems to be BSD find.

It gave you the list of options (i think that's at most one of -H and friends, as many as you like of -E and friends, -f with an argument), and -n isn't one of them.

Several BSD commands are pickier than GNU commands about option order, sometimes for good reason, sometimes because it was easier to write that way.

This is why I've ultimately come to the conclusion that shells are for casual use only, not for any kind of serious work. There are too many implementation details, inconsistencies, and footguns to write anything that needs to be somewhat reliable.

What do you use instead?

To be fair, there is one shell that I think someday we could rely on. https://www.nushell.sh/ Besides that, my answer is "any programming language," since at the core, dealing properly with system calls and their outputs is the whole reason PL's exist. In practice, I've been using Rust lately which makes a nice systems language, but JS and Python are always options for shell-like scripts that don't suffer from quite the level of degeneracy when encountering weird filenames or unexpected input in general.

> my answer is "any programming language,"

That would be a terrible shell. Changing directories, listing them, moving files, running programs are all simple no-brainer operations in any reasonable shell, but are non-trivial in any programming language that's not designed to be a shell.

So you use the shell for things that require no brain: browsing your directory tree, casual printing of files. Then, when you need to encode these operations in a script, you pull out a scripting language, because you need more than the shell can provide with its casual nature.

Legacy and backwards-compatibility. find(1) is a really funny example, too, because POSIX find doesn't have that many flags, so they could probably fit all of them into the short format.

If anyone is looking for alternatives, try fd


fd is even more than that. In most cases `fd -x` can replace `find ... | xargs ...`.

From the article, it sounds like find predates the two dashes convention, so I think it gets a pass.


  tar xvf foo.tar

I've seen this a few times, but the one that always gets me are things like aws, it does something in response to "aws --help", but it doesn't tell you that you really want to call "aws help" to get some useful help.

I've seen that pattern before, but it always drives me a little crazy.

That. And Git’s passive aggressive approach when it kind of knows what you wanted, but will show you how you failed to write that command.

What's the alternative? Accepting the dev's favourite misspelling npm-style doesn't seem like a good idea to me.

No, but a "command not found" would suffice and not hint that it could have figured it out for you.

Alternatively, it could offer a y/n prompt.

I'd rather have the suggestions, I don't take hints from computer software personally :) Sometimes I just misremember a particular command (i.e. "submodule" vs "submodules").

Git has an option to run misspelled commands anyway.

Still annoyed that PowerShell didn't follow POSIX standard for arguments, at a time when MS was working hard on open-source compatibility.

It’s unfortunate that Go’s standard package `flag` doesn’t follow the standard either, given the language is otherwise a good fit for command-line tools.

By standard, do you mean -s for short flag, and --two_dashes_for_long_flags?

Because if you don't care about chaining together short flags and just want to use two dashes for your long flags, Go will happily accept that.

https://golang.org/pkg/flag/ : "One or two minus signs may be used; they are equivalent."

But help prints single dash for long flags which contributes to the fall of double dash long flags.

Oh, I agree.

I ran into a related issue a couple of years back where people were using single-dash flags for a C++ project that was using Abseil flags in conjunction with getopt parsing of short flags (for legacy reasons). Why were they using single-dash flags, despite that not showing up anywhere in our documentation? They copy-pasted from --help.

(I'm happy to say that --help in Abseil has since been fixed.)

But that doesn’t preclude mistakes by collision (N short flags match a long one) or unpredictable bugs in a long flag interpreter (a short flag being a substring of a long one)—both being trivially common bugs when this ambiguity is allowed, especially when an API is ported to another environment with less tooling standardization around interpreting the input.

Go doesn't allow for specifying multiple short flags all run together, or for flag args without spaces, so neither of those are directly relevant here.

Also, that first issue happens with POSIX flags (with the GNU long flag extension, anyhow): `grep -help` is different from `grep --help` (and if you type the former, it'll just wait patiently for you to close stdin).

Because of Go, I have to monitor what language a command-line program is written in before using it.

>at a time when MS was working hard on open-source compatibility.

You mean around 2002-2006? I find that pretty hard to believe.

I feel like the open-source compatibility paradigm really started right after PowerShell

It was probably even partly because of the reception to PowerShell.

/Options are the norm for Windows (and uh cough VMS).

Which is also why Windows uses backslash (\) as their path separator. Because forward slash would have collided with the slash option marker Windows inherited from VMS.

That is surprisingly false. Microsoft operating systems use both / and \ as a path separator, going all the way back to DOS.

Early versions of MS-DOS made it a user preference option in the command.com interpreter, whether the user wanted to use / for options and \ for path separation or vice versa.

IIRC the "forward slash" convention for Windows command options traces back to DOS, not VMS. Where DOS inherited the convention from I do not know.

CP/M https://en.wikipedia.org/wiki/CP/M

In longer words: Windows was originally a GUI system on top of DOS which was influenced by CP/M. The NT kernel did away with DOS, but the influence still lives to this day. For a simple one: not being able to name a file "con" (or any capitalized variation) comes all the way from CP/M.

For the uninitiated: OSes from that era didn't have "directories"; Everything lived in the root of the drive, including device files. So, to print a file, you could literally do something like:

    A> type FILE.TXT > PRN
When DOS added directories, they retained this "feature" so programs unaware of what directories were could still print by writing to the `PRN` "file". Because of "backwards compatibility", NT still has this "feature" as well.

One thing VMS got right is that each binary declared its supported options and the shell could tell you what they were. And it would take any unique abbreviation.

Powershell scripts and cmdlets work similarly. They probably won't have help text but at least you can see what's available without having to look at the argument parsing section of the script. And you can use the shortest unique prefix as the short form of an argument (though I don't love this since adding an argument can break the shortened form of other arguments)

It’s easy (although verbose) to add help text, and valid options, too.

Also easy to create option sets so that mutually exclusive arguments are shown in the help as different ways to invoke the script.

And bunch of other niceties, all queryable without running the script, and all feeding autocomplete with useful information:


..and TOPS-10 and TOPS-20 and RT-11 and RSX-11 and RSTS-11.

They were for DOS, too. Not that I’m disputing the VMS roots of WNT.

What would be a good reason to have POSIX standards in PowerShell, aside from, that's what POSIX does?

It'd make the typing simpler. PowerShell has posix-like aliases, like 'rm' and 'cd', but they don't accept POSIX parameters. So you end up with "rm -Recurse", since rm is an alias for Remove-ChildItem.

I like PS in theory but the syntax and naming just absolutely kill me. What were they smoking when they named as simple an operation as delete "Remove-ChildItem"? And what's with all of the capital letters?

That's what happens I guess when the people designing it haven't actually used a CLI day to day much, because, well, they're using Windows.

I can't agree. I have used Linux shells for some time (since 97), and while the olden days would be me laughing at vbs and all that awfulness, I'd take PowerShell any day.

The short terse commands and the really awkward, confusing, mistake prone syntax of sh or bash really reels their ugly head in scripts.

Interactive shell? No problem. But that's the beauty of PowerShell: verbosity and correctness in scripts, where the IDE quickly expands those long commands, and short aliases for interactive use.

> The short terse commands and the really awkward, confusing, mistake prone syntax

When used in an interactive shell short commands save time and effort. And it is easy to learn and remember them because in everyday work you need only about 10 commands. For some some commands which I use a lot I have one-two letter aliases to type even less e. g. i=fgrep.

It makes shell scripts less readable for someone who come from windows and and don't know even common shell commands, but for someone who use shell at least from time to time it should be easy to read.

Yeah I agree with that. Bash (and friends) scripts are awful. PS scripts are nice and readable, and not subject to the insane quirks of bash ([ vs [[ vs test? come on)

Seems like the real solution is separating scripts from interactive use.

Ironically it already happened: bash for user interface, but /bin/sh is something else. But bash for user interface keeps being a repl that was accidentally promoted to user interface.

> What were they smoking when they named as simple an operation as delete "Remove-ChildItem"?

Simple. All these commands work with providers, of which a file system is just one. Other providers include Windows Registry, environment variables, certificate stores, functions and variables in PowerShell runtime. More providers can also be created and plugged into the system. PowerShell Providers are essentially Window's FUSE. See [0] for details.

So, for instance, you can do `Get-ChildItem HKCU:` to list entries under HKEY_CURRENT_USER in the Registry, the same way `Get-ChildItem C:/` will list you top-level items on the C: drive. Worth observing: while the console output for these two commands is similar, the results are in fact different objects underneath (Microsoft.Win32.RegistryKey vs. System.IO.FileInfo).

In short, these commands are an abstraction over file-system-like things. Whether or not that was a good idea is a different question.


[0] - https://docs.microsoft.com/en-us/powershell/module/microsoft...

It makes a little more sense in context to me. The verbose Verb-Nounish works because the verbs are designed to be limited. E.g. there's Remove- but no Delete- in the standard (shown in `Get-Verb`). So you can then press ctrl+space after typing Remove- and see all the different types of things you can remove. Too many, so you can filter to Remove-<prefix>* etc. The verbosity of cmdlet names when using it as a shell is mitigated with the aliases (e.g rm), and the parameters by accepting any case and shortening to anything non-ambiguous (e.g. `rm -rec -fo`). I guess the capitalisation comes from C# or .net's casing? I like PascalCase for it's great readability/conciseness tradeoff over others, and it's standard windows case-insensitive so I've never had a huge issue with it.

The tradeoff is that "all the things I can remove" is usually "the set of all things my shell knows about" and not "the set of things related to my task at the moment" -- ChildItem-* would be more helpful!

Neat thing you can do is type "*-Noun" and the tab completion will give you options that fill in the "*". Alternatively "Get-Command *-Noun" will also list out all of the matching commands. Get-Help also supports that kind of wildcard so you get the list of commands along with their help summary.

The "*" can even be in the middle. I open VS solution files all the time from Powershell. Since there are often many other files and folders with similar names alongside them I just type ".\*.sln" and hit tab.

> What were they smoking when they named as simple an operation as delete "Remove-ChildItem"?

The long names are the official readable names for scripting. It can and does have short aliases like "rm" that you would use in interactive mode.

> And what's with all of the capital letters?

PowerShell is case-insensitive. The capital letters are for readability.

I disagree and agree with the sentiment. As someone more familiar with Linux, I sure would prefer to be able to assume a similar style.

But the biggest thing I'm happy about WRT Powershell is that it's consistent (and pretty well documented). At least it makes sense. Batch scripting really didn't.

Just annoyingly inconsistent when calling PowerShell commandlet vs local exes.

PowerShell is different enough that maybe it's not a bad thing?

Seeing functions aliased to their POSIX names is already a little bit misleading when you realize they are not a drop-in replacement at all.

PowerShell was born with a “we know better” attitude that, I hope, is gone by now.

Because they really didn’t.

Except they did, and I for one wish traditional Unix shells would die. Composing software by having every single program and script include a half-assed parser and serializer is causing a lot of unnecessary waste and occasional security problems in computing. Moving structured data in pipes is just a better idea.

Then use JSON or XML in those pipes. Nothing forces you to deal with unstructured data.

Wish I could (actually, I'd prefer JSONB or other binary format). Unfortunately, every program in the UNIX ecosystem assumes unstructured text in pipes, and makes it my responsibility to glue them together by building ad-hoc parsers with grep, head, sort, sed and awk.

A lot of more recent programs (such as AWS, K8s tools) can easily output JSON. You can make schemas match, but you'll most of the time need to use something like jq to transform what one program outputs into what makes sense for the other.

I always try to design my tools with a "terse" output that makes it easier to pipe it into other programs.

"POSIX_ME_HARDER" passive aggressive done well

Fwiw I'm pretty sure that POSIX_ME_HARDER wasn't an RMS-ism. RMS invented the "-pedantic" gcc flag (to enable some warning messages that he felt weren't necessary) and that always got a laugh when he talked about it and that was more his style. POSIX_ME_HARDER was more of a signature style of one of the other devs at the time, rather than of RMS.

maybe so, but it's still brilliant

Do you recall who was that other dev?

Yes, but I don't know what he's doing now and I wouldn't want to put his name here unless he said it was ok.

fair. Let's hope this part of history is written some day.

remember seeing this buried in some ifdefs somewhere or in compile output back in the day.. to me finding the origins this is almost or more interesting than the longopt story itself :)

sssht! be quiet! the moral police are going to ban this as well!

Oh come on. I had a laugh reading it, but immediately understood why it was changed—voluntarily, not by anything resembling police. I wouldn’t wish real censorship on anyone, but I wish y’all were at least able to identify it with better accuracy than a poorly trained AI.

Problem is, all of us including GP are primed by poorly trained AI to take comments in a divisive direction.

I’m primed by poorly trained cognitive chemistry to ignore that I know that and try to make lemonade

I was surprised to read that long options were invented so recently, in only 1990.

Well... at that time, paper terminals were still in use in some places. The gap between legacy and modern hardware that was in operation was huge.

I think what's recent is the syntax for mixing short and long options together on the same command. Some commands had long options, some had short, but with "--" one command can have both.

Pretty sure some programs used them before 1990, just not with a convenient getopt_long(). I know it's not the best example, but 'dd' used things like "if=whatever skip=123" prior to 1990. The article also mentions find, but it used single dash long options.

Found an example that's a bit closer, from Minix 1.0, 1987. The "pr" program:

  Usage: pr [+page] [-columns] [-h header] [-w with] [-l length] [-nt] [files]
Mixed in this case, with -columns and +page, and all hand parsed. But long options nonetheless.

Isn't that expecting say pr +4 -172

to format for printing starting at page 4, on a wide 172 colunn paper?

Ah, yes, you're right...

I think the "find" command was around before then, and it has long options. For example:

    find dir -type f -name "*.h" -print

Those aren't really options. The syntax of the find command is

  find <options> <paths> <expression>
Those thing you list are part of the <expression> part of the command. The <options> part in BSD find, and I believe GNU find, only uses options of the form -X where X is a single character.

It's a little confusing because the man pages for both BSD and GNU find do call some of the things that appear in the <expression> part of the command "options".

Find is specifically called out in the article.

> There were a few programs that ran on Unix systems and used long option names starting with either - or no prefix at all, such as find, but those syntaxes were not compatible with Unix getopt() and were parsed by ad-hoc code.

The Free Software Foundation held a public election on how to do long options three decades or so ago.

This was likely before the effects of Eternal September began destroying the public Usenet, so the vote may well have been held there, in one of the newsgroups relevant to the FSF, GCC or GNU.

The '--' alternative won overwhelmingly, as I rememeber it. A few hundred votes were cast by email.

Why did POSIX disallow +options?

The shell uses "-" to enable an option and "+" to disable it. I don't know who's to blame for this, but it's part of the POSIX standard.

> The shell uses "-" to enable an option and "+" to disable it.

things like this are how I know that there are too many people who are absolutely insane and who make mundane decisions.

    - to add
    + to subtract
absolute genius. is this what higher education teaches people? I didn't go to college, and maybe that was best.

One of my favorites is in POSIX date format strings:

    %p     locale's equivalent of either AM or PM
    %P     like %p, but lower case

Weirdly, this kind of syntactic idiosyncrasy is something that got me interested in Erlang. Finally a language that uses full stops when a routine full stops. I find most of the rest of its syntax uncomfortable (I didn’t spend much time with the language, I’m sure it’s fine when you’re used to it), but I always found it weird to end a completed statement with a statement-list-joining punctuation mark.

This was inherited from Prolog, which ends terms with a full stop.

Most other languages didn't want to handle the syntactic ambiguity of using the period as a decimal point and a statement separator.

I thought of mentioning the Prolog heritage. Weirdly CSS (having the worst syntax consistency of any language I can think of) is hyphen-heavy and solves its negation infix operator ambiguity well: it needs to be surrounded by whitespace.

For Prolog/Erlang, I think the preceding syntax is disambiguating enough

COBOL terminates some statements with a period. And before that FLOW-MATIC https://en.wikipedia.org/wiki/FLOW-MATIC

> I always found it weird to end a completed statement with a statement-list-joining punctuation mark.

But if you're producing a list of statements, isn't a statement-list-joining punctuation mark the perfect thing to use?

There's also the difference in different languages between statement separators and statement terminators, but I don't really know enough about it.

I always thought this is a Wirthism because Pascal ends unit and program with an "end." (with a dot), whereas function and procedure are terminated with "end;". (with a semicolon). I don't know about other Wirth languages though, maybe it is Pascal specific and not really something typical for Wirth?

The punctuation in Erlang mirrors English so closely I find it frustrating when people complain about it.

It's incredibly simple. Comma means "and", semi-colon ends a clause, full stop closes out the entire thought.

Erlang is great, and I got used to the punctuation, but it's kind of a pain when you're moving code around.

Oh, now this is the last thing, gotta take off the ; or replace a , with a .

At least when I was starting out, I'd have loved a more C-like syntax with {} and consistently semicolons. Of course, Elixir came and just got rid of most punctuation, which I like less.

Anyway, it's consistent and after a couple weeks of messing it up, I can consistently see where the mistake is from the compiler error; after several years, I still sometimes mess it up, but oh well. I can't recall having messed up the punctuation so much that it still compiled but wasn't what I meant, so it's almost always quick to recover.

That's not a plus sign, it's a crossed-out hyphen

> "that's a smile, not an upside-down frown!"

+ is the opposite of -. Not too hard to remember.

Or just think about all options as being prefixed with a "-"

I can kind of see it as an ASCII art flowchart line.

IE a line to indicate use of this option and a line with a strikethrough to indicate it has been "struck out" as an option.

But using the conventional minus and plus symbols is indeed confusing.

I mean, if you want to undo subtracting something, you add it? The only reason I ever found it confusing is that it was "backwards", but that feels like a forced error due to + requiring a shift while - doesn't.

"-" doesn't require hitting the shift key. Making the common case easy seems like a reasonable argument.

You could ask Richard Stallman himself, who is well known for responding to random inquiries from people.

Presumably it was rejected because the whole point of POSIX was to consolidate, regularize, and simplify pre-existing practice. Adding "+" as an additional standard option signifier would take a huge step in the complete opposite direction. The only precedence for "+" would have been the `set` shell builtin, and AFAIU the committee only begrudgingly grandfathered that syntax.

Someone elsethread mentioned the `date` utility, but if you look at the BSD implementation "+" isn't used as an option marker, per se, but rather to disambiguate operand strings. The 2001 standard only defined the following:

  date [−u] [+format]
  date [−u] mmddhhmm[[cc]yy]
It's splitting hairs, but POSIX was at least able to shoehorn the legacy syntax into a more regularized base interface.

In many astronomy programs, we have to put up with the PFILES convention, where arguments are given like so

    mytool infile=foo.fits outfile=bar.fits
Some parameters need to be given (and will be prompted for if not given), and some get a default if they are omitted. The extra tricky part is that the parameters and their defaults are read from .par files in a path. There are the default ones for the tool, and a user-specific parameter file which can be modified using the "pset" program (or read with "pget"). A tool when run will also modify the par file to update various parameters with the ones given on the command line.

Unfortunately, there is no form of locking on these par files, so one has to mess around with the path settings (to make per-process paths), or use some form of locking, to ensure they don't get corrupted if multiple processes are run at once.

Many coders died in the getopt() wars of the late eighties. Unexploded shar files should be treated with great caution.

Unfortunately the / vs \ (and correspondnig - vs /) war was not resolved.

Kind of off-topic, but somehow I've found that blogs that use that particular template (with the blue top header and all) pretty much always have content that I find interesting or useful. Often they contain niche information that's hard to come by elsewhere. Has anybody felt similarly/know what might cause this?

EDIT: Trying to remember examples of it; here's one I can recall: http://www.nynaeve.net/?p=180

This theme is called "Kubrick". It was the default Wordpress theme from ~2005-2010. It was designed by Michael Heilemann (https://twitter.com/Heilemann), here's a nice article about it https://www.huffpost.com/entry/the-secret-history-of-kub_b_4...

The golden age of blogs

It’s a rare moment on HN when a commonly maligned CMS written in a commonly maligned language gets such high praise for its default behavior. Just commenting here in hopes I can come back and reflect on the weird disconnect between UX and nerd preference.

I believe it is a WordPress default template. Perhaps it is because these authors are so focused on quality writing they have no time for frivolous nonsense like 'templates'.

It was the default Wordpress theme from ca. 2005-2010 which at least in my mind was basically the peak era of "the blogosphere" especially on technical topics.

As others have said it's a default wordpress theme, so often used by people who install wordpress in its default configuration because they want a reasonable https GUI to type content into a blog, but don't care about making it pretty or special looking (As you would if you were to use wordpress to build a company webpage).

I remember reading something about a debate between --options and -=options. It's pretty tough to search for this but maybe someone else knows where I might have come across it?

GNU stuff is so influential. It’s really remarkable. Copyleft, the open compiler, open desktop. The Linux desktop changed my life when I encountered it as a 12 yr old.

It’s hard to imagine that the philosophy of these early heroes has so pervaded the world. Free software is everywhere. Other fields don’t do this to the degree software does. So much value for humanity just because the first few took one approach when another would have worked just as well.

With years of phenomenal dedication to careful language and explaining all the boring legalese. (the GPL versions are all great reads.)

Years after it was first proposed, I'm still saddened that

    sudo --with-wheel-group
has not (yet) been sneaked in to any major *nix distribution.

What would that do?

So it is not confused by the marriage operator. ls -- As the end of options And the beginning of arguments.

I always thought it had to do with chaining flags. For example, if multi-character single-dash flags were supported, and you wanted to write a `-lah` option for `ls, you wouldn't be able to or else you'd introduce ambiguity. And nobody wants to type `ls -l -a -h`.

Not to mention the double dash convention to support dashed parameters afterwards.

  `grep bar -w  -- -foo-file`

Okay, now tell me why there's no consistency between the case of the short version of `--recursive`

Is this not part of POSIX? I see folks churning about Go and the Bell Labs people, but these styles were exactly what (in my mind) POSIX was partially in response to.

No -


but no 'getopt_long' (AFAICT).

OpenBSD, just as one example of a true unix-derived system, added it's own getopt_long only in 3.3 release which was 2003, ~17 years after the 1st posix in 1988. This article mentions getopt_long originating ~1990, after 1st posix standard.


Interesting. I'd love to long and short flags be standardized. Their variable implementations are highly confusing for users.

It looks like date allows the + option. For example,

    date +'%Y-%m-%d-%H-%M-%S'

The + prefix for the format passed to date is mandated by POSIX-- to distinguish the date format from other arguments.

that's not the kind of option the article is talking about -- the long options are generally english words, and there is a limited set of them. Date uses two dashes for them -- like "--utc" and "--set". They won't work with +

Your example is a format string. date needed to tell if the positional argument is a format or a new date, and to make it easy, they decided to prefix the format with a special character. I am going to guess that this character should not be - (to avoid confusion with options), or numeric (to avoid confusion with new date), not have special meaning in common shells (to keep quoting simpler). They could have chosen : or ^ for example.

Nah. date could have used -f or any other letter to specify the format string. They did not _have_ to use +.

Completely agree, the implementer of date had many options. choosing + character, they apparently went for conciseness instead of convention.

Not sure what you are disagreeing with though, I made no claims that someone had to use +, or positional argument in general. Just an observation about what might have guided the design process.

Am I the only one who read the title and thought this had to do with options trading?

I thought of the options syntax straight away. Do you work in finance by any chance?

I'm a programmer without any special finance knowledge and I also read it as being about the options the financial instrument at first, for some reason.

that explains dig?

And java -version

Java never supported single letter options, which is why any other meta-character is necessary at all.

ls -la == ls -l -a, but then you can't ls -all because it's the same, you need --all (or +all).

Java supports -cp or -classpath but in both cases the token after the - always identifies a single option, so you don't need the second dash.

I get the logic, but it still creates cognitive load to have to keep track of what a given cli utility supports

java -version AND java --version

Maybe because java was intended to run on multiple operating systems, although both look out of place on MS-DOS!

java --version only works on Java 11 and newer.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact