
A magic getopt - Hello71
http://www.daemonology.net/blog/2015-12-06-magic-getopt.html
======
majika
This is really cool (I've been wondering how to do a string-switch for a while
now), but I don't think getopt is a great use case, because getopt still
results in imperative argument parsing and this always results in pain.

I've been working on libargs for the past year; it's a declarative argument-
parsing library for C:
[https://github.com/mcinglis/libargs](https://github.com/mcinglis/libargs)

The main idea is that each argument is by default parsed and stored as a
string, or you can optionally specify a function of the type `void f(char *
name, char * arg, void * dest)` to parse the argument string and store it in a
well-typed destination. This way, you can have an `int` argument by passing
`int__argparse` as the parser, and if the user passes a value outside the
range of `int`, then an appropriate out-of-range error is printed to the
console. Similarly with `uchar__argparse` or something like `point__argparse`
(e.g. taking some format like `{x,y}`).

libargs is quite flexible and has worked well for me so far. Automatic help
text generation can be added in future while maintaining (non-ABI) backwards
compatibility.

The main disadvantage is that it depends on other libraries I've developed
that are essentially Jinja-templated C source files that function as makeshift
generic types / typeclasses in C. Your inclination towards this approach
depends on taste; personally I much prefer deferring the pain to the build
system, as opposed to the source code.

~~~
ZaoFishbones
Fun fact, C++ reserves double-underscores anywhere in names for the
implementation. It's highly unlikely that you'll run into anything colliding
with your names in the wild, but if someone wants to use your library in C++,
it's technically bogus.

~~~
Kristine1975
It's similar in C. C11 Standard chapter 7.1.3:

 _> All identifiers that begin with an underscore and either an uppercase
letter or another underscore are always reserved for any use._

I doubt the library is usable in C++ though, since longjmp doesn't play well
with the destruction of local objects.

------
jzwinck
Anyone who designs command-line interfaces for GNU/Linux should read this:
[https://www.gnu.org/prep/standards/html_node/Command_002dLin...](https://www.gnu.org/prep/standards/html_node/Command_002dLine-
Interfaces.html) and this:
[https://www.gnu.org/prep/standards/html_node/Option-
Table.ht...](https://www.gnu.org/prep/standards/html_node/Option-Table.html)

If you are using C++, Boost.Program Options is a solid choice. It is one of
the relatively few command line argument parsers in the world which supports
close to a full set of (GNU-like) behaviors without custom workarounds. It
also enables type safety in the sense of disallowing decimals where integers
were expected.

~~~
gioele
Thank you for the link. It contains this tiny gem: a glimpse into a path that
CGI applications have not taken:

> CGI programs should accept these as command-line options, and also if given
> as the PATH_INFO; for instance, visiting
> ‘[http://example.org/p.cgi/\--help’](http://example.org/p.cgi/--help’) in a
> browser should output the same information as invoking ‘p.cgi --help’ from
> the command line.

~~~
jzwinck
Regardless of CGI, I do advocate writing web services that respond to "GET
/help" with a developer-facing synopsis of available routes. Pretty much the
same thing.

~~~
LukeShu
I advocate using "OPTIONS /path" for that.

~~~
jzwinck
That prevents people from loading it with a plain-vanilla web browser. "GET
/help" is more accessible, therefore more likely to be used. By humans, I
mean.

------
yoshuaw
You can use getopt in POSIX shell too -
[https://github.com/yoshuawuyts/knowledge/blob/master/unix/sh...](https://github.com/yoshuawuyts/knowledge/blob/master/unix/shell.md#command-
line-switches)

~~~
pdkl95
To add to this, here's a full example of conditionally using getopt that still
works without it. It also includes an example of the recommended-by-GNU help
and version options, which jzwinck mentioned.

[https://gist.github.com/pdkl95/363d48999e9df027a99c](https://gist.github.com/pdkl95/363d48999e9df027a99c)

------
jevinskie
I've found LLVM's solution to be the fastest/slickest method I have thus far
discovered. [0] I love that it is declarative and works across TUs. I've
thought about making a standalone distribution for some time.

[0]:
[http://llvm.org/docs/CommandLine.html](http://llvm.org/docs/CommandLine.html)

~~~
sgt
That looks very slick - although it seems to be C++ only.

------
cperciva
Answers to some questions here:
[http://www.daemonology.net/blog/2015-12-07-design-of-
magic-g...](http://www.daemonology.net/blog/2015-12-07-design-of-magic-
getopt.html)

------
cbsmith
One word: docopt.

~~~
ams6110
Not sure why you were downvoted. This is the way to go IMO. Declare your
options like writing a man page. Let docopt generate your option handling
code. There is an implementation for C.
[https://github.com/docopt/docopt.c](https://github.com/docopt/docopt.c)

~~~
thechao
That's a generator for docopt parsers, written in python, targeting C. The
closest docopt (variant) in C that I know of is the one I wrote:
[https://github.com/jaroslov/docoptc](https://github.com/jaroslov/docoptc) .
Although, at best, I'd say that code is "looking for a maintainer".

------
joepvd
As I am not the first one to shamelessly self plug: I made ngetopt.awk[0] an
argument parser for GNU awk that handles long options as well.

I've been using it quite a bit, to transform a three-liner awk program into a
full fledged command in little effort. Very nice to share with colleagues.

[0]:
[https://github.com/joepvd/ngetopt.awk](https://github.com/joepvd/ngetopt.awk)

------
IgorPartola
Shameless plug: I wrote a Python library that not only lets you declaratively
define all your options, but also works with config files too. Don't write
your own logic for this. Just use
[https://github.com/ipartola/groper](https://github.com/ipartola/groper)

~~~
Latty
For Python, I remember docopt ([http://docopt.org/](http://docopt.org/)) being
a big deal for a while - never used it, but sounds much nicer than writing out
lots of code to construct your options parsing. Any reason why groper is a
better option?

~~~
IgorPartola
groper is better in several ways. First off, you specify all your options in
Python, not a DSL that won't be checked until compile time. I mean, sure you
could add code highlighting for docopt to your editor of choice, but chances
are it already supports Python and works well.

Second, with groper each module is free to define its own options. You no
longer need to have a centralized place where all the options are defined, and
you don't need to do the plumbing through your application to give the right
values to the right modules. This way your server.py can say "I want a host
and a port" and your logger.py can say "I want the verbosity level and the
filename" and the two don't have to know about each other (but can).

Third and most important, groper supports config files, which no other
argument parser does. Chances are that if you are creating something more than
just a simple command line program, you'd have too many options to specify on
the command line. Instead, it'd be lovely to have a config file, and be able
to override some of the options via command line args. Python's ConfigParser
is a mess, and it's a mess that works completely differently from
argparse/optparse/getopt. I also don't consider something like config.py to be
good practice either.

So basically, groper is your one stop shop for getting config files and
command line options right. Oh, and it can generate sample config files for
you if you want.

------
tjholowaychuk
[https://github.com/clibs/flag](https://github.com/clibs/flag) is yet-another
approach, more like Go's flag package. Docopt is awesome too.

~~~
LukeShu
docopt is _almost_ perfect. It needs to support optional flag arguments (á la
`--color[=WHEN]), and having the first un-indented line after `Usage:` be
treated as prose. The latter is on `master` of the Python (reference)
implementation, but hasn't landed in a release or a spec.

Also, better errors for when the programmer screws up the formatting of the
doc string.

~~~
ridiculous_fish
Shameless self-plug: you may be interested in my version of docopt at
[https://github.com/ridiculousfish/docopt_fish](https://github.com/ridiculousfish/docopt_fish)
. Its syntax is more forgiving, it's capable of expressing more usage specs,
and it has excellent error reporting for doc strings.

~~~
LukeShu
Very cool! I'll definitely keep it in mind (though I've been tending to use Go
for greenfield projects recently).

------
codezero
Almost totally unrelated, but taking the opportunity to rant. If you use OS X,
your getopt is fully POSIX compliant (as are many BSDs, I think), unlike most
Linux distributions. This means globs (files arguments) must come after
optionals.

For example:

ls -la * .txt

OK

ls * .txt -la

Not OK

This drives me mad.

~~~
cperciva
Yes, I deliberately did not implement that GNU bug. I also avoided
implementing the --prefix-of-long-option bug.

~~~
codezero
It makes me very happy that you refer to this as a bug :)

------
RustyRussell
I can recommend CCAN's opt module:
[http://ccodearchive.net/info/opt.html](http://ccodearchive.net/info/opt.html)

