
Ask HN: What high level language for writing a command line utility? - lawpoop
I notice that a number of shell utilities are written in c. I have some ideas for some utilities, but I won&#x27;t be learning c anytime soon. And, it would seem to introduce a level of low level concerns, such a memory management, that would be a pain for a small, simple utility.<p>What high level language would you recommend for writing a utility? Off the top of my head, it seems that python and ruby would be the top contenders. Bash itself seems to simple to write a decent sized utility in; I understand things like arrays are cumbersome
======
wahern
If you want the utility to be useable in most environments without the user
having to install dependencies, your best choices are POSIX shell and Perl.
This applies even if you only expect your utility to run on Linux, as embedded
Linux systems often only have dash, a strict POSIX shell.[1] If it's not a
totally barebones environment, it will invariably also have Perl.[2]

For maximum portability, you'll want to stick to POSIX shell scripts. If you
want to make use of complex data structures, then you'll probably want Perl.
You can emulate arrays in POSIX shell fairly easily, but not associative
arrays (aka dictionaries, hash tables).

[1] If you care about BSDs, know that Bash isn't installed by default, and
I've never needed to install it except on interactive shell servers where
squeaky users were adamant about using Bash. Interestingly, Solaris has
recently started shipping with Bash. But Perl is available natively on all
these. I don't know if Perl is installed by default on AIX, but it's present
on the AIX shell I have access to.

[2] IME you'll find Perl at least as often as Bash, though it's not
universally the case. I never understood the recent obsession with Bash
scripting. Its extensions are poorly designed (ksh is much more elegant) and
it has a horridly complex implementation with subtle gotchas. The one bright
point about Bash is that it has strong POSIX compliance. In any event, if you
want to use Bash because of its language extensions, IMO you should just use
Perl. But if you can deal without nice language constructs, you may as well
stick to POSIX compliant code. Among other things, the POSIX shell
documentation (the Standard Unix Specification spec is free from
opengroup.org) is simple and concise and easily navigable, in stark contrast
to Bash documentation. Read through the "Shell Command Language" chapter a
couple of times and you'll understand shell scripting better than the vast
majority of seasoned Unix hackers.

~~~
lawpoop
Is there an authoritative source on POSIX shell?

~~~
wahern
Yes, the Standard Unix Specification (SUSv4) published by the Open Group,
which is identical to the POSIX standard.[1] This publicly accessible URL for
SUSv4 version 7[2] works for me, but I'm unsure if it's stable or intended to
be directly accessible:

    
    
      http://pubs.opengroup.org/onlinepubs/9699919799/
    

You can download a free copy of the above HTML site, or a free copy of the
PDF-formatted specification, from
[https://www2.opengroup.org/ogsys/jsp/publications/Publicatio...](https://www2.opengroup.org/ogsys/jsp/publications/PublicationDetails.jsp?publicationid=11867).
The download requires free registration. I've never received spam from them. I
think at this point the Open Group is just happy to have more mindshare.

I prefer having a local HTML copy because even though the frames-based site is
eminently navigable, occasionally it's useful to be able to use grep(1) to
search through the standard.

Another useful resource is the bug tracker for the SUS/POSIX working group.
You can browse a list of the errata and requested future changes at
[http://austingroupbugs.net/view_all_bug_page.php](http://austingroupbugs.net/view_all_bug_page.php)

For years Sun was the de facto steward of the working group. But as is
evident, with some sleuthing, from the above database, Red Hat is the de facto
steward today. Which means if you're curious where Red Hat intends to evolve
(or lock down) the Linux user land, watching where the future POSIX/SUS
standard is going can be revealing. Commercial unices like Solaris and AIX
have begun implementing Linux API and utility extensions directly, but the
BSDs tend to emphasize implementing existing and forthcoming POSIX-defined
interfaces. In any event, unlike 15 or 20 years ago POSIX compliance is
exceptionally strong across the board, but especially on open source
platforms. If you're into systems programing, it pays to be familiar with the
standard. It may seem esoteric, but IME the commitment to the POSIX standard
by the community is stronger than it has ever been. Everybody is
asymptotically approaching complete POSIX compliance. I always code to POSIX
first and only supplement with native interfaces when there's demonstrated
utility. Linux has strong backward compatibility, but native interfaces can
change and are more likely to change than those required for POSIX compliance.
For example, Linux deprecated the BSD sysctl(2) syscall years ago in favor of
/proc, which broke a ton of applications, including Tor. Even though Linux
_technically_ still supports sysctl(2) via a compile-time option, it's been
removed from all modern Linux distributions AFAIK (certainly Red Hat awhile
ago, more recently Ubuntu). By contrast, even if Linux doesn't support some
POSIX interface currently, it's likely that support will be eventually added.
Similarly, if you relied too heavily on glibc extensions 10 years ago, you
would have made alot of trouble for yourself when musl libc came out.

I'm very pragmatic when it comes to maximizing portability, and I don't write
strictly POSIX compliant code. In fact, for C programming I've found that the
worst thing you could do for portability is define the _POSIX_C_SOURCE macro
as it hides both native interfaces as well newer POSIX interfaces that are de
facto portable. (Most systems support most of the SUSv4/POSIX-2008 interfaces,
but nobody supports the complete suite of interfaces, so defining
_POSIX_C_SOURCE=200809L can backfire, and _POSIX_C_SOURCE=200112L is
needlessly restrictive.) And my "POSIX" shell scripts always use #!/bin/sh,
even though on Solaris the POSIX-compliant shell is at /usr/xpg4/bin/sh, and
shebang interpretation isn't even required by POSIX. (IME /bin/sh is
sufficiently compliant everywhere, as is shebang interpretation[3]; and
because there's no substitute for actual portability testing, the corner cases
will quickly reveal themselves. In any event with perhaps one exception I
don't think I've personally found a case where Bash or dash didn't follow
POSIX.) But, just to reiterate, from a purely pragmatic perspective
emphasizing POSIX compliance has proven a good habit.

[1] SUSv4 is the same as the POSIX-2008 specification.

[2] Issue 7 is the latest amended specification which fixes errata for
SUSv4/POSIX-2008. I suspect SUSv5 will come out in the next year or two, but
that's just conjecture.

[3] However, shebang interpretation is definitely not implemented the same way
everywhere. For example, the Linux kernel concatenates all arguments after the
command, whereas BSD kernels (including macOS) will split the shebang line up
to some fixed limit. Which is why you can do "#!/usr/bin/env perl" but not
"#!/usr/bin/env perl -w"; because on Linux /usr/bin/env is invoked with a
single argument, "perl -w", rather than two arguments, "perl", "-w". Linux and
macOS support recursive invocation of shebang commands--where the command,
/bin/foo, invoked by #!/bin/foo is itself a script using shebang
interpretation--but nobody else does.

------
savethefuture
I've used golang and
[https://golang.org/pkg/flag/](https://golang.org/pkg/flag/) in the past,
works pretty well. (and I love go) heres another great resource:
[https://gobyexample.com/command-line-flags](https://gobyexample.com/command-
line-flags)

------
melezhik
Take a look at sparrow - [https://sparrowhub.org](https://sparrowhub.org) ,
this is framework to make it easy to develop and distribute command line
utilities written on various languages - Perl, Bash, Python and Ruby. Many
features included so don't have to spend your time on things you usually do
when develop cli tools.

------
flukus
If you're considering python and ruby then you might want to consider crystal
(and the python equivalent that I don't remember the name of) and avoid the
runtime costs of a VM. Can you give some examples of the kind of utilities
you're talking about? Is it something you'd load up and have running all day
or a command you'd run frequently? VM overhead wouldn't matter in the former,
but would in the latter. What are the performance needs of the utilities?

Manual memory management isn't really a problem for small, simple utilities,
it becomes a problem on larger scale systems. C is a rather simple language to
learn.

~~~
lawpoop
I guess I'm not so concerned about doing it in c, but rather getting stuck in
certain places (I'm only thinking of memory management, but I suppose there
are others) that higher level languages avoid completely. Also, as a learner,
I am afraid I would be writing idiosyncratic weirdness instead of idomatic,
readable code.

I would want the codebase to be accessible to a wide audience.

------
manyxcxi
It entirely depends on my use case. If I'm stringing I/O together from other
applications, I'll just glue them together in Bash.

If I'm building something a bit more robust or integrated I'll use Python or
Groovy or Node.

An example would be: we have a Java web app with some REST APIs for updating
content. Instead of rebuilding the models, I just pulled the domain objects
into a core library and wrote a Groovy CLI around it.

~~~
lawpoop
I'll want something that probably communicates with web through REST calls,
with authentication, so I'll need ssl.

~~~
manyxcxi
I've done a lot with curl and jq for REST calls with bash. But if I'm doing a
lot of API calls I usually go straight to Python or Node- usually Node because
I feel like I'm faster with it for that type of stuff.

I tend to save Groovy for bigger projects, or take my bash scripts that get
unwieldy and move them over.

------
kristianp
If you already know a dynamic language like ruby or python, or a high level
static language like c# or java, then use the language you know.

------
balinjdl
Although Node.js has been associated (usually) with web programming, it can be
very useful for command-line utilities. Worth considering, as the number of
supporting libraries is very large.

------
jmduke
Python has click ([http://click.pocoo.org/5/](http://click.pocoo.org/5/)),
which is pretty great.

------
anigbrowl
C is a beautiful language. Give in to the temptation.

