What was the point of [ “x$var” = “xval” ]?

dataflow · on April 12, 2021

It's still relevant if you use -a, -o, or parentheses (which I guess is reason to avoid them):

  $ a=1; b=1; c='!'; d='!'
  $ [ "$a" = "$b" -a "$c" = "$d" ]
  -bash: [: too many arguments
  $ [ '!' '(' "$c" = "$d" ')' ]
  -bash: [: `)' expected, found !
  $ echo "${BASH_VERSION}"
  5.1.4(1)-release

Note that it's not even a bug here; it's necessary for disambiguation.

fooblat · on April 12, 2021

Alternatively you can use the newer (and highly recommended) bash syntax that is also quote safe:

     $ a=1; b=1; c='!'; d='!'
     $ [[ $a = $b && $c = $d ]] && echo true
     true

For anyone writing bash scripts today I would avoid the old test/[ functions entirely.

simias · on April 12, 2021

As always my take on this is that if you don't care about portability, why even bother with bash?

My rule of thumb is that if your script becomes too complicated that you find POSIX sh limiting, it's probably a good hint that you should rewrite it in a proper programming language. Bash scripting is an anti-pattern in my opinion, it's the worst of both worlds since it's neither portable nor that much better than POSIX.

derefr · on April 12, 2021

There’s portability and there’s portability.

I don’t really care about targeting, say, IRIX, or z/OS, or the early-2000s-era Windows POSIX Subsystem. But I do care about targeting Linux, macOS/iOS, all the modern living BSDs, and Android (where applicable.) I also maybe care about letting my software run under a Linux Busybox userland.

It’d be great to know what the lowest-common-demoninator standard is for just that set of targets. I don’t care about POSIX, but if that set of targets had a common formal named standard, I’d adhere to it rigorously.

Since it doesn’t, I just mostly adhere to POSIX — except where it gets to be too much (like having to write 1000-line Bourne Shell scripts.) When that happens, I look around to see if the set of targets I care about all do something the same way. If they do (such as all happening to ship some — perhaps ancient — version of Bash), then I break away from POSIX and take advantage of that commonality.

I don’t know enough about all the things these OSes do or do not have in common to truly code to the implicit standard 100% of the time. I wish I did. So, in most respects, wherever I haven’t done independent research, I have to hold myself to a much stricter standard — POSIX. But it’s only ignorance keeping me doing it!

simias · on April 12, 2021

>I don’t really care about targeting, say, IRIX, or z/OS, or the early-2000s-era Windows POSIX Subsystem.

Sure, those are very niche, but not all BSDs or every Linux distros have bash in the base distribution. And if you work with embedded system bash is very much a luxury. You mention busybox but this one is very much a crap shot because everything in busybox is opt-in. You want to use `seq`? Better make sure that it's configured. On the other hand you almost certainly won't have bash on these systems either.

At any rate if the alternative is between having to install python, ruby, perl or lua on my machine or having to maintain and hack on ginormous bash scripts I know what I'll chose.

But in general I like your approach, it's pragmatic. It's just pretty hard these days to find a system that ships bash but can't give you some decent scripting support, at least in my experience.

MichaelMoser123 · on April 13, 2021

not just limited to embedded environments, some docker images used in cloud setups also don't have bash - alpine doesn't have it. The alpine image is used a lot, due to its small size; you can add bash easily, but bash is not part of the base image (there is a difference if you are in an ssh session to a container that doesn't have access to the net)

   docker run --rm -it alpine
   / # bash
   /bin/sh: bash: not found

floatingatoll · on April 12, 2021

Note that targeting bash on macOS means targeting bash v3, as Apple is not updating to bash v4+ (due to GPLv3) and has changed the default interactive shell to zsh in macOS 11.0+.

fooblat · on April 12, 2021

> it's the worst of both worlds since it's neither portable nor that much better than POSIX

Writing modern bash using the updated syntax, it is easy to make robust scripts and the syntax around things like quoting is simple and consistent.

It seems odd to me to suggest that one should either stick to the old brittle /bin/sh syntax or not use bash at all. It has it's place in the world and it should be done the modern way.

You'll find 10s of thousands of lines of bash in the kubernetes repo and all the files I checked are written using modern bash syntax. There must be a reason they choose bash for these parts.

Of course, in any case everyone should use the tools they like and find most suited to the task at hand.

scbrg · on April 12, 2021

> There must be a reason they choose bash for these parts.

Kubernetes is a Google creation. Google mandates bash for shell scripts: https://google.github.io/styleguide/shellguide.html#s1.1-whi...

lanstin · on April 13, 2021

Because it's everywhere anyone cares about and isn't some weird thing like zsh or Lord forgive us tcsh.

Also all of these modern are basically fancy wrappers around bash scripts. Even dockerfiles is just a way to run some shell commands at image build time and other shell commands at launch time. Much less puppet or ansible or user-data.sh ....

skj · on April 12, 2021

"there must be a reason" hahahahahaha

chasil · on April 12, 2021

Shellcheck specifically supports POSIX, BASH, and Korn. Each of these has an important role in scripting.

Development of the formal ksh93 was halted and rewound by AT&T after David Korn's departure, precisely because the user community remains extremely intolerant to changes in functionality and/or performance. Korn shell development that requires a "living" shell should likely target both ksh93 and the MirBSD Korn shell (the default shell in Android).

BASH has diverged from the Korn shell in a few (annoying) ways, but remains core to the GNU movement and rightly commands its own audience. It is much larger than the MirBSD Korn shell.

POSIX shells were developed to target truly minimal systems - POSIX rejected Korn likely due to (Microsoft) XENIX on 80286 systems that only allowed 64K text segments. Korn was able to run on such machines, but the code was not maintainable. Clean C source for a POSIX shell is far easier to achieve for XENIX, and remains a better fit for embedded environments. This is likely why POSIX is not Korn.

alerighi · on April 12, 2021

For small scripts, bash is still the easier way to go. Other programming languages are not that immediate to do simple things. How many lines of code take piping a command into another in python, for example?

Also bash is not portable, but you can assume to find a recent version of bash on any Linux system. And for BSD systems or macOS, you can install it with a command. And yes, there are other ancient UNIX systems, but who uses them? But what other programming language can you be sure to find on any system?

Finally, bash it's quite efficient. I mean that it doesn't have the weight that a full programming language like Python has, especially the time that it takes to activate the program interpreter. That could make a difference in scripts that are executed a lot of time from other script or other programs. Or even in wrapper scripts, since they doesn't add noticeable delay when executing a command.

There is still a reason to use and know bash to me. Of course implementing whole programs in bash doesn't make sense, but for simple scripts it does.

andrewshadura · on April 12, 2021

> How many lines of code take piping a command into another in python, for example?

None.

  from sh import sort, du
  from glob import glob
  print(sort(du(glob("*"), "-sb"), "-rn"))

eru · on April 17, 2021

Looks like three lines to me?

throwaway823882 · on April 12, 2021

Sometimes you write POSIX shell (or very nearly POSIX) for portability. Sometimes you write in "a proper programming language" for more powerful/flexible features. And sometimes you write Bash to get both.

Bash is ported to basically every system, it's a small binary (by today's grotesque standards), and it gives you some added functionality that would be otherwise annoying to re-implement in POSIX sh (or Bourne sh). But it's also far less complex than Python. In today's world, if someone has any shell, it's a safe bet that they either already have Bash, or they can install it just like they'd have to install Python.

It's less painful in the long term to use Bash rather than Python. Python is more time-consuming and costly. It's a larger space requirement, you have to manage more runtime dependencies and a virtual environment, debugging is worse, and there's more opportunity for bugs because it's more complex. Bash scripts also rarely get bugs down the line due to dependencies changing, whereas this happens in Python so frequently we have to pin versions in a virtualenv. When's the last time you pinned the version of a Bash script dependency?

chasil · on April 12, 2021

There are two specific cases where you do not have BASH.

1. Ubuntu/Debian /bin/sh is the Debian variant of the Almquist shell, known as dash. The only non-POSIX element it includes is the "local" keyword (afaik).

2. Busybox, with the Almquist shell. Busybox also has a "bash" which offers a few cosmetic improvements to ash (an alias of [[ to [ is one that I can see in the source). I write POSIX shell scripts on Windows with busybox quite often.

The only other shell that omits everything outside of the POSIX specification is mrsh. I don't think mrsh is widely-deployed in any major distribution.

(You also aren't going to have bash in AIX/HP-UX and maybe a Solaris base load, but as the parent article says, we aren't talking about dinosaur herders.)

alexhutcheson · on April 13, 2021

For #1, you can just put #!/bin/bash at the top of the file to use Bash. Bash is still available, it’s just not the default for scripts that specify #!/bin/sh.

#2 is still currently tricky, but Rob Landley (former Busybox maintainer) is working on a full bug-for-bug compatible Bash clone called toysh which will be included in an upcoming release of Toybox[1]. Once that’s released, I’m looking forward to (hopefully) never writing a script for BusyBox ash again.

[1] http://landley.net/toybox/about.html

saagarjha · on April 13, 2021

tcsh was the default shell in Mac OS X for a while…

HellsMaddy · on April 12, 2021

Bash is ugly and rife with pitfalls, but it can be incredibly productive for certain classes of problems if you know what you’re doing. Trying to use Python or Go to whack some external commands together feels very cumbersome to me.

For pure programming, “real” languages are absolulely preferable in almost every way, but all of them fail when it comes to running external programs and redirecting their output in a way that doesn’t make me pull my hair out. As a heavy user of both “real” programming languages and shell scripting languages, I’m left craving something that brings the best of both worlds.

There are a number of newer shell projects in this vein that are very exciting, like Oil [0], Elvish [1], and Xonsh [2]. I was also hopeful that Neugram [3] would go somewhere, but the project seems to have died out. While many people cite Bash’s lack of portability as a reason not to use it, I find Bash to be very portable for my use cases and avoid using these newer shells for their lack of portability. Maybe one day we can have nice things.

[0]: https://github.com/oilshell/oil

[1]: https://github.com/elves/elvish

[2]: https://github.com/xonsh/xonsh

[3]: https://github.com/neugram/ng

andrewshadura · on April 12, 2021

> Trying to use Python or Go to whack some external commands together feels very cumbersome to me.

Try https://amoffat.github.io/sh/

  from sh.contrib import git
  git.reset(hard=True)

PhantomGremlin · on April 14, 2021

How is this better than just using subprocess? Faster? More elegant?

The doc says: sh is a full-fledged subprocess replacement

It would be nice if that page had some sort of comparison. I guess the examples shown are sufficient if you're a regular user of subprocess?

HellsMaddy · on April 12, 2021

Wow that’s cool. Thanks for sharing.

hnlmorg · on April 12, 2021

It's worth adding mine to the list too:

[4]: https://github.com/lmorg/murex

Superficially it's quite similar to elvish (purely by coincidence) but it's aimed around local machine use (eg by developers and devops engineers) so is as much inspired by IDEs as it is by shells.

throwaway823882 · on April 12, 2021

Sad that none of those projects took the easy route. If you want to replace Bash, do it the way Bash replaced Bourne: Make it backwards compatible.

Bash acts like 'historical versions of sh' if it was invoked as sh. So, a Bash replacement could act like Bash if it was invoked like bash. Then implement all the extra crazy shit if you invoke it as slash or something.

Assuming it was a small, fast, compiled binary, this would take off in all the distros pretty much immediately. And if you really want it to succeed, implement a spec first, and implementations second. Add a slash implementation to Busybox, and then it's on every embedded Linux system in the world.

hnlmorg · on April 12, 2021

Literally the first in the GPs list, Oil shell, aims to do just that.

> Oil is a new Unix shell. It's our upgrade path from bash to a better language and runtime.

https://www.oilshell.org/blog/2021/01/why-a-new-shell.html

HellsMaddy · on April 12, 2021

It would certainly be nice, but I wouldn’t call that the easy route. Creating a new shell that’s fully backwards compatible with the monstrosity that is Bash sounds like a massive undertaking. POSIX or the Bourne shell is probably an easier target, and there are numerous alternative shells that are POSIX-compatible and add nice new features or optional alternate syntaxes. But I agree, to properly dethrone Bash would probably require backwards compatibility with it.

hnlmorg · on April 12, 2021

Personally I don't see the point dethroning bash. I mean Bash still hasn't completely dethroned the Bourne shell. And now there is a lot more competition in shells than there ever was.

Personally I think the better approach is to accept that bash/sh will always be about for legacy stuff and for alternative shells to carve out a niche elsewhere. Particularly because to some of bash/sh's pain points can't be addressed without break compatibility in the first place (like handling file names with spaces).

lordgroff · on April 12, 2021

This used to be my take, but I'm not sure any more. Now, I don't think Python is ideal, and I am looking for something that maybe fits better than it in that gray zone of programming language vs "quick" script, but I can't see your point about debugging and bugs: bash is notoriously annoying to write bug free, and I can't see what's the issue with the Python debugger.

Also if we're in that gray zone of bash vs Python, you can pretty much stick to the stdlib.

scoopdewoop · on April 12, 2021

I'm hopeful that Deno can be just that runtime. Since its a single binary, its trivial to install and script for. TBH, I don't have any technical reasons for JS/TS over python besides that I have coworkers that have never had to learn python and would prefer not to.

  /**
   * cat.ts
   */
  for (let i = 0; i < Deno.args.length; i++) {
    const filename = Deno.args[i];
    const file = await Deno.open(filename);
    await Deno.copy(file, Deno.stdout);
    file.close();
  }

dylan604 · on April 12, 2021

python3 or python2.7???

brobinson · on April 12, 2021

I would go with the one that hasn't been out of support since Jan 1, 2020.

https://www.python.org/doc/sunset-python-2/

dylan604 · on April 12, 2021

yet everything comes standard with stock install of 2.7. ???

chungy · on April 12, 2021

No responsible distro still ships 2.7

dylan604 · on April 12, 2021

hahahaha. I like your definition or responsible as AWS Linux 2 comes with 2.7 and macOS comes with 2.7

glad to know i'm such a hellion running such irresponsible OSes! down with the man!! I do what I want!

Or more likely, the world isn't as clear cut as you might think

chungy · on April 12, 2021

> I['d] like your definition [f]or responsible

A distribution that removes Python 2.7 once upstream security support has been cut. Since Python is frequently used on network-facing services, it is unconscionable to allow its use past January 2020. Heck, one might even have setuid programs in Python and you don't really want to risk an unfixed local privilege escalation either.

> AWS Linux 2

A butchered version of CentOS

> and macOS

Not a distribution, and Apple doesn't exactly have a good track record with keeping things up to date or secure.

> Or more likely, the world isn't as clear cut as you might think

Given that the vast majority of the Python ecosystem is no longer compatible with 2.7, there isn't even a technical reason for kicking around 2.7 anymore.

Python 2.7 is trivial to build and run if you really, really want or need it, but since all the major upstream distributions have removed it, it sends a clear signal that you must take the risks into serious consideration before doing it.

fomine3 · on April 13, 2021

Ruby!

blibble · on April 12, 2021

python is indeed very painful for typical short shell script tasks

for example: anything with pipes

however once you start having to run multiple background jobs with custom wait logic it probably becomes worth it

tacitusarc · on April 12, 2021

Python is far easier for anything involving http and data parsing/manipulation. It's far worse at running seuqntial shell programs.

andrewshadura · on April 12, 2021

  from sh import wc, ls
  print(wc(ls("-1"), "-l"))

citrin_ru · on April 13, 2021

Try to do in Python something like:

Where size of data >> RAM (but less than free space in /tmp). It of course possible, but I suspect would require more effort and would work slower than a shell line above.

Too · on April 14, 2021

If you want to compare memory efficiency. The sort+uniq -c would likely be more optimal in a high level language as you would there keep only the count+value in a defaultdict or alike instead of sorting the entire dataset, which requires keeping it all in mem.

As for brevity, yeah, this particular example would require 5 lines instead of 1.

citrin_ru · on April 14, 2021

For small datasets something like defaultdict may be more efficient, but if it would not fit in RAM you have a problem.

gmfawcett · on April 12, 2021

In the space of shell programming, virtualenv and third-party Python libraries are irrelevant and unnecessary.

rattray · on April 12, 2021

Huh? That's only true if you don't use any libraries, right?

What if you want to use something like delegator[0] to make running shell commands less toilsome?

[0] https://github.com/amitt001/delegator.py

lmm · on April 12, 2021

Python with no libraries is still a much, much nicer programming environment than bash.

dylan604 · on April 12, 2021

How say you? To even ask the system a question in python, you have to "import os". Is that not a library? I'm not a python-first type of person, but I have hacked enough scripts to be functional. My one take away is how python core is not very useful, and to even remotely do anything one must have a list of imports.

scoopdewoop · on April 12, 2021

'os' is in the standard library, not a concern when discussing virtual-env and third party libraries.

dylan604 · on April 12, 2021

then why must it be imported as a 3rd party? this is my question on python in general. i get needing to explicitly import things that are not part of the core/standard, but why are core/standard required to be imported? why can't the methods they provide not just immediately available.

*clearly, i've never taken a python CS style class, but it this importing core functions is just strange to me

gmfawcett · on April 12, 2021

"Third party" means a third party provided it: not yourself, and not the Python distribution, but someone else.

"os" is a second-party library. You didn't write it; it was included in the Python distribution for you.

What you're asking, I think is why the "os" functions are not, in Python terminology, "builtins." The builtins are functions that are available without any kind of import: like len(), max(), sorted(), etc.

https://docs.python.org/3/library/functions.html

Why do you have to "import os"? Python was designed that way, that's all. So were the majority of other languages that come with standard libraries. (JavaScript is one exception that comes to mind, but there aren't many such exceptions.)

The discussion above relates to "virtual environments", which are ways to manage third-party dependencies. My claim is that virtualenvs, while handy for general-purpose development, are basically pointless for replacing shell scripts. You don't need them -- the core language and its standard library are sufficient for most shell scripting purposes. I'm basing this opinion on my ~20 years of using Python for this kind of work.

admax88q · on April 12, 2021

Using a "proper" programming language adds a dependency. If you're okay adding a dependency, then adding a dependency on bash should be fine.

The worst part of Linux development is the insistence on 'portability' when using the Unix tools. As a result bash, make, awk, have all stagnated.

amelius · on April 12, 2021

With all the different shells that could be on a system like sh, zsh, ksh, csh, tcsh, ... I'm thinking that bash itself is a dependency.

alerighi · on April 12, 2021

You can have other shells that are not bash, true. But you are guaranteed to find a bash shell in any Linux distribution (well, except minimal distros like OpenWRT or Alpine Linux that for saving space use busybox). Let's say that in any server or desktop Linux installation you find bash.

You would probably find also python, but what version? Python3? These days you are probably guaranteed to find them, except that there are old Red Hat servers still in production where is not installed. And Python3, what version? 3.7? Older? What features are safe to use and what are not? Do you have to research to find out? Or you are stuck with using Python 3.2 just to be safe? Well, in the time you wasted thinking about that, you would have finished writing the script in bash.

andoriyu · on April 12, 2021

Agreed. I think bash is a dependency. Even more, it's a dependency I'd want to avoid at all cost.

Dylan16807 · on April 12, 2021

What's a good environment that has:

1. interactive file listing and tab completion

2. the ability to run one-off lines

3. the ability to run parts of scripts and build up into a bigger script

Currently I write a lot of small-ish scripts using either bash or bash plus a text editor. I would like to use a less obtuse language than bash, but the interactive part is really important. I don't want a system where I have to run exploratory and test commands in bash and then translate them into a different language, because that means figuring out how to say it twice and doing extra debugging.

amelius · on April 12, 2021

> Bash scripting is an anti-pattern in my opinion

Totally agree here. Shells are good for one thing: user interaction. For all the rest, use a programming language.

bityard · on April 12, 2021

Except for perhaps C, Which "proper" programming language has been ported to more systems and architectures than bash?

gnubison · on April 12, 2021

> Alternatively you can use the newer (and highly recommended) bash syntax that is also quote safe

Well, mostly:

  $ a=1; b='*'; c='!'; d='*'
  $ [[ $a = $b && $c = $d ]] && echo true
  true
  $ [[ $a = "$b" && $c = "$d" ]] && echo true
  $

From the bash(1) manpage:

> When the == and != operators are used, the string to the right of the operator is considered a pattern and matched according to the rules described below under Pattern Matching, as if the extglob shell option were enabled.

POSIX syntax is more consistent and more portable:

  $ [ "$a" = "$b" ] && [ "$c" = "$d" ] && echo true
  $

WaxProlix · on April 12, 2021

You can also use shellcheck[1], which has saved me from pushing numerous potential bugs into production.

https://www.shellcheck.net/

nik_0_0 · on April 12, 2021

This article appears to be discussing how to handle this case from within said shellcheck :)

dataflow · on April 12, 2021

It's always fun to assume everyone is running Bash until your Mac's zsh or your Ubuntu's dash (or whatever) choke on your script a few times. When this happens enough times you eventually reconsider...

linkdd · on April 12, 2021

That's what shebangs are for:

  #!/usr/bin/env bash

If it's not there, it will fail.

I expect every script with `#!/bin/sh` as shebang to be POSIX shell only.

dataflow · on April 12, 2021

> If it's not there, it will fail.

Until your users/colleagues source it into another shell, or copy-paste some part of it into their shell/Makefile/Bazel/other script... and then either you feel sorry for them, or you laugh at them and blame them for burning themselves and trashing their system. (Hopefully your choice didn't cause their files to get wiped. Unless you think they deserved it for daring to paste your code somewhere else, I guess.)

The world isn't perfect.

linkdd · on April 12, 2021

This problem is not exclusive to bash, try copy-pasting python 2 code into python 3.

Or ES2020 code in a browser.

Or ASM z80 into an x86 program.

When you're copy-pasting, you should be aware of what you're copying, and where you're pasting it.

Surely, no domain in IT is idiot-proof.

dataflow · on April 12, 2021

> This problem is not exclusive to bash, try copy-pasting python 2 code into python 3.

No, it is highly exclusive to Bash actually. In Bash, even one-character mistakes have a high potential to cause catastrophic damage. Just splitting or globbing on the wrong character can wipe your filesystem. As well as issues like these, where your string being '!' suddenly causes the command to be interpreted entirely differently. And Bash has modes that keep going when there is an error so it's not like you'll even know you've encountered an error or get any kind of visible crash + stack trace necessarily. It just keeps trashing everything further and further. These are extremely wildly different from Python.

> Surely, no domain in IT is idiot-proof.

I would love to know what happens when you say this to a user who pastes your Bash in zsh. Do you tell them your code isn't idiot-proof? Is everyone who ends up running Bash in zsh for some reason or another an idiot?

linkdd · on April 12, 2021

If you're not doing `set -euo` and if you run your scripts as root before testing them, you're asking for something wrong to happen.

EDIT in response to your EDIT:

My users get either:

  - a tar.gz archive
  - a binary
  - a docker image
  - an helm chart
  - a POSIX shell script (very rare)

Then they can use any method they want to deploy it.

You want your software to be portable? You have to put in the work to do it.

If you expect your users to know better (assume they ARE idiots), again, you're asking for something wrong to happen.

dolmen · on April 12, 2021

You mean "set -euo pipefail", don't you?

You should use more copy/paste. ;)

dataflow · on April 12, 2021

> If you're not doing `set -euo` and if you run your scripts as root before testing them, you're asking for something wrong to happen.

...because you can't delete important files unless you're root, right? Subfolders in your home directory are totally OK being deleted? You store your documents so that only root can modify them?

And do you know set -e has a ton of its own pitfalls and doesn't save you from this? It's a half-broken mitigation. I don't have energy to keep arguing here, but I would suggest go learning about the pitfalls of Bash and set -e and such.

DonHopkins · on April 12, 2021

No, if you're using a language that requires you to perform some obscure incantation like "set -euo" before running your scripts, you're asking for something wrong to happen.

Blame the disease, not the symptom.

dylan604 · on April 12, 2021

if your pedantic enough to use "set -euo", you're going to add it to the script and not make a user set it first.

acdha · on April 12, 2021

If people copy-and-paste complex shell scripts without testing, you’re going to have a bad day no matter what you do. The solution is to work on better practice, use of shellcheck, rewriting complex logic in Python, etc. since that’ll pay dividends in so many other ways.

dolmen · on April 12, 2021

> Until your users/colleagues source it into another shell

The intersection of people knowing the "source" command and not knowing what is a she-bang or the difference between sh and bash is quite small.

dataflow · on April 12, 2021

So what is your point here? I'm hallucinating when I see people do this?

cjaybo · on April 12, 2021

Are you suggesting that leaving off the shebang line makes these types of mistakes less likely? That makes no sense.

skinner927 · on April 12, 2021

If I copy Python 2 code into Python 3 it’s not gonna work right either. What’s your point?

fooblat · on April 12, 2021

I've been seeing this argument for decades and yet I've never run into any issues like you describe.

When I write a bash script it is for my use or in my production system. In either case I call bash explicitly (/usr/bin/env bash) so the script will work correctly or not at all.

Rarely do I have a project where there is a requirement to write a "run anywhere" script but if it were I would stick to posix.

dataflow · on April 12, 2021

There are more people in the world than you. You might do everything perfectly; other people won't. Especially users who have yet to learn about these pitfalls. If you haven't run into this, either you're lucky enough that it doesn't apply to your situation, or you haven't become aware of it or tried to pay attention to it. I know I've seen people try to run Bash in their Mac's zsh (not necessarily deliberately either; some programs just run the default shell) and watched it crash and burn. I'm not condoning it, I'm saying it happens whether you and I like it or not, and you can either blame the victim (which I guess wouldn't be the first time I see this happen), or you can do something about it.

fooblat · on April 12, 2021

Am I understanding correctly that your point is that we shouldn't write modern bash scripts because some number of untrained people will have problems by accidentally misusing the scripts (wrong interpreter, blind copy/paste, etc). Is that a correct summary?

To me this sounds like saying we shouldn't have powerful tools like chainsaws because every year some people without safety training use them wrong and hurt themselves.

dataflow · on April 12, 2021

No, that's just an exaggerated caricature of what I'm saying.

I'm only saying you should avoid using modern things needlessly. This means, don't do it when the POSIX-compliant syntax also serves your needs. e.g., don't do [[ -f "$foo" ]] when you could just as easily do [ -f "$foo" ]. And if you do need to use Bash-specific features, make sure it's safe enough that foreseeable misuses will fail safely. Or when that's not possible, try to mitigate it somehow (maybe with a comment, or by putting in some extra dynamic checks, or other solutions you can think of).

And this goes for other things too, not just Bash. Probably outside programming too.

fooblat · on April 12, 2021

> I'm only saying you should avoid using modern things needlessly.

I agree with that idea.

However, I don't think it applies in this case. The point of using modern bash syntax is that it fixed the brittleness and pitfalls of posix sh. This is significant. This is what I need in production. This is why our style guides insist on modern bash syntax for bash scripting.

I'd say the rest of your suggestions apply to shell scripts no matter what syntax is used.

raffraffraff · on April 12, 2021

Also, bash on Mac is shitty. I've run into problems with it before (can't remember exactly what it was) so we've reversed it: our shell scripts are written in zsh, and we just ensure that zsh installed everywhere.

jcynix · on April 12, 2021

Besides that bash has its problems with array variables. Unless I switch to one of the "real" scripting languages, I use zsh too and start my scripts with

  if [[ -n "$BASH" ]] ; then
    echo $0: use zsh, not bash
    exit 1
  fi

Earlier on I used the more compact

  [[ -n "$BASH" ]] && { echo $0: use zsh, not bash;  exit 1 }

But bash chokes on that with a syntax error

Izkata · on April 12, 2021

The syntax error is on a required semicolon. Put one after the "exit 1":

  [[ -n "$BASH" ]] && { echo $0: use zsh, not bash;  exit 1; }

dataflow · on April 12, 2021

Oh yeah that too, that's another whole can of worms I forgot to bring up, thanks for mentioning it.

inetknght · on April 12, 2021

> For anyone writing bash scripts today I would avoid the old test/[ functions entirely.

I think it's better said that anyone writing bash scripts today should always pass their script to shellcheck.net. It can even be installed locally in case sharing the script online is undesired.

Shellcheck will not only look for problems in the script but it will also usually provide some background about how the problem is a problem and a suggested workaround or solution.

1vuio0pswjnm7 · on April 13, 2021

I would avoid using bash for writing scripts (i.e., non-interactive use). There are better shells for scripting, like the one most Linux distributions are using, derived from the default shell in NetBSD, derived from the Almquist shell. The thing I miss most when using Linux is the default NetBSD shell. It has the speed of dash plus command line history.

jolmg · on April 13, 2021

> that is also quote safe

Depends on what you mean by quote safe. Quotes aren't redundant:

  $ d='*'
  $ [[ foo = $d ]]; echo $?
  0
  $ [[ foo = "$d" ]]; echo $?
  1

fanf2 · on April 12, 2021

The [[ builtin is a ksh feature copied by bash. POSIX includes [[ and ]] in its list of reserved keywords but doesn't specify their behaviour.

arthur2e5 · on April 12, 2021

The post mentions the issue is fixed by POSIX, which… notably does not standarize -o -a and parens. I guess now we know why POSIX didn't standardize those.

hvdijk · on April 12, 2021

POSIX did standardize those, but the functionality is marked XSI, so only XSI-conforming implementations are required to provide it.

There is never a reason to use them, though. They can be replaced by the shell's && and || operators, which nicely bypasses the parsing ambiguities at the same time.

dataflow · on April 12, 2021

> There is never a reason to use them, though. They can be replaced by the shell's && and || operators, which nicely bypasses the parsing ambiguities at the same time.

Not sure I agree. The equivalent of more complicated expressions like (a AND b) OR (c AND d) with && and || isn't always so "nice" IMO. e.g., try it for this:

  [ -f a -a -f b -o -f c -a -f d ]

I think the simplest syntax will require you to spawn subshells, which is neither particularly nice, nor particularly performant (especially on Windows).

sltkr · on April 12, 2021

Is this really so much worse:

   { test -f a && test -f b; } || { test -f c && test -f d; }

?

Yes, it's more verbose, but the original expression isn't particularly readable either, and if you'd want to evaluate something like (a OR b) AND (c OR d) you'd need to introduce parentheses anyway.

dataflow · on April 12, 2021

It's not "so much" worse, it's just mildly worse. All I was saying here was the verbosity is a reason for using them. Obviously there are reasons not to use them too, which I said in my own top comment.

hvdijk · on April 12, 2021

Note also that this does not need any subshells. I think dataflow was thinking of (test -f a && test -f b) || (test -f c && test -f d) instead, but { } isn't worse than that.

dataflow · on April 12, 2021

Yeah almost. I was thinking of

  ([ -f a ] && [ -f b ]) || ([ -f c ] && [ -f d ])

instead of

  (test -f a && test -f b) || (test -f c && test -f d)

dolmen · on April 12, 2021

Parentheses launch a subshell, so you probably want to use accolades instead:

{[ -f a ] && [ -f b ] ;} || { [ -f c ] && [ -f d ] ;}

dataflow · on April 12, 2021

With braces I'd prefer the test syntax actually, this is too many special characters to be comfortably readable for me.

And yes I know it creates subshells, that was precisely what I was saying in my comment where I said the simplest syntax I could think of (i.e. that very code) creates subshells.

jolmg · on April 13, 2021

> Note that it's not even a bug here; it's necessary for disambiguation.

I would call it a bug when there's one valid way to read it, and [ fails to see it. I don't see the ambiguation. For example, zsh's [ doesn't have these errors:

  $ a=1; b=1; c='!'; d='!'
  $ [ "$a" = "$b" -a "$c" = "$d" ]; echo $?
  0
  $ [ '!' '(' "$c" = "$d" ')' ]; echo $?   
  1

Same result with FreeBSD's tcsh's [.

I think bash's [ is just lacking the ability to backtrack while parsing its arguments.

Curiously, GNU's [ fails to parse the arguments of the first, but not the second's:

  $ command [ "$a" = "$b" -a "$c" = "$d" ]; echo $? 
  [: extra argument ‘!’
  2
  $ command [ '!' '(' "$c" = "$d" ')' ]; echo $?   
  1

It seems its argument parsing works differently when using parentheses:

  $ command [ "$a" = "$b" -a '(' "$c" = "$d" ')' ]; echo $? 
  0

dataflow · on April 14, 2021

I was just being a little sloppy in what I wrote. I meant you can encounter ambiguities in the language when you start using -a/-o or (). I didn't mean those particular examples' parse trees are ambiguous.

alerighi · on April 12, 2021

Well, never seen any reason to use -a or -o. To me they are less clear and you are prone to error. I find things like: [ "$a" = "$b" ] && [ "$c" = "$d" ] much more clear.

FirstLvR · on April 12, 2021

adrianmonk · on April 12, 2021

On a side note, I think this style tweak makes it more readable:

    [ x"$var" = x"val" ]

My eye doesn't have to separate "xval" into "x" and "val", so it's more obvious that the value being compared against is "val".

(It behaves the same way since the shell allows quoting to start in the middle.)

chubot · on April 12, 2021

Related: Problems With the test Builtin: What Does -a Mean?

http://www.oilshell.org/blog/2017/08/31.html

The POSIX spec did indeed make things cleaner; the last section quotes it and gives some style advice.

aasasd · on April 12, 2021

As usual, the problem is that Sh & co. use in-band control, especially when talking to utilities. Just a mess of text being passed around, of which every program must make sense on its own—but this happening in shell's own functions takes the cake.

Moreover, similar problems are inherited by some programs that try to do shell-like commands but put their own syntax in the way. E.g. Ansible, where YAML, JSON, module call DSL and Jinja are all mixed on top of the commands so you can bump into any of them.

improv32 · on April 13, 2021

Interestingly enough, this isn't a mess in the shell's own builtins. The [ ] expression syntax is provided by the program /usr/bin/[ which is (often) a symlink to /usr/bin/test. It's just a clever trick! [ is a program that evaluates it's argv as an an expression, less the trailing ], and then returns the result as an error code.

Actually re-reading your post you probably know that already! Nevermind then. Maybe someone reading this will be enlightened

aasasd · on April 13, 2021

Yes, I know about `[` being both a program and (potentially) a built-in. However the problem is exactly that an Sh-style shell can't easily divorce itself from the same gotchas unless it invents a DSL for its builtins with different semantics—because normally it must parse vars before invoking the command, and thus vars are interpolated into the text, which the command is. If the shell does the sane thing of interpreting the command before evaluating vars, it means breaking off from the normal semantics and thus being inconsistent.

I imagine the introduction of `[[` and `((` in Bash may have come partly from this reasoning (to abandon the POSIX semantics for these commands), though I never delved far enough into nerdery to learn the difference from `[`—so don't know if they do evaluate vars in the sensible way.

I keep wondering if there are any shells or other software calling into utils, that manage to walk the edge of writing commands without bumping into such syntax problems—but avoid having to "enquote" "each" "argument" all the time.

naniwaduni · on April 13, 2021

[ is not, in principle, the shell's own function. It merely happens to be implemented as one in most shells anyone actually uses as an optimization.

Ericson2314 · on April 13, 2021

Indeed! As always, http://langsec.org/

nwellnhof · on April 12, 2021

The "POSIX prescriptions" mentioned in the quoted comment can be found here under "algorithm for determining the precedence": https://pubs.opengroup.org/onlinepubs/9699919799/utilities/t...

chunkyks · on April 12, 2021

So, in conclusion it's a hack... but, it's also incredibly robust. It doesn't hurt, and we acknowledge there are cases where it could help. It's idiomatic enough that literally everyone who's done a nontrivial amount of scripting is familiar with it.

At some point, "decades old idiomatic hack that increases robustness" stops being a hack and becomes an idiomatic way to increase robustness.

A better conclusion would be "yes, keep using it".

m463 · on April 12, 2021

> So, in conclusion it's a hack... but, it's also incredibly robust.

You know, shell scripts are a hack.

I really wish there was a middle ground between shell scripts and say python. Python has proper lists and hashes and loops and can manipulate paths with spaces or quotes or unicode. It doesn't get dragged down by the escape-an-escape-within-a-regex nonsense that makes things undecipherable. or ${VAR:-foo} or ${VAR##ugh}

But python is a bit fumbly when it comes to invoking external programs. try: ... subprocess.bleh except: ... ugh.

chubot · on April 13, 2021

That's one good way to describe Oil: a middle ground between shell and Python.

e.g. this post has the slogan that it's "for Python and JS programmers who avoid shell".

http://www.oilshell.org/blog/2020/01/simplest-explanation.ht...

Oil has hash tables, and can manipulate paths with spaces -- in fact the latest blog post is about that!

http://www.oilshell.org/blog/2021/04/simple-word-eval.html

Also the same blog post mentions that we should get rid of ${VAR##ugh}. Help wanted :)

acdha · on April 13, 2021

> But python is a bit fumbly when it comes to invoking external programs. try: ... subprocess.bleh except: ... ugh.

I actually tend to prefer that these days — if I write shell I'm tossing in "set -e -u -o pipefail" anyway and subprocess's newer APIs end up being cleaner when you need to do anything non-trivial for error handling (I usually end up with one `try:` block for most of the program and maybe one or two `except: cleanup(); raise` blocks. Having the logic be clean and consistent saves you so much time when you revisit that program a month later.

The biggest wins I've seen from quoting external program arguments — either not needing to do so at all or passing things cleanly through shlex.quote makes it so much easier when you have the possibility of any special characters in arguments.

zokier · on April 12, 2021

Perl?

eb0la · on April 12, 2021

I learnt this trick from OpenBSD 2.5 (yes, a long time ago). I still remember getting root with a shell-script mp3 player that a friend coded because of this.

The exploit was based in inserting some \$$ make_a_new_root_account_in_etcpassw $ inside the variable inside the test.

As a result, both my friend and I stopped writing shell scripts as CGI and started learning perl just to do it.

montroser · on April 12, 2021

Note that this is a relic of the past, as TFA explains. Modern bash can be totally manageable and even fun, if you take the time to learn and have a good style guide and shellcheck by your side.

tragomaskhalos · on April 12, 2021

I use this hack as a matter of course, having internalised it aeons ago. Bash does lots of things better now but as others have commented it is still so fundamentally weird that an innate conservatism is often a wise approach, especially when one lacks the time (or motivation) to explore its "newer" features.

superasn · on April 12, 2021

I sometimes wonder why bash transpilers aren't more popular? Bash is such a strange language (or maybe just very old with lots of cruft) that it's impossible to write anything meaningful without check SO 5 times.

Given i haven't spent a lot of time writing shell script yet but the syntax is just super weird, like I recently learned(1) that when you When you write if [ -e /etc/passwd ]; then .. that bracket is not shell syntax but just a regular command with a funny name.

Is there a recommended bash transpiler like what babel is to Js? So i can write bash scripts using a more coherent language and it spits out .sh files.

(1) https://news.ycombinator.com/item?id=26701079

GuB-42 · on April 12, 2021

That's more or less what Perl is (a more coherent language).

Perl may look ugly, but it doesn't have the escaping problems that Unix shells have. And most of the usual shell features (listing files, launching commands, etc...) are built-in. It is also available in almost all Unix systems, maybe even more so than bash.

nikau · on April 12, 2021

> It is also available in almost all Unix systems, maybe even more so than bash.

That's a bit of a deceitful claim - maybe bash isn't on every UNIX box but /bin/sh sure is along with awk and cut and all the other associated helpers.

Even if perl is installed its likely missing all the cpan modules that make it useful which isn't great if the host doesn't have carte blanche access to install whatever it feels like.

As a sysadmin/devops person knowing the utilities that are guaranteed to be installed is essential, especially now that docker images trimmed of any fat are trendy for bogus security ideals.

dolmen · on April 12, 2021

The CPAN modules availability can be worked around using App::FatPacker. https://metacpan.org/pod/App::FatPacker

That's what I'm using for github-keygen for maximum portability. https://github.com/dolmen/github-keygen/

nikau · on April 12, 2021

sure, but then you have the overhead of packaging all that junk up vs a single .sh file.

gspr · on April 12, 2021

I wouldn't go that far as calling it impossible, but it sure is crufty and quirky seen with modern eyes.

My "favorite" (i.e. most terrifying) BASH fact is this: https://unix.stackexchange.com/questions/121013/how-does-lin...

codetrotter · on April 12, 2021

Interesting that the answer to that question names the file as .bash – I do this too, but I didn’t know anyone else did.

I do this because my shell scripts use bash specific features, and I want to distinguish them from portable shell scripts that use POSIX shell features only. The latter being the only files that really deserve being named as .sh in my eyes :)

hnlmorg · on April 12, 2021

`if` and iterations were two things that annoyed me about bash enough to write my own $SHELL. It's not a transpiler though so I can't realistically use it for production services. However for quickly hacking stuff together for testing / personal use it's been a life saver due to having the same prototyping speed of bash but without the painful edge cases.

Not suggesting it's suitable for everyone though. But it's certainly solved a lot of problems for me.

https://github.com/lmorg/murex

devit · on April 12, 2021

That's false, [ is a bash shell builtin (it's also a coreutils program, but the program is not called unless you specify /usr/bin/[ or use a shell that doesn't have it built-in).

Hendrikto · on April 12, 2021

> I recently learned(1) that when you When you write if [ -e /etc/passwd ]; then .. that bracket is not shell syntax but just a regular command with a funny name.

That’s how it used to be. Nowadays, most of the time it is a built-in, not it‘s own executable anymore.

pastage · on April 12, 2021

Not sure how another standard would make it better. But sure maybe something that used dash as a backend. It does seems like it really would suck to actually use, if my little experience with Javascript eco system is anything to go on. So different perspectives maybe?

SavantIdiot · on April 12, 2021

"This happened because the utility used a simple recursive descent parser without backtracking, which gave unary operators precedence over binary operators and ignored trailing arguments."

It is a bit sad that his problem has to be encountered and resolved over and over and over again. Granted, it was a SIMPLE PARSER because resources were extremely tight. Note, I don't mean shells, I mean working with projects that implement parsers. As soon as someone busts out lex/yacc I know I'm going to see at least one "ah, we didn't think of that" issue. Parsers are hard.

rspeele · on April 12, 2021

While I agree that there can be pitfalls to parsing, most of the time it isn't as bad as this. The "test" or "[" program has a unique challenge in that it never gets to see the quotation marks or variable references, because the shell already expanded them.

So when test sees that one of its arguments is the string "-f", it has no way to know whether that string came from a variable like $var, a quoted string literal like "-f", or just the text -f. Most of the time if you're writing a parser for your own DSL you don't have this problem. You can tell the difference between the string literal "+" and the operator +.

znpy · on April 12, 2021

I first learned about this hack while reading a debian init script. Thankfully systemd has removed the need about this!

gorgoiler · on April 12, 2021

Init scripts are just executable files.

They can be written in any compiled language you like or even an interpreted language, if you are prepared to add the dependency of an interpreter for that language at boot time.

This is a lot simpler than migrating to (and getting locked in to) systemd.

choeger · on April 12, 2021

Systemd does a lot of things right: Having one process for the init, declarative definition of common options, streamlined interface. Note that many distros used to provide shell libraries for the same effect, but ended up with incompatible init scripts.

In the end, the init script style is just the bare minimum that works solely by POSIX features and is a PITA when it comes to advanced use cases like dependencies or machine-readable system state or platform independent init definitions. Systemd is certainly not the ultimate tool ever written, but it is a bold step in the right direction. Maybe someone will rewrite it someday (if so, presumably in rust) and I will appreciate the effort to make things even better.

Shish2k · on April 12, 2021

> This is a lot simpler than migrating to (and getting locked in to) systemd

Depends on your perspective - for a developer, “do whatever you want in whatever way you’re comfortable with” is simple. As a sysadmin, I much prefer the simplicity of “there is one way to do it; features like resource limits and automatic restarts work out of the box, and are consistent across all services”.

im3w1l · on April 12, 2021

The flexibility for the author makes it harder for the reader though. I can read a systemd service file from top to bottom and understand everything in a minute or two. That's valuable for me.

otabdeveloper4 · on April 12, 2021

systemd is exactly the boot time interpreter you're asking for.

corty · on April 12, 2021

No, systemd doesn't really interpret much, unit files are mostly not a full replacement for what init-scripts were capable of. Which leads to lots of Exec=/some/shell/script.sh in unit files which is worse than before because now you have multiple files and more strange interactions.

BenjiWiebe · on April 12, 2021

Is that actually common? I write quite a few systemd unit files, and have yet to need a wrapper shell script. And the vast majority, or maybe even all, of the unit files my system has by default are also not calling shell scripts.

alerighi · on April 12, 2021

Depends on what you are doing.

Init scripts where simpler because the boot process was well defined, meaning that you had all your scripts in rc.d that were executed in that particular order.

Now with systemd is more difficult to have something start after something else, you have all the dependencies and if you get them wrong it will not work. Worse, the boot process is no longer deterministic, meaning that 99 times it could work and it could break the 100th time.

On the other side, systemd is a necessary evil for a modern desktop system, where you have multiple events that could fire (hotplug, network, power events, etc). But not so sure if it's that necessary for a server or an embedded device.

otabdeveloper4 · on April 13, 2021

> Now with systemd is more difficult to have something start after something else

Not really, it's just one simple line in a unit file. (https://fedoramagazine.org/systemd-unit-dependencies-and-ord...)

corty · on April 13, 2021

No. First, you usually need both, Want= and After=. Second, that After= isn't really "after" in the traditional sense, because systemd will start A, not wait for A to be up and running and immediately after the "A has been started"-event start B. If you really want A to be available when B is started (which is the traditional init-script sense of "after") you need to modify the software A to signal completion to systemd or you need some horrible shellscript cludges in the Exec=-line of B that checks for the availability of A. Or an inbetween unit with a Exec=sleep 5 or something.

Claiming "it is just 1 line" is either inexperience or dishonesty.

p4l4g4 · on April 12, 2021

Wondered for ages whether there is a real situation where this is needed. Good to see this publicly debunked. Now I have a resource to point to when I get the "better safe than sorry" argument.

DonHopkins · on April 12, 2021

What's been publicly and thoroughly debunked is the very idea of writing shell scripts in the first place, instead of using a non-ridiculous scripting language like Python, Perl, or JavaScript. If you want to be safe instead of sorry, then never write shell scripts.

jude- · on April 13, 2021

People complaining about how underpowered shells are should have a look at Ion [1]. It's mostly the same as bash, but with first-class support for arrays, maps, byte slices, variable scoping, and (primitive) type-checking. Oh, and it's written in Rust.

[1] https://gitlab.redox-os.org/redox-os/ion

chubot · on April 13, 2021

BTW Ion is mentioned here along with many other alternative shells: https://github.com/oilshell/oil/wiki/Alternative-Shells

svnpenn · on April 13, 2021

Last release was 3 years ago...

jude- · on April 13, 2021

Last commit was 2 weeks ago.

raverbashing · on April 12, 2021

> Amazingly, the x-hack could be used to work around certain bugs all the way up until 2015, seven years after StackOverflow wrote it off as an archaic relic of the past!

Oof. If people think JS is bad wait until they try to do anything moderately complex in shell script

dale_glass · on April 12, 2021

Yup. These days if it's more than a few lines, I write it in Perl instead if it's for personal use, or Python if it's going to be shared (Python just takes me more effort still).

For all the vaunted Unix Philosophy it's amazing how clunky some of it is. Countless shell scripts still trip over spaces in filenames. Of those that don't, nearly all of them will still be confused by any unusual characters like newlines. If you want something done properly, you have to pull out a proper scripting language, to have readdir() and the ability to pass arguments directly to a process.

Even Perl, which generally excels in such tasks has weird lapses in convenience. You can run a command with one line of code without the possibility of confusion with `system("ls", "-l", $dir)`, but you can't get its output that way. There's no version of `system` that'd allow you to both explicitly specify each argument to the process, and obtain its output. You either use backticks and risk quoting trouble, or need to use `open`, which is a lot more verbose.

It's interesting that it took Microsoft to do a new approach in this regard. PowerShell has its amount of weirdness I really hate, such as that the environment escapes the program into the commandline, but it's really refreshing how it dispenses with brittle grep, awk and cut stuff.

mfontani · on April 12, 2021

For Perl, you might want to look at https://metacpan.org/pod/IPC::Run3

You can:

- specify each argument separately

- get separate stdout and stderr

- provide input

dale_glass · on April 12, 2021

Oh, I know. There's that, and File::Slurp, and a bunch of other stuff one would think would exist from the start, but for some reason don't. But those are all external modules that somebody had to write, that didn't exist at some point in time, and which sometimes one can't use because in some cases you can only rely in what's in core.

I'm just wondering at that bizarrely enough, Larry Wall (or somebody else) found it useful to have a simple, convenient way to execute a command with exact arguments, but not for when you need its output.

I must have written the same wrapper around open() several dozen times by now, due to needing to write scripts for cases where installing dependencies is undesirable.

dolmen · on April 12, 2021

> I'm just wondering at that bizarrely enough, Larry Wall (or somebody else) found it useful to have a simple, convenient way to execute a command with exact arguments, but not for when you need its output.

It had backquotes from the start. So it was as convenient (and as unsafe) as the Bourne shell.

Of course you quickly want more safety and backquotes are just legacy you can never use for serious stuff.

dolmen · on April 12, 2021

> You can run a command with one line of code without the possibility of confusion with `system("ls", "-l", $dir)`

$dir = '-r';

You loose.

Quoting is not enough. You also have to fully understand the syntax of each command.

Fix: system("ls", "-l", "--", $dir)

dale_glass · on April 12, 2021

Bad example, yes. Plus it's a pointless thing to do to start with, because you run into trouble with special characters in filenames that way. Got to use readdir.

And while we're being pedantic, it's 'lose'.

dolmen · on April 12, 2021

> And while we're being pedantic, it's 'lose'.

Fair. I'm better at shell and Perl languages than English. None of them are my mother tongue.

arnsholt · on April 12, 2021

`open my $fh, '-|', 'ls', '-l', $dir` should do the trick I think, but apparently it only works on platforms with a "real fork" (so fine as long as you don't care about Windows, probably).

dale_glass · on April 12, 2021

It does, but it's not convenient. You need a loop, perhaps deal with $/, close it at the end.

I'm not saying it can't be done, just that the language is oddly lacking in convenience for this use case.

dolmen · on April 12, 2021

$dir = '-r';

You loose.

DonHopkins · on April 12, 2021

The TOPS-20 command interpreter with its regularized parameters and prompting command-line completion and inline help and noise words was so much better designed and user friendly than any of the pathetic Unix shells.

https://www.bourguet.org/v2/pdp10/cmds/chap1

https://en.wikipedia.org/wiki/Command-line_completion#Histor...

https://en.wikipedia.org/wiki/TOPS-20

Command Processor. Rather advanced for its day were some TOPS-20-specific features:

Command completion. Dynamic help in the form of:

noise-words - typing DIR and then pressing the ESCape key resulted in DIRectory (of files)

typing I and pressing the Esc key resulted in Information (about)

One could then type ? to find out what operands were permitted/required.

PCL language features. PCL includes:

flow control: DO While/Until, CASE/SELECT, IF-THEN-ELSE, GOTO

character string operations (length, substring, concatenation)

access to system information (date/time, file attributes, device characteristics)

https://web.archive.org/web/20200115181431/http://www.opost....

A Brief Description of Recognition and Help

Typing the escape key says to the system, "if you know what I mean from what I've typed up to this point, type whatever comes next just as if I had typed it". What is displayed on the screen or typescript looks just as if the user typed it, but of course, the system types it much faster. For example, if the user types DIR and escape, the system will continue the line to make it read DIRECTORY.

TOPS-20 also accepts just the abbreviation DIR (without escape), and the expert user who wants to enter the command in abbreviated form can do so without delay. For the novice user, typing escape serves several purposes:

Confirms that the input entered up to that point is legal. Conversely, if the user had made an error, he finds out about it immediately rather than after investing the additional and ultimately wasted effort to type the rest of the command.

Confirms for the user that what the system now understands is (or isn't) what the user means. For example, if the user types DEL, the system completes the word DELETE. If the user had been thinking of a command DELAY, he would know immediately that the system had not understood what he meant.

Typing escape also makes the system respond with any "noise" words that may be part of the command. A noise word is not syntactically or semantically necessary for the command but serves to make it more readable for the user and to suggest what follows. Typing DIR and escape actually causes the display to show:

         DIRECTORY (OF FILE)

This prompts the user that files are being dealt with in this command, and that a file may be given as the next input. In a command with several parameters, this kind of interaction may take place several times. It has been clearly shown in this and other environments that frequent interaction and feedback such as this is of great benefit in giving the user confidence that he is going down the right path and that the computer is not waiting to spring some terrible trap if he says something wrong. While it may take somewhat longer to enter a command this way than if it were entered by an expert using the shortest abbreviations, that cost is small compared to the penalty of entering a wrong command. A wrong command means at least that the time spent typing the command line has been wasted. If it results in some erroneous action (as opposed to no action) being taken, the cost may be much greater.

This is a key underlying reason that the TOPS-20 interface is perceived as friendly: it significantly reduces the number of large negative feedback events which occur to the user, and instead provides many more small but positive (i.e. successful) interactions. This positive reinforcement would be considered quite obvious if viewed in human-to-human interaction terms, but through most of the history of computers, we have ignored the need of the human user to have the computer be a positive and encouraging member of the dialog.

Typing escape is only a request. If your input so far is ambiguous, the system merely signals (with a bell or beep) and waits again for more input. Also, the escape recognition is available for symbolic names (e.g. files) as well as command verbs. This means that a user may use long, descriptive file names in order to help keep track of what the files contain, yet not have to type these long names on every reference. For example, if my directory contains:

        BIG_PROGRAM_FILE_SOURCE
        VERY_LONG_MANUAL_TEXT

I need only type B or V to unambiguously identify one of those files. Typing extra letters before the escape doesn't hurt, so I don't have to think about the minimum abbreviation; I can type VER and see if the system recognizes the file.

DonHopkins · on April 12, 2021

The "vaunted Unix Philosophy" has been intellectually and pragmatically bankrupt for decades.

https://en.wikipedia.org/wiki/The_UNIX-HATERS_Handbook

http://www.art.net/~hopkins/Don/unix-haters/BarfBag.gif

https://web.mit.edu/~simsong/www/ugh.pdf

Preface

“I liken starting one’s computing career with Unix, say as an undergraduate, to being born in East Africa. It is intolerably hot, your body is covered with lice and flies, you are malnourished and you suffer from numerous curable diseases. But, as far as young East Africans can tell, this is simply the natural condition and they live within it. By the time they find out differently, it is too late. They already think that the writing of shell scripts is a natural act.”

— Ken Pier, Xerox PARC

The Shell Game, p. 149

Shell crash

The following message was posted to an electronic bulletin board of a compiler class at Columbia University.

    Subject: Relevant Unix bug
    October 11, 1991

    Fellow W4115x students—
    While we’re on the subject of activation records,
    argument passing, and calling conventions, did you
    know that typing:

    !xxx%s%s%s%s%s%s%s%s

    to any C-shell will cause it to crash immediately? 
    Do you know why?

    Questions to think about:

    • What does the shell do when you type “!xxx”?

    • What must it be doing with your input when you type
    “!xxx%s%s%s%s%s%s%s%s” ?

    • Why does this crash the shell?

    • How could you (rather easily) rewrite the offending
    part of the shell so as not to have this problem?

    MOST IMPORTANTLY:

    • Does it seem reasonable that you (yes, you!) can bring what 
    may be the Future Operating System of the World to its
    knees in 21 keystrokes?

Try it. By Unix’s design, crashing your shell kills all your processes and logs you out. Other operating systems will catch an invalid memory reference and pop you into a debugger. Not Unix.

Perhaps this is why Unix shells don’t let you extend them by loading new object code into their memory images, or by making calls to object code in other programs. It would be just too dangerous. Make one false move and—bam—you’re logged out. Zero tolerance for programmer error.

The Metasyntactic Zoo

The C Shell’s metasyntactic operator zoo results in numerous quoting problems and general confusion. Metasyntactic operators transform a command before it is issued. We call the operators metasyntactic because they are not part of the syntax of a command, but operators on the command itself. Metasyntactic operators (sometimes called escape operators) are familiar to most programmers. For example, the backslash character (\) within strings in C is metasyntactic; it doesn’t represent itself, but some operation on the following characters. When you want a metasyntactic operator to stand for itself, you have to use a quoting mechanism that tells the system to interpret the operator as simple text. For example, returning to our C string example, to get the backslash character in a string, it is necessary to write \\.

Simple quoting barely works in the C Shell because no contract exists between the shell and the programs it invokes on the users’ behalf. For example, consider the simple command:

    grep string filename:

The string argument contains characters that are defined by grep, such as ?, [, and ], that are metasyntactic to the shell. Which means that you might have to quote them. Then again, you might not, depending on the shell you use and how your environment variables are set.

Searching for strings that contain periods or any pattern that begins with a dash complicates matters. Be sure to quote your meta character properly. Unfortunately, as with pattern matching, numerous incompatible quoting conventions are in use throughout the operating system.

The C Shell’s metasyntatic zoo houses seven different families of metasyntatic operators. Because the zoo was populated over a period of time, and the cages are made of tin instead of steel, the inhabitants tend to stomp over each other. The seven different transformations on a shell command line are:

    Aliasing                     alias and unalias
    Command Output Substitution  `
    Filename Substitution        *, ?, []
    History Substitution         !, ^
    Variable Substitution.       $, set, and unset
    Process Substitutuion.       %
    Quoting                      ',"

As a result of this “design,” the question mark character is forever doomed to perform single-character matching: it can never be used for help on the command line because it is never passed to the user’s program, since Unix requires that this metasyntactic operator be interpreted by the shell.

Having seven different classes of metasyntactic characters wouldn’t be so bad if they followed a logical order of operations and if their substitution rules were uniformly applied. But they don’t, and they’re not.

[...followed by pages and pages of more examples like "today’s gripe: fg %3", "${1+“$@”} in /bin/sh family of shells shell scripts", "Why not “$*” etc.?", "The Shell Command “chdir” Doesn’t", "Shell Programming", "Shell Variables Won’t", "Error Codes and Error Checking", "Pipes", "| vs. <", "Find", "Q: what’s the opposite of ‘find?’ A: ‘lose.’"]

My judgment of Unix is my own. About six years ago (when I first got my workstation), I spent lots of time learning Unix. I got to be fairly good. Fortunately, most of that garbage has now faded from memory. However, since joining this discussion, a lot of Unix supporters have sent me examples of stuff to “prove” how powerful Unix is. These examples have certainly been enough to refresh my memory: they all do something trivial or useless, and they all do so in a very arcane manner.

One person who posted to the net said he had an “epiphany” from a shell script (which used four commands and a script that looked like line noise) which renamed all his '.pas' files so that they ended with “.p” instead. I reserve my religious ecstasy for something more than renaming files. And, indeed, that is my memory of Unix tools—you spend all your time learning to do complex and peculiar things that are, in the end, not really all that impressive. I decided I’d rather learn to get some real work done.

—Jim Giles, Los Alamos National Laboratory

dolmen · on April 12, 2021

CSH Programming considered harmful (1991)

http://harmful.cat-v.org/software/csh

TeMPOraL · on April 12, 2021

Or CMake.

I can excuse shell languages because they're optimizing for being fast to type in a command line, while also being used for writing programs. But other than that, I weep when I think of the time being wasted on languages with arcane syntax. I wish the world would just switch over to s-expressions.

I understand that many people will find well thought out syntax cleaner than writing trees. But what languages like Bash or CMake, and arguably JS too, have in common is that their syntax wasn't thought out, it grew organically into a mess. S-expressions eliminate this entire class of problems up front.

iso1631 · on April 12, 2021

> If people think JS is bad wait until they try to do anything moderately complex in shell script

The number of times I've started with a simple bash script and regretted it down the line. Now I refactor into perl far earlier in the process

wruza · on April 13, 2021

To me the question always was what’s the point of [ba]sh for non-interactive shell scripting, when there is perl (I’m aware that it has issues with versions, portability to non-PCs, etc, so think “theoretical standardized lightweight perl mode for sh”, not a real one). Shell syntax and semantics are a pile of ugly hacks and landmines, which were done because everyone wanted to program cli. When you want to program, just take a programming language, not a poor excuse of it. The amount of time you end up wasting with bash is much greater than reading perldoc *, and mastering bash never returns an investment.

laci37 · on April 12, 2021

My second task at my first job was to connect up a horrible mess of bash, sed, awk, python and R to use a web service as input instead of local files. I ended up rewriting the whole thing to python. It was a good decision.

mikedilger · on April 12, 2021

I still use this hack without thinking about it in case the variable on the left is empty. But I suppose I should actually be quoting.

  [ $X == "hello" ]

When X is empty, it gives:

  example.sh: line 5: [: ==: unary operator expected

So I fix it (apparently wrongly) with:

  [ x$X == x"hello" ]

I'll make a point of doing this instead:

  [ "$X" == "hello" ]

koala_man · on April 12, 2021

Definitely quote. `[ x$X == x"hello" ]` may fail for a variety of different values of X like `*`, `Rocky_[1976].jpg`, `hello world` and `foo = xfoo -o bar` Quoting works in all those cases.

naniwaduni · on April 13, 2021

Of the listed, it should only actually fail on the ones with spaces in them (or rather, the ones that get split according to IFS).

koala_man · on April 13, 2021

Whether or not it should, it may fail for any kind of glob.

This could be either because the glob ends up unintentionally matching something:

    $ var='*'; touch xfoo xbar
    $ [ x$var == x"hello" ]
    bash: [: too many arguments

    $ var="Rocky_[1976].jpg"; touch xRocky_7.jpg
    $ [ x$var == x"Rocky_[1976].jpg" ]
    (false, should be true)

or because you have something like nullglob or failglob enabled:

    $ var="Rocky_[1976].jpg"; shopt -s nullglob
    $ [ x$var == x"hello" ]
    bash: [: ==: unary operator expected

kiwijamo · on April 12, 2021

If you wish to support the author of ShellCheck, you can become a GitHub sponsor: https://github.com/sponsors/koalaman

Disclaimer: I am one of the current sponsors.

spockz · on April 12, 2021

The GitHub docs speak of being able to do single donations but I’ve not seen the option yet. Is it something you need to explicitly configure as the one being sponsored?

koala_man · on April 12, 2021

One-time donations were not originally an option, but I'm happy to see it's been added since. It's now enabled. Thanks!

dcminter · on April 12, 2021

Thanks for that tip, I've benefitted from Shellcheck a bunch of times so I've just sponsored and wouldn't have without your comment.

I've puttered along for years writing little bash scripts for one thing or another and always had the uneasy feeling that I was littering them with errors.

Shellcheck's linting showed me (a) that I was right and (b) how to fix them. It's made me a better developer and I'm so grateful for it!

koala_man · on April 12, 2021

Thank you for your support ^____^

majkinetor · on April 12, 2021

Amazing how people still put up with this nonsense. Bash really has no place in this age.

citrin_ru · on April 12, 2021

POSIX shell is a bit quirky language and bash add a few quirks on its own, but you can write reasonably working shell code if you'll approach it like any other programming language: learn it (read guides, read man), choose a style guide (or write own), use static analyzers (shellcheck is a great tool), test code on different inputs, ask to review your code.

For some reason it is commonly believed that writing shell so easy that one don't need anything of this. Don't even need to learn it.

majkinetor · on April 12, 2021

I guess you can use that approach for pretty much anything in the world. You can totally use it for brainfuck.

However, good systems impose elegance without users having to create complex standards around it.

citrin_ru · on April 12, 2021

1. Yes, but there are little reasons to use brainfuck in production code, and there are reasons to use POSIX shell: 1. It is available on practically all UNIX-like systems from NetBSD to Ubuntu in their default installation (no additional dependencies) 2. For small scripts it can run faster than Python/Ruby/Perl thanks to small startup time. 3. It is a handy tool when all you need a small script which mostly runs other commands/CLI tools (but you still have to spend at least little time to learn it first).

May be availability of the shell is an unfortunate historical artifact, yet it still the reason why one may want to use it.

2. Idea to impose elegance upon users is attractive, but even if we ignore that it makes system/language design harder, different people can have different option on what is elegant and what is not. Having said that I like Go (which tries to enforce what can be enforced) because options of its author are not too far from mine. But not everybody is happy with Go and it is not hard to find its criticism.

majkinetor · on April 12, 2021

> people can have different option on what is elegant and what is not

Thera are also measurable effects of elegant system like number of bugs, questions, libraries, standards, etc.

citrin_ru · on April 12, 2021

To compare say how language syntax/features affect number of bugs, questions, etc you have to make everything else to be equal and it is very hard if possible at all in the real world.

klapatsibalo · on April 12, 2021

I agree. However, we had to create complex C "standards" for dos and don'ts in order to _try_ to write safe code.

And C is still omnipresent. So there must be some other characteristic of Bash that makes it not fit for this age (or some characteristic of C that makes it irreplaceable, that Bash does not have)

majkinetor · on April 12, 2021

> And C is still omnipresent.

And actively being replaced by rust each day because those "standards" are obviously not enough.

And there is some other characteristic of C - its portable.

skywhopper · on April 12, 2021

Bash has multiple better solutions to this. This is a really really really old hack. New code that uses it is either trying to be backwards compatible to the extreme or is more likely just someone unthinkingly copying what they’ve been taught or seen.

qiqitori · on April 12, 2021

What's a decent replacement shell? Sometimes I think there should be one, with all backwards compatibility for if/looping thrown out, but still adhering to the core concept of mostly being made of a handful of builtins and otherwise executing commands (i.e., without having to type system("") or exec("") etc. to execute commands).

lmm · on April 12, 2021

tclsh is very nice for scripts - it's a first-class part of TCL which is a proper grown-up programming language, but it pulls it very much into being a shell, and it's been around for decades so there's a decent chance that any old unix system will have it installed.

orthoxerox · on April 12, 2021

Does it have to be a shell? I would be fine with having bash as a shell for basic stuff like piping and a non-interactive scripting language for actual scripts.

hnlmorg · on April 12, 2021

I've written one specifically to solve if and looping: https://github.com/lmorg/murex (docs: https://murex.rocks)

Others I've not used are oil shell, which aims to be a better Bash. Elvish, which has a heck of a lot of similarities with my own project (and is arguably more mature too). There are REPL shells in LISP, Python and all sorts too.

So there are options out there if you want a $SHELL and happy to break from POSIX and/or bash compatibility.

TeMPOraL · on April 12, 2021

Seconding PowerShell. Or some future development that files off the rough edges, but otherwise follows the same sane principle of piping objects instead of unstructured text :).

michaelcampbell · on April 12, 2021

So. Much. Typing. though.

netmare · on April 12, 2021

Command/type aliases and autocompletion make this a non-issue on the console and advanced code editors. Personally, I don't mind the verbosity even when using plain editors, like Notepad2, my text/script editor of choice.

Extra-long command names and deeply namespaced .Net types can be a chore both to read and write, but I find the *nix tools way too cryptic. Never could get used to them, but then again, I've been on DOS/Windows all my life since the 90s.

To be honest, I'd occasionally get frustrated and think about switching to Linux. However, Python and PowerShell are the two things that have kept me on Windows. The fact that they are both open-source and cross-platform is just the cherry on top.

mkj · on April 12, 2021

Python with sh.py is often alright in the place of a shell script.

majkinetor · on April 12, 2021

Really ?

    sh.ls("-l", "/tmp", color="never")

majkinetor · on April 12, 2021

PowerShell obvioslly, if you can handle awesome.

Now there will be tones of angry dudes complaining about:

- Optional verbosity

- 1-5s startup time on some machines

- Random MS agenda to conquer and destroy

- Random problem with usage that is solved by RTFM

- Suggesting python, ruby, go, node, perl and whatever non-shell language

Thankfully, ignoring is still a great human feature.

jen20 · on April 12, 2021

First, I like PowerShell - sufficiently much to have used it as a login shell _on macOS_ for a while.

However, you grossly overestimate the “awesome”. To answer each “angry dude” point in turn:

- The verbosity is fine - in scripts, one should fully expand command names. Interactively, `gci` is fine (or the `ls` alias, though some of the default aliases are travesties).

- The startup time is insane. I start a new terminal window hundreds of times per day, and the startup time is what made me abandon POSH.

- Apache 2 licensed, so I don’t care. I do default to assuming that MSFT are doing something shady though, and I’m right [1].

- RTFMing is perfectly fine (and Get-Help has high quality documentation for built-ins), but one must accept that most users will not. ISE (if it existed outside of Windows) was a nice workaround for needing this.

- Python, Ruby, Go, Node and Perl are all more portable (see my last comment about FreeBSD support).

As an interactive shell, nushell [2] is probably the closest thing to the PowerShell experience which is not tied to .NET Core.

[1]: https://docs.microsoft.com/en-us/powershell/module/microsoft... [2]: https://github.com/nushell/nushell

majkinetor · on April 12, 2021

So your answers just confirm my 'angry dude' bullet list.

As far as I can see, the only really problematic one is startup time, which is surely something that will be resolved in the future. I witnessed this only rarely while I use PowerShell on Windows and Linux - it was on windows and it was on VM with very slow disk, and some modules that each keep functions separated by files.

I welcome nushell development, but you can't really compare it to pwsh - even if they continue developing it, its decade or so from being usable as main shell.

> you grossly overestimate the “awesome”.

Nah, its awesome. I save bunch of time any time I open the PowerShell. And any time when one tries to be quicker in some other language one is not, eventually.

jen20 · on April 12, 2021

In that case you cannot see sufficiently far: telemetry on by default is a non-starter. If the startup time were going to be fixed “real soon now”, it would have improved since 2006. It has not, especially when basic usability enhancements like “posh-git” are enabled.

> [nushell is a] decade or so from being usable as a main shell

News to me. I replaced Powershell with NuShell.

majkinetor · on April 12, 2021

> it would have improved since 2006

cross platform pwsh is just few years old

> I replaced Powershell with NuShell.

Sure, you can replace it with LOLCAT either, its totally legit.

Regarding startup time, I can't believe there is such drama. If that troubles you so much, you can easily workaround it by always keeping powershell process pool ready in the background (1 or 2 suspended instances until you use them).

scbrg · on April 12, 2021

> you can easily workaround it by always keeping powershell process pool ready in the background (1 or 2 suspended instances until you use them).

This is a pretty absurd take when the discussion is about alternative shells for writing scripts. Shell scripts are everywhere. A fairly heavily used UNIX box might easily execute hundreds of shell scripts every minute. A startup time of several seconds makes it completely unusable.

majkinetor · on April 12, 2021

Not sure what to tell you to break your bubble. I have literary thousands of pwsh scripts running in critical gov projects in production, most of them as part of CI, CD, build and automatic test but some of them serving millions of users. I had dozen of such projects in previous decade on Windows and several on Linux and never had any significant problem with slow startup really. Sometimes it happened that some machines had slow startup but team usually fixed that one way or another.

Pattern of usage of bash scripts where hundreds of shell script run every minute is IMO not something you should brag about in systems architecture. And pwsh is different - it has modules which are encapsulated and you don't have to start another instance of shell to prevent mingle. Parallelism ofc still benefits from faster startup.

scbrg · on April 12, 2021

> I had dozen of such projects in previous decade on Windows and several on Linux and never had any problem with slow startup really.

OK. So then startup time is not slow? Well, then it's of course a non issue - but then I also don't see why a process pool would be required.

> Pattern of usage of bash scripts where hundreds of shell script run every minute is IMO not something you should brag about in systems architecture.

Yet, you just did :) Anyway, it was hardly intended as bragging, rather than just stating facts. Lots of stuff are shell scripts. Lots of user facing commands are wrapped in shell scripts. Of the ~3k commands in my $PATH, around 500 are shell scripts. You may not think this is a good state of affairs, but it's still a fact. (And it's quite unclear to me what's so bad about it - an excellent use for scripts is providing environment for other programs.)