As always my take on this is that if you don't care about portability, why even bother with bash?
My rule of thumb is that if your script becomes too complicated that you find POSIX sh limiting, it's probably a good hint that you should rewrite it in a proper programming language. Bash scripting is an anti-pattern in my opinion, it's the worst of both worlds since it's neither portable nor that much better than POSIX.
I don’t really care about targeting, say, IRIX, or z/OS, or the early-2000s-era Windows POSIX Subsystem. But I do care about targeting Linux, macOS/iOS, all the modern living BSDs, and Android (where applicable.) I also maybe care about letting my software run under a Linux Busybox userland.
It’d be great to know what the lowest-common-demoninator standard is for just that set of targets. I don’t care about POSIX, but if that set of targets had a common formal named standard, I’d adhere to it rigorously.
Since it doesn’t, I just mostly adhere to POSIX — except where it gets to be too much (like having to write 1000-line Bourne Shell scripts.) When that happens, I look around to see if the set of targets I care about all do something the same way. If they do (such as all happening to ship some — perhaps ancient — version of Bash), then I break away from POSIX and take advantage of that commonality.
I don’t know enough about all the things these OSes do or do not have in common to truly code to the implicit standard 100% of the time. I wish I did. So, in most respects, wherever I haven’t done independent research, I have to hold myself to a much stricter standard — POSIX. But it’s only ignorance keeping me doing it!
>I don’t really care about targeting, say, IRIX, or z/OS, or the early-2000s-era Windows POSIX Subsystem.
Sure, those are very niche, but not all BSDs or every Linux distros have bash in the base distribution. And if you work with embedded system bash is very much a luxury. You mention busybox but this one is very much a crap shot because everything in busybox is opt-in. You want to use `seq`? Better make sure that it's configured. On the other hand you almost certainly won't have bash on these systems either.
At any rate if the alternative is between having to install python, ruby, perl or lua on my machine or having to maintain and hack on ginormous bash scripts I know what I'll chose.
But in general I like your approach, it's pragmatic. It's just pretty hard these days to find a system that ships bash but can't give you some decent scripting support, at least in my experience.
not just limited to embedded environments, some docker images used in cloud setups also don't have bash - alpine doesn't have it. The alpine image is used a lot, due to its small size; you can add bash easily, but bash is not part of the base image (there is a difference if you are in an ssh session to a container that doesn't have access to the net)
docker run --rm -it alpine
/ # bash
/bin/sh: bash: not found
Note that targeting bash on macOS means targeting bash v3, as Apple is not updating to bash v4+ (due to GPLv3) and has changed the default interactive shell to zsh in macOS 11.0+.
> it's the worst of both worlds since it's neither portable nor that much better than POSIX
Writing modern bash using the updated syntax, it is easy to make robust scripts and the syntax around things like quoting is simple and consistent.
It seems odd to me to suggest that one should either stick to the old brittle /bin/sh syntax or not use bash at all. It has it's place in the world and it should be done the modern way.
You'll find 10s of thousands of lines of bash in the kubernetes repo and all the files I checked are written using modern bash syntax. There must be a reason they choose bash for these parts.
Of course, in any case everyone should use the tools they like and find most suited to the task at hand.
Because it's everywhere anyone cares about and isn't some weird thing like zsh or Lord forgive us tcsh.
Also all of these modern are basically fancy wrappers around bash scripts. Even dockerfiles is just a way to run some shell commands at image build time and other shell commands at launch time. Much less puppet or ansible or user-data.sh ....
Shellcheck specifically supports POSIX, BASH, and Korn. Each of these has an important role in scripting.
Development of the formal ksh93 was halted and rewound by AT&T after David Korn's departure, precisely because the user community remains extremely intolerant to changes in functionality and/or performance. Korn shell development that requires a "living" shell should likely target both ksh93 and the MirBSD Korn shell (the default shell in Android).
BASH has diverged from the Korn shell in a few (annoying) ways, but remains core to the GNU movement and rightly commands its own audience. It is much larger than the MirBSD Korn shell.
POSIX shells were developed to target truly minimal systems - POSIX rejected Korn likely due to (Microsoft) XENIX on 80286 systems that only allowed 64K text segments. Korn was able to run on such machines, but the code was not maintainable. Clean C source for a POSIX shell is far easier to achieve for XENIX, and remains a better fit for embedded environments. This is likely why POSIX is not Korn.
For small scripts, bash is still the easier way to go. Other programming languages are not that immediate to do simple things. How many lines of code take piping a command into another in python, for example?
Also bash is not portable, but you can assume to find a recent version of bash on any Linux system. And for BSD systems or macOS, you can install it with a command. And yes, there are other ancient UNIX systems, but who uses them? But what other programming language can you be sure to find on any system?
Finally, bash it's quite efficient. I mean that it doesn't have the weight that a full programming language like Python has, especially the time that it takes to activate the program interpreter. That could make a difference in scripts that are executed a lot of time from other script or other programs. Or even in wrapper scripts, since they doesn't add noticeable delay when executing a command.
There is still a reason to use and know bash to me. Of course implementing whole programs in bash doesn't make sense, but for simple scripts it does.
Sometimes you write POSIX shell (or very nearly POSIX) for portability. Sometimes you write in "a proper programming language" for more powerful/flexible features. And sometimes you write Bash to get both.
Bash is ported to basically every system, it's a small binary (by today's grotesque standards), and it gives you some added functionality that would be otherwise annoying to re-implement in POSIX sh (or Bourne sh). But it's also far less complex than Python. In today's world, if someone has any shell, it's a safe bet that they either already have Bash, or they can install it just like they'd have to install Python.
It's less painful in the long term to use Bash rather than Python. Python is more time-consuming and costly. It's a larger space requirement, you have to manage more runtime dependencies and a virtual environment, debugging is worse, and there's more opportunity for bugs because it's more complex. Bash scripts also rarely get bugs down the line due to dependencies changing, whereas this happens in Python so frequently we have to pin versions in a virtualenv. When's the last time you pinned the version of a Bash script dependency?
There are two specific cases where you do not have BASH.
1. Ubuntu/Debian /bin/sh is the Debian variant of the Almquist shell, known as dash. The only non-POSIX element it includes is the "local" keyword (afaik).
2. Busybox, with the Almquist shell. Busybox also has a "bash" which offers a few cosmetic improvements to ash (an alias of [[ to [ is one that I can see in the source). I write POSIX shell scripts on Windows with busybox quite often.
The only other shell that omits everything outside of the POSIX specification is mrsh. I don't think mrsh is widely-deployed in any major distribution.
(You also aren't going to have bash in AIX/HP-UX and maybe a Solaris base load, but as the parent article says, we aren't talking about dinosaur herders.)
For #1, you can just put #!/bin/bash at the top of the file to use Bash. Bash is still available, it’s just not the default for scripts that specify #!/bin/sh.
#2 is still currently tricky, but Rob Landley (former Busybox maintainer) is working on a full bug-for-bug compatible Bash clone called toysh which will be included in an upcoming release of Toybox[1]. Once that’s released, I’m looking forward to (hopefully) never writing a script for BusyBox ash again.
Bash is ugly and rife with pitfalls, but it can be incredibly productive for certain classes of problems if you know what you’re doing. Trying to use Python or Go to whack some external commands together feels very cumbersome to me.
For pure programming, “real” languages are absolulely preferable in almost every way, but all of them fail when it comes to running external programs and redirecting their output in a way that doesn’t make me pull my hair out. As a heavy user of both “real” programming languages and shell scripting languages, I’m left craving something that brings the best of both worlds.
There are a number of newer shell projects in this vein that are very exciting, like Oil [0], Elvish [1], and Xonsh [2]. I was also hopeful that Neugram [3] would go somewhere, but the project seems to have died out. While many people cite Bash’s lack of portability as a reason not to use it, I find Bash to be very portable for my use cases and avoid using these newer shells for their lack of portability. Maybe one day we can have nice things.
Superficially it's quite similar to elvish (purely by coincidence) but it's aimed around local machine use (eg by developers and devops engineers) so is as much inspired by IDEs as it is by shells.
Sad that none of those projects took the easy route. If you want to replace Bash, do it the way Bash replaced Bourne: Make it backwards compatible.
Bash acts like 'historical versions of sh' if it was invoked as sh. So, a Bash replacement could act like Bash if it was invoked like bash. Then implement all the extra crazy shit if you invoke it as slash or something.
Assuming it was a small, fast, compiled binary, this would take off in all the distros pretty much immediately. And if you really want it to succeed, implement a spec first, and implementations second. Add a slash implementation to Busybox, and then it's on every embedded Linux system in the world.
It would certainly be nice, but I wouldn’t call that the easy route. Creating a new shell that’s fully backwards compatible with the monstrosity that is Bash sounds like a massive undertaking. POSIX or the Bourne shell is probably an easier target, and there are numerous alternative shells that are POSIX-compatible and add nice new features or optional alternate syntaxes. But I agree, to properly dethrone Bash would probably require backwards compatibility with it.
Personally I don't see the point dethroning bash. I mean Bash still hasn't completely dethroned the Bourne shell. And now there is a lot more competition in shells than there ever was.
Personally I think the better approach is to accept that bash/sh will always be about for legacy stuff and for alternative shells to carve out a niche elsewhere. Particularly because to some of bash/sh's pain points can't be addressed without break compatibility in the first place (like handling file names with spaces).
This used to be my take, but I'm not sure any more. Now, I don't think Python is ideal, and I am looking for something that maybe fits better than it in that gray zone of programming language vs "quick" script, but I can't see your point about debugging and bugs: bash is notoriously annoying to write bug free, and I can't see what's the issue with the Python debugger.
Also if we're in that gray zone of bash vs Python, you can pretty much stick to the stdlib.
I'm hopeful that Deno can be just that runtime. Since its a single binary, its trivial to install and script for. TBH, I don't have any technical reasons for JS/TS over python besides that I have coworkers that have never had to learn python and would prefer not to.
/**
* cat.ts
*/
for (let i = 0; i < Deno.args.length; i++) {
const filename = Deno.args[i];
const file = await Deno.open(filename);
await Deno.copy(file, Deno.stdout);
file.close();
}
A distribution that removes Python 2.7 once upstream security support has been cut. Since Python is frequently used on network-facing services, it is unconscionable to allow its use past January 2020. Heck, one might even have setuid programs in Python and you don't really want to risk an unfixed local privilege escalation either.
> AWS Linux 2
A butchered version of CentOS
> and macOS
Not a distribution, and Apple doesn't exactly have a good track record with keeping things up to date or secure.
> Or more likely, the world isn't as clear cut as you might think
Given that the vast majority of the Python ecosystem is no longer compatible with 2.7, there isn't even a technical reason for kicking around 2.7 anymore.
Python 2.7 is trivial to build and run if you really, really want or need it, but since all the major upstream distributions have removed it, it sends a clear signal that you must take the risks into serious consideration before doing it.
Where size of data >> RAM (but less than free space in /tmp). It of course possible, but I suspect would require more effort and would work slower than a shell line above.
If you want to compare memory efficiency. The sort+uniq -c would likely be more optimal in a high level language as you would there keep only the count+value in a defaultdict or alike instead of sorting the entire dataset, which requires keeping it all in mem.
As for brevity, yeah, this particular example would require 5 lines instead of 1.
How say you? To even ask the system a question in python, you have to "import os". Is that not a library? I'm not a python-first type of person, but I have hacked enough scripts to be functional. My one take away is how python core is not very useful, and to even remotely do anything one must have a list of imports.
then why must it be imported as a 3rd party? this is my question on python in general. i get needing to explicitly import things that are not part of the core/standard, but why are core/standard required to be imported? why can't the methods they provide not just immediately available.
*clearly, i've never taken a python CS style class, but it this importing core functions is just strange to me
"Third party" means a third party provided it: not yourself, and not the Python distribution, but someone else.
"os" is a second-party library. You didn't write it; it was included in the Python distribution for you.
What you're asking, I think is why the "os" functions are not, in Python terminology, "builtins." The builtins are functions that are available without any kind of import: like len(), max(), sorted(), etc.
Why do you have to "import os"? Python was designed that way, that's all. So were the majority of other languages that come with standard libraries. (JavaScript is one exception that comes to mind, but there aren't many such exceptions.)
The discussion above relates to "virtual environments", which are ways to manage third-party dependencies. My claim is that virtualenvs, while handy for general-purpose development, are basically pointless for replacing shell scripts. You don't need them -- the core language and its standard library are sufficient for most shell scripting purposes. I'm basing this opinion on my ~20 years of using Python for this kind of work.
You can have other shells that are not bash, true. But you are guaranteed to find a bash shell in any Linux distribution (well, except minimal distros like OpenWRT or Alpine Linux that for saving space use busybox). Let's say that in any server or desktop Linux installation you find bash.
You would probably find also python, but what version? Python3? These days you are probably guaranteed to find them, except that there are old Red Hat servers still in production where is not installed. And Python3, what version? 3.7? Older? What features are safe to use and what are not? Do you have to research to find out? Or you are stuck with using Python 3.2 just to be safe? Well, in the time you wasted thinking about that, you would have finished writing the script in bash.
3. the ability to run parts of scripts and build up into a bigger script
Currently I write a lot of small-ish scripts using either bash or bash plus a text editor. I would like to use a less obtuse language than bash, but the interactive part is really important. I don't want a system where I have to run exploratory and test commands in bash and then translate them into a different language, because that means figuring out how to say it twice and doing extra debugging.
> When the == and != operators are used, the string to the right of the operator is considered a pattern and matched according to the rules described below under Pattern Matching, as if the extglob shell option were enabled.
POSIX syntax is more consistent and more portable:
It's always fun to assume everyone is running Bash until your Mac's zsh or your Ubuntu's dash (or whatever) choke on your script a few times. When this happens enough times you eventually reconsider...
Until your users/colleagues source it into another shell, or copy-paste some part of it into their shell/Makefile/Bazel/other script... and then either you feel sorry for them, or you laugh at them and blame them for burning themselves and trashing their system. (Hopefully your choice didn't cause their files to get wiped. Unless you think they deserved it for daring to paste your code somewhere else, I guess.)
> This problem is not exclusive to bash, try copy-pasting python 2 code into python 3.
No, it is highly exclusive to Bash actually. In Bash, even one-character mistakes have a high potential to cause catastrophic damage. Just splitting or globbing on the wrong character can wipe your filesystem. As well as issues like these, where your string being '!' suddenly causes the command to be interpreted entirely differently. And Bash has modes that keep going when there is an error so it's not like you'll even know you've encountered an error or get any kind of visible crash + stack trace necessarily. It just keeps trashing everything further and further. These are extremely wildly different from Python.
> Surely, no domain in IT is idiot-proof.
I would love to know what happens when you say this to a user who pastes your Bash in zsh. Do you tell them your code isn't idiot-proof? Is everyone who ends up running Bash in zsh for some reason or another an idiot?
> If you're not doing `set -euo` and if you run your scripts as root before testing them, you're asking for something wrong to happen.
...because you can't delete important files unless you're root, right? Subfolders in your home directory are totally OK being deleted? You store your documents so that only root can modify them?
And do you know set -e has a ton of its own pitfalls and doesn't save you from this? It's a half-broken mitigation. I don't have energy to keep arguing here, but I would suggest go learning about the pitfalls of Bash and set -e and such.
No, if you're using a language that requires you to perform some obscure incantation like "set -euo" before running your scripts, you're asking for something wrong to happen.
If people copy-and-paste complex shell scripts without testing, you’re going to have a bad day no matter what you do. The solution is to work on better practice, use of shellcheck, rewriting complex logic in Python, etc. since that’ll pay dividends in so many other ways.
I've been seeing this argument for decades and yet I've never run into any issues like you describe.
When I write a bash script it is for my use or in my production system. In either case I call bash explicitly (/usr/bin/env bash) so the script will work correctly or not at all.
Rarely do I have a project where there is a requirement to write a "run anywhere" script but if it were I would stick to posix.
There are more people in the world than you. You might do everything perfectly; other people won't. Especially users who have yet to learn about these pitfalls. If you haven't run into this, either you're lucky enough that it doesn't apply to your situation, or you haven't become aware of it or tried to pay attention to it. I know I've seen people try to run Bash in their Mac's zsh (not necessarily deliberately either; some programs just run the default shell) and watched it crash and burn. I'm not condoning it, I'm saying it happens whether you and I like it or not, and you can either blame the victim (which I guess wouldn't be the first time I see this happen), or you can do something about it.
Am I understanding correctly that your point is that we shouldn't write modern bash scripts because some number of untrained people will have problems by accidentally misusing the scripts (wrong interpreter, blind copy/paste, etc). Is that a correct summary?
To me this sounds like saying we shouldn't have powerful tools like chainsaws because every year some people without safety training use them wrong and hurt themselves.
No, that's just an exaggerated caricature of what I'm saying.
I'm only saying you should avoid using modern things needlessly. This means, don't do it when the POSIX-compliant syntax also serves your needs. e.g., don't do [[ -f "$foo" ]] when you could just as easily do [ -f "$foo" ]. And if you do need to use Bash-specific features, make sure it's safe enough that foreseeable misuses will fail safely. Or when that's not possible, try to mitigate it somehow (maybe with a comment, or by putting in some extra dynamic checks, or other solutions you can think of).
And this goes for other things too, not just Bash. Probably outside programming too.
> I'm only saying you should avoid using modern things needlessly.
I agree with that idea.
However, I don't think it applies in this case. The point of using modern bash syntax is that it fixed the brittleness and pitfalls of posix sh. This is significant. This is what I need in production. This is why our style guides insist on modern bash syntax for bash scripting.
I'd say the rest of your suggestions apply to shell scripts no matter what syntax is used.
Also, bash on Mac is shitty. I've run into problems with it before (can't remember exactly what it was) so we've reversed it: our shell scripts are written in zsh, and we just ensure that zsh installed everywhere.
Besides that bash has its problems with array variables. Unless I switch to one of the "real" scripting languages, I use zsh too and start my scripts with
if [[ -n "$BASH" ]] ; then
echo $0: use zsh, not bash
exit 1
fi
Earlier on I used the more compact
[[ -n "$BASH" ]] && { echo $0: use zsh, not bash; exit 1 }
> For anyone writing bash scripts today I would avoid the old test/[ functions entirely.
I think it's better said that anyone writing bash scripts today should always pass their script to shellcheck.net. It can even be installed locally in case sharing the script online is undesired.
Shellcheck will not only look for problems in the script but it will also usually provide some background about how the problem is a problem and a suggested workaround or solution.
I would avoid using bash for writing scripts (i.e., non-interactive use). There are better shells for scripting, like the one most Linux distributions are using, derived from the default shell in NetBSD, derived from the Almquist shell.
The thing I miss most when using Linux is the default NetBSD shell. It has the speed of dash plus command line history.
The post mentions the issue is fixed by POSIX, which… notably does not standarize -o -a and parens. I guess now we know why POSIX didn't standardize those.
POSIX did standardize those, but the functionality is marked XSI, so only XSI-conforming implementations are required to provide it.
There is never a reason to use them, though. They can be replaced by the shell's && and || operators, which nicely bypasses the parsing ambiguities at the same time.
> There is never a reason to use them, though. They can be replaced by the shell's && and || operators, which nicely bypasses the parsing ambiguities at the same time.
Not sure I agree. The equivalent of more complicated expressions like (a AND b) OR (c AND d) with && and || isn't always so "nice" IMO. e.g., try it for this:
[ -f a -a -f b -o -f c -a -f d ]
I think the simplest syntax will require you to spawn subshells, which is neither particularly nice, nor particularly performant (especially on Windows).
{ test -f a && test -f b; } || { test -f c && test -f d; }
?
Yes, it's more verbose, but the original expression isn't particularly readable either, and if you'd want to evaluate something like (a OR b) AND (c OR d) you'd need to introduce parentheses anyway.
It's not "so much" worse, it's just mildly worse. All I was saying here was the verbosity is a reason for using them. Obviously there are reasons not to use them too, which I said in my own top comment.
Note also that this does not need any subshells. I think dataflow was thinking of (test -f a && test -f b) || (test -f c && test -f d) instead, but { } isn't worse than that.
With braces I'd prefer the test syntax actually, this is too many special characters to be comfortably readable for me.
And yes I know it creates subshells, that was precisely what I was saying in my comment where I said the simplest syntax I could think of (i.e. that very code) creates subshells.
> Note that it's not even a bug here; it's necessary for disambiguation.
I would call it a bug when there's one valid way to read it, and [ fails to see it. I don't see the ambiguation. For example, zsh's [ doesn't have these errors:
I was just being a little sloppy in what I wrote. I meant you can encounter ambiguities in the language when you start using -a/-o or (). I didn't mean those particular examples' parse trees are ambiguous.
Well, never seen any reason to use -a or -o. To me they are less clear and you are prone to error. I find things like: [ "$a" = "$b" ] && [ "$c" = "$d" ] much more clear.
As usual, the problem is that Sh & co. use in-band control, especially when talking to utilities. Just a mess of text being passed around, of which every program must make sense on its own—but this happening in shell's own functions takes the cake.
Moreover, similar problems are inherited by some programs that try to do shell-like commands but put their own syntax in the way. E.g. Ansible, where YAML, JSON, module call DSL and Jinja are all mixed on top of the commands so you can bump into any of them.
Interestingly enough, this isn't a mess in the shell's own builtins. The [ ] expression syntax is provided by the program /usr/bin/[ which is (often) a symlink to /usr/bin/test. It's just a clever trick! [ is a program that evaluates it's argv as an an expression, less the trailing ], and then returns the result as an error code.
Actually re-reading your post you probably know that already! Nevermind then. Maybe someone reading this will be enlightened
Yes, I know about `[` being both a program and (potentially) a built-in. However the problem is exactly that an Sh-style shell can't easily divorce itself from the same gotchas unless it invents a DSL for its builtins with different semantics—because normally it must parse vars before invoking the command, and thus vars are interpolated into the text, which the command is. If the shell does the sane thing of interpreting the command before evaluating vars, it means breaking off from the normal semantics and thus being inconsistent.
I imagine the introduction of `[[` and `((` in Bash may have come partly from this reasoning (to abandon the POSIX semantics for these commands), though I never delved far enough into nerdery to learn the difference from `[`—so don't know if they do evaluate vars in the sensible way.
I keep wondering if there are any shells or other software calling into utils, that manage to walk the edge of writing commands without bumping into such syntax problems—but avoid having to "enquote" "each" "argument" all the time.
So, in conclusion it's a hack... but, it's also incredibly robust. It doesn't hurt, and we acknowledge there are cases where it could help. It's idiomatic enough that literally everyone who's done a nontrivial amount of scripting is familiar with it.
At some point, "decades old idiomatic hack that increases robustness" stops being a hack and becomes an idiomatic way to increase robustness.
A better conclusion would be "yes, keep using it".
> So, in conclusion it's a hack... but, it's also incredibly robust.
You know, shell scripts are a hack.
I really wish there was a middle ground between shell scripts and say python. Python has proper lists and hashes and loops and can manipulate paths with spaces or quotes or unicode. It doesn't get dragged down by the escape-an-escape-within-a-regex nonsense that makes things undecipherable. or ${VAR:-foo} or ${VAR##ugh}
But python is a bit fumbly when it comes to invoking external programs. try: ... subprocess.bleh except: ... ugh.
> But python is a bit fumbly when it comes to invoking external programs. try: ... subprocess.bleh except: ... ugh.
I actually tend to prefer that these days — if I write shell I'm tossing in "set -e -u -o pipefail" anyway and subprocess's newer APIs end up being cleaner when you need to do anything non-trivial for error handling (I usually end up with one `try:` block for most of the program and maybe one or two `except: cleanup(); raise` blocks. Having the logic be clean and consistent saves you so much time when you revisit that program a month later.
The biggest wins I've seen from quoting external program arguments — either not needing to do so at all or passing things cleanly through shlex.quote makes it so much easier when you have the possibility of any special characters in arguments.
I learnt this trick from OpenBSD 2.5 (yes, a long time ago).
I still remember getting root with a shell-script mp3 player that a friend coded because of this.
The exploit was based in inserting some \$\(
make_a_new_root_account_in_etcpassw \) inside the variable inside the test.
As a result, both my friend and I stopped writing shell scripts as CGI and started learning perl just to do it.
Note that this is a relic of the past, as TFA explains. Modern bash can be totally manageable and even fun, if you take the time to learn and have a good style guide and shellcheck by your side.
I use this hack as a matter of course, having internalised it aeons ago. Bash does lots of things better now but as others have commented it is still so fundamentally weird that an innate conservatism is often a wise approach, especially when one lacks the time (or motivation) to explore its "newer" features.
I sometimes wonder why bash transpilers aren't more popular? Bash is such a strange language (or maybe just very old with lots of cruft) that it's impossible to write anything meaningful without check SO 5 times.
Given i haven't spent a lot of time writing shell script yet but the syntax is just super weird, like I recently learned(1) that when you When you write if [ -e /etc/passwd ]; then .. that bracket is not shell syntax but just a regular command with a funny name.
Is there a recommended bash transpiler like what babel is to Js? So i can write bash scripts using a more coherent language and it spits out .sh files.
That's more or less what Perl is (a more coherent language).
Perl may look ugly, but it doesn't have the escaping problems that Unix shells have. And most of the usual shell features (listing files, launching commands, etc...) are built-in. It is also available in almost all Unix systems, maybe even more so than bash.
> It is also available in almost all Unix systems, maybe even more so than bash.
That's a bit of a deceitful claim - maybe bash isn't on every UNIX box but /bin/sh sure is along with awk and cut and all the other associated helpers.
Even if perl is installed its likely missing all the cpan modules that make it useful which isn't great if the host doesn't have carte blanche access to install whatever it feels like.
As a sysadmin/devops person knowing the utilities that are guaranteed to be installed is essential, especially now that docker images trimmed of any fat are trendy for bogus security ideals.
Interesting that the answer to that question names the file as .bash – I do this too, but I didn’t know anyone else did.
I do this because my shell scripts use bash specific features, and I want to distinguish them from portable shell scripts that use POSIX shell features only. The latter being the only files that really deserve being named as .sh in my eyes :)
`if` and iterations were two things that annoyed me about bash enough to write my own $SHELL. It's not a transpiler though so I can't realistically use it for production services. However for quickly hacking stuff together for testing / personal use it's been a life saver due to having the same prototyping speed of bash but without the painful edge cases.
Not suggesting it's suitable for everyone though. But it's certainly solved a lot of problems for me.
That's false, [ is a bash shell builtin (it's also a coreutils program, but the program is not called unless you specify /usr/bin/[ or use a shell that doesn't have it built-in).
> I recently learned(1) that when you When you write if [ -e /etc/passwd ]; then .. that bracket is not shell syntax but just a regular command with a funny name.
That’s how it used to be. Nowadays, most of the time it is a built-in, not it‘s own executable anymore.
Not sure how another standard would make it better. But sure maybe something that used dash as a backend. It does seems like it really would suck to actually use, if my little experience with Javascript eco system is anything to go on. So different perspectives maybe?
"This happened because the utility used a simple recursive descent parser without backtracking, which gave unary operators precedence over binary operators and ignored trailing arguments."
It is a bit sad that his problem has to be encountered and resolved over and over and over again. Granted, it was a SIMPLE PARSER because resources were extremely tight. Note, I don't mean shells, I mean working with projects that implement parsers. As soon as someone busts out lex/yacc I know I'm going to see at least one "ah, we didn't think of that" issue. Parsers are hard.
While I agree that there can be pitfalls to parsing, most of the time it isn't as bad as this. The "test" or "[" program has a unique challenge in that it never gets to see the quotation marks or variable references, because the shell already expanded them.
So when test sees that one of its arguments is the string "-f", it has no way to know whether that string came from a variable like $var, a quoted string literal like "-f", or just the text -f. Most of the time if you're writing a parser for your own DSL you don't have this problem. You can tell the difference between the string literal "+" and the operator +.
They can be written in any compiled language you like or even an interpreted language, if you are prepared to add the dependency of an interpreter for that language at boot time.
This is a lot simpler than migrating to (and getting locked in to) systemd.
Systemd does a lot of things right: Having one process for the init, declarative definition of common options, streamlined interface. Note that many distros used to provide shell libraries for the same effect, but ended up with incompatible init scripts.
In the end, the init script style is just the bare minimum that works solely by POSIX features and is a PITA when it comes to advanced use cases like dependencies or machine-readable system state or platform independent init definitions. Systemd is certainly not the ultimate tool ever written, but it is a bold step in the right direction. Maybe someone will rewrite it someday (if so, presumably in rust) and I will appreciate the effort to make things even better.
> This is a lot simpler than migrating to (and getting locked in to) systemd
Depends on your perspective - for a developer, “do whatever you want in whatever way you’re comfortable with” is simple. As a sysadmin, I much prefer the simplicity of “there is one way to do it; features like resource limits and automatic restarts work out of the box, and are consistent across all services”.
The flexibility for the author makes it harder for the reader though. I can read a systemd service file from top to bottom and understand everything in a minute or two. That's valuable for me.
No, systemd doesn't really interpret much, unit files are mostly not a full replacement for what init-scripts were capable of. Which leads to lots of Exec=/some/shell/script.sh in unit files which is worse than before because now you have multiple files and more strange interactions.
Is that actually common? I write quite a few systemd unit files, and have yet to need a wrapper shell script. And the vast majority, or maybe even all, of the unit files my system has by default are also not calling shell scripts.
Init scripts where simpler because the boot process was well defined, meaning that you had all your scripts in rc.d that were executed in that particular order.
Now with systemd is more difficult to have something start after something else, you have all the dependencies and if you get them wrong it will not work. Worse, the boot process is no longer deterministic, meaning that 99 times it could work and it could break the 100th time.
On the other side, systemd is a necessary evil for a modern desktop system, where you have multiple events that could fire (hotplug, network, power events, etc). But not so sure if it's that necessary for a server or an embedded device.
No. First, you usually need both, Want= and After=. Second, that After= isn't really "after" in the traditional sense, because systemd will start A, not wait for A to be up and running and immediately after the "A has been started"-event start B. If you really want A to be available when B is started (which is the traditional init-script sense of "after") you need to modify the software A to signal completion to systemd or you need some horrible shellscript cludges in the Exec=-line of B that checks for the availability of A. Or an inbetween unit with a Exec=sleep 5 or something.
Claiming "it is just 1 line" is either inexperience or dishonesty.
Wondered for ages whether there is a real situation where this is needed. Good to see this publicly debunked. Now I have a resource to point to when I get the "better safe than sorry" argument.
What's been publicly and thoroughly debunked is the very idea of writing shell scripts in the first place, instead of using a non-ridiculous scripting language like Python, Perl, or JavaScript. If you want to be safe instead of sorry, then never write shell scripts.
People complaining about how underpowered shells are should have a look at Ion [1]. It's mostly the same as bash, but with first-class support for arrays, maps, byte slices, variable scoping, and (primitive) type-checking. Oh, and it's written in Rust.
> Amazingly, the x-hack could be used to work around certain bugs all the way up until 2015, seven years after StackOverflow wrote it off as an archaic relic of the past!
Oof. If people think JS is bad wait until they try to do anything moderately complex in shell script
Yup. These days if it's more than a few lines, I write it in Perl instead if it's for personal use, or Python if it's going to be shared (Python just takes me more effort still).
For all the vaunted Unix Philosophy it's amazing how clunky some of it is. Countless shell scripts still trip over spaces in filenames. Of those that don't, nearly all of them will still be confused by any unusual characters like newlines. If you want something done properly, you have to pull out a proper scripting language, to have readdir() and the ability to pass arguments directly to a process.
Even Perl, which generally excels in such tasks has weird lapses in convenience. You can run a command with one line of code without the possibility of confusion with `system("ls", "-l", $dir)`, but you can't get its output that way. There's no version of `system` that'd allow you to both explicitly specify each argument to the process, and obtain its output. You either use backticks and risk quoting trouble, or need to use `open`, which is a lot more verbose.
It's interesting that it took Microsoft to do a new approach in this regard. PowerShell has its amount of weirdness I really hate, such as that the environment escapes the program into the commandline, but it's really refreshing how it dispenses with brittle grep, awk and cut stuff.
Oh, I know. There's that, and File::Slurp, and a bunch of other stuff one would think would exist from the start, but for some reason don't. But those are all external modules that somebody had to write, that didn't exist at some point in time, and which sometimes one can't use because in some cases you can only rely in what's in core.
I'm just wondering at that bizarrely enough, Larry Wall (or somebody else) found it useful to have a simple, convenient way to execute a command with exact arguments, but not for when you need its output.
I must have written the same wrapper around open() several dozen times by now, due to needing to write scripts for cases where installing dependencies is undesirable.
> I'm just wondering at that bizarrely enough, Larry Wall (or somebody else) found it useful to have a simple, convenient way to execute a command with exact arguments, but not for when you need its output.
It had backquotes from the start. So it was as convenient (and as unsafe) as the Bourne shell.
Of course you quickly want more safety and backquotes are just legacy you can never use for serious stuff.
Bad example, yes. Plus it's a pointless thing to do to start with, because you run into trouble with special characters in filenames that way. Got to use readdir.
`open my $fh, '-|', 'ls', '-l', $dir` should do the trick I think, but apparently it only works on platforms with a "real fork" (so fine as long as you don't care about Windows, probably).
The TOPS-20 command interpreter with its regularized parameters and prompting command-line completion and inline help and noise words was so much better designed and user friendly than any of the pathetic Unix shells.
Typing the escape key says to the system, "if you know what I mean from what I've typed up to this point, type whatever comes next just as if I had typed it". What is displayed on the screen or typescript looks just as if the user typed it, but of course, the system types it much faster. For example, if the user types DIR and escape, the system will continue the line to make it read DIRECTORY.
TOPS-20 also accepts just the abbreviation DIR (without escape), and the expert user who wants to enter the command in abbreviated form can do so without delay. For the novice user, typing escape serves several purposes:
Confirms that the input entered up to that point is legal. Conversely, if the user had made an error, he finds out about it immediately rather than after investing the additional and ultimately wasted effort to type the rest of the command.
Confirms for the user that what the system now understands is (or isn't) what the user means. For example, if the user types DEL, the system completes the word DELETE. If the user had been thinking of a command DELAY, he would know immediately that the system had not understood what he meant.
Typing escape also makes the system respond with any "noise" words that may be part of the command. A noise word is not syntactically or semantically necessary for the command but serves to make it more readable for the user and to suggest what follows. Typing DIR and escape actually causes the display to show:
DIRECTORY (OF FILE)
This prompts the user that files are being dealt with in this command, and that a file may be given as the next input. In a command with several parameters, this kind of interaction may take place several times. It has been clearly shown in this and other environments that frequent interaction and feedback such as this is of great benefit in giving the user confidence that he is going down the right path and that the computer is not waiting to spring some terrible trap if he says something wrong. While it may take somewhat longer to enter a command this way than if it were entered by an expert using the shortest abbreviations, that cost is small compared to the penalty of entering a wrong command. A wrong command means at least that the time spent typing the command line has been wasted. If it results in some erroneous action (as opposed to no action) being taken, the cost may be much greater.
This is a key underlying reason that the TOPS-20 interface is perceived as friendly: it significantly reduces the number of large negative feedback events which occur to the user, and instead provides many more small but positive (i.e. successful) interactions. This positive reinforcement would be considered quite obvious if viewed in human-to-human interaction terms, but through most of the history of computers, we have ignored the need of the human user to have the computer be a positive and encouraging member of the dialog.
Typing escape is only a request. If your input so far is ambiguous, the system merely signals (with a bell or beep) and waits again for more input. Also, the escape recognition is available for symbolic names (e.g. files) as well as command verbs. This means that a user may use long, descriptive file names in order to help keep track of what the files contain, yet not have to type these long names on every reference. For example, if my directory contains:
BIG_PROGRAM_FILE_SOURCE
VERY_LONG_MANUAL_TEXT
I need only type B or V to unambiguously identify one of those files. Typing extra letters before the escape doesn't hurt, so I don't have to think about the minimum abbreviation; I can type VER and see if the system recognizes the file.
“I liken starting one’s computing career with Unix, say as an undergraduate, to being born in East Africa. It is intolerably hot, your
body is covered with lice and flies, you are malnourished and you
suffer from numerous curable diseases. But, as far as young East
Africans can tell, this is simply the natural condition and they live
within it. By the time they find out differently, it is too late. They
already think that the writing of shell scripts is a natural act.”
— Ken Pier, Xerox PARC
The Shell Game, p. 149
Shell crash
The following message was posted to an electronic bulletin board of a
compiler class at Columbia University.
Subject: Relevant Unix bug
October 11, 1991
Fellow W4115x students—
While we’re on the subject of activation records,
argument passing, and calling conventions, did you
know that typing:
!xxx%s%s%s%s%s%s%s%s
to any C-shell will cause it to crash immediately?
Do you know why?
Questions to think about:
• What does the shell do when you type “!xxx”?
• What must it be doing with your input when you type
“!xxx%s%s%s%s%s%s%s%s” ?
• Why does this crash the shell?
• How could you (rather easily) rewrite the offending
part of the shell so as not to have this problem?
MOST IMPORTANTLY:
• Does it seem reasonable that you (yes, you!) can bring what
may be the Future Operating System of the World to its
knees in 21 keystrokes?
Try it. By Unix’s design, crashing your shell kills all your processes and
logs you out. Other operating systems will catch an invalid memory reference
and pop you into a debugger. Not Unix.
Perhaps this is why Unix shells don’t let you extend them by loading new
object code into their memory images, or by making calls to object code in
other programs. It would be just too dangerous. Make one false move
and—bam—you’re logged out. Zero tolerance for programmer error.
The Metasyntactic Zoo
The C Shell’s metasyntactic operator zoo results in numerous quoting
problems and general confusion. Metasyntactic operators transform a command
before it is issued. We call the operators metasyntactic because they
are not part of the syntax of a command, but operators on the command
itself. Metasyntactic operators (sometimes called escape operators) are
familiar to most programmers. For example, the backslash character (\)
within strings in C is metasyntactic; it doesn’t represent itself, but some
operation on the following characters. When you want a metasyntactic
operator to stand for itself, you have to use a quoting mechanism that tells
the system to interpret the operator as simple text. For example, returning
to our C string example, to get the backslash character in a string, it is
necessary to write \\.
Simple quoting barely works in the C Shell because no contract exists
between the shell and the programs it invokes on the users’ behalf. For
example, consider the simple command:
grep string filename:
The string argument contains characters that are defined by grep, such
as ?, [, and ], that are metasyntactic to the shell. Which means that you
might have to quote them. Then again, you might not, depending on the
shell you use and how your environment variables are set.
Searching for strings that contain periods or any pattern that begins with a
dash complicates matters. Be sure to quote your meta character properly.
Unfortunately, as with pattern matching, numerous incompatible quoting
conventions are in use throughout the operating system.
The C Shell’s metasyntatic zoo houses seven different families of metasyntatic operators.
Because the zoo was populated over a period of time, and
the cages are made of tin instead of steel, the inhabitants tend to stomp over
each other. The seven different transformations on a shell command line
are:
Aliasing alias and unalias
Command Output Substitution `
Filename Substitution *, ?, []
History Substitution !, ^
Variable Substitution. $, set, and unset
Process Substitutuion. %
Quoting ',"
As a result of this “design,” the question mark character is forever doomed
to perform single-character matching: it can never be used for help on the
command line because it is never passed to the user’s program, since Unix
requires that this metasyntactic operator be interpreted by the shell.
Having seven different classes of metasyntactic characters wouldn’t be so
bad if they followed a logical order of operations and if their substitution
rules were uniformly applied. But they don’t, and they’re not.
[...followed by pages and pages of more examples like "today’s gripe: fg %3", "${1+“$@”} in /bin/sh family of shells shell scripts", "Why not “$*” etc.?", "The Shell Command “chdir” Doesn’t", "Shell Programming", "Shell Variables Won’t", "Error Codes and Error Checking", "Pipes", "| vs. <", "Find", "Q: what’s the opposite of ‘find?’ A: ‘lose.’"]
My judgment of Unix is my own. About six years ago (when I first got
my workstation), I spent lots of time learning Unix. I got to be fairly
good. Fortunately, most of that garbage has now faded from memory. However, since joining this discussion, a lot of Unix supporters
have sent me examples of stuff to “prove” how powerful Unix is.
These examples have certainly been enough to refresh my memory:
they all do something trivial or useless, and they all do so in a very
arcane manner.
One person who posted to the net said he had an “epiphany” from a
shell script (which used four commands and a script that looked like
line noise) which renamed all his '.pas' files so that they ended with
“.p” instead. I reserve my religious ecstasy for something more than
renaming files. And, indeed, that is my memory of Unix tools—you
spend all your time learning to do complex and peculiar things that
are, in the end, not really all that impressive. I decided I’d rather
learn to get some real work done.
I can excuse shell languages because they're optimizing for being fast to type in a command line, while also being used for writing programs. But other than that, I weep when I think of the time being wasted on languages with arcane syntax. I wish the world would just switch over to s-expressions.
I understand that many people will find well thought out syntax cleaner than writing trees. But what languages like Bash or CMake, and arguably JS too, have in common is that their syntax wasn't thought out, it grew organically into a mess. S-expressions eliminate this entire class of problems up front.
To me the question always was what’s the point of [ba]sh for non-interactive shell scripting, when there is perl (I’m aware that it has issues with versions, portability to non-PCs, etc, so think “theoretical standardized lightweight perl mode for sh”, not a real one). Shell syntax and semantics are a pile of ugly hacks and landmines, which were done because everyone wanted to program cli. When you want to program, just take a programming language, not a poor excuse of it. The amount of time you end up wasting with bash is much greater than reading perldoc *, and mastering bash never returns an investment.
My second task at my first job was to connect up a horrible mess of bash, sed, awk, python and R to use a web service as input instead of local files. I ended up rewriting the whole thing to python. It was a good decision.
Definitely quote. `[ x$X == x"hello" ]` may fail for a variety of different values of X like `*`, `Rocky_[1976].jpg`, `hello world` and `foo = xfoo -o bar`
Quoting works in all those cases.
The GitHub docs speak of being able to do single donations but I’ve not seen the option yet. Is it something you need to explicitly configure as the one being sponsored?
Thanks for that tip, I've benefitted from Shellcheck a bunch of times so I've just sponsored and wouldn't have without your comment.
I've puttered along for years writing little bash scripts for one thing or another and always had the uneasy feeling that I was littering them with errors.
Shellcheck's linting showed me (a) that I was right and (b) how to fix them. It's made me a better developer and I'm so grateful for it!
POSIX shell is a bit quirky language and bash add a few quirks on its own, but you can write reasonably working shell code if you'll approach it like any other programming language: learn it (read guides, read man), choose a style guide (or write own), use static analyzers (shellcheck is a great tool), test code on different inputs, ask to review your code.
For some reason it is commonly believed that writing shell so easy that one don't need anything of this. Don't even need to learn it.
1. Yes, but there are little reasons to use brainfuck in production code, and there are reasons to use POSIX shell: 1. It is available on practically all UNIX-like systems from NetBSD to Ubuntu in their default installation (no additional dependencies) 2. For small scripts it can run faster than Python/Ruby/Perl thanks to small startup time. 3. It is a handy tool when all you need a small script which mostly runs other commands/CLI tools (but you still have to spend at least little time to learn it first).
May be availability of the shell is an unfortunate historical artifact, yet it still the reason why one may want to use it.
2. Idea to impose elegance upon users is attractive, but even if we ignore that it makes system/language design harder, different people can have different option on what is elegant and what is not. Having said that I like Go (which tries to enforce what can be enforced) because options of its author are not too far from mine. But not everybody is happy with Go and it is not hard to find its criticism.
To compare say how language syntax/features affect number of bugs, questions, etc you have to make everything else to be equal and it is very hard if possible at all in the real world.
I agree. However, we had to create complex C "standards" for dos and don'ts in order to _try_ to write safe code.
And C is still omnipresent. So there must be some other characteristic of Bash that makes it not fit for this age (or some characteristic of C that makes it irreplaceable, that Bash does not have)
Bash has multiple better solutions to this. This is a really really really old hack. New code that uses it is either trying to be backwards compatible to the extreme or is more likely just someone unthinkingly copying what they’ve been taught or seen.
What's a decent replacement shell? Sometimes I think there should be one, with all backwards compatibility for if/looping thrown out, but still adhering to the core concept of mostly being made of a handful of builtins and otherwise executing commands (i.e., without having to type system("") or exec("") etc. to execute commands).
tclsh is very nice for scripts - it's a first-class part of TCL which is a proper grown-up programming language, but it pulls it very much into being a shell, and it's been around for decades so there's a decent chance that any old unix system will have it installed.
Does it have to be a shell? I would be fine with having bash as a shell for basic stuff like piping and a non-interactive scripting language for actual scripts.
Others I've not used are oil shell, which aims to be a better Bash. Elvish, which has a heck of a lot of similarities with my own project (and is arguably more mature too). There are REPL shells in LISP, Python and all sorts too.
So there are options out there if you want a $SHELL and happy to break from POSIX and/or bash compatibility.
Seconding PowerShell. Or some future development that files off the rough edges, but otherwise follows the same sane principle of piping objects instead of unstructured text :).
Command/type aliases and autocompletion make this a non-issue on the console and advanced code editors. Personally, I don't mind the verbosity even when using plain editors, like Notepad2, my text/script editor of choice.
Extra-long command names and deeply namespaced .Net types can be a chore both to read and write, but I find the *nix tools way too cryptic. Never could get used to them, but then again, I've been on DOS/Windows all my life since the 90s.
To be honest, I'd occasionally get frustrated and think about switching to Linux. However, Python and PowerShell are the two things that have kept me on Windows. The fact that they are both open-source and cross-platform is just the cherry on top.
First, I like PowerShell - sufficiently much to have used it as a login shell _on macOS_ for a while.
However, you grossly overestimate the “awesome”. To answer each “angry dude” point in turn:
- The verbosity is fine - in scripts, one should fully expand command names. Interactively, `gci` is fine (or the `ls` alias, though some of the default aliases are travesties).
- The startup time is insane. I start a new terminal window hundreds of times per day, and the startup time is what made me abandon POSH.
- Apache 2 licensed, so I don’t care. I do default to assuming that MSFT are doing something shady though, and I’m right [1].
- RTFMing is perfectly fine (and Get-Help has high quality documentation for built-ins), but one must accept that most users will not. ISE (if it existed outside of Windows) was a nice workaround for needing this.
- Python, Ruby, Go, Node and Perl are all more portable (see my last comment about FreeBSD support).
As an interactive shell, nushell [2] is probably the closest thing to the PowerShell experience which is not tied to .NET Core.
So your answers just confirm my 'angry dude' bullet list.
As far as I can see, the only really problematic one is startup time, which is surely something that will be resolved in the future. I witnessed this only rarely while I use PowerShell on Windows and Linux - it was on windows and it was on VM with very slow disk, and some modules that each keep functions separated by files.
I welcome nushell development, but you can't really compare it to pwsh - even if they continue developing it, its decade or so from being usable as main shell.
> you grossly overestimate the “awesome”.
Nah, its awesome. I save bunch of time any time I open the PowerShell. And any time when one tries to be quicker in some other language one is not, eventually.
In that case you cannot see sufficiently far: telemetry on by default is a non-starter. If the startup time were going to be fixed “real soon now”, it would have improved since 2006. It has not, especially when basic usability enhancements like “posh-git” are enabled.
> [nushell is a] decade or so from being usable as a main shell
Sure, you can replace it with LOLCAT either, its totally legit.
Regarding startup time, I can't believe there is such drama. If that troubles you so much, you can easily workaround it by always keeping powershell process pool ready in the background (1 or 2 suspended instances until you use them).
> you can easily workaround it by always keeping powershell process pool ready in the background (1 or 2 suspended instances until you use them).
This is a pretty absurd take when the discussion is about alternative shells for writing scripts. Shell scripts are everywhere. A fairly heavily used UNIX box might easily execute hundreds of shell scripts every minute. A startup time of several seconds makes it completely unusable.
Not sure what to tell you to break your bubble. I have literary thousands of pwsh scripts running in critical gov projects in production, most of them as part of CI, CD, build and automatic test but some of them serving millions of users. I had dozen of such projects in previous decade on Windows and several on Linux and never had any significant problem with slow startup really. Sometimes it happened that some machines had slow startup but team usually fixed that one way or another.
Pattern of usage of bash scripts where hundreds of shell script run every minute is IMO not something you should brag about in systems architecture. And pwsh is different - it has modules which are encapsulated and you don't have to start another instance of shell to prevent mingle. Parallelism ofc still benefits from faster startup.
> I had dozen of such projects in previous decade on Windows and several on Linux and never had any problem with slow startup really.
OK. So then startup time is not slow? Well, then it's of course a non issue - but then I also don't see why a process pool would be required.
> Pattern of usage of bash scripts where hundreds of shell script run every minute is IMO not something you should brag about in systems architecture.
Yet, you just did :) Anyway, it was hardly intended as bragging, rather than just stating facts. Lots of stuff are shell scripts. Lots of user facing commands are wrapped in shell scripts. Of the ~3k commands in my $PATH, around 500 are shell scripts. You may not think this is a good state of affairs, but it's still a fact. (And it's quite unclear to me what's so bad about it - an excellent use for scripts is providing environment for other programs.)
> OK. So then startup time is not slow? Well, then it's of course a non issue
Its not as fast as bash or cmd but I don't start hundreeds of instances of shell
> but then I also don't see why a process pool would be required.
If you have habit of starting shell every ten seconds or so and equally fast close them, I guess powershell will not be your friend. Bad habit to have anyway IMO, because there are few tools that support that workflow.
> Yet, you just did :)
I never said I run bunch of instances of pwsh - I usually run couple only and within them, everything happens. Although pwsh starts in 300ms on my machine.
> And it's quite unclear to me what's so bad about it
The bad about it is jungle of bashizms and next to 90% of statements which have nothing to do with actual business logic but are there for text massage and parsing.
You can't really compare bash and pwsh tho - you have entire dotNet ecosystem there compared to bash where you need jungle of tools. You can do almost ANYTHING in pwsh without relying on external tools. That is consistency - on different flavors of linux, some tools are there, some are not, and some are just different, even the basic ones such as stat. I get nightmares from the fact that grep and sed and awk and perl and rg and whatever each use their own flavor of reg ex... Its just insane :)
pwsh is more comparable to mainstream languages like go or python. What separates it from them is that it is designed for shell. With that power, I can understand a little bit of slower startup time, but I am sure it will be solved eventually.
> Bad habit to have anyway IMO, because there are few tools that support that workflow.
This is absolutely false. Powershell and the Windows console prompt do not support this workflow. Every other shell/terminal emulator combination I can think of supports it fine.
Perhaps you can elaborate on why you consider it to be a "bad" habit?
The article only considers quoted arguments. Quoting should be recommended for test, and OP probably even has a formal check for quotes being in place. However, let's just set that aside for now. Quotes are optional. And if you don't put quotes, then yes, you need to use `[ x$var = xval ]` to address empty variables.
Ah, upon rereading it turns out it is still an issue:
... would fail to run test commands whose left-hand side matched a unary operator.
This was fixed [...] in 1979. However, test and [ were also available as separate executables, and appear to have retained a variant of the buggy behavior:
Any time I have to do any serious shell scripting at work, I die a little inside. I do need to try out shellcheck thoroughly, it might catch some bugs in our pipelines
Honestly, the only practical solution to this in my experience is to just buckle up and spend the hours/days it takes to learn the dirty corners of Bash properly. This doesn't mean you have to learn 100% of it, but it does mean you need to learn enough of it that will (a) serve your needs comfortably (instead of making you constantly have to hack around), and (b) give you insight as to what subset(s) to avoid unless absolutely necessary. Same goes with other tools, whether with warranted or unwarranted complexity (like git, Bazel, C++, etc.). Yes it's painful while you're still learning, but it's an investment that makes you more productive and just makes everything so much more tolerable afterward. If you never spend the time then you'll keep dying inside every time you have to deal with tools you don't like.
I also recommend the man pages for the dash shell. They are the first reference I look at when I need to remember how something works in POSIX shell, without bashisms.
Personally, I learnt my bash by reading a lot of Gentoo EBUILD scripts and config files. I can recognize other Gentoo users just by how many curly braces and double quotes I see in a simple one liner.
The one resource that I have bookmarked decades ago and would highly recommend to get the ball rolling are the TLDP pages on bash:
I wish I had a great answer, but I don't. Maybe these will help, though.
What I've done is, every time I came across something that didn't behave the way I expected, I tried to Google it and tinker around with it until I could understand exactly what is going on, so that I "got" the underlying feel for the language and didn't have to look up every consequence every time. e.g., the difference between [ ] and [[ ]], which is confusing at first, becomes more intuitive once you realize [ is a portable POSIX command whereas [[ is special Bash syntax, and once you grasp this, you can understand what its implications are. e.g., that Bash has no way to detect '=' as a special token inside [ ], and consequently you should worry about ambiguities. (This does assume a little background knowledge of parsing ambiguities; if you haven't taken a course related to programming languages, going through one might also be a worthwhile investment.)
Also, you have to be on the lookout for things that you would want to do in a different language, and learn how to do it in Bash, like creating functions & local variables. It's mostly a matter of being proactive about learning how to structure your code like you would in other languages. Often I see people try to keep things "simple" by just copy/pasting code so that they don't have to organize things at a higher level. You wouldn't do that in other languages; don't do that in Bash.
Some pitfalls you really just do have to learn the hard way; there's no other way I can think of. The key here is to spend the extra minutes necessary to actually learn what broke when you come across it, instead of just tinkering with your code until it works. Like when you realize that (set -e; false && false; echo hi) prints 'hi' despite the fact that (set -e; false; echo hi) does not (this is absolutely bonkers), dig in and figure out what's going on so you don't shoot yourself in the foot the next time. You can't avoid it the first time, but you can at least make sure to avoid it in the future.
Also, always try to think of unintended consequences when you learn how to do something in Bash. Like when you learn how `trap 'rm -f file.bak' EXIT` lets you clean up a file, ask yourself what happens if someone has already set a trap? Ask yourself how to dynamically generate and pass a file name to that command safely? Ask yourself in what scope/context the command is evaluated? Spend some time figuring these out either online or by tinkering manually, then remember your solution for later (maybe write it down). Possibly the most costly mistake you can make is to assume something will work in exactly the same way you expect in every other language, because chances are pretty darn good it won't. Even pipelines (which you probably already use) are tricky if you start thinking about them... e.g., what happens if the first command terminates prematurely? What happens if you assign to a variable? etc.
The POSIX pages (OpenGroup, first link) are useful for understanding what you can rely on; I'd use that as a command reference whenever it's possible. In some cases they're overly pessimistic (e.g., you can probably rely on 'local' despite it not being POSIX), but generally they're good starting points.
tldp is where I still look up some things, like the heredoc syntax. I've probably looked that up 20 times and I still don't remember it well enough for some reason.
And it might be fun to just browse through the BashPitfalls page once you have some comfort with shell scripting just to get an idea of what can go wrong, but I wouldn't really recommend it as a learning guide, and you definitely shouldn't expect to memorize all of it.
Shellcheck is immensely helpful and I highly recommend incorporating it into your editing workflow (that is, lint as you go with a language server) before it hits the CI/CD pipeline.
(ba)sh scripting has so many footguns it astounds me that so much complicated code has been written (more-or-less) successfully in it. Once I reach more than a few dozen lines I usually hit the escape hatch and reach for a more powerful scripting language (you can always still shell out to utilize handy CLI tools if you need them).
On the other hand bash is one of my favorite languages to develop for. It took many years to get proficient and you shouldn't try to do anything too complex but it has good sides:
1. The REPL is top notch.
2. Good documentation. Most commands come with a complete manual and examples are easy to find.
3. It's portable and has lots of available APIs, probably even more than C.
Yeah, but I think it is good to do this kind of work because it reminds of of what can be done with an interpreter that is less than 1MB. Time and again I've heard folks on HN say, "Why aren't shell scripts just written in Python?" On the one hand, I think it is great to think in "bare metal", on the other hand do we really need to think in "bare metal"? Shouldn't we free our brains from solving (and resolving) 40 year old problems. I dunno, my point is: I feel that "little death" you describe too. (Non, francaise, pas cette petit mort!)
I started using it about a year ago (I wasn't aware of it before that) and it's astonishing the number of things it can pick up on in even a five line script.
Often they're things you will get away with most or even all of the time in a specific scenario, but there they lurk, waiting to blow your leg off when you start including spaces in a filename or somesuch!
I wrote The newline Guide to Bash Scripting[1] (released today!) with users like you in mind: already familiar with programming, needs to use Bash, but don't want to invest hundreds of hours to understand how to list files, use redirects, handle arguments, and so on.
"Any time I have to do any serious shell scripting at work,"
Right when this happens, that very second, remember there are hundreds of other languages that can do anything bash can in a readable fashion. EG. os.system() in python. If you are writing more than 3 lines of bash you are doing it all wrong.
When I need a shell script, I write a PHP script instead.
(PHP has its oddities, but it has many benefits over Bash, including an optionally strictly-enforced type system, binary-safe strings, clear C-like syntax… and like Bash it has backticks!)
In my book with zsh being a very popular shell and it having been a bug as little as 5 years ago, I’d almost learn towards recommending it rather than recommending against? It seems like a relatively low impact fix that improves portability.
Seeing or typing x$foo is a reminder that shell scripts are unsuited for anything more than a few lines long. If you're tempted to switch to [[ ]], just switch all the way to a real programming language.
The only good part of sh/bash script is to run commands with piping/redirection support IMO,other than that I saw posts like this to explain sh/bash script’s puzzles/pitfalls all the time.