Hacker News new | past | comments | ask | show | jobs | submit login
How to do things safely in Bash (2018) (github.com/anordal)
173 points by soheilpro 74 days ago | hide | past | favorite | 93 comments



I've recently taken to not bothering with shell scripts at all. I just use Python instead. The type system, while it's by no means state of the art, is still miles ahead of the stringly-typed bash. The libraries are excellent; much can be done with the standard library alone. The scripts are much faster (even though the language isn't particularly tuned for performance), since I don't need to spawn a subprocess for each little string manipulation task. And if I ever need to ‘shell out’ to an external process, there's always the subprocess module that is much more robust than ‘set -e’ ever was.

Seriously, if you can avoid it, just don't write shell scripts at all. Use a real programming language instead.


This, and I mean very specifically this with python always as the go-to replacement, comes up every single time anyone even utters the words "shell" or "bash" and every time all I can think is that the people saying it aren't writing the same kinds of shell scripts I ever see, use, or wind up writing.

And every time I've ever seen something that's allegedly "a shell script replaced with a python script" it's way more complicated than an equivalent shell script would have been.

I just wonder, really, what "a shell script" even is to you if not something where you need to "shell out to external processes"?


I wonder too. Things like file and directory manipulations are way more complicated in native Python and I find the exceptions harder to troubleshoot than native commands. I'm not even going to get into pythons version and dependency hell, suffice it to say I have more confidence in a standard version of bash and standard utilities generally being present on a modern Linux system. What I've seen from pythonistas as reasoning for this replacememt love is usually one or a combination of a few things - a claimed write once run anywhere ability that I think is generally fantasy, a belief that Python is a "real" language and shell is not which usually just means "I don't like shell or think it is unfashionable", or a belief that Python code can be made more reusable. The person in question almost always has a background of feature developer that is now doing devops type tasks.

All that said I like python and there are tasks I absolutely prefer it for, usually anything to do with scraping some web API that returns a complex piece of JSON or xml.


Here is a novel idea.

I am allowed to call 'mv' or 'cp' binaries from bash.

I am also allowed to call 'mv' or 'cp' binaries from python with subprocess!

(Edit: bad example binaries, 'df' or 'tar' would be probably better.)

I must choose not to install bash libraries for a bash script beyond builtins.

I can choose not to install python libraries beyond builtins.

It becomes a really simple tradeoff.

Nobody forces you to step out of python's stdlib. No extra dependencies. And you can still reap benefits of a proper programming language.


It's worth noting that there are still a shocking number of production machines in the world that don't have stable installations of python3.x yet.


So? Python 2.x is just as good for this, and is pretty widely available. Besides, I don't write scripts for every production machine in the world, just a subset that I have access to.


So, a bash script written ten years ago probably still works on any version of bash since then.

A python script written ten years ago probably doesn't even parse on python3.0, let alone python3.6 (because there have been backwards incompatible changes since 3.0 even!).

And that's also true vice versa.

So like, this is to the point that there's some "reasonable subset" of python you can use to make it portable (ie. no external packages so you don't need to worry about pip vs. setuptools or venvs or whatever, let alone whether they'll build). I'm asserting that there is not.


Not too long ago I discovered that a tiny temporary shell script I gave to a team to validate Akamai cache was still being used nine plus years later. It was supposed to be a short term work around until the UI team created a proper validation. They never did and somehow my script kept working for almost a decade. The only reason I found out was due to a minor change in our UI that broke the manifest lookup url pattern.


Again, so? Maybe you should update your scripts more than once every 10 years?


It is OK to call whatever one needs actually - just minimize the usage of Bash language features (Python provides most of them). It is not complicated and requires no dependencies, here's a self-contained helper: https://gist.github.com/fillest/8d64f8fa0cdb1745bfc9c683cf39...


Great, would that work with pipes and redirections too? At that point you might as well use sh[1] which is a more advanced abstraction. My issue with these approaches is that you'll inevitably hit a bug or limitation where just using a shell language would've been easier to maintain and more portable.

Python has its uses, but replacing shell scripts is not one of them IMO. It it was then the problem didn't need to be a shell script to begin with (doesn't shell out often, needs complex data structures, etc.).

[1]: https://amoffat.github.io/sh/


Python's way of executing a command is somewhat of a nightmare. There are many, many ways to do the same thing, and some are safer than others. Some are better for capturing output, while others are better for just executing and getting a return code.


> file and directory manipulations are way more complicated in native Python

Have you looked at the pathlib[1] module that was added in Python 3.4? It's fantastic. Makes things both more convenient and more correct, in my experience, and it lets you manipulate Windows paths on Unix and vice versa.

[1] https://docs.python.org/3/library/pathlib.html


Hum, no those people are probably writing the same kind of script you do. Their Python versions are indeed longer and more verbose.

I always recommend Python anyway if you care about edge cases (what is not always). That longer script handles them on the obvious way, while the short and more readable shell script does something absolutely crazy every time a detail is different from planed. And if you try to correctly handle the edge cases in Bash (you'll fail), you'll get something much more complicated than the Python version anyway.


Which of these is simpler?

Rust:

    fn add(a: i32, b: i32) -> i32 {
        a + b
    }
Python:

    def add(a, b):
       return a + b
Arguably the Python function is "simpler" because it occupies fewer characters. And yet, the Rust function is more tightly specified; it does less. It can't throw exceptions, unlike the Python function. Invalid programs where you pass the wrong type to the Rust function won't even compile, whereas the Python function will blow up at runtime.

I figure that these supposedly "simpler" Bash programs are not simple at all.


Perhaps sometimes. But properly handling errors around process management in python is not simple either, even with libraries like subprocess. My experience is that when python scripts doing a lot of process management fail they fail in far more complex and hard to diagnose ways than shell scripts do (especially bash scripts with `set -e -o pipefail`).

And like, `subprocess.run()` defaults its check argument (that will make it throw an exception on non-zero exit code) to false, which means that to get that same kind of behaviour as `set -e` in python means making sure you always pass check=False to every invocation, afaik.

Sometimes added complexity can introduce classes of errors.


I'm a little lost... neither of those functions is doing something you would be expected to do in a shell script.

The upthread point was that shell scripts are for coordinating the actions of separate programs run via their command lines and simple IPC like pipes. If that's not what you want to do, shell is the wrong language. If that *is* what you want to do, then you'll find Python and Rust are pretty severely handicapped.


What I was trying to say is that bash scripts are superficially simple but their behavior is hard to reason about.

I chose Rust for my example because its addition operation is tightly specified and thus the behavior is very easy to reason about. (I probably should have chosen a language other than Python for the counter example.)

As soon as shell scripts get mildly complex, like others in this thread I prefer to port them to Python.

Since I'm a FOSS programmer, I've often dealt with the misery of non-portable shell scripts. Userland incompatibilities across operating systems mean that the same bash program behaves differently because the userland programs bash calls behave differently.

In contrast, reasoning about the portability of a Python script is much easier than predicting which tangle of flags in a shell script will do the right thing.


> What I was trying to say is that bash scripts are superficially simple but their behavior is hard to reason about.

But the counterpoint is that this is dependent on situation, and if the behavior you're trying to "reason" about is better captured by the operation of tools like "find", "objcopy", "openssl", "nc", "rsync", etc... than it is by a Python library-based attempt at the same stuff, then putting that behavior into a shell script is a net win regardless of "complexity".


Oh, I'm really not comparing it with Rust. The comparison is between a general language and a shell. If it's Python, Rust, PHP, Java, or whatever is of secondary relevance.

I've done some ops work in C, Python, and C#. I'm also waiting for a good problem to try some Haskell shell monad. All the same old language pros and cons apply.


I'm just making a general argument about what "simple" means. A well-specified program may or may not be superficially simpler, but it will exhibit simpler behavior.


Oh, ok. The Rust version is at least as simple as the Python one, depending on what kinds of detail you care about.

The thing about shell code is that there are plenty of cases where you don't care about things like error management, bad inputs, and etc. So it's not immediately obvious what language is simpler. When you care about those, the shell becomes the most complex choice by far, and that's the duality problem that generates all those different opinions you see around.


> I just wonder, really, what "a shell script" even is to you if not something where you need to "shell out to external processes"?

A replacement for a shell script may not itself be a shell script.

A fairly typical python-as-shell-replacement I have that (among other things) creates/manipulates test resources in AWS corresponding to git feature branches, shells out less than it would if it was a typical shell language (using boto3 rather than shelling to the AWS CLI, etc.) but still shells out to git. But it would be replacing a shell script even if I had a git-repository-interaction library that eliminated the need to shell out.


I mean I'd agree that's a good candidate for replacement; once you start doing anything more complicated than maybe idempotent curls to an external service I think it ceases to be a 'shell script'.

This may sound a little like "no true scotsman" I guess, but "what is the scotsman" is kind of the key issue that I'm saying gets glossed over when people say blanket things like "never write shell, always use python."


I’m not sure I’d agree with defining the scope of things for which an actual shell language is ideal as inherently “shell scripts” (for a few reasons, including the evolving variety of shell languages, the fuzzy boundaries of that set, etc.), I think that it is probably the case that the choice between bash (or powershell or any other shell language) and Python isn’s as simple as “always use Python” and that there is plenty of space for discussion around which tasks each is appropriate for, with the answer differing based on which shell languages you are looking at.


This (and a loooong history of similar posts) is the post that I was originally responding to:

> Seriously, if you can avoid it, just don't write shell scripts at all. Use a real programming language instead.

This may be slightly weaker than "never use shell" or "always use python" but it's pretty close to the former (you can almost always "avoid" using shell somehow), and people seem to believe that python is somehow particularly uniquely suited to use for shell tasks that I think there's some degree of the latter implied.

So... I agree with you? I am saying that there is room for discussion around what constitutes a good use of shell script, and that there is a space in which that is true. I'm suggesting that a lot of people who believe otherwise may just mostly be writing things that aren't in that space, and I think this is actually a pretty charitable argument tbh.


As someone who has gold in Bash on Stack Overflow, I can say this with confidence: Bash is just a shitty programming language.

If you are writing anything larger than about 60 lines, don't use Bash. Almost anything else will be better. Python, PHP, Ruby, JavaScript, even Go.


True. However 60 lines of bash would amount to 300 lines of Python.


Possibly, not necessarily, and I'd argue it's for the better. If a bash one-liner requires a wizard to decipher (especially because people way overuse short options over long ones in scripts), and 10 lines of equivalent Python can be understood by someone who has never actually written any Python, I know what I'd pick.

This also helps in eradicating bugs. Bash is very hard to get right (hence this thread in the first place), even for the simplest stuff. This simply won't happen in Python (or any other proper language).


I agree, but I use Powershell rather than Python. (I've actually written Powershell scripts that implement pieces in Python.) Because gluing together bits of Powershell and Python is still better than trying to write it in bash.


Same. I feel like I spend 45 minutes every time I write a shell script to determine simple things like "is this environment variable set". Stack Overflow always has 100 answers on how to do this, and every one has someone replying "well that doesn't work if..."

In general, after 20 years of using Unix, I'm starting to sour on the "everything is a string" and "streams are just newline separated plain-text records". The shell is a pretty good REPL, and glue like seq, find, xargs, grep, etc. are very good. But honestly, I find myself reaching for purpose-built tools rather than ad-hoc pipelines more and more. fdfind does more of what I want on average than find; rg does more of what I want than 'find | xargs grep'. I also find myself using more and more structured data, and write more jq pipelines than I do bash pipelines. Finally, I am getting more and more frustrated at basic shell mechanics like history handling. I have 5 terminals open on average, and I don't understand why C-r in one won't find history in another (I know why, of course, but I don't like it). I'm getting pretty close to just writing my own shell and toolchain. I feel like if the Unix shell needed to be fundamentally reimagined, someone would have already done it. But they haven't. They just hacked Unicode and remote RPCs into shell prompts so that pressing "enter" on an empty shell prompt takes 25 seconds to run. I'm not sure why people are so focused on the polish and not the core inadequacies, but they are, and I feel like I'm going to have to fix that myself in the next couple years...


Have you tried any of the shells here?

https://github.com/oilshell/oil/wiki/Alternative-Shells

Discussed recently: https://news.ycombinator.com/item?id=26121592

A lot of them are "fundamentally reimagining" shell -- actually I'd say MOST of them are; whether that's good or bad depends on the user's POV.

Oil is reimagining shell, but also providing a graceful upgrade path. Out of all the shells I'd say it's most focused on the fundamental language and runtime, and less on the interactive issues (right now).


I couldn't agree more!

> I feel like if the Unix shell needed to be fundamentally reimagined, someone would have already done it. But they haven't.

AFAICT this has a lot to do with the fact that most people who'd be able to "reimagine" / fundamentally improve the Unix shell are also the people that are so intimately familiar with the shell and all its quirks that they don't see any need for improvement. It has always been that way, so it must stay that way.

> I feel like I'm going to have to fix that myself in the next couple years...

Please let me know when you do. :)


I agree with some of your frustrations, but it's honestly not that bad once you setup your environment properly and get used to some of the quirks.

> I feel like I spend 45 minutes every time I write a shell script to determine simple things like "is this environment variable set".

Adopt some best practices[1] and have a template to start new scripts with. ShellCheck also helps with avoiding the most common pitfalls.

> I find myself reaching for purpose-built tools rather than ad-hoc pipelines

It helps to distinguish day-to-day interactive work in the shell from writing portable shell scripts that will be robust and require little maintenance for years to come. For the former you want user friendly tools that help you the most with a particular task. For the latter you want to use battle-tested tools with a stable interface that are available on many systems or are preferably built-in to the shell.

> I am getting more and more frustrated at basic shell mechanics like history handling

That's mostly a solved problem in many shells. Zsh supports it natively[2].

I wouldn't want to discourage you from writing your own shell, good luck with that, but there are plenty of good POSIX compatible and alternative shells out there. We can agree that "real" programming languages are much more powerful, safer and friendlier, but plain old shell scripts are the right tool for the job in many situations and shouldn't be ignored at all cost.

[1]: http://redsymbol.net/articles/unofficial-bash-strict-mode/

[2]: https://nuclearsquid.com/writings/shared-history-in-zsh/


All the deployment things are just wrappers to shell commands so if you want to help people debug user-data.sh, cron, ansible, docker files, or puppet, you need to understand shell. Also, it is better to have something in bash as code than not have it be fully automated. I draw the line at arrays tho. If you want arrays don’t do bash. But if you want to just run a bunch of commands on your compute, bash is there.

One thing I have started doing for stuff that need to go via ssh is I just do a here doc for an embedded shell script that I scp to the remote and then just use ssh to invoke it. The quoting and pseudotty stuff just got to be too dumb to understand.

Also, and maybe this is just for non-interactive scripts, I find everything from writing them to using them is a lot easier if you take pains to make them idempotent, so you can run them one or 100 times with the same effect.


> The [python] scripts are much faster...

Pipelines are performant; many algorithms can be expressed as pipelines. Bash has many built-in string manipulation mechanisms which do not require a sub shell.

Why do people who subscribe to the “avoid shell scripts at all costs” ideology feel so passionately about prescribing to others?

I like python, but if I am in a rush it’s shell.


Actually, the one thing that I as a systems engineer really like about Clojure is the threading macro (-> some-data some-function-applied-on-some-data some-otherfunction) or (->> ...) for inserting not as a second, but last argument. That is like a pipeline but usually more readable to me even though I am proficient with the shell (and I must admit even PowerShell, which really is more comparable to Perl with a bit of SQL and other flavours here and there than bash).

Sysadmins/ system engineers might love Babashka: https://github.com/babashka/babashka which is a large subset of Clojure + some frequently used libraries as a native GraalVM image. It is portable, has very fast startup and if the script becomes a larger program, you can easily switch to ClojureScript + Node.js (e.g. for still very fast startup) or Clojure (on the normal JVM) or perhaps build your own GraalVM image. You might also just open a REPL and run it as a single session but that is rather unique in the sysadmin/ systems engineer space, where most things are launched on schedule e.g. each 5 minutes by a script and in case the startup time is somewhat long, it might dominate the execution time.

Btw. babashka seems to be about twice as fast to start on my Debian: time bb -e '(+ 1 1)' executes on average in about 11 ms vs time python3 -c 'print(1 + 1)' executes on average in about 23 ms


> Why do people who subscribe to the “avoid shell scripts at all costs” ideology feel so passionately about prescribing to others?

Are you asking why people make recommendations in general?


Pipelines are a kernel feature that can be created on Python too. You'll just need to abstract it in a function.


At least shell follows POSIX standard which is more than one can say about Python. Have witnessed self-proclaimed "shell replacing" python scripts invoking commands, and never checking the exit code. Debugging hell.


I like bash because it's easy to connect different data together. I can output a column from a sql query to a file and easily write a for loop that goes over the data and use it.

I work mostly in payment related code, so most often I use bash to hot fix in production. Like if we need to refund certain people, I'll convert the refund api call to a curl call, get the data from sql and then just run a simple loop.


Shouldn't scripts over 100 lines be rewritten in Python or Ruby?

http://www.oilshell.org/blog/2021/01/why-a-new-shell.html#sh...

tl;dr I think the main skill that is missing is being able to write a Python script that fits well within a shell script. The shell script often invokes tools written in other languages like C, Rust, JavaScript, etc. (most of which you didn't write)

Good book on this: https://www.amazon.com/UNIX-Programming-Addison-Wesley-Profe...

Online for free: http://www.catb.org/esr/writings/taoup/html/


Use shell script for glueing stuff (ie calling other tools) and pipelines. Use GNU awk for complicated text processing. Profit!


Bash is fine as long as everything works. Handling errors in bash is a fool’s errand and you’re quite right to use Python for resilience in the face of errors or malformed input.

Sometimes you just gotta eat. Python is a sit-down meal. Bash is a can of beans. I’m happy eating both for dinner but it depends who (if anyone) I’m sharing my meal with.


100% agree. This article should just be "rewrite your script in Python". It's not the best scripting language ever, but it works fairly well and it's about 100 times better than Bash.


If you’re interested in writing safe shell scripts then check out shellcheck:

https://github.com/koalaman/shellcheck

If you’re interested I’ve written a git hook for it that runs a check when you git commit:

https://github.com/alblue/scripts/blob/main/shellcheck-pre-c...

You should also check out her Google shell script style guide:

https://google.github.io/styleguide/shellguide.html


Shellcheck and the VS Code Shellcheck extension[1] are really good. I never write a bash script without it.

Built-in autofixes and direct links to extensive documentation on each rule, with easy to grasp examples. It's just awesome.

[1]: https://marketplace.visualstudio.com/items?itemName=timonwon...


An earlier comment of mine about some more shell checking/linting/formatting tools:

https://news.ycombinator.com/item?id=26510549


I agree with almost everything in this guide, but it has a couple of recommendations that create new problems (while solving others):

The nullglob option ('shopt -s nullglob') makes things like 'for f in .txt' work right when there are no matching files, but make other commands like 'grep somepattern .txt' change their behavior in ways that can cause serious trouble. grep (and many other commands), given no files as input, will read from standard input (up to the EOF); if it's running interactively and input hasn't been redirected, this causes the script to hang for no discernible reason. If input has been redirected, it steals input that was presumably meant to be read by some other command. In my opinion, the problems this causes are more serious than what it solves.

The errexit option ('set -e' or 'shopt -s errexit') can both fail to exit when you expect/want it to (several such situations are described in the guide) and also exit when you don't expect/want it to. The guide mentions suppressing unexpected exits with '|| true', but it's not always obvious when this is needed, and it's not even consistent between versions of bash. There's a good parable about this (and some examples) at http://mywiki.wooledge.org/BashFAQ/105

I really don't like trying to predict and work around unpredictable features like this; I'd much rather deal with explicit error handling, like 'commandThatMightFail || { echo "Aaaargh" >&2; exit 1; }'


The fact that you have to use quoting nearly everywhere is a design flaw in the Borne shell. Some shells, like Plan 9's rc, for example, don't expand after variable substantiation. They have an operator to call if you want to explicitly force expansion. That's so much cleaner and less error prone.


Oil is Bourne compatible, but has a mode to opt you into the better behavior. Example:

    osh$ empty=''
    osh$ x='name with spaces.mp3'
This is like Bourne shell:

    $ argv $empty $x
    ['name', 'with', 'spaces.mp3']   # omit empty and split

    $ argv "$empty" "$x"       
    ['', 'name with spaces.mp3']     # unchanged
Opt into better behavior, also available with bin/oil:

    $ shopt --set oil:basic

    $ argv $empty $x                                                                                 
    ['', 'name with spaces.mp3']  # no splitting/elision
If you want to omit empty strings, you can use the maybe() function, which returns a 0 or 1 length array for SPLICING with @:

    $ argv @maybe(empty) "$x"
    ['name with spaces.mp3']      # omitted empty string
Example of splicing arrays:

    $ array=("foo $x" '1 2')

    $ argv $empty @array
    ['', 'foo name with spaces.mp3', '1 2']
This is called "Simple Word Evaluation": https://www.oilshell.org/release/latest/doc/simple-word-eval...

Feedback appreciated!

(Interestingly zsh also doesn't split words, but it silently removes empty strings).


> Interestingly zsh also doesn't split words

Should zsh users ever want that behavior they can enable it globally with the SH_WORD_SPLIT option, or more sensibly local to a given parameter with the = parameter expansion as in ${=var}. Also, it has the best comment in zsh's manpage: "SH_WORD_SPLIT [...] Note that this option has nothing to do with word splitting."

    $ x='name with spaces.mp3'
    $ print -l -- $=x  # "print -l" displays one element per line
    name
    with
    spaces.mp3
> it silently removes empty strings

You can keep the empty strings too if needed, but it requires using the @ expansion flag as in ${(@)arr}. It all becomes superbly readable, here I'll prove it:

    $ arr=($x '' 'old file')
    $ print -l -- "${(@D)arr:A:gs/old/new/}"
    ~/Desktop/name with spaces.mp3

    ~/Desktop/new file
In all seriousness, the zsh default handling feels right for interactive usage in this case. I'd love the strictness of oil, so that I simply don't need to remember these things. Not yet quite ready to give up on the zshexpn(1) goodies though.


OK interesting, didn't know about those options. I have seen the ${(...)x} syntax but not really used it. Yes it doesn't rate highly on my readability scale :)

I have heard the feedback that people like zsh expansion shortcuts, e.g. for globbing. Personally I am a find/xargs person, i.e. I select the files first with 'find' and then execute what I want with xargs.

It does require you to invert your thinking -- you're going from verb NOUNS to NOUNS verb. And you have to go back to the beginning of the command line and edit it. But I do find that it lets you test out the selection logic more naturally.

find can also be faster, e.g.

    find . -name .git -a -prune -o -print
skips the even STATTING .git directory, not just printing it, as opposed to ** I believe.

However find arguably has an even worse syntax (although it is explicit, just with some dumb shortcuts). One longstanding goal is to put a better syntax on the "evaluation model", which is pretty useful. It's basically a predicate that's evaluated over every node in the FS tree, but you can also customize the traversal.

It might be better for programs rather than interactively, but I started using it interactively too.

(When compiled with glibc, Oil also has bash/ksh-style extended globbing, which allows negation etc., but it seems only old scripts use that.)


You don't have to choose glob&loop or find&xargs as a zsh user, as there is a built-in zargs function:

   zargs -- $crazy_glob -- $command
Much like the `find | xargs` version you can begin with `zargs -- $glob` until your filter is correct and then tack the command on when you're ready.

Hmm, now I really like the idea of an oil-y version of this where you could do something like `oargs { $clearer_glob_DSL } { $command_block }`. It might even be possible right now using stest from dmenu¹ as a stand-in for a more advanced globbing alternative.

I believe part of the reason zsh users find the extended globbing functionality useful is the basic usage matches their expectations with other tools, unlike find's quirky syntax. The common filters use the same values as you'd see in `ls --classify` output. For example, you want executable files tack a `*` on or for sockets use a `=`.

FWIW, the find comparison isn't quite right. Out of the box `**/file` wouldn't traverse a .git directory anyway, unless the GLOB_DOTS option is set or you provide the equivalent flag as in `**/file(D)`. The less specific point is quite true though as adding negation and toggling flags in a single glob can be extremely difficult to read. See the examples at the bottom of zshexpn(1) for proof of that.

Edit to add: I hope this comes across as an attempt to be helpful as was intended, and not some awful stop motion attempt.

¹ https://tools.suckless.org/dmenu/


zsh is also doing the variable substitution better than bash. FYI, I just released rust_cmd_lib 1.0 recently, which can do variable substitution without any quotes: https://github.com/rust-shell-script/rust_cmd_lib


> The fact that you have to use quoting nearly everywhere is a design flaw in the Borne shell

I disagree. This can be considered a design flaw in other places, like filesystems that allow filenames with spaces and other idiotic complexity-inducing things.


Filenames with spaces only introduce complexity because bash was poorly designed. If you could treat strings with spaces the same as strings without spaces — as you can pretty much everywhere but the shell — then there wouldn’t be any additional complexity at all.


You have this point of view because you see filenames as data. Then, it wouldn't be nice to limit the contents of this data, thus filenames should be completely unrestricted. But there is another point of view. From the natural point of view of the shell, filenames are just identifiers. Like variable names in a programming language. Of course, there are programming languages that allow variable names with spaces, but they require weird quoting or a very limited syntax. Having variable names without spaces is so much convenient that it is a hard restriction in most languages. The same is true for the beautiful shell language, but unfortunately unix filesystems are poorly designed by allowing almost arbitrary filenames (note that null character and slash are not allowed).

I prefer to be able to do "for i in `ls`..." in my shell than to have filenames in my disk with hard spaces. This could be solved at the filesystem level, by a mount option (say, "-o cleannames") that exposes filenames with spaces using a non-breaking unicode space. You will take my simple shell one-liners that break with ugly filenames from my cold, dead hands.


Iterating a list of files should be handled by built in support for lists on language level and listing files from the stdlib, not based on your personal preference that spaces in strings is a universal delimiter for list-items because ls in some happy-cases happen to print files that way. What are you going to do next time you have to iterate any other type of data with spaces in it? Claim that's complex too?

Trying to shoehorn in arbitrary restrictions to data identifiers because of the way it's serialized in some situations just leads to the "csv-problem" where you never reach a uniform standard because some prefer using spaces as separators, some use tab, some use comma, semicolon, quotes are always allowed, etc...

Just define a standard array-type once and for all and use it to pass data for everything. Take python for example os.listdir() is just one call away, same for any other high level language. (I am aware bash already has this and one should use globs instead of ls, though i wouldn't advocate bash for anything regardless, for multiple other reasons)


that standard already exists and it is text identifiers separated by spaces


I personally have found that AWK is a surprisingly good alternative to shells for scripts bigger than average.

Why?

1. Portability (AWK is a part of POSIX)

2. Very clean syntax

3. Powerful associative arrays

4. Super-powerful string manipulations facilities

5. Easy shell interoperability

As an example here is a simplistic build tool [1] I’ve developed in AWK. As you can check [2] it runs unchanged under Linux/macOS/Win (via Git Bash).

[1] https://github.com/xonixx/makesure/blob/main/makesure.awk

[2] https://github.com/xonixx/makesure/actions/runs/702594092


AWK is good, but Perl is better. Perl is good, but Python is cleaner. Python is good, but Go is better. Go is good, but Rust is faster. Rust is good, but shell is simpler. Shell is good, but AWK is better...


Perl is like swiss army knife for sysadmins.


Every time I think of writing some Bash I spend some time here: https://mywiki.wooledge.org/BashPitfalls and then ...I use something else.


Yeah it's ironic that you have such a wonderful repository of knowledge only to realize that you shouldn't have that much corner cases and bag of tricks.


Here's a great resource for proper Bash syntax

http://mywiki.wooledge.org/BashGuide


When calling shell from bash (to use pipes etc.), I found it useful to pass variables via the environment, like this:

    def safe_call(command, **keywds):
        return subprocess.check_call(f'set -euo pipefail; {command}', shell=True, env=keywds)

    safe_call('command1 -- "$bar" | command2 --baz="$baz" | command3 > "$output_file"', bar=bar, baz=baz, output_file=output_file)


The main difficulty I have with this method is that I don't have an easy way to pass a Python array to bash as an array of parameters.


PowerShell is cross platform now and it’s quite good. I would encourage people who’ve never tried it to give it a shot.

Like 99% of other people who never used PoSh, I thought it was just a Windows shovelware replacement for Command Prompt. I was mistaken—the syntax is actually a lot simpler than bash, but it’s just as capable.

One of the cooler features about PoSh is that you can leverage Visual Basic/C#/.NET from it, and you can script GUIs (like Python’s tkinter library).


I'd probably like to see powershell usage increase too, but I think we may be in the minority. Every time it comes up I check to see if it is available in Debian and see that the packaging bug¹ has been open for years without anyone caring enough to move on it. The referenced upstream bug² points out that it isn't capital-F Free enough to distribute [yet], but you'd expect more comments in the bug if people were pushing(or a package for the non-free section).

¹ https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=834756

² https://github.com/PowerShell/PowerShell/issues/5869


But What-Is-Up-With-Those-Really-Long-Commands? :)


Discussed at the time:

Safe ways to do things in bash - https://news.ycombinator.com/item?id=17057596 - May 2018 (240 comments)


Embarrassingly, instead of fixing the code, I fix the data for bash. I just make sure my input is well-formed haha. I don't think I have any files or folders with spaces in them on my computer.

It may not be the way but it is a way.


I tend to do this as well.

This is the way.


I've been shell scripting intermittently for a long time and have written some bash scripts that have seen a lot of use. This document really showed me how bad some of my programming is. But then again, the first thing I tell people when demonstrating one of these programs: "I am not a programmer. I am a bash scripter. And, a bad one." The stuff still works. But I would feel a lot better if I were doing things the right way. I'll be reviewing this next time I write a bigger s/program/script/


Shellcheck helped me learn a lot about bash.

https://www.shellcheck.net/


See the same debate ensue each time with bash - use python, they shout. Perhaps the overriding factor here is familiarity? Past looping and some basic flow control I'm guessing most people run out of being comfortable in bash and can get things done quicker in their fave Lang?


Please read http://mywiki.wooledge.org/BashPitfalls. Now you can find such a page of footguns for almost any language out there. But this on a whole different level and affects even the most fundamental features. One can't even for-loop a list of files or use echo in the first intuitive way without running into a trap that potentially blows up your whole system. It's the complete opposite of "pit of success"

Familiarity is of course an important factor, though I would actually claim the problem is the opposite, people reach towards bash - because of familiarity of what they type on the cli. Not realizing that scripting is a complete different world from interactive use where you can inspect, correct and manually adjust every action step by step, with same input each time.

A language with only string as a data type (it has arrays and numbers but its so confusing and converts implicitly that you might as well ignore it, IFS anyone?), surprising quoting, data and input interpreted as code, globbing and expansions where you least expect it, global variable everything, no module-system, only error handling is either errexit or manually checking each command and every sub-pipe, functions make error-handling behave even weirder. It can't be made safe no matter how hard you try to familiarize yourself with it.


It’s not about being comfortable, but about the million ways your shell script will fail. A file has a space in the name? Good chance your script failed. No file founs? Same. Multiple files found after expansion? Same.


If you're gonna Bash, definitely use shellcheck for everything. If you're writing Bash in vim, I'd highly recommend installing Syntastic and enabling the checker below:

``` let g:syntastic_sh_checkers = ["shellcheck", "-e", "SC1090"] ```


Note that HN doesn't support markdown, but does have its own limited markup.

Specifically, backticks are simply displayed, rather than marking a code snippet or block.

You can indent by four spaces for a code block as below:

    This is a preformatted section preceded by
    four spaces in the comment editor.


This all has its place, and I learned a good few things from the article. But also, in many situations when you may not need to deal with completely arbitrary input, it can be nice to shed all of the "correct" syntax and just optimize for readability.


> Command substitution: > Good: "$(cmd)" > Bad: $(cmd)

And I’m over here still using `cmd`...


A portable way to test if the variable is set:

    test -n "${VAR+x}"


I think the best way to write things safely is to not write them in Bash in the first place. Better pick a strongly typed language of your choice.


How many strongly typed languages are:

+ Installed everywhere

+ Even within 20% as concise as bash


Indeed. Python does not win over bash in either category, and it is slower.

I wish there was a super-fast shell-like scripting language everywhere that could serve as a general programming language. Like Perl, but simpler rules and readable syntax. But there is nothing. Who would have thought, I may have to learn Perl eventually.


IMHO, we need more than one language in the shell: one to orchestrate launching of applications, plumbing, and processing of their inputs and outputs, another one for computations with integers, floats, complex values, arrays, matrices, another one for text processing and parsing, another one for argument parsing and default values, another one for error handling and recovery, another one for pattern matching and state machines, another one for object-oriented programming, another one for network programming, another one for record-oriented programming, another one for security and debugging, another one for container management, another one for system manipulation, upgrading, recovering, another one for documentation, and so on, with advanced way to share variables and data between these languages (environment variables on steroids).

It's not possible to implement such monstrosity as one simple, portable, small, fast, and secure binary.


Why not? Busybox is an example of how to pack diverse capabilities into one binary.


How is it slower/matter at all? Goddamn bash calls an external process in if branches, so if a bash script is enough for a given process, than running python inside a vm inside a vm inside a vm will be still fast enough. Scripting doesn’t require performance.


It is slow because the process of interpretation is slow (conditions, number and string operations) and because to achieve anything nontrivial in shell/bash, one _has_ to call external processes, which is slow. In Perl one calls external processes much less because of large amount of capable libraries and Perl is indeed much faster.


This doesn’t make sense. How would python be slower? Bash is indeed the prototypical do what each line says. And there is absolutely no optimizations here.

Python while not the fastest thing out there, basically every implementation does a quick parse phase and run some optimized form.


I meant bash is slower than perl. Python is also slower than perl. Python is quicker than bash in number/string operations in process, but much slower than bash in startup. 'Quick parse phase' is exactly the reason; it is not quick. Try to execute 1000 python processes and then 1000 bash processes that do something simple like add numbers given as command line arguments. Python is order of magnitude slower to finish.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: