
How to write idempotent Bash scripts - kiyanwang
https://arslan.io/2019/07/03/how-to-write-idempotent-bash-scripts/
======
mnutt
It’s good to know which command line tools are idempotent, but I think once
you start running automated scripts repeatedly you’re better off with
something like Chef where you declaratively define the end state and let chef
figure out how to converge upon that state. Taking the “create a directory”
example, you would say “I want to end up in a state where this directory is
created and has these permissions” and chef will determine if the directory
already exists.

There are of some downsides to that approach which stem from the fact that you
can’t possibly declare every possible thing. For instance, if you were to run
the above declaration and then later remove the declaration, chef wouldn’t
know to delete the directory; at that point chef wouldn’t know anything about
the directory at all. So ideally you can build infrastructure from scratch
each time (Docker, etc) rather than having to converge an existing machine.

~~~
pas
Chef isn't magic, it just uses a lot of cookbooks which are just running
things to check what the state is.

Usually those cookbooks will want to have things in a particular way, which
might not be the way you want.

Chef/Ansible/Puppet all have the problem of having many layers of overhead for
the same thing.

This makes them slow, hard to debug, hard to read and even write (in my
opinion).

Sure, bash is hell, but the go from bash to something like Rust, instead of
Chef/Ansible.

Sure, again, this might sound rather unconventional, but you get safety, speed
and convenience of a full fledged programming language, single binary output,
etc. And you need to test cookbooks/playbooks too, so why spend enormous
effort on cobbling together scripts and high-level abstractions instead of
writing just what you need with the assumptions you really have - instead of
what a general cookbook/playbook has to have (which is a vast difference).

~~~
lukeschlather
Rust doesn't get you anything resembling what you want here.

Chef/Ansible/Salt could be described as rudimentary type systems for operating
system states, that enable you to convert between different types/states.

It's conceivable you could build such a system in Rust, but Rust itself has no
primitives that would make it easy. It certainly isn't a good starting point
when you want to set up a database server.

~~~
pas
All Config Management systems execute other programs, parse their output,
check/parse files underneath their fancy skin.

I've used both Chef and Ansible. Their maintenance cost is pretty high
unfortunately, and they are not that flexible, nor resilient to worth it.

No wonder the container/k8s boom is so big. Immutable infrastructure (docker
images) with really powerful operational features is what can justify high
maintenance efforts. Whereas fancy but brittle (due to the inherent problem of
state discovery) CM systems are at best only useful for the initial platform
setup.

Rust is just useful because you can quickly and relatively safely produce
single binary programs. (Many people use go for this, but it's easier to skip
error handling in go which result in runtime problems, which is very
inconvenient in a provisioning/setup system.)

~~~
ptman
Nix/guix might be a better approach. Ansible and chef describe part of the
desired state of a system and require a very large definition to cover the
whole system. nix aims to track the whole system state and allow you to supply
changes which again combine to another whole system state

~~~
pas
Yes. Though I'd probably want to simply use something "trusted", either Ubuntu
LTS or CentOS/RHEL, keep the installed packages to a minimum, use a local
repository mirror proxy, track package changes there, etc.

And the image build should be just a simple imperative install these packages,
use this config, run this command on invocation.

Nix is rather amazing with its powerful CLI stuff it provides (S3 compatible
dependency store, fetching via SSH, closures, etc).

My only problem with NixOS is that it's very much like Gentoo. It has infinite
composability built-in, but it means you have to rebuild everything. In
Debian/Ubuntu land you usually can simply enable/disable install/uninstall
specific feature related packages. (For example postfix and postfix-mysql
packages.)

------
Zenst
Many people who write scripts do not factor the environment they are run in
and miss including traps to handle events like being terminated mid-way thru
and cleaning up to a sanitized state. After all, creating a file, you can't
presume their is enough free space, but people do in scripts all the time.

Then you have scripts that you want run once and if run again not have any
more effect. FOr example a script to be run by ops on production servers, what
happens if they accidently run it twice! An easy way to cover that is to have
the script create a file, which you check at the start of the script and if
present you exit. For this you can use the date/time, PID and script name for
a unique file name, create in a /tmp directory that you cleanup weekly via
sculker or whatever frequency you require. Handy way to handle scripts that
you want run successfully once and never to be run again.

~~~
cjvirtucio
I have not been doing DevOps for very long, but a useful trick I've found is
to create a staging directory, perform my work there, diff the result with the
existing state, and copy if a difference exists. I've found it very easy to
clean up to a sanitized state in this way; the existing state is safe, since
all I have to do is just destroy my work.

~~~
nerdponx
What do you use to compute the diff? Just the diff command itself on
individual files?

~~~
dredmorbius
diff -R

You can use numerous tools to compare/manage branches; md5sum / shasum (md5 is
usually useful, though not entirely safe), diff and kin, vimdiff, rsync,
fdupes, jdupes, git, hg.

~~~
haddr
I went the „git” way once and I found it really cool! With all the goodies you
get for free from git, like applying patches, checking differences, resetting
to previous state etc.

~~~
dhimes
Same, but with mercurial.

------
united893
This is generally terrible advice. A better option is to 'set -e' and ensure
that the bash script exits when there's a failure.

Bash scripts can't be idempotent because they operate in an external
environment that can't ever be. The better option is to just be extra safe.

See:

> [http://redsymbol.net/articles/unofficial-bash-strict-
> mode/](http://redsymbol.net/articles/unofficial-bash-strict-mode/)

~~~
dymk
That's orthogonal to the article - in fact, the article hints that you'd take
this advice _when_ your bash script is failing midway, such as it might with
`set -e`.

The article is about writing bash scripts that don't care if you've ran them
once, or 10 times.

~~~
news_hacker
Seems like all the pedants came out for this article. In good-faith I
understand the principles OP is suggesting and have first-hand experience of
their usefulness. But it's easier to nit-pick I guess.

~~~
unixhero
Everytime I discuss Bash on the net, these gremlin style characters ruin the
fruitful discussions I've had ... I think the article provides excellent
advice.

------
empath75
This is surely a good collection of flags to know, but I sort of object to the
premise of the article.

There is a good reason that many of these flags are not default behavior, and
it’s that they can be quite destructive.

If you’re writing a script, as the beginning scenario says, that has errors in
it, and you expect to have more, do you really want to tell all the commands
to power through and delete and overwrite stuff if it encounters an unexpected
state? Telling people to just always use these flags and not only when they’re
quite sure what their script is doing, is probably playing with fire.

This may make the script idempotent in repeated runs, but the first run might
not be what you expect at all.

~~~
iforgotpassword
> If you’re writing a script, as the beginning scenario says, that has errors
> in it, and you expect to have more, do you really want to tell all the
> commands to power through and delete and overwrite stuff if it encounters an
> unexpected state?

Where are they saying this? These flags are about making sure the script
wouldn't error out right away if you run it again, eg how a regular mkdir for
a path that already exists would not exit 0 and thus end script execution
(given set -e is active, which the article seems to imply). Also it is not so
much about purposely writing a script that contains errors, because that would
just mean the error is encountered every time, but more about transient errors
like running out of disk space, a curl call failing because of network hiccups
etc. Everything your script did up until that failure shouldn't prevent a
second call from succeeding.

------
intc
> This is an easy one. Touch is by default idempotent. This means you can call
> it multiple times without any issues. A second call won’t have any effects

Every call to touch will alter the modification time of the file (example.txt
in this case).

~~~
jrumbut
It will change the access time too. Almost every suggestion in the post has
this problem which is unfortunate since access/modify/create time are used in
a lot of scripts to see which files were or were not processed by prior runs
of a script (whether this is a good idea or not).

It is very difficult to write truly idempotent bash scripts (consider log
files). Creating sort of idempotent or at least rerunnable scripts would be
nice but I think even that would take more care than this.

~~~
theamk
Do people actually use access time for anything?

I occasionally do full-text searches on system files, and this resets access
times on all files. Thus, I cannot imagine using a script which cares about
it.

------
suprgeek
A lot of the suggestions in this post are "idempotent" only in the sense of
the specific use case that the author is interested in. Please do not take
this advise to be generally applicable for the uses of "idempotent" or in
general to mean best practice while writing bash scripts.

The advice such as replace rm <FileName> with rm -f <FileName> could lead to
Disaster depending on the scenario. So a HUGE YMMV

------
zamadatix
"The -f flag removes the target destination before creating the symbolic link,
hence it’ll always succeed."

Removing something and adding it back seems to be, by definition, not
idempotent. What if something tries to access that file in between? What about
the filesystem timestamps (same issues as with the "touch" claim)?

~~~
grzm
Idempotency focuses on the result, rather than implementation. If the intent
of the script is to ensure the file is there when the script is finished, then
it’s idempotent. If the time stamps matter for the intent of the script, then,
yes, what you pointed out would be an issue.

~~~
Godel_unicode
This is incorrect. Idempotency is about not needing to worry about running
something more than once. If you introduce race conditions with follow-on
runs, you're not idempotent.

~~~
grzm
If race conditions are an issue within the context of your intent, then yes,
you'll need to take that into account. Again, one needs to take into account
the intent and context of the script. If other aspects of the environment
ensure that the operations are serialized (say you're making an update to a
system which you've rotated out of production for the update), you won't need
to worry about race conditions. Need to make sure you have no timing effects?
That will effect how the script is written. Need to worry about CPU or IO
utilization (including reads)? That will need to be considered.

------
zimbatm
All the examples are more-or-less idempotent. For example the `blkid /dev/sda1
|| mkfs.ext4 /dev/sda1` doesn't guarantee what type of filesystem the
partition will have. `touch file` changes the mtime of the file. `ln -sf` will
fail if the target is a directory and is not atomic.

It would be nice to have a toolset of commands that are all idempotent to make
that type of task easier.

------
arendtio
For everybody interested in _portable_ shell programming I like to recommend
the following page which offers a good overview over POSIX compliant commands
with links directly into the POSIX standard:

[https://shellhaters.org](https://shellhaters.org)

~~~
theamk
Is POSIX relevant today? I'd think that overwhelming majority of the shell
scripts would require either Linux or BSD dialects.

And if one wanted to limit my shell scripts significantly, I'd limit myself to
commands supported by busybox's "ash" shell -- this is a limited shell
environment which is pretty widely used in initrd's and embedded devices.

------
zokier
The first example ("idiom") is bit tricky, because the title is "Creating an
_empty_ file". An alternative would for example be

    
    
        echo -n > example.txt
    

Both fit the description of "Creating an empty file". Is it idempotent, or
more or less so than `touch example.txt`? If there is already an existing
file, then touching it obviously will not empty it, so the end state is not
"empty file exists" like one would expect from operation "Creating an empty
file"

Ironically article calls "This is an easy one"

~~~
gabrielblack
"echo -n " is useless, is sufficient:

> example.txt

without prepending the echo command. This will create an empty file or it will
truncate an existent file. The touch command could be preferable to preserve
the content, if needed.

~~~
gabrielblack
P.S.

>> example.txt

this one (with a double > ) could replace touch in this application case: it's
less readable but more efficient.

------
mongol
This is useful and I think it would be good if more tools had idempotency
options available. Just today I needed to run a script that added a zypper
repo in OpenSuse. It fails the second time because the repo is already added
and there is no option to avoid it. Also the error code is not distinct so it
is not easy to script around. Ideally every tool should have a mode
"accomplish this result" in addition to "do this".

------
rb808
I think Python is getting so popular & ubiquitous now I can't see any reason
to write bash scripts any more except for perhaps very simple ones that are
just a few lines. Other people's bash scripts are regularly just too difficult
and complex to maintain.

~~~
pletnes
If you’re into minimal docker images, bash or similar shell might be the only
script interpreter available.

Python might not exist on your windows machine.

Etc.

~~~
antpls
Surely we could all agree on a minimal subset of useful Python to write
scripts : loops, arrays, lists, some strings functions, etc. The resulting
Python interpreter would probably be tiny.

For example, there is
[https://docs.bazel.build/versions/master/skylark/language.ht...](https://docs.bazel.build/versions/master/skylark/language.html)
which is a subset of Python.

~~~
pletnes
Micropython runs in 16k RAM.

[https://micropython.org/](https://micropython.org/)

------
tjoff
Is there any hope for a more modern ubiquitous light weight scripting
language?

It can be learned, and one can even learn to love it. But scripting is often
done by people that very seldom write scripts, the barrier to make decent
scripts is much too high and readability isn't much better. The nonsensical
syntax is also easy to forget, the result is just an awful lot of headache and
poor scripts floating around. Surely we can do better?

------
finchisko
I think the problem should be solved within the OS using sandboxing, like in
iOS (or to some degree Android too) and macOS and Windows store applications.
For example, when you uninstall an app, there are no leftovers in the system.
You don't need to check, if file exists or not ...

To some degree, this has been also solved with containers. It's much easier to
create new container image, then create proper idempotent script. And you can
create create image from running system too. So you can just "bash" commands
and don't care about the state of the system. When done, you just create final
image and you can be sure, that state of the system in the container would
same as you intended.

I'm talking mostly about install scripts. Of course there are valid use cases
for idempotent bash scripts.

------
deepsun
> ln -s source target

Terminology is broken. Correct is "ln -s TARGET LINK_NAME" per its man page.

------
bonyt
Don’t forget mktmp for making temporary files or directories, instead of
putting files in a set place in /tmp.

[https://linux.die.net/man/1/mktemp](https://linux.die.net/man/1/mktemp)

~~~
arendtio
Well, sadly mktemp is not part of POSIX. So while it is available on a wide
range of systems, there are also different implementations with different
options. So while I like the job it does, I am also pretty disappointed how
'unportable' scripts become as soon as they use mktemp :-/

~~~
posix_me_less
It is sad to read these comments by people believing POSIX is something
relevant today. My advice is to give up on false hope that POSIX will solve
portability problems. POSIX ideas of shell are 30 years old and refusing to
use anything newer alone won't make your script work correctly on all systems.
If feasible, install bash, which has become the standard _de facto_. Or just
write your script to test for the shell version and then launch appropriate
code path.

~~~
arendtio
I know that POSIX doesn't meet the expectations many of us have, but I don't
see how it isn't relevant anymore? I mean, if you want to write portable shell
scripts it is still a good reference on what you can expect to find on many
systems (and which are options introduce by GNU and the likes).

Yes, ultimately you have to test your scripts on the actual systems, but that
is something you have to do anyway. For example, when you run scripts on MacOS
and you run into old Bash bugs because Apple refuses to ship an up-to-date
version, those are issues a standard can't solve.

However, I have no experience how much you can count on POSIX when it comes to
C APIs and the like.

~~~
posix_me_less
POSIX is part of unix history and running systems have some compatibility
level with it, so in that respect it is relevant. But as for writing
universally working scripts, it is not a panacea, because systems shells are
not exactly POSIX compliant. For example, bash and Linux, the standard of unix
de facto, deviate from POSIX, the standard de iure.

> but that is something you have to do anyway.

Exactly my point.

------
sharperguy
I like to use stuff like

mountpoint -q $MOUNTPOINT || mount ...

Instead of

if ! mountpoint -q $MOUNTPOINT; then mount ... fi

It's much more compact in the case of having a lot of commands that need to be
checked in case they don't need to be run again.

~~~
CameronNemo
I was going to mention this, but frankly it is just a style preference. No
real difference between either variant. Also, you can do something like:

    
    
        mountpoint -q /proc || {
            mount -t proc none /proc
            chown 0400 /proc/slabinfo
        }
    

if you need multiple statements.

------
forty
I use ansible only not to have to do this :) nice tricks though

------
jakub_g
Another tricky thing: recursively copying a folder is not easy to get right in
an idempotent way due to arcane bash behaviors:

[https://unix.stackexchange.com/questions/228597/how-to-
copy-...](https://unix.stackexchange.com/questions/228597/how-to-copy-a-
folder-recursively-in-an-idempotent-way-using-cp)

~~~
posix_me_less
That is hardly arcane, the context dependent behaviour of cp and mv is a well-
known standard. You are right it does make making a copy in a given path more
complicated task.

------
rwestergren
Context and purpose of the bash script in question is important here. In the
example, the author is writing a simple bootstrap script for a dev machine. A
number of the critiques here, while valid, are aimed at different use-cases.

------
gist
A good way to remember idempotent is to think of the 'delete' or 'mark as
read' function for gmail.

You can delete an email or mark it as read even if it's been done before (on
another open screen as an example).

~~~
bklaasen
Actually I think the "summon lift" button is a better example of idempotence.
It doesn't matter how many times you mash the button, the lift is coming ASAP,
no quicker. Hitting the button when it's lit doesn't cancel summoning the
lift.

~~~
gist
By 'lift' you mean 'elevator'? When I first read this I thought you meant
'summon lyft' the ride service but then realized what you meant!

Interestingly in the 'old days' of elevator operators hitting it multiple
times would be either a 'hurry up' for the operator or an annoyance that maybe
made it come slower!

------
delinka
> This means you can call it multiple times without any issues.

Unless you have processes that depend on modification time.

> ln -sfn

If you don’t mind the possibility of ownership changing, this is OK.

------
mooneater
Great article, but a common scenario is missing: I want to run a long running
script, but not if another copy of it is already running.

~~~
dymk
[https://linux.die.net/man/1/flock](https://linux.die.net/man/1/flock) would
probably help

~~~
sirn
flock(1) is not POSIX, though. mkdir(1) can be used if you absolutely want a
POSIX way to manage locks. For example:

    
    
        if ! mkdir .lock; then
            printf >&2 "Already running?\\n"
            exit 1
        fi
    

Some network file system implementation do not guarantee atomic mkdir, so you
still need an extra caution with this method.

~~~
posix_me_less
Existence of a file is an unreliable indicator of script instance running.
Much more reliable is to search the script name or other characteristic in the
list of running processes. To use this, the script has to have unique name
though.

------
ape4
Hey I just learned about `mountpoint`. Great. I used to grep the output of
`df`. No more.

------
aey
Why wouldn’t you use make? Side-effect tracking and cleanup is the job of a
build tool.

~~~
andreareina
How do you get make to track something other than the existence and age of a
file?

~~~
aey
If it’s IO, then the inputs and outputs can be tracked as files. If it’s not
IO, and just a pure computation it can be reproduced on demand.

------
michaelmcmillan
Let's try to avoid using a hammer on anything that resembles a nail:
[https://github.com/valvesoftware/steam-for-
linux/issues/3671](https://github.com/valvesoftware/steam-for-
linux/issues/3671)

~~~
dymk
How is that related to the article aside from "both involve bash"?

~~~
pmarreck
It's because rm -rf "$STEAMROOT/"* will evaluate as rm -rf "/"* if $STEAMROOT
for some reason is blank or unset, which is what happened here. (As someone in
that discussion mentioned, a little more Bash knowledge might have suggested
using rm -rf "${STEAMROOT:?}/"* instead, to force it to (at least) error if it
is empty or unset.)

~~~
dymk
So literally the only thing in common with the article is that they involve
bash

~~~
michaelmcmillan
I can't help you if you don't see the connection.

------
unixhero
Great article. Very helpful.

------
NelsonMinar
I got hung up on the claim "Touch is by default idempotent. ... A second call
won’t have any effects". The whole point of touch is to have an effect every
single time you run it. It updates the file atime and mtime. That may seem
harmless in your application, but it's definitely not no effect. Also
promiscuous touching is the source of a bunch of bogus last modified dates in
source code bundles.

~~~
riazrizvi
I've heard Idempotency confused with Consistency. Idempotency is where a
function f can be applied (as a composition) multiple times and it gives the
same result, so Idempotency: _f(f(x)) = f(x)_. Whereas Consistency: _f(x) =
f(x) = f(x)_. An example of an idempotent function f is RaiseToThePowerZero(x)
on x > 0\. We usually take consistency for granted, so it's perhaps better to
think of an example of a non-consistent function f which would be
RandomNumberGenerator().

~~~
mehrdadn
> as a convolution

Sorry, what does idempotency have to do with convolution?

~~~
riazrizvi
Should be composition, not convolution, my brain fart.

------
canadev
personally, I feel like rm -f is dangerous, and never would recommend using
it.

------
nurettin
To make your bash scripts truly idempotent, go to root and turn your system
into a git repo. You're welcome.

------
argd678
The problem with the idempotent trend in configuration management is that it’s
all based on not tracking or knowing what the current state of something is.
So reasoning about how these systems work is fundamentally impossible. It
would be better to focus on systems where by we can always know the state and
improve the tooling there.

~~~
LfLxfxxLxfxx
On the contrary. With idempotence you know the state after an action. rm -f
will always delete the file (if possible). ln -sfn will always work, even for
a directory.

With the default behaviour of rm, ln -s, etc you know neither the state
before, nor after.

~~~
posix_me_less
The option -f is making the script more heavy handed and less likely to fail.
But this is not necessary for the script to be idempotent and sometimes is not
desirable. You can deal with errors in a more safer way and still be
idempotent.

