Hacker News new | past | comments | ask | show | jobs | submit login
Take care editing bash scripts (2019) (sdf.org)
405 points by cxr 32 days ago | hide | past | web | favorite | 114 comments

Whether this happens or not depends on how the file is edited, I think. From the strace output shown in TFA, it appears the file is not opened again between executing the lines. E.g. using vim as editor, the modified file is actually written as a new file under the same filename. In Linux, the bash process will keep the original file open so that the inodes are not reclaimed by the filesystem until the file is closed. Hence the bash process won't "see" the modified script file.

I've tested this, and indeed when editing the file with vim, I can't reproduce the "dangerous" behavior. To trigger this behavior, I'd have to overwrite the script file, e.g. by

    1. create script run.sh with "sleep 30" etc
    2. create script run1.sh with "sleep 3" instead
    3. start run.sh
    4. overwrite script: cat run1.sh > run.sh
    5. after the sleep, the commented line does execute
Are there editors that actually overwrite the original file, instead of creating a new file? I thought the latter is the recommended approach as it can prevent data loss e.g. if there's a crash during writing the file.

P.S. If the script in question is on a network file system, things may be different even for vim. E.g. on NFS the file handle doesn't behave fully like a regular POSIX file handle, and the NFS server might actually reopen the file between reads, thus delivering the modified contents to the bash process.

> Are there editors that actually overwrite the original file, instead of creating a new file?

Are there ones that do the reverse? :o.

Seriously though, it's worth pointing out that what you describe is not obvious, and one can easily miss that there are two ways of writing a file to disk. Case in point, it took me 17 years of programming to discover that "writing new files under the same filename" is a thing, and that was only by accident, beacuse I started wondering why #'open in Common Lisp has both :overwrite and :supersede as different options for what to do if the file you want to write to exists...

> it's worth pointing out that what you describe is not obvious, and one can easily miss that there are two ways of writing a file to disk.

Yeah, there's a dichotomy in how files are handled as presented to the user and how it's actually done. For example, despite most UIs presenting files as being opened, they're almost always closed and only opened when the program needs to do something to them at any given instant.

> Case in point, it took me 17 years of programming to discover that "writing new files under the same filename" is a thing

It took me many years (though not 17) to discover the opposite, that all forms of overwriting without changing the inode was possible. For many years, I was under the impression that one could modify files to the same length or a longer one, but that the core syscall interface for files (the calls that one is typically introduced to when learning about working with files) didn't allow for the shrinking of files. I made the wrong justifying assumption that this API design was to help against disk fragmentation. I also thought file-clobbering meant replacing files. I later learned that truncate(2) exists and allows for the arbitrary shrinking of files, but even now, I'm not sure that it's possible to shrink files arbitrarily on all systems with a unix-like file syscall interface. For example, Plan9 doesn't seem to have a truncate syscall.[1]

It seemed to me that the only way file editors could modify a file to be shorter was to write a new one and replace the old one with it. I suppose they could have also clobbered files, but it didn't seem to me that it held any advantage over writing elsewhere and renaming, at least not while I wrongfully assumed that clobbering caused the inode to change. If anything, resorting to clobbering increases the risk of data-loss if there's any error while writing.

Anyway, my point is that what you call non-obvious was what I intuitively thought was the only way file editors could work portably, and that this was due to my late introduction to the perhaps-non-portable truncate(2).

[1] http://aiju.de/plan_9/plan9-syscalls

Come to think of it, I started programming on Windows, and my first introduction to the filesystem operations was iostream in C++. Being a kid then, I haven't learned about syscalls or inodes until a lot, lot later. I always imagined the filesystem as a more sophisticated version of an array of pairs <startAddress, length>. Thus I assumed shortening a file is trivial (just decrement the length), but making a file longer may require copying it over to somewhere else, if there isn't $newLength bytes available after the $startAddress. I believed that defragmentation meant removing gaps caused by shortening or deleting files.

(N.b. this is how simple memory allocators work.)

So in the end, for a long while, I hadn't had an issue that would make me double-check those assumptions.

The traditional way to truncate an existing file is using O_TRUNC in the open call. From your link it looks like Plan 9 has that flag.

> In addition, there are [...] values that can be ORed with the omode: OTRUNC says to truncate the file to zero length before opening it

That's what I meant by clobbering. The thing is that you can't use that flag to say "shorten it by half" to keep the first half, but you can with truncate(2).

My point was that if you can't shorten it by an arbitrary amount to avoid the redundancy of writing the same beginning of the file, then there's really no advantage to it over writing a new file to replace the old one. In fact, I thought clobbering (i.e. O_TRUNC) was nearly equivalent, only with the additional risk of data loss if there's an error while writing.

Creating a new file also replaces all file metadata (creation time, permissions, ACLs, links, ...) while O_TRUNC only erases the contents.

I can reproduce this in Emacs 28 and non 4.9.2 I have no idea how nano handles saving files but emacs seems to handle it safely as described and still hit the dangerous case.

from the depths of basic-save-buffer-2

   ;; Write temp name, then rename it.
   ;; This requires write access to the containing dir,
   ;; which is why we don't try it if we don't have that access.

Thanks, indeed with nano I can reproduce this.

One way to check how the editor behaves is to run

    ls -li run.sh
before and after editing the script. The first number in the output is the inode of the file; if it stays the same then the file was overwritten.

it changes under both emacs and vim but not under nano. Neither vim nor emacs is overwriting it in place but both show the bad behavior on my end.

There are situations in which even vim will write directly to the file. When you do not have write permissions on the directory, vim will not be able to create a new file and will therefore "reuse" the existing file.

Edit: Here is the list in the vim source code when it will not use rename: https://github.com/vim/vim/blob/95f0b6e5a5e5861da34cc064c601...

> E.g. using vim as editor, the modified file is actually written as a new file under the same filename.

I don't know much about the internal workings of vim, but I do know two things:

1) I use vim exclusively (on linux anyway).

2) I have been bitten by this exact problem.

I guess I can understand why bash works this way.. it means that it can handle arbitrarily long scripts without having to read the entire script into memory at once. But purely from an everyday usability perspective, this is a really bad design decision. Anyone know if there's a way to force it to read the whole script into memory on startup?

When writing a long script you should bundle everything into functions and then call one that kicks everything off at the end. This prevents the problem mentioned but also is helpful for keeping it organized.

I'm surprised that helps. Instead of the seeking behavior demonstrated in the `strace`, does it switch to another way of keeping state after a backward jump?

It would also have to buffer the whole file even though it's painstakingly executing it in a way that doesn't require a buffer.

I would imagine it's the same thing. It's going by top-level statement. The backward jump is just to keep track of which top-level statement it has read so far. It's not about being positioned on what it's currently running.

> painstakingly executing it in a way that doesn't require a buffer.

I don't understand why people keep saying that. There's a buffer of 64 bytes. You mean in a way that doesn't require slurping the file. It avoids slurping precisely by buffering.

The parser copies the body of functions and aliases (and heredocs, duh) into memory and interprets it from there. This is necessary because otherwise a function that is called frequently would translate into a lot of system calls for file I/O. Also bash has to work with streams that are not seekable, like piped command output, and so it allows declarations that can be parsed and then later referred to in one pass.

I tried running it in Vim (I have version 8.1.1401), with a regular bash script, and it didn't run the dangerous command. However, when I made a hard link to the same file, then it did run the dangerous command. My guess is that Vim defaults to writing a new file with the same name, but, if there is a hard-link pointing at the same file, then it has to actually modify the file in place. That is probably what bit you in the past (although I'm sure this behaviour is configurable somewhere).

If you want to force bash to read the whole script into memory on startup, you can put the whole script inside a function, and then, on the last line of the file, call the function you just defined.

> I guess I can understand why bash works this way.. it means that it can handle arbitrarily long scripts without having to read the entire script into memory at once.

The normal way of reading a file (using a buffer) solves this. The only reason for doing this dance instead would be that it's considered part of the interface. I was going to look for it in the POSIX standard, but it looks like that would cost me $894.00, and that's $7 more than my entire verifying-comments-on-the-internet budget for this fortnight.

> Are there editors that actually overwrite the original file, instead of creating a new file?

Yes, there are. This bit me at one point, since I created a Make target depending on directories rather than individual files in the directories (for performance). A directory gets updated when a new file is created, so it worked for all developers except one - who had an editor that saved files in place. I think that developer was using Sublime, but I might remember wrong.

iirc Sublime Text overwrites if you set

  "atomic_save": false

Thank you! I code with Sublime on files mounted with sshfs, and could not figure out why %autoreload (automatically reload imported modules in iPython) did not work. Apparently %autoreload watches directory updates, which are only triggered with atomic saves when using sshfs with cache.

I know `nano` does this.

i dont see the behavior you describe after i edit file in vim the inode does not change (in linux)

  → ls  -i /tmp/foo2.txt 
  4063325 /tmp/foo2.txt
  → vim !$
  vim /tmp/foo2.txt #edit file
  → ls  -i /tmp/foo2.txt 
  4063325 /tmp/foo2.txt

It depends on vim's 'writebackup' option. By default, it's on, but I guess you have it off. You can check in vim with ":set writebackup?" If it returns "writebackup", it's on; if it returns "nowritebackup", it's off.

EDIT: Then again, like the other comment mentioned, it seems it also depends on whether the file is in /tmp/. This seems to be because it matches the default pattern in the option "backupskip". For unix, the default is "/tmp/,$TMPDIR/,$TMP/,$TEMP/".

Thanks. Very curious. Indeed if the file is in /tmp, the inode doesn't change. But if it's in my home directory, it does change (for me). Both file systems are ext4, so there's something else at play.

in /tmp you dont have write permissions (so cannot create a new file, and falls back to overwrite), in home directory you have.

What does TFA mean in this context? I know what it usually meant on Slashdot, but it seems out of place here and other comments I see often on HN.

Same: The Fine Article - but vary the second word according to context.

The featured article

> Are there editors that actually overwrite the original file, instead of creating a new file?

Yeah, I've been bitten by this before. I use vim though! Does ":w" have different behaviour to ":wq"?

I'm pretty certain I've encountered this using fairly featureful IDEs too, as I primarily use the JetBrains suite for dev now.

This could all be compounded by me editing stuff on NFS mounts though...

> Does ":w" have different behaviour to ":wq"?

Or does ":w" have a different behaviour than ":up"(save file if changed) in this context? My guess would be that both just reuse the same functionality(why be inconsistent for no reason?).

But regardless it's good to know this edge case can occur - trying to find the cause otherwise and reproducing the problem would be a nightmare, even if the behaviour was consistent across editors.

EDIT: I tested using 'ls -li' to look at the file's inode, and while ":w", ":wq" as well as ":up" always result in a new file it seems that multiple changes can cause the inode to go back to a previously used number. So maybe this issue can occur with every editor if the file is saved multiple times.

> it seems that multiple changes can cause the inode to go >back to a previously used number. So maybe this issue can >occur with every editor if the file is saved multiple times.

I doubt that a inode reuse can make this happen, since the inode will not be reused while bash keeps the file open.

> the modified file is actually written as a new file under the same filename.

That can't be true, because it would break links.

Only hardlinks, which are rare for scripts in my experience, and not symbolic links.

I have used the same overwrite-by-rename technique in very long-running embedded processes because renaming a file is atomic on common local Linux filesystems.

In vim, it depends on the value of backupcopy. If set to "auto" (a default-default on linux, but maybe not for a specific distribution's default vimrc), the behavior is to check to see if the number of hardlinks to a file is greater than one, and then write the modified version in place, otherwise use the write and rename strategy.

See: https://github.com/vim/vim/blob/95f0b6e5a5e5861da34cc064c601...

I'm glad someone finally wrote about this issue.

I have also stumbled upon this a few years ago, but I blamed myself (for using an editor that saves in place), and just applied the following "workaround" when needed: wrap the whole script in a function, and as the last line of the script just call it with `_main "${@}"`.

To expand on this, yes, you can force Bash to read more lines and keep them in memory before executing them by wrapping them in blocks. These blocks can be (), {}, functions, and maybe other structures.

In addition to this, you can make sure no appended code is ever executed by explicitly running "exit" at the end of your block. I actually used this trick in a self git-updating script (the updated version could contain more lines at the end).

DOS/Windows Batch files behave same way. COMMAND.COM rereads whole batch file every time it executes one command line.

    PING localhost -n 10
    #echo nothing to see here
    echo finish
D:\123>PING localhost -n 10

Pinging ... [] with 32 bytes of data: Reply from bytes=32 time<1ms TTL=64 ...

Ping statistics for Packets: Sent = 10, Received = 10, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 0ms, Maximum = 0ms, Average = 0ms

D:\123>#echo nothing to see here '#echo' is not recognized as an internal or external command, operable program or batch file.

D:\123>echo finish finish

now rerun it and edit first line to 'PING localhost -n 1' and suddenly

... Approximate round trip times in milli-seconds: Minimum = 0ms, Maximum = 0ms, Average = 0ms

D:\123>echo nothing to see here nothing to see here

D:\123>echo finish finish

This is why Rex Conn's add-on command interpreter for MS/PC/DR-DOS, 4DOS, had the concept of BTM scripts. BTM stands for "Batch To Memory" and 4DOS (and all of its successors through NDOS, 4NT, and 4OS2 to the current Take Command) reads the entirety of a BTM script before running it.

* https://jpsoft.com/help/batchtype.htm

The reason that this could not be simply turned on globally for BAT scripts is that various tools made use of the original behaviour, perhaps most notably "fancy change directory" scripts and similar programs, where a wrapped executable was called from a wrapper script, and the executable rewrote the next lines of the script on the fly to do the selected action.

What you describe is also interesting behaviour but it is different from the effect seen in bash.

For bash it works rougly like this (discounting the shebang to keep things simple):

1. bash reads first line and remembers the byte offset where to continue next. In this case at the octothorpe at byte position 10 (assuming UNIX line endings).

2. bash executes first line

3. while bash is still busy with the first line someone removes a character before position 10

4. Bash reads what it assumes to be the next line at byte position 10, which is now where the 'r' is.

5. bash executes 'rm -rf --no-preserve-root'

EDIT: The observed behaviour for a DOS/Windows batch file is indeed the same. I assumed that the trick would not work because supposedly Windows reads the whole file again after every command, but apparently it does not remember the line it was executing but also a byte offset

    echo start
    echo stop
Removing the last character from the first line while the script is paused results in

    Bad command or filename - "cho".

I've continuously observed the described problem, and have read about a workaround: when writing the "top level" lines of script, never write them alone, but inside of a { }

It looks like this in my case:


    # main
Now whatever is inside of any of the functions or inside of the "main" braces will be read at once and it won't be replaced with some new lines if I edit the file while the execution still hasn't finished.

I would surely prefer having some global flag "reread the script" for those who need the old behavior and "sane" default for the most users (never reread). But general user friendliness and reasonable defaults was seldom a desired goal in the circles that decide about the development of these programs.

Hm I put all shell code in functions for other reasons, but it's interesting to know it solves another problem.


To summarize:

1) saving many interactive snippets in a single file -- each one is a function

2) expose entire shell functions to tools like xargs and chroot

3) if $0 myfunc solves the "ignored errexit" problem


4) safely modify a bash script while it's executing

FWIW, the IBM mainframe world's partitioned dataset (PDS) addressed this issue 40+ years ago. Old processes continue to run with the version of the PDS member they started with, while new processes start with a newer version, if one exists. Similar to read consistency features of Oracle.

Has this been reported as a bug? Because it strains credulity to think that this is really intended behavior.

I wouldn't predict this, but I don't find it terribly out of line.

Bash, and shells in general, are designed to be interactive command execution environments. The syntax easily allows multiple commands on a single line, and this is not uncommon in interactive or scripting use, in my experience. Many shells also favor permissive execution. What I mean by this is that they will keep running as far as they can. It is perfectly reasonable (at least based on the behavior of many shells) for a script to run several commands successfully and then fail later on when facing a syntax error or nonexistent command.

Many shells are also carrying design decisions from a much more resource-constrained era of computing.

It certainly makes sense in terms of preserving system resources to simply execute a command at a time, to full completion, before even reading forward. This minimizes the amount of parsing and the amount of script required to keep in memory. I am not saying that this is the best way to do things, but that it is an optimization along certain axes.

Separately, it is not unreasonable in terms of design and implementation of a shell to unify as much as possible the scripting environment and the interactive environment. Keeping this in mind, the way to think of a script is just as an interactive session. Each line is just a command entered at the prompt. This ensures that behavior in scripts is the same as the behavior the user sees every day. How to implement this, especially in the face of resource constraints, in another way? I am not saying that the implementation as it stands is correct or good, but just that this is not an unreasonable behavior.

I agree it's definitely on purpose. As you write, there is no reason to keep the full script in memory. Sure, human-written scripts will always fit, and probably did so even in the old days, but some scripts are computer-generated. (Recall those "self-extracting shell scripts"? Their size is unbounded. Nowadays they even have a hip .io domain: makeself.io)

Secondly, bash does not always seek back and reread parts of the script it already read. For instance it will skip doing that when only executing plain echo commands instead of commands which necessitate forking. Perhaps someone could look up the relevant part of the source?

See it for yourself by stracing (with strace -e %process,%file,read,write,lseek) a file containing

    echo a
    echo b
    echo c

    date +a
    date +b
    date +c

Furthermore I would say this is actually a "feature", as some scripts might be too large to fit into limited memory, and also reading-and-executing one step at a time allows one to write "dynamic" scripts that are generated as they are executed.

Although I haven't written such "dynamic" scripts (yet), I'm sure a few exist that based on `expect` and `bash` work in a "feedback-loop" manner.

If you want to use a 'dynamic' script, why not pipe from stdin instead?

Yes, you are right, pipes are mandatory here.

When I was referring to a "dynamic" script I didn't mean "just append to a file", but instead I was referring to the fact that `bash` doesn't first try to load the whole script, parse and then execute it, but instead it "streams" through the script.

There are a lot of features of bash that are probably fundamentally incompatible with this, but for non-interactive scripts I really only want to use languages that read in the entire file and parse it before executing a single line. This can even allow stuff like JIT compilation which, though it does bear some startup cost is faster in the long run.

Oh, for sure. I agree. I don't think bash is the tool for that job.

Even when allowing incremental reading of the script, it would be safer to use inotify to detect changes to the script being executed and fail when changes are detected.

Also, flock() could be used to signal to other processes that the script file is not to be written to.

This seems like the correct solution to me. If I run a 4GB script (eg, a self-extracting script with a zip embedded, like another comment mentioned), I don't want to oom myself. Locking during execution or exiting with an appropriate error code seems like much more desirable behavior.

Also, what is the use case exactly? In-place replacement of scripts in production that run almost like services? I would think that even if this behavior was not there, doing the above is a huge red flag and outside of the scope of what the shell was designed for.

My post was all about why this is not surprising behavior for bash, and my justification for the same. To be very clear, I don't find this surprising given the circumstances of bash's birth into this world and its primary use case as an interactive shell. I don't suggest any specific use case for this behavior, or even that there is an intended use case at all. I am choosing "behavior" as the word to use very specifically. It's not a feature. It is a behavior that falls out of a combination of implementation goals and constraints. I do not find this behavior surprising. I am not suggesting that someone sat down and reasoned through why a language should behave in this way when executing a script.

Understood; I was not trying to dispute your post or anything. It's an overall point that when faced with this bash behavior, the correct response should be not to be more careful, but redesign whatever system is running it in such a way that this can be a production problem in the first place.

This is a feature I knew about, and one I've found useful. It means I can keep working on a long shell script while the first part is running. The "be careful" applies to editing parts of the script that have already run.

Bash in general is a really helpful tool because it lets you trade off some safety for the ability to move very quickly, and this kind of on-the-fly editing is a good example.

There used to be the "text file busy" error to prevent that from happening, where "text" refers to the code segment of an ELF binary, but I've seen the error for shell script "text" as well. Isn't there anymore?

That error only applies to executables that get mapped into memory while running -- which is to say, binary executables. Script executables have never had that behavior.

That makes sense, but I have faint memories of vi rejecting writes to running scripts. Maybe it was on Solaris or AIX.

It doesn't happen with the z-shell (zsh) as I just tested under Debian. Bash shows the described behaviour. Interesting, need to look at other shells too, later.

But it's definitely better to be careful and expect the bash ("classic"?) behaviour.

Bear in mind that bash is obviously restarting from the character position specifically but other shells might not handle this at all, meaning that they just use buffered stdin. For such a small script the whole thing has been read into the buffer but a longer script might fault at an otherwise unobvious point.

Unless you know that it is explicitly safe (ie the whole file has been cached before it starts) then you should really not modify any running file.

That was my next question, thank you. (Just switched to zsh last month after 20+ years on bash.)

Oh wow! I had the feeling that something like this affected me a handful of times but I was never sure and I couldn‘t reproduce so I just ignored. That‘s quite crazy!

Stylistically it's outside the scope.

Bash is a classic style tool, if you hammer your thumb instead of a nail, that doesn't mean the design of hammers need to be fixed. It's meant to be useful, not smart.

This attitude is why the core utils have remained so damn useful for 50 years and haven't devolved into dysfunctional messes.

choosing what not to do is important

IIRC, older versions of Python would run with a (writable) fd open to the initial script. If you weren't paying attention, it was possible to accidentally modify that script.

It’s weird, I’ve been using bash for over ten years now, and just started noticing this behavior a couple weeks ago. It probably has to do with the nature of the scripts I’m running these days (bash scripts that call docker commands to build complicated images and then do other stuff). I could tell that bash wasn’t reading the whole script before executing, but I didn’t spend the time to figure out exactly what the real behavior was. Thanks to the author for this analysis!

Is it possible your editor used to perform atomic replacement of files, and has now started to overwrite them.in place instead?

I have noticed this behavior on Ubuntu in 13.13. On the other hand, I kept the habit of expecting a script to fail when modifying it (most of the time, starting a command in the middle makes it fail) but it never seems to occur on macOS’s bash.

Was that the Ubuntu released in the 13th month of 2013, Fictional Ferret?

If shell scripting on macOS works different (and this sounds like exactly the kind of thing that might work different) then that would explain why I didn’t come across this years ago. Most of my 10 years of shell scripting have been on macOS.


    $ cat rewrite_me 
    C=$(sed -r 's,^(C.+sed.+)#$,\1,g' ./$0); echo "$C" > $0  #####
    # date

    $ ./rewrite_me 
    Fri 08 May 2020 11:41:52 PM CEST

This isn't specific to bash scripts, this can happen to pretty much any file. What we really need is a way for a process to atomically "check out" a copy of a file to read from that won't change.

An atomic read fork isn't sufficient. The file could be inconsistent if another process is currently writing to the file at the moment of the snapshot. What you need is an atomic write, which is already provided by rename(2).

I've always found it odd that binaries and scripts are often installed 755, including in /bin and /sbin.[1] Perhaps it's because install scripts don't bother changing the read-write permissions, so executable end up with 755 because the default umask is 022.

Anyhow, I've taken to removing all write permissions from most of my files, not just executables. I haven't yet experimented with changing my umask to 222, but I suspect it would cause many programs, especially editors, to fail.

[1] At least on Linux. I just checked OpenBSD and they're 555. But even on OpenBSD most files in /usr/local/{bin,sbin}, installed by third-party packages, are 755.

While technically true, there's an enormous difference between a window of vulnerability that lasts some fraction of a second (the time needed to read the entire file from the filesystem) and multiple seconds, hours, days or even months depending on how long the script runs.

MVCC for filesystems?

Title should be: Take Care Editing Running BASH Scripts.

Yeah, I was just thinking about how this could only happen when editing scripts while they're running. It doesn't make sense to do that with regular, non-interactive CLI scripts, but I guess it does with interactive ones that display a TUI or prompt.

Batch files on Windows do the same thing. That’s caught me out far more often, because I make errors in my batch files much more often than I do with shell scripts.

On the other hand, I used this behaviour to my advantage in the DOS era.

I had written a small utility to easily start my games (a Pascal program). My goal was to get as much memory free for the games. To allow unloading my program before launching the game, I wrote a bat script which first launched my program.

My program would replace the last line of the script with the game to start (and a goto begining), and then terminate. Now, command.com would start my game, without any memory overhead. When I quit my game, I come back to my utility.

At the time, security was not my concern...

I was just going to comment on this. I believe batch files buffer on line, but can't confirm.

They do.

Can be used it to interact with the execution of the current program. It is simple as appending further code with a label and jumping to it using goto. I first noticed it in 2007 started using it in production.

I can reproduce this in bash but not in fish or zsh. This seems to be a uniquely bad feature of bash.

yep, i've also checked bourne shell and csh/tcsh, cannot reproduce too. only in bash

I had always assumed that bash first read the whole script into memory (unless it was huge), then parsed and interpreted it. Guess I was wrong. Thanks.

I've noticed this behaviour in the past. I remember a sun solaris server at work back around 2009 getting taken down by someone editing a script while it was running. The scripts always had a rm -rf as part of cleanup of working files.

I wonder if folks using Chef to ensure a file is present used a "clobber-always" mode and had it behave correctly. I would imagine so since Chef should make a new file. Anyway, a problem for an older age.

This is why my directories with shell scripts have a "make install" target. For execution they are copied to a place where they are unlinked before written new.

As M. Ahern points out at https://news.ycombinator.com/item?id=23098813 , that's only part of the job.

Assuming that you actually use rename() to do the unlinking and atomic updating, the "make install" should also ensure that write permission is removed from the new files.

* https://github.com/jdebp/nosh/blob/79b1c0aab9834a09a59e15d47...

I have seen this years ago when I was running a script and then thought of a change and edit it while it was still running. The odd execution thereafter gave it away.

I may be missing something here, but why are things like this not repotrted as a bug ?

It is not really a bug, just counter intuitive behavior. Their example is somewhat contrived, usually this will just manifest in something being broken.

If you use an editor with safe saves this might not even happen because that works by saving to a temp file and doing an atomic swap, and unlinking the old file. In that case, I believe the script should continue as written, because the fd should point to the unlinked file. I believe vim does this by default, for example.

Using rename() does this for you. It's actually guaranteed to preserve the contents of the old file for any process still having it open while atomically letting any future process only see the new file.

This is a bug because the downside of this behaviour is outweighed by some tiny upside.

But I've never seen a person editing a bash script without an editor that swaps in a new version of the file until today, so never knowingly encountered this bug before.

It should be fixed.

I don't think that's how you define a bug. I think technically you could argue its a "logical error", however that assumes that you know the intent of the creator of bash was specifically to not allow this behavior which I don't think is the case.

If nearly everyone doesn't expect the behaviour, and the behaviour has a serious downside, it's a bug.

Windows .bat files are notorious for this behaviour; disappointing to see it in bash.

Hot patching is dangerous in all execution environments.

This is software engineering... Separate your dev environment from production and do a controlled deployment.

Every binary executable on a Linux system can be deleted and replaced while the running program is unaffected. Every Python script can be edited while it's running with no effect.

It comes as a surprise the first time you experience it that this rule doesn't follow with bash scripts.

> Every binary executable on a Linux system can be deleted and replaced while the running program is unaffected.

Delete and replace is different than edit in place. I don't know about Linux, but you can edit a binary on FreeBSD which is running, and it affects running copies (it's all memory mapped from the file), and that's why i use install instead of cp to update binaries (including shared libraries)

Well, on unix systems unlinking a file and creating a new one is not the same as modifying the file. If a process has the old file open it will keep that open and remain running with the old file. The new file is a different file.

Is bash closing the file and reopening it, or is the editor writing back to the same filehandle I wonder?

> Every Python script can be edited while it's running with no effect.

Until your script throws an exception, at which point your backtrace will not match the code that was running at all.

It is a surprise, no doubt. On the other hand, bash scripts can contain data and be far larger than available RAM, so just slurping the whole thing up at the start doesn't really work.

That's also true of executables, but bash scrips aren't exec'ed, they are just text files that are read in by the bash executable.

This advice is sensible if making the assumption that the script is for a production purpose.

If you're writing a shell script for some mundane task on your personal machine, knowing the possible effects of editing a running script is useful knowledge, especially when something might behave in a non-intuitive way.

But this is not hot patching, at least not if I understand correctly: changing 30 to 3 was not meant to have an effect on the running script, it was meant lower the waiting time for next time you're running the script.

This is not an unusual thing to do in other situations. I'm sure you change your code all the time while an instance of the program is running, no matter whether it's compiled or interpreted. At least I don't habitually close a stop program first before I make changes to the source code.

The example in the article shows that that's a bad idea in bash scripts because they, unlike other code, don't get read in entirely (as in, say, Python, Perl, etc.) before they get executed.

Ask yourself for a moment -- how does puppet deploy your bash long running scripts? Is there no risk that a simple, standard, dev/prod separated deploy of your rsync backup script break?

Not in Lisp. In Lisp you always do hot patching, it's a feature.

That's why I get always hit by this shell problem. Happens every other week. There's not even a lock, as I would expect.

I have enough bureaucracy in my life, hard and fast rules like this is what kills actual innovation. Sometimes you just want to run something to get it working. Software engineering isn't religion. At the end of the day the focus should be on feature development, with a slight bias towards decent process.

I agree on the innovation side, but when you have a public service or paying customers, and large Engineering teams you definitely need rules and basic environment separation is not too hard

While true, this assumes you work in a controlled environment, with a pipeline of checks and tests before things go into production; this particular article is about changing a script while it's running somewhere.

Which is probably not a good idea in the first place.

If you put such dangerous commands on your website, then please make them not user-selectable.

Oh lord... Are there any further baby-proofing you'd like to see?

No one post any fork bombs...

Such as this?

  eval $(echo "I<RA('1E<W3t`rYWdl&r()(Y29j&r{,3Rl7Ig}&r{,T31wo});r`26<F]F;==" | uudecode)

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact