
Batch editing files with ed - weinzierl
https://jvns.ca/blog/2018/05/11/batch-editing-files-with-ed/
======
davidgould
Awk really is the tool of choice for this sort of thing:

    
    
      $ awk '{print}/baz/ {sub("baz", "elephant"); print}' jvns.txt
      foo:
        - bar
        - baz
        - elephant
        - bananas
    

Since the script is single quoted you can also lay it out legibly:

    
    
      $awk '
      {print}
      /baz/ {
          sub ("baz", "elephant")
          print
      }
      '

which is nice for more complex "one liners". Awk is also standard on all posix
environments and in the mawk flavor is extremely fast (relevant if you are
processing huge files).

There is superb book, _The AWK Programming Language_ , which teaches a lot
about programming in addition to the awk language. Good discussion and link to
pdf here: [0]

[0]
[https://news.ycombinator.com/item?id=13451454](https://news.ycombinator.com/item?id=13451454)

~~~
lucidguppy
The AWK Programming Language is the best programming book ever because it lets
you learn the language through interesting problems (like writing a very small
assembler).

~~~
lemonberry
It's available online. Thanks for the tip.

[https://ia802309.us.archive.org/25/items/pdfy-
MgN0H1joIoDVoI...](https://ia802309.us.archive.org/25/items/pdfy-
MgN0H1joIoDVoIC7/The_AWK_Programming_Language.pdf)

~~~
kaushalmodi
Come on, don't promote piracy of the book! It's worth buying! Mods, please
take down this link.

~~~
kaushalmodi
Hmm, someone downvoted me. Can the downvoter explain what was wrong in what I
said?

~~~
lemonberry
I believe the book's been out of print for sometime. I've no idea if that's
why you were downvoted. I wouldn't post a link to a pirated anything. That pdf
is linked from a ton of sites and the site I found it on seemed like a
reputable site, though I don't remember which it was at the moment.

~~~
kaushalmodi
> I believe the book's been out of print for sometime.

Hmm, I did not know that (I have a print copy of that book).

> I wouldn't post a link to a pirated anything.

OK. But just in case, someone still wants a physical book, they can get it
used from Amazon (can talk about Amazon in US) for ~$3.

------
janvdberg
Pretty much this whole thread is people coming up with better ways of solving
the problem. I love Hacker News! :) But I think the point of the blog (for me)
is showing what 'ed' is and where can you use it for (the problem at hand is
secondary). And it did exactly that for me, I knew about 'ed' but never really
understood what made it special or where/how you could use it. Thanks Julia
for enlightening me!

~~~
orivej
> I knew about 'ed' but never really understood what made it special or
> where/how you could use it.

ed script is one of the formats supported by diff and patch (with their -e or
--ed command line switch). (diff generates the script by itself, but patch
just pipes it to ed.)

ed scripts are used at Apple to maintain their patches for Python: see e.g.
[https://opensource.apple.com/source/python/python-97.50.7/2....](https://opensource.apple.com/source/python/python-97.50.7/2.7/fix/posixmodule.c.ed.auto.html)
and others in
[https://opensource.apple.com/source/python/python-97.50.7/2....](https://opensource.apple.com/source/python/python-97.50.7/2.7/fix/)

~~~
captn3m0
fun side note: `patch` runs ed and ed allows you to run arbitary commands
resulting in a security issue:

[https://rachelbythebay.com/w/2018/04/05/bangpatch/](https://rachelbythebay.com/w/2018/04/05/bangpatch/)

This was used for the [https://holeybeep.ninja/](https://holeybeep.ninja/)
April fool joke release.

------
orivej
This problem is easily solved with a regexp that processes the input as a
whole, rather than working on it line by line. I had this need often enough
that I exported Go regexp engine as a command line tool regrep, which can
insert "elephant" after "baz" with:

    
    
      go get github.com/orivej/unix/regrep
      regrep s '(\n( *-) baz\n)' $'$1$2 elephant\n' < input.yaml
    

It only processes standard input, and can not by itself replace the contents
of an input file with its output; but another tool, inplace, helps:

    
    
      go get github.com/orivej/unix/inplace
      find . -name '*.yaml' -exec inplace {} regrep s '(\n( *-) baz\n)' $'$1$2 elephant\n' \;

~~~
hawski
It's close to the idea of structural regular expressions [0]. I'm still
waiting for the awk from the paper.

[0]
[http://doc.cat-v.org/bell_labs/structural_regexps/](http://doc.cat-v.org/bell_labs/structural_regexps/)

------
rav
In practice I often use Vim instead.

    
    
        :args file1.txt file2.txt file3.txt
        :set autowrite
        :argdo norm /- baz/<CR>yypwCelephants
    

The commands set the argument list to a list of three files (by default, it is
set to the filenames you passed to vim on the command-line). Then, autowrite
is enabled which automatically saves each buffer after editing it. Finally,
argdo runs a command on each argument file.

~~~
JoshMnem
Thanks for that tip. I've used ex for similar things when editing hundreds of
files.

This example searches each HTML file in a directory for a line with a string
and then deletes a number of lines:

    
    
        $ echo "g/search string/ .,+20 d\nx" >> exscript
        $ for f in *.html
          do
              ex - $f < exscript
          done

~~~
teddyh
Why write the ‘ex’ commands to a file? Why not just do the echo inline, like
this

    
    
       for f in *.html; do
           echo -e "g/search string/ .,+20 d\nx" | ex - "$f"
        done
    

?

(I also added quotes to the $f dereference in case of file names containing
white space, and the -e flag to echo to expand \n to newline. In case of a
Bourne shell without support for -e in echo, I would probably use “{ echo
"g/..."; echo "x"; } | ex - ...” instead of using \n.)

Also, in a production script I would probably have used “find . -maxdepth 1
-name "＊.html" -print0 | xargs --null --no-run-if-empty | while read f; do
...; done” instead of a ‘for’ loop from a pathname expansion, in order to
guard against there being _no_ html files, in which case a ‘for’ loop from a
pathname expansion otherwise would be passing the literal string “＊.html” as
the file name argument to ‘ex’.

~~~
JoshMnem
Thanks for the tips. I put them in a file, because I saw someone using ed last
year in a script and started looking at ex from there.

    
    
        diff -e file1 file2 > ed_script
    

Using echo (along with your other suggestions) is probably better for that
example.

~~~
teddyh
Note: I forgot the “--max-args=1” option to xargs.

------
John_KZ
I've recently used EDLIN from an 80s version of MS-DOS.

After spending 5 minutes with the manual, I realized it was the best line
editor I've ever used.

ed is very similar, but it doesn't come with a nice manual. The info page is
chaotic and doesn't start from the beginning. You can cover typical usage in 5
lines, but no, you have to read through 25 pages of stuff just to figure out a
sensible command. I just use vim or nano.

~~~
digi_owl
My understanding is that man/info is meant more as a reference and less like a
first time users guide.

~~~
gpvos
Good man pages can be great first time user guides.

------
bewuethr
I'm much more familiar with sed than ed, so here's how I would to this:

    
    
      sed '/baz/{s/.*/&\n&/;s/baz/elephant/2}' input.txt
    

or, slightly more readable

    
    
      sed '/baz/ {
               s/.*/&\n&/
               s/baz/elephant/2
           }' input.txt
    

The first substitution appends a copy of the line to the pattern space, the
second substitution replaces the second occurrence of "baz" with "elephant".

This being said, I went ahead and bought the book mentioned in the article [0]
- a neat little read.

[0]:
[https://www.michaelwlucas.com/tools/ed](https://www.michaelwlucas.com/tools/ed)

~~~
textmode
To use this solution with a version of sed that does not accept newlines in
patterns (i.e. to make it portable), one has to put the commands in a sed
commands file and run it with sed -f.

How to make the one-liner portable without using a sed commands file?

Maybe something like:

    
    
      sed 's/baz/elephant/;/^ \{2\}- elephant/{h;G;};/^ \{4\}- elephant/{h;G;};s/elephant/baz/' foo|sed -a wfoo
    
      1. s/baz/elephant/ 
      2. duplicate that line if two or four space indent
      3. s/elephant/baz/
      4. save
    

N.B. no temp file used to save changes

cf. jvns.ca blog:

    
    
      1. search for baz
      2. copy that line and paste it
      3. s/baz/elephant/
      4. save and quit
    

N.B. temp file in $TMPDIR used to save changes

------
simplicio
When I first encountered the command-line in college my Prof introduced me to
VI and the basic bash commands, but I wasn't familiar with any other scripting
languages, (or even, if memory serves, the concept of a 'scripting language').
As a result, I ended up creating a pretty dizzying array of ed scripts until
someone introduced me to sed and the fact you can use bash as a scripting
language.

------
8077628
Ed is the standard text editor [https://www.gnu.org/fun/jokes/ed-
msg.html](https://www.gnu.org/fun/jokes/ed-msg.html)

------
atsaloli
Nice article, good to hear ed is not dead. =)

You could also just add the text after the matching line. A little simpler and
more straight forward.

    
    
        $ cat > /tmp/ed-script
        /baz
        a
          - elephants
        .
        w
        q
        $ cat /tmp/2
        foo:
          - bar
          - baz
          - bananas
        $ cat /tmp/ed-script | ed /tmp/2
        33
          - baz
        47
        $ cat /tmp/2
        foo:
          - bar
          - baz
          - elephants
          - bananas
        $·

~~~
fiddlerwoaroof
I don’t think matches the spec that the new line has the same number of
leading spaces as the surrounding lines

~~~
fiddlerwoaroof
Weird, after rereading the article, it seems like I may have imagined that
part.

~~~
bramblerose
An older version of the article contained the following:

> I had one extra weird requirement which was that some of the lines were
> indented with 2 spaces, and some with 4 spaces. The - elephant line needed
> to have the same indentation as the previous line.

~~~
atsaloli
Well! That explains the .t. Thanks! :)

------
wainstead
Chapter 20 of O'Reilly's "Unix Power Tools, 3rd Edition" is all about batch
editing and covers ed/ex as well.

Maybe there's an old copy of "Unix Power Tools" over in your server room or an
abandoned cubicle in the office... the content has not changed much in the
ensuing decades!

------
protomyth
Ed is pretty complete, to the point it was a little too powerful when it was
part of a security problem with FreeBSD's patch.
[https://securitytracker.com/id/1033188](https://securitytracker.com/id/1033188)

------
textmode

       echo -e '/-baz\n+1\ni\n-elephant\n.\nw\nq\n'|ed foo
    

but ed requires a temp file in $TMPDIR to save changes

for speed, put $TMPDIR on memory file system

sed requires no temp file

    
    
       1.sed:
       /-baz/a\
       -elephant
       
       
       sed -f 1.sed foo|sed -a wfoo
    

works with all versions of sed, e.g., not all versions support "\n" in
patterns nor so-called "edit-in-place" automatic temp file creation and
removal

~~~
bewuethr
The sed command doesn't get the indentation right, though, as the article says
it could be indented by two or four spaces.

~~~
textmode

       1.sed:
       s/- baz/- elephant/;
       /^  - elephant/{h;G;}
       /^    - elephant/{h;G;}
       s/- elephant/- baz/;
    
       sed -f 1.sed foo|sed -a wfoo
    

or

    
    
       1.sed:
       /^  - baz/a\
         - elephant
       
    
       /^    - baz/a\
           - elephant
       
    
       sed -f 1.sed foo|sed -a wfoo

------
KC8ZKF

      sed s/baz/baz\\nelephants/

~~~
yani
You are missing the dash that the line starts with

~~~
KC8ZKF

      sed -i.bak s/baz/"baz\\n  - elephants"/ *.txt

~~~
bewuethr
The article says that the indentation can be _two or four_ spaces, though.

------
teddyh
I do this fairly regularly in various shell scripts, but less now than
previously ever since “sed” introduced the --in-place option, making it more
useful for my purposes most of the time.

------
userbinator
It's worth noting that the "enter a single . on a line to signal the end of an
input" convention found its way into mail/mailx and SMTP too. The good thing
is that it means you don't need to insert special characters like Esc (Ctrl+[,
27, 0x1B, whatever you want to call it) into your script; the bad thing is
when you _do_ want to add a line containing a single "."... whereby ed and
SMTP have diverged with different "escaping" conventions.

------
fakedrake
Good to know about ed! Since noone else has mentioned this, emacs' keyboard
macros seem much easier to me especially since more than the basic editing
stuff off awk/ed/sed i can leverage all the editor extensions and
modifications i have accumulated over the years. That is unless its tens of
thousands of files and the edit is exceptionally simple. I would write a
script in that case too.

~~~
Symbiote
Since discovering them, I use Emacs keyboard macros all the time.

Let's say I have:

    
    
      key   value
      salt  pepper
      fish  chips
      vodka orange
      rum   cola
    

and I want the second column in uppercase.

<F3> to start recording a macro. Alt →, → to position the cursor at "v" (or
just →→→→→→ if this is a fixed width column), then Alt U to uppercase the next
word. → to move the cursor one forward, to the start of the next line. <F4> to
finish recording the macro.

Then press <F4> five times to run the macro five times.

(Explanation intended for users who've never used Emacs before. Of course,
there are optimizations.)

~~~
rbonvall
The equivalent in vim is:

• qq to start recording a macro in register q,

• w to jump to the second word,

• gUaw to "go uppercase a word",

• j to move to next line (↓ works as well),

• q to stop recording,

• 5@q to apply macro in register q five times.

But in this example I would have probably used ex command:

    
    
        :%normal wgUaw
    

(for every line do as if I had typed wgUaw) or visualy selected the second
column as a block and just pressed U.

I'm genuinely interested in someone showing how to do this kind of
trasformation in popular modern editors such as Atom and VSCode. Is there such
a flexible way as in the classic editors?

~~~
rodorgas
You can do it using multiple cursors. On Sublime, you place the cursor on
beginning of “value”, then press Ctrl+Shift+Down until the end - there will be
a cursor on every line. Then you press Ctrl+Right to select all values on the
second column. Then press Ctrl+P and choose “Convert to Upper Case”, or just
Ctrl+KU.

------
textmode
There is another program I use for editing that is older than ed. It is
written in asm. I think it may actually be faster than sed (and sed is faster
than AWK, Lua, Perl, Python, etc.)

    
    
      1.spt:
    
      ; x = "  - baz" 
      ; y = "  - elephant"
      ;a a = input :f(end)
      ;  output = a
      ;  a ? x :s(d)f(a) 
      ;d output = y
      ; :(a)
      ;end
    
      spitbol 1.spt < foo

~~~
davidgould
>sed is faster than AWK

Depends on the awk implementation and the task. However even gnu awk (gawk) is
very fast and mawk is astonishing.

Here is a simple example: count the lines, words, and characters in a 65MB
text file (10 copies of a novel stuck together).

Testing on Ubuntu GNU/linux 16.10 reporting middle of three tries:

    
    
      export LANG=ASCII    # avoid differences due to unicode
      $ time -p wc big10.txt 
       1284570 10956950 64886660 big10.txt
      real 0.29
      user 0.28
      sys 0.01
      $ time -p gawk '{l+=1; w+=NF; c+=length($0)+1} END {print l, w, c}' big10.txt
       1284570 10956950 64886660
      real 0.55
      user 0.53
      sys 0.01
    

Not bad, gawk is less than twice as slow as wc which is the standard tool for
this.

    
    
      $ time -p mawk '{l+=1; w+=NF; c+=length($0)+1} END {print l, w, c}' big10.txt
       1284570 10956950 64886660
      real 0.35
      user 0.33
      sys 0.01
    

But mawk is only 20% slower than wc. For a script!

Just for a check, even python is not terrible at this:

    
    
      #!/usr/bin/python
      import sys
      l, w, c = 0, 0, 0
      for line in file(sys.argv[1], "rb"):
          l += 1
          w += len(line.split())
          c += len(line)
      print l, w, c
      
      $ time -p ./wc.py big10.txt 
      1284570 10956950 64886660
      real 0.87
      user 0.86
      sys 0.01
    

About 3 times slower than wc and mawk.

~~~
textmode
On a much slower computer...

    
    
      time -p wc big10.txt
      1284570 10956950 64886660 big10.txt
    
      real         2.76
      user         2.68
      sys          0.08
    

Trying this as novice with k3.

Because novice, 2 out of 3 counts are incorrect and probably not the fastest
solution used.

Total "words" in the example was simply AWK's NF. But looking at big10.txt
there anomalies such as words separated by "\--" instead of space.

Here I used non-space character followed by space. Far from accurate but not
too far.

    
    
      1.k: 
      w:0:"big10.txt";v:,/$w
      m:v _ss "[^ ] " / "word": char followed by space
      #w   / lines
      1+#m / words
      #v   / characters
    
      time -p k 1
    
      1284570
      10019630
      63602090
    
      real         2.70
      user         2.40
      sys          0.28
    
    

Counting lines with sed

    
    
      time -p wc -l big10.txt
      1284570 big10.txt
    
      real         0.13
      user         0.06
      sys          0.07
    
      sed -n '$!d;=' big10.txt
      1284570
    
      real         0.29
      user         0.19
      sys          0.09

~~~
davidgould
That is a slow computer, mine is a pre-haswell i3.

    
    
      $ time -p sed -n '$!d;=' big10.txt
      1284570
      real 0.07
      user 0.06
      sys 0.00
    
      time -p mawk 'END {print NR}' big10.txt
      1284570
      real 0.04
      user 0.03
      sys 0.00
    
      $ time -p gawk 'END {print NR}' big10.txt
      1284570
      real 0.14
      user 0.13
      sys 0.00
    
      $ time -p wc -l big10.txt
      1284570 big10.txt
      real 0.02
      user 0.02
      sys 0.00

~~~
textmode
Revised 1.k.

    
    
      w:0:"big10.txt";v:{" ",x}'w;u:{#v[x] _ss " [^ ]"}'!#v;t:{#w[x]}'!#w
    
      #w / lines
      +/u / words
      +/t / chars
    

Counts for words and chars are closer but still short due to inexperience
using k.

But it _appears_ the script is now _faster than wc_.

    
    
      time -p wc big10.txt
    
      1284570 10956950 64886660 big10.txt
      real         2.78
      user         2.66
      sys          0.12
    
      time -p k 1
    
      1284570
      10956830
      63602090
    
      real         2.57
      user         2.42
      sys          0.14

------
zokier
There is also ex, which is sort of cousin of ed, but also closer to nowdays
more familiar vi(m).

~~~
greenyoda
It's closer because vi was built on top of ex -- vi's ":" commands are just ex
commands. In fact, on my Linux box, /bin/ex is just a symbolic link to
/bin/vi:

    
    
        $ ls -l /bin/ex
        lrwxrwxrwx 1 root root 2 Oct  2  2017 /bin/ex -> vi
    

_The original code for vi was written by Bill Joy in 1976, as the visual mode
for a line editor called ex that Joy had written with Chuck Haley. Bill Joy 's
ex 1.1 was released as part of the first BSD Unix release in March 1978._[1]

(Bill Joy went on to become a co-founder of Sun Microsystems.)

"ex" stood for the _ex_ tended version of ed.

[1] [https://en.wikipedia.org/wiki/Vi](https://en.wikipedia.org/wiki/Vi)

------
yani
Good article. It has everything - a problem and a good solution to this
problem.

------
tincholio
?

------
manish_gill
sed combined with iTerm's abilities to type in multiple panes lets me edit
100s of config files across multiple servers at the same time.

------
textmode
(Novice k user.)

    
    
        1.k:
    
        /k3
        v:"- baz";u:"  - elephant";t:"foo";s:"\n"
        w:_ssr[,/$0:t;v;v,u]
        w[1_ {x-2}'(&w="-")]:s
        t 0:_ssr[w;s;s," "],s
    
        k 1

------
NVRM
Php -> one line

