

An Awk Primer - networked
https://en.wikibooks.org/wiki/An_Awk_Primer

======
sonnym
I highly recommend The AWK Programming Language by Aho, Kernighan and
Weinberger[1][2]. It has a very similar style to The C Programming Language,
also co-written by Kernighan. Even as a general book on programming, it is
pretty good in that it presents the reader with useful exercises from simple
text processing all the way up to more advanced database implementations. I
would not recommend it as a first programming book, but it really covers how
powerful the AWK language is in a way that is very accessible.

1\. [http://cm.bell-labs.com/cm/cs/awkbook/](http://cm.bell-
labs.com/cm/cs/awkbook/) 2\.
[https://www.goodreads.com/book/show/703101.The_awk_Programmi...](https://www.goodreads.com/book/show/703101.The_awk_Programming_Language?from_search=true)

~~~
hudibras
The O'Reilly _Sed & Awk_ book is a lot of fun, too.

[http://shop.oreilly.com/product/9781565922259.do](http://shop.oreilly.com/product/9781565922259.do)

------
Tycho
I would love there was some sort of course to 'Become a Command Line Ninja'
where you work through various tasks using command line tools, maybe with an
element of competition.

Obviously people pick this up on the job but not everyone who works with Unix
terminals has a fully fledged developer/sys-admin role. My 'bag-of-tricks'
evolves at a snail's pace.

~~~
buckie
Something to add to your 'bag-of-tricks' \-- this was once handed down to me
by a master by the name of .ike, as it was handed down to him before. Maybe
you'd be interested:

"The 3-finger Claw Technique" The sweetest 3 functions, ever.

    
    
      shout() { echo "$0: $*" >&2; }
      barf() { shout "$*"; exit 111; }
      safe() { "$@" || barf "Cannot $*"; }
    

and always, exit 0

I use these daily, they're extremely cross-platform friendly in any bourne-
derived shell I've used in the last 4 years.

    
    
      safexample.sh :
      --
      #!/bin/sh
    
      shout() { echo "$0: $*" >&2; }
      barf() { shout "$*"; exit 111; }
      safe() { "$@" || barf "cannot $*"; }
    
      # consider the following lines
      safe cd /some/dir
      safe tar xzvfp /my/big/tarball.tbz
    
      exit 0
      --
    

In the above example, using 'safe' suddenly gives the user the following
special powers:

1\. if `/some/dir` does not exist, the script will safely exit before that un-
tarring does any damage.

2\. the actual 'directory does not exist' will return to stderr for the
script, (as opposed to just telling us the useless fact that the enclosing
script failed). Now bourne shell starts behaving like a modern language! (ala
Python/Ruby tracebacks, Perl errors, etc…)

Additionally, if you 'exit 0' at the end, you can run non-safe operations, and
always be guaranteed that if the shell exits, it was 'complete'. An example:

    
    
      # consider the following lines
      safe cd /some/dir
      tar xzvfp /my/big/tarball.tbz
    
    

Now, safe was removed from the un-tar command, right? So, imaging tarballs
created by people using a Mac (with HFS+ and some filesystem-specific sticky
bit somewhere in the filesystem which was tar'd up). Now, you un-tar it on
some _BSD or other_ NIX box, and tar complains as it goes- and exits non-zero.
One way to handle this, is to consider the tar 'reliable', and not use safe
for it.

Now, if we continue to be conscious of this simple safe/notsafe distinction
with these scripts, they can be called by other safe scripts- and behave just
like any respectable UNIX program!

Now, consider this crontab entry:

    
    
      1       3       *       *       *       someuser flock -k /tmp/safexample.lock /path/to/safexample.sh
    

If it fails, (for bad perms, no tarball to unpack, etc…) cron will actually
have a reasonable message to email/log on failure!

Thanks to the noble efforts of the Clan of the White Lotus, these 3 functions
originate during the late thirteenth century. The original author, William
Baxter, explained that these three functions took 15 years to boil down to
what they are- I believe that after mercilessly abusing them for several years
myself…

Below is another example with more bits.

    
    
      #!/bin/sh
    
      shout() { echo "$0: $*" >&2; }
      barf() { shout "$*"; exit 111; }
      safe() { "$@" || barf "cannot $*"; }
    
      for i in "before $@ after"
      do
       echo "arg <$i>"
      done
    
      for i in "before $* after"
      do
       echo "arg <$i>"
      done
    
      safe echo "this is ok"
      safe echo "this is ok, too" || safe echo "so is this"
      safe bad_echo "this is bad"
    
      exit 0

~~~
jzwinck
You don't need to copy those boilerplate functions everywhere you go. Bash and
others have "set -e" which will make the overall script exit with an error if
one of its commands returns an error. Also useful is "set -u" to detect the
use of unset variables (though some people write shell scripts which depend on
those, so it's not as universally applicable).

Try writing "set -eu" as the first line of your shell scripts. It's sort of
like compiling with "-Wall -Werror"\--not the default but perhaps should have
been.

------
jkbyc
Knowing bash, sed (single-line), grep quite well, I still regularly run into
problems (mostly multiline complex regex substitutions I guess) where I could
use a more expressive or more efficient tool. I am thinking that I should
perhaps invest in learning AWK. But then, why not go all the way and learn
Perl? What are the advantages and disadvantages?

~~~
cstross
If you've got bash, sed, grep, and add awk, then you are 80-90% of the way to
Perl 4 proficiency -- that's basic Perl circa 1990, adequate for short scripts
(and more efficient that bash/sed/awk scripts). Perl started life as a
superset of awk, with sed, grep, and some weird extras (streams) thrown in.
Indeed, the perl distro ships with a2p, a program for automatically converting
awk scripts into perl scripts.

Perl 5 adds variable scoping, pointers, modules, OOP and a bunch of powerful
expressive stuff and abstractions -- but if you learn awk on top of what
you've already got it means you're about 90% of the way to introductory Perl
proficiency, and you can add the advanced stuff later.

So by all means learn awk! Just remember that it's not suitable for really big
jobs -- if you need to create awk scripts that go beyond a couple of dozen
lines of code, that's when you'll probably want to upgrade to Perl.

~~~
microtherion
While I agree that awk and Perl 4 have roughly comparable powers, I would not
call Perl "a superset of awk" (At least not Perl 3/4; I'm not familiar with
the syntax of earlier versions).

Where awk really shines is situations that call for a set of record based
production rules, and if in fact your problem is in that domain, you could
write hundreds or even thousands of rules without the problem becoming
unsuitable for awk. Like for any domain specific language, you could consider
the limitations imposed by awk as valuable discipline to keep you in the
problem domain.

While perl can be written in an awk-like style, it's really much more of a
general purpose language, so it can accommodate workflows that deviate
substantially from the "sequentially process records" style that awk excels
at.

~~~
emmelaich
The autosplit and loop options to Perl make it very suitable for an awk
replacemnt. It even has BEGIN and END pseudo-patterns.

Once you've tried Perl you don't want to flip back and forth to awk because
the inconsistencies in syntax (within and to each other) will drive you mad.

~~~
cstross
However, IIRC BEGIN and END showed up in Perl 5.003 or thereabouts, circa
1995-97.

------
grdvnl
I have found this resource invaluable:

[http://www.grymoire.com/Unix/Awk.html](http://www.grymoire.com/Unix/Awk.html)

Also, check out the sed tutorial while you are there.

------
incision
Awk, more often than anything else in my toolbox has been the thing that earns
respect and sometimes awe when I come into a new environment and a routine
time-saver in practice.

------
donatzsky
While somewhat specific to GAWK, the GNU Awk User's Guide is excellent.

[http://www.gnu.org/software/gawk/manual/gawk.html](http://www.gnu.org/software/gawk/manual/gawk.html)

------
ams6110
sed, grep, and awk are such useful utilities. A lot of developers I work with
tend to reach for Python first, but in most cases these tools are all that you
need.

~~~
pm90
Using Python is probably a better choice though. Its so easy to call other
programs, emulate a bash script and much much more....all in code which is
easily readable and maintainable!

~~~
eropple
I like Python well enough, but at Localytics we use Ruby where I'd previously
used Python and I'm finding it a lot nicer. Backticks alone make my life way,
way easier.

~~~
emmelaich
I agree that Python is a bit of an impedance mis-match for lots of short
scripts compared to Ruby and Perl, however it's not that bad.

I'm forcing myself to use Python for SysAdmin tasks because I like it so much.

Here's backticks in Python, pretty much a copy n paste from the docs.

    
    
        def backtick(cmd):
            return subprocess.Popen(shlex.split(cmd),
                stdout=subprocess.PIPE).communicate()[0]

~~~
eropple
_> I'm forcing myself to use Python for SysAdmin tasks because I like it so
much._

This seems unwise.

Use the best tool for a given job.

~~~
emmelaich
Well I exaggerate a little.

And folks tend make too much of the differences between languages.

------
girvo
And here I've been using Awk as a nicer grep in my CLI for searching for
simple strings... This is pretty interesting, I can see it really being useful
too.

