
Awk vs. Perl (2009) - type0
http://aplawrence.com/Unixart/awk-vs.perl.html
======
troydj
I learned Awk in 1988 before Perl was around (on our systems, anyway). It was
super useful at the time. But if you know Perl and Perl is available on your
system, there's certainly not a compelling need for writing standalone, multi-
line Awk programs. But Awk is really, really useful for one-liners. As Larry
Wall has said: "I still say awk '{print $1}' a lot."

Brian Kernighan himself, in this 2015 talk [1] on language design, states that
Awk was primarily intended for one-liner usage (he mentions this at 20:43).

[1]
[https://youtu.be/Sg4U4r_AgJU?t=19m45s](https://youtu.be/Sg4U4r_AgJU?t=19m45s)

~~~
ceronman
Once I learned Perl I never used awk or sed again. Even for one liners with
the -n -p -a options you can easily write one liners in Perl that are concise
as those in Awk.

~~~
raldi
How do you write

    
    
        awk '{print $3 ":" $1 " " $2}'
    

in Perl?

~~~
jwilk

      perl -aE 'say "$F[2]:$F[0] $F[1]"'

~~~
raldi
That's indeed concise, but it doesn't work. I think you need -naE

~~~
jwilk
-a implies -n since v5.19.3.

------
comstock
The nice thing about awk is that it's really only suitable for simple record
processing. You therefore stop using fairly quickly when the problem reaches
sufficient complexity to use a better suited language.

It's like a very neat, small domain specific language for use processing
simple record based text files. And within those limits it's super useful.

~~~
vram22
>The nice thing about awk is that it's really only suitable for simple record
processing.

Not exactly only simple processing, IMO. Record processing, yes - it is a
domain specific language, and its core feature is the pattern-action thing,
and it was (at least originally) meant mainly to operate on record-oriented
text files (even if on lines, since a line is a record with one column). I
don't have good examples off the top of my head, but can mention a few things:

The books The Unix Programming Environment by Kernighan and Pike, and either
Programming Pearls or More Programming Pearls, by Jon Bentley, have some
examples of advanced uses of awk - so it is not just for simple record
processing. And I've read somewhere that just shell and awk and other Unix
tools have even been used to create a DBMS of sorts. Plus, with later versions
like GNU awk and so on, they've added more features to the language, including
probably many more built-in functions, and also some network programming
support [1].

[https://www.quora.com/What-is-the-most-complex-software-
writ...](https://www.quora.com/What-is-the-most-complex-software-written-in-
Awk)

According to an answer at the above link, an nroff-subset text formatter and
Lisp subset interpreter have been written in awk. Those are more complex than
simple record processing.

Also:

[1] Effective awk Programming, 3rd Edition
[http://shop.oreilly.com/product/9780596000707.do](http://shop.oreilly.com/product/9780596000707.do)

~~~
comstock
Right, I mean it's Turing complete, and you can do anything you want. But at
least when I use it, I quickly reach for another language after using it for
basic reformatting/simple calculations because using awk to do so would be too
difficult and awkward.

~~~
vram22
I get you now.

------
znpy
The article doesn't say much, besides replaying many common thoughts.
Actually, as an article, it's pretty much useless.

U have been reading "The AWK programming language" and actually kinda fell in
love with awk.

Awk does many simple things, it's fairly nice to combine such simple things,
and it's generally a nice and handy tool to know.

~~~
chubot
Yeah it's basically "I know Perl and I don't know Awk". I'm surprised it got
to the front page.

Obviously, the two languages overlap in functionality, and the one you know is
going to be easier for the job (for you).

If you know neither, Awk is easier to learn, but that's of course because it
does less. That may be good or bad, depending on what you are trying to do.

------
athenot
While I agree with the sentiment of the article, it misses a larger argument.
Perl is reasonably good at "making common things easy and hard things
possible". Meaning if the needs might vary, it's not a stretch to evolve the
code to match the requirements.

And despite having a strong affinity for Perl, I don't think it's productive
to just bannish Awk to the trash: part of being well-rounded is knowing many
tools—which can overlap to some degree—and appreciating the sweet spot of each
one. For some one-liners, Awk is just as (if not more) elegant as Perl.

~~~
omaranto
Could you give some examples of one-liners you feel might be more elegant in
Awk than in Perl? It might be fun to get a thread going where people try to
give clean Perl versions.

~~~
majewsky
Basically anything that uses $1, $2,... because you save a split().

    
    
      awk '$4 ~ /T/ { print $1 }'
      perl -nE '@x = split /\s+/; say $x[0] if $x[3] =~ /T/;'
    

... _stares at this and wonders_ ... _reads `man perlrun`_ ... Oooh, there's a
switch to auto-split-on-whitespace in Perl.

    
    
      perl -anE 'say $F[0] if $F[3] =~ /T/;'
    

Sooo this doesn't quite answer your question anymore, but I'm going to leave
this here since it's a nice instance of TIL.

~~~
mitchty
The best part about using the @F array, is you get more than 10 things you can
split and dereference. $0-9 only in awk.

That and maybe because i'm an old perl die hard, its just easier to pipe to
perl than remember how awk does its shenanigans.

~~~
kazinator
Awk is not limited to $0 to $9; where did you get that??? Maybe you're
confusing this with register substitutions in regex: \1 through \9.

The nice thing about the @F array is that @F[0] doesn't represent the whole
record.

There is a class of Awk bugs whereby arithmetic is being done on the
parameters using $N (where N is some calculated variable containing an
integer). Everything is fine when N >= 1, but if there is a bug where N is
accidentally zero, then $N refers to the whole record instead of throwing an
out-of-bounds error.

And things are accidentally zero quite easily in awk, e.g.

    
    
       awk '{print $X}'  # X is undefined so serves as zero; this prints all lines

~~~
mitchty
Solaris awk. :)

~~~
kazinator
I see on my Solaris 10 VM that indeed the "broken old awk" doesn't work past
$9.

However, on the same OS, "nawk" does.

~~~
mitchty
Yep, there is also /usr/xpg4/bin/awk vs /usr/ucb/bin/awk etc...

I've long since had to deal with Solaris but for about 10 years of Solaris I
always treat awk with a lowest common denominator approach. Which means I tend
to avoid it even to this day.

------
banku_brougham
I think awk is great, i enjoy working with awk. The GNU version is
indespensible because of multidimensional array support of course. Perhaps I
enjoy the way it feels like driving an antique car, it really harkens back to
a bygone age. Yet its fast.

I dont want to invest the time in Perl, when there are better general purpose
languages out there. Perl certainly has a different look, I can understand why
some would love it.

------
bsg75
I have a strong affinity for Awk. I also have a mild dislike of Perl. Not sure
if this is normal or odd.

~~~
ShannonAlther
I find Perl extremely useful for making prototypes really quickly, and never
learned how to use Awk because I only started with Perl in the first place a
year or two ago. I'm sure it's just a matter of preference.

~~~
bsg75
Preference and experience.

My time with Perl was wrestling with code that people wrote to show how good
they were at Perl, and where maintainability was a secondary concern. This is
a bit harder to do in Awk (but not impossible). I moved from Perl to Python in
the 1.5 days in part because of this.

I believe this occurs with Java now too. Neither Perl nor Java are bad
languages, they are both powerful and performant. However there are some
architecture astronauts who seem to enjoy making life harder for the rest of
us.

~~~
ShannonAlther
+1 for "architecture astronauts", you made my day.

------
krylon
To me that question never really arose. I learned Perl before I knew the shell
well enough to make use of awk or sed. By the time my shell scripts hit the
complexity ceiling, I happily switched to Perl and haven't regretted it.

The shell scripts I write are usually just canned commands, maybe some
tweaking with environment variables. Anything more complex I usually handle
using Perl.

Shell scripts supposedly are more portable because the Bourne shell is always
there, but that only is true with different Unix flavours. Once you run into a
Windows box, (or more exotic systems like OS/400 (whatever IBM calls it these
days), VMS, z/OS, BS2000), shell scripts won't help you. Perl will. ;-)

Last but not least: CPAN. There are perl modules available for nearly anything
you could ask for. (Except talking to SharePoint, which I sadly have to do on
a fairly regular basis. OTOH, having known its horrors, I can understand why
the Perl community wants to stay away from that POS.)

------
jopython
There is no reason to deal with multiple flavors of Awk when you have Perl,
especially when you are working with different operating systems. That and
Perl's regex which is the defacto across multiple languages, made me completed
do away with Awk.

------
wodenokoto
Would have been nice with a comparison of typical command line scenarios and
perhaps even speed comparison.

~~~
okreallywtf
I would also be interested in this.

------
bangonkeyboard
Awk fits on a manpage and in my head. Perl doesn't.

------
davidw
FWIW, Ruby has a lot of the same command line arguments that Perl does.

~~~
lloeki
Ruby does not have $_ which you happen not to see in perl one liners because
of its implicit nature, and that makes those one liners very clear and terse.

~~~
davidw
Ruby has $_

    
    
         seq 1 20 | ruby -pe 'puts "num is #{$_}"'
    

Perl is probably a bit more concise, but you can do a lot of stuff with Ruby,
which is convenient if you use it for other things too.

------
kazinator
The author of this article makes himself appear bat-shit crazy for his claims
about Perl syntax being better.

In Awk, you can define a function like this:

    
    
      function add(a, m, n,
                   sum)
      {
        for (sum = 0; m < n; m++)
          sum += a[m]
        return sum
      } 
    

_Named parameters_ (wow, there is a concept); and few dermatological problems:
no damn sigils, or required statement-terminating semicolons. The array
reference is a[m], the scalar is m and so on.

Using an extra parameter (which is not specified in the call) for the local
variable sum is a massive design fail, but no worse than any of the Perl
design fails.

Perl repeats most of the design flaws in Awk, with compounded interest.

------
CalChris
When I need to do something like this I usually use _sed_. It's similar enough
to _vim_ that I don't have to remember too much. I've never liked _perl_ at
all and _awk_ doesn't offer that much more than _sed_. Maybe that'd be
different if this sort of wrangling was a steady diet but it isn't. If I was
gonna use _awk_ I'd just skip straight on through to _python_.

~~~
kazinator
> _awk doesn 't offer that much more than sed_

Not much; just, oh, stuff like floating-point math and trig functions; integer
math, associative arrays, access to environment variables, named functions
with arguments and full recursion; control and looping structures like if, for
and while, file I/O and redirection, string literals with C escapes, ...

    
    
      $ awk '$0=sin($1) + cos($2)'
      0 1
      0.540302
      1 0
      1.84147
    

How about sed?

Just kidding ...

~~~
CalChris
Holy double angle formula, Batman. Awk has command line trig functions? Riddle
me this. Is the Google search bar just an Awk script and they won't tell you?
That's just the sneaky kind of trick L+S would try.

------
kazinator
TXR Lisp Awk macro:

[http://www.nongnu.org/txr/txr-
manpage.html#N-012F3A2C](http://www.nongnu.org/txr/txr-
manpage.html#N-012F3A2C)

POSIX standard's Awk examples, translated:

[http://www.nongnu.org/txr/txr-
manpage.html#N-03D16283](http://www.nongnu.org/txr/txr-
manpage.html#N-03D16283)

------
gegtik
I love awk, it's my go-to. Somehow learning perl sufficiently (yet another
programming language) is more mental work I'd rather not undertake, what with
its idiosyncracies and often implicit nature.

I find awk easy to understand when I read a script after not seeing it for a
long time (though I have to google each time I'm writing string manipulation
and use arrays) .

------
forinti
I had to find the longest line in a really big file and tried both awk and
Perl. I was surprised at how much faster Perl was:

[http://alquerubim.blogspot.com.br/2016/09/a-linha-mais-
compr...](http://alquerubim.blogspot.com.br/2016/09/a-linha-mais-comprida-de-
um-arquivo-ii.html)

Still, awk was simpler.

~~~
kazinator
Your benchmark doesn't specify which implementation of Awk you're testing, and
which version of it.

Your code is verbose: length($0) shortens to just length:

    
    
      awk 'length > max { max = length } END { print max }'
    

The Perl seems to have an off-by-one error.

~~~
forinti
It was an Oracle Linux 6 (64 bits) with GNU Awk 3.1.7 and Perl 5.10.1.

------
marmaduke
At work I grapple with an HPC job scheduler written as a set of concurrent
Perl scripts coordinating through transactions in a Postgres database. What a
disaster.

Still, it is good fortune that the scheduler was not written in awk.

------
andrewbinstock
Given that the comments are from 2009, I expect the title is slightly
misdated.

~~~
type0
Yeah looks like it, updated it

------
labster
I can't visit this site on mobile without being redirected to a spam page that
looks suspiciously like Facebook.

~~~
majewsky
I wondered at first because the text had some gaps in it, and thought that
images were not loaded because of my strict uMatrix policy. After enabling the
usual suspects (ajax.googleapis.com) didn't work, I inspected the gap and saw
that it was just an ad container. And in fact, the same ad service is
advertised below the article text.

------
kwoff
Anyone else learn of Perl in the footnotes of the "sed & awk" book? :)

~~~
vram22
IIRC I first learned of Perl in the Unix Power Tools book from O'Reilly. That
book was really good.

------
freyfogle
why chose, TIMTOWTDI

~~~
kazinator
Because at some point you actually have to DI, and then you have to choose a
W.

