One really cool tool that web programmers should know if they work with JSON data a lot is jq: It's a line-oriented tool like sed, awk, and grep, but it's for manipulating JSON data. It can be really useful for quickly making sense of JSON-formatted log files. For example, you can do something like
What a great article. Even though I've picked up a lot of this through osmosis, I wish I'd read such a clear and lucid primer of Unix basics (including the author's other articles on the subject) a few years ago.
One tool that should have become more common but isn't is Rob Pike's structural regular expressions, which are a fascinating generalization of awk for non-line oriented data.
This is just a side effect of being the first popular scripting language. While Perl was being used by anyone & everyone to get stuff done, people who desired a more formalized and strict object-oriented structure migrated to Python. Non-programmers and get-it-done types hacked together Perl since the notion of objects (let alone subroutines) was beyond their ability. It's natural Python scripts are easier to read, since they tend to be written by more professional programmers.
Now that Python has eclipsed Perl's popularity, it's just a matter of time before you start seeing the same level of quality issues in Python. The untrained, non-programmers will be creating write-only scripts in the new language soon enough.
It was just a few months ago that I debugged some Python scripts for a QA department at a smallish company. This code was the equivalent of any nightmare that I've seen in Perl. Not only was it all very "un-Pythonic", it didn't use classes, it hardly used subroutines, and it was equal parts of commented out tries along with the "working" code. The gem was a script that wrote another Python script and executed it (written because the author only knew how to initialize multidimensional arrays, but didn't know how to build them on-the-fly).
(And yes, there were popular scripting languages before Perl. I remember arguing the superiority of Bourne shell scripts of C-shell scripts...)
Perhaps we can retire the notion of "write-only" Perl -- all languages of sufficient complexity provide the means to obfuscate.
Are you sure that reams of special-use syntax and the "there's more than one way to do it" philosophy to language design don't play some role? While it's perfectly possible to write beautiful code in Perl and ugly code in Python, it seems a stretch to claim that it's just as easy as doing the reverse.
One shouldn't discount the free concurrency offered by the unix pipelining paradigm. There are several data-mining usecases where you'd wind up writing much more performant scripts exchanging I/O between processes, with less brainpower than it takes to write a bunch of for-loops in python.
Languages like Awk and especially tools (you could say "language" because it's strictly true but, come on) like sed have built in safeguards against writing code that stretches for more than a certain number of lines. That safeguard is that it's a really awful experience to actually do that. As a result these scripts tend to be short and to the point.
It depends on what you want to do. Even for simple text processing in one-liners, there are quite a few common tasks that are difficult in awk. A big one for me is capture groups in regular expressions:
Python and Ruby are just as bad as Perl for this. Python being the least fluent of the three for scripting.
If you want a proper language to support your scripting goals, then if you go right up to something like Ocaml or Haskell you'll skip all the pointless stringly-typed problems of perl/ruby/python.
TCL, REBOL or Red - or maybe some kind of Lisp even - could be better than Python for scripting. There are probably other good languages for this, like maybe Io.
Haskell and OCaml and Java and C++ are about equally badly suited for the job. No, they don't make a good scripting languages. And they don't even want to. Why would anyone try to write shell scripts with them is really beyond me.
Yeah, "stringly-typed" isn't really a problem when you're mostly dealing with files made up of strings/lines. Interfacing between programs and files which output mostly idiosyncratic output over an interface of files and strings isn't really made any easier or more robust by using a heavy type system and functional purity...
You're in for a fun surprise if you ever do embedded linux software where a full python or ruby interpreter will either blow your flash space requirements, be too slow or simply unavailable. Perl is a way better option but even then might be too heavy. Busybox however will have a sed or awk.
These are tools that are not going away tomorrow just because something better exists.
awk+sed, perl, and python are at least somewhat universal - ocaml and haskell are not. Heck, awk+sed even more so, any unix system almost no matter how old, or odd has some version of those two tools on them.
Because it's line-oriented, it also works seamlessly with other tools, so you can pipe the output to, say, sort, to find the slowest requests.