Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: An eBook with hundreds of GNU Awk one-liners
539 points by asicsp on April 2, 2020 | hide | past | favorite | 48 comments

I recently published my ebook on GNU awk one-liners [1]. It starts from the basics of awk syntax and then discusses one-liner examples. There's a chapter on regular expressions as well. The github repo has the details on how to get the PDF version, all the example files and code snippets used in the book, sample chapters as well the markdown source used to generate the PDF.

I made all my ebooks [2] free last month amidst the pandemic fears. These include GNU grep & ripgrep, GNU sed and three books on regular expressions (Python, Ruby, JavaScript).

I'd appreciate your feedback and hope the books are useful. Happy learning :)

[1] https://github.com/learnbyexample/learn_gnuawk

[2] https://learnbyexample.github.io/books/

I always like when classic lean-and-mean Unix tools get attention, as opposed to big language ecosystems. Speaking of which, is there a particular reason why only gawk is covered? gawk has only very minor enhancements to POSIX awk, and gawk isn't even the default in many place. For example, Debian uses, or used to use, mawk as default, while the BSDs and Mac OS have nawk. I think the point of awk is that it's portable, and introducing gawkisms in your program not only makes it non-portable, but also would make it impossible to run on mawk, which is much much faster for eg. log file analysis. Might not matter all that much for one-liners, though.

Agree about portability issues. I cover only gawk, because I don't know all the differences between various versions. This started as a chapter in my command line text processing repo, where I cover various tools. I had come across various posts on stackoverflow/unix.stackexchange about implementation differences. I use Ubuntu, so I made a choice of sticking to GNU/Linux to make my life simpler.

I'm not sure about your point saying "only very minor enhancements". When I posted about my book on reddit, I got this comment [1] noting feature differences.

[1] https://www.reddit.com/r/commandline/comments/fqkc6r/just_pu...

The way to see if something works in AWK is to read the 2004 Open group POSIX standard on it:


Here’s my AWK (OK, shell + AWK) one liner:

  while echo -n '] ' ; do read a; awk 'BEGIN{print '"$a"'}' ; done
It’s a calculator; type in something like 2 + 2 and it will give you 4. Since standard AWK has advanced math functions like log, it’s a full blown scientific calculator.

The only tricky part is that you hit Ctrl + C (not Ctrl + D) to exit it.

gawk is pretty much everywhere now that awk sees substantial use, and presents rather great enhancements over POSIX awk[1]. I usually don't think that GNU tools make much sense to focus on, but gawk is something that's, generally-speaking, worth doing so. Especially given that it runs on almost everywhere under the sun.

[1] http://www.skeeve.com/awk-sys-prog.html

“It runs everywhere” (gawk) is very different from “it’s installed out of box everywhere” (awk). The latter can be used in provisioning scripts, shell script libraries/modules, etc. where the former usually can’t.

(Not saying gawk isn’t significantly better than awk for some tasks.)

My claims were that GNU awk is everywhere that awk sees substantial use, and that it runs everywhere.

These were two separate claims, and are very different.

It is very rare that you see a system that awk is frequently used on that doesn't have GNU awk installed.

> It is very rare that you see a system that awk is frequently used on that doesn't have GNU awk installed.

Wait, what? Ubuntu server and macOS both have awk preinstalled (which is true for any POSIX) and gawk not preinstalled. Those are what I (and many other developers) spend 95% of my time on. And I do use awk frequently.

Edit: Okay, /usr/bin/awk is actually gawk on Ubuntu, so I was wrong about that. Still, macOS isn't very rare.

How many OS X users do you think are actively using awk in any of its variants? What percentage of total awk usage is on OS X, do you think? OS X may not be rare, but it seems highly unlikely that many OS X systems see frequent usage of awk.

To be fair, /usr/bin/awk was mawk for a long time with Ubuntu. I remember getting in a heated discussion with GAWK’s maintainer that he should allow a way for [A-Z] to never match lower case letters (it does in some locales) so that we didn’t have to use stuff like [[:upper:]] and [[:lower:]], which do not work in mawk.

How often does a MacOS user who frequently uses awk lack the ability to install gawk?

The claim under discussion in this deep subthread is "it is very rare that you see a system that awk is frequently used on that doesn't have GNU awk installed", so this is irrelevant.

That seems like an awfully pedantic position to take. The ability to install a program is irrelevant?

I don't see Unix tools as lean and mean, but as brittle and slow. It's never just a "one liner", you end up piping text data through dozens of programs with inconsistent unstructured interfaces.

I don't understand the reverence for these tools. I think it's a bunch of junk that somehow can't get replaced because it's not quite bad enough and it's "already installed" everywhere.

Debian Stable runs GNU Awk. Issuing the command awk --version gives the output:

GNU Awk 4.2.1, API: 2.0 (GNU MPFR 4.0.2, GNU MP 6.1.2)

Very nice. I had cobbled up a "one liner" (sort of) gawk webserver on here a couple a months ago: https://news.ycombinator.com/item?id=22085459

I tried making it scalable, but unfortunately, the server sockets in gawk don't set SO_REUSEPORT. So, I can't fork usable children. It does work if you use LD_PRELOAD tricks, or edit the gawk binary to change SO_REUSEADDR to SO_REUSEPORT, but both are pretty hacky.

If gawk would separate the listen() and accept() calls out, you could do a lot more with their server socket code.

Very cool. Works on iOS with iSH surprisingly.

    FOUND index.html
    FOUND favicon.ico

Oh wow. I wouldn't have guessed iSH would have packaged up the extensions I'm using.

Had to `apk add gawk` is all.

impressive, if I use -o option to get a pretty printed version, it comes to 68 lines

I want a Perl one. I never saw the need to do AWK because I knew Perl fairly well.

In saying that the youngsters should learn some of these old school tools. Python is a nice language but the regexes are crap acompared to Perl. I always need to look up the documentation. Perls are built in, clean and concise.

(caturing, groups) = string =~ /regex/

I remember that having not touched Perl in a while. I miss it.

Have yet to learn Perl, but I've frequently not seen it preinstalled on systems, and additionally you sometimes need CPAN to be able to run scripts. awk might not be as powerful, but at least you know it's small, self-contained, and likely to be available in some form on most systems. That's part of its value I think... likely a consequence of not trying to do as much.

re: why awk is almost always available, it's part of POSIX: https://pubs.opengroup.org/onlinepubs/009695399/utilities/aw...

Someone else already asked for perl one-liners in this thread. I started with command line text processing repo [1] about three years back. That has a chapter on grep/sed/awk/perl/ruby one-liners along with many other tools. I may convert perl one-liners to a book as well later.

Python's default 're' module does indeed lack many features, but there's 'regex' third party module that would be easier to adapt for perl users.

[1] https://github.com/learnbyexample/Command-line-text-processi...

There is an available replacement for Python's standard re library, regex[1], which adds a long list of features and enhancements. It is too little known IMO, despite having existed and been continually maintained and enhanced for nearly a decade.

[1] https://pypi.org/project/regex/

There is a perl oneliners book from nostarch press

It takes way less time to learn awk than perl.

Yes, but I already know Perl. I assume more people know Perl than AWK, but maybe that's a bias from places I worked.

This is great. Studying the work of masters is the best way to learn a trade.

Thank you, even if publishing it at this time was somewhat Awk-ward...

Yeah, these are troubling times for sure. I was about half way done with the book when things became serious in my country early March. I did think about delaying the release but then made the opposite decision. I made all my books free, released markdown sources for them and then published this book early (cutting down on some topics, no exercises yet, etc).

Purchased! Thank you for writing these. My wishlist is for a POSIX awk version of your book.

Very cool.

I have the Bash equivalent (sorta) on my bookshelf and has seen a lot of use. ('GNU Introduction to the Command-Line').

It introduces a command, describes the options in detail, and then the next few pages for each command are useful bash (mostly) one-liners.

Is there any equivalents of perl one-liners?

I started with command line text processing repo [1] about three years back. That has a chapter on grep/sed/awk/perl/ruby one-liners along with many other tools. I may convert perl one-liners to a book as well later.

[1] https://github.com/learnbyexample/Command-line-text-processi...

Perhaps something like this:


The thing about cookbooks is that you rarely find something that matches what you need in the moment - you really need to learn and use these tools regularly and then you'll be prepared without documentation.

no starch press sells a Perl One Liners book. https://nostarch.com/perloneliners

The book is based on this article series: https://catonmat.net/perl-one-liners-explained-part-one

Thank you very much for this. My partner lost her job, and she will use your books to help her on a new carrier path.

Tnx, that's nice. I like your 1 condition!

epub version will by available?

Some others have asked for an epub version too. I have this article [1] bookmarked, so if it goes well, I'll add epub as well this month. If you know pandoc, you could use the markdown source in the repo to generate a basic version and see if it works with your reader.

[1] https://cmichel.io/how-to-create-beautiful-epub-programming-...

awk is great. Started using it in 1987, still using it in production today!

you forgot to attach your amazon referral link!

I haven't published these books on amazon. The github repo has the links for leanpub/gumroad, they allow PDF which amazon doesn't as far as I know.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact