Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Prig – like AWK, but uses Go for “scripting” (benhoyt.com)
77 points by benhoyt on March 1, 2022 | hide | past | favorite | 17 comments



I don't like the go requirement when compared to awk which is essentially no requirement at all.

And I generally don't like reimplementations of things that already exist without a significant justification. (for real I mean, vs for the fun of it or just experimenting or as an excersize to use a new language)

Don't care much either way about the verbosity because the extra verbosity can also be more consistent and easier to read because of fewer special implicit things to have to just know and resolve mentally.

But the execution time is undeniable, and it's probably a lot more convenient for typical awk-ish jobs to use something like this than to make a dedicated little c or go program, not to mention, this still lets you treat your code as an interpreted, readable, hackable script, vs an actual c or go program that would have to be an executable.

All in all, walking in the door I was predisposed to find this pointless and annoying that some number of people might buy into it and eventually I'd have to care because something I want some day will require it.

But in fact I find it not pointless at all. Nice.

And I actually like and use awk. I'm comfortable and productive with it for the occasional jobs that fit naturally. So comfortable I've accidentally written things that ended up becomming mostly just a big END{} block which could have been done exactly the same way in bash or anything else.


I don’t really see the proposition value other than runtime. For most awk commands the syntax is succinct and works well. This unnecessarily makes it much more tedious and verbose to… save time not learning basic awk syntax?

The awk manual is a pretty light read.


The Nim variant (yes, knowing Nim first matters) has terser syntax, but retains the strong typing that allows compiling to fast running code. E.g., from the rp --help, to total just the positive integers:

    rp -b'var t=0' -w'0.i>0' t+=0.i -e'echo t'
Missing is the awk-like implicit declaration/number stuff (divisive "magic"), but 'var t=0' is pretty succinct. With more work the harness could possibly infer the type of 't' from parameters to eliminate the need for that. Just with backslash instead of single quote you can get it down to 39 bytes vs 33 for awk '{if($1>0)t+=$1}END{print t}'.

If you have some CSV with headers you can cobble together something kind of like ghetto SQL:

    rp 'echo CustomerName,City' -w'Country=="USA"' < data
Part of the value prop is that you can directly author/access any library code in your native prog.lang in your query as naturally as native imports and routine calls. Awk is pretty established and for small programs you may not need any libs, but there are a lot more Go/Python/etc. libs than awk libs, AFAICT.

Beyond lib existence there is lib API understanding. There is not only language learning to be saved, but also lib ecosystem learning. I'm sure 'perl -na' folks could rattle off dozens of examples with CPAN package PDQ. I know many people who would say learning the ecosystem is what takes the most time...

There is also little barrier to using a similar approach to "compile" your data as well as the code. This is basically how 'nio qry' works as a kind of open architecture ghetto DB. [1] { Yeah, yeah, Big Iron, corporate database XYZ probably also has some ugly, non-portable extension language with 1970s prog.lang aesthetics. }

[1] https://github.com/c-blake/nio


Yeah, I don't disagree:

> Should you use Prig? I’m not going to stop you! But to be honest, you’re probably better off learning the ubiquitous (and significantly terser) AWK language.


totally agree


Its probably a familiarity thing, but to me the beauty of tools like awk and sed are that theyre relatively simple, compact, and usually everywhere. The DSL is tricky to get past the basics with and I usually reach for another tool at that point anyway.

But, whatever language you know already is probably the fastest.


I like this because I don't have to learn AWK to use it, and I already know Go. Even if AWK is simple, I use it so rarely and every time i do its a return to the manual. With this, I won't be so hard for me to use.


AWK is real easy to learn. I repeat the process from scratch every time I use it.


I put together a script using JBang (Scripts in Java https://jbang.dev)

https://gist.github.com/evacchi/7fb37056d92f72ae88157adcbb2f...


I couldn't resist :-) Shameless plug, I wrote a blog post https://news.ycombinator.com/item?id=30518152


TIL about pz (kind of like this but using Python) -- that's very cool! https://github.com/CZ-NIC/pz


That's a nice project idea. Made me try it in common lisp which comes with a compiler built in. It's fairly doable. The only downside is the big executable sizes for SBCL and CCL, and the relatively slow startup times for ECL.


I agree. Your comment got me thinking about a lisp dsl for awk-like operations. txr lisp[0] comes with one of these.

[0] https://www.nongnu.org/txr/txr-manpage.html#N-000264BC


I would love to buy a TXR book. It seems like such a powerful tool, and I totally lack any context or understanding of how to get started with it!

I got started with Awk and Sed by just reading through the manuals and putting in the effort to understand the various code snippets I saw online. But that was when I was much younger and "fresher" than I am now. Is TXR worth it?


I hear the D compiler has pretty fast compile times. I'm not sure how low the syntactic noise can be gotten. Of course, with very slow compiles it will mostly seem not very pointful.

It works with a C backend [1], too, and TinyCC/tcc can compile & launch programs in 1-2 milliseconds - faster than the start up time of most interpreters, actually. With a quick changeover to 'gcc -O3' for bigger data inputs/a more saved-away-tool. Personally, I find the C backend too syntactically noisy and Nim programs run as fast as optimized C anyway. So, I only ever use the Nim one.

Anyway, I think it's probably at least a good teaching tool for kids to get comfortable with code generation thinking, etc. A DSL you invent yourself might be easier to remember than a brand new foreign prog.language like sed/awk/'perl -na' that also have implicit loops. With Python you could probably get pretty far just doing 'eval'. At the least, even if you cannot remember your own DSL, well then this will help you appreciate the challenge prog.language designers face. So, either way it's kind of a pedagogical win.

And even if you don't connect CSV column headers with variables/etc, as Ben observes it's not an awful way to test out little "How do I use this <MY PROG LANG> library?" utterances. Bash/zsh become your repl with their shell history/etc.

[1] https://news.ycombinator.com/item?id=30191905


Next in the trilogy: awk2go, like awk2c.


Funny you should say that .. this piece from the same author describes just that:

https://benhoyt.com/writings/awkgo/




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: