I used to write a fair amount of AWK code back in the day. It was great for one-liners, but I wrote some more elaborate programs in it too. I treated it seriously, like a real programming language, and it served me well.
My favorite AWK program was probably my LaserJet II code listing program from 1991. I wrote this out of frustration with the terrible default listings I got when I printed source code. The AWK code did a "two-up" printout of source code: two pages of code per page of paper in landscape mode. It used a nice font and drew graphic boxes around the pages. I was in the habit of using separator lines like this in my code:
//-------------------------------------
So the program found these and changed them to graphic line separators. It also avoided splitting a function onto two pages if it could - it would leave whitespace at the bottom of a page instead, filled in with a faint dot pattern.
Somewhere I have a printout I made with this program; I was hoping to find it and scan it in to show what it looked like. I know it's here somewhere, but in the meantime the source code will have to do:
Years ago I wrote a compiler (for bytecode) in awk. Ummm... http://cowlark.com/mercat, although you'll need to wade through zip files to get at the source. It was 1.6kloc for a fully typed algolish language producing stack-based bytecode.
awk's a lovely little language, and deserves to be better known than it is. Its two big failings are local variable syntax and absence of structured types... and the standard library's a bit mad in places (gsub, sigh). But it's expressive and concise and still readable, and meets its core competency of doing easy text processing beautifully.
That's a C file which is also an executable shell script which contains an embedded awk script. The whole thing's a Forth interpreter. Running the file uses the awk script to compile a Forth subset into bytecode and patch the source file with the new bytecode, which allows me to keep the whole thing in a single source file. It's not what I'd call good awk, but it's incredibly effective awk...
It looks like you fished the awk file out - I found http://cowlark.com/mercat/com.awk.txt linked directly on that page, which, at 1610 total lines (1518 SLOC counting commented-out code), sounds like exactly what you're referring to.
As for fforth.... your signoff at the end of the comments sums it up much better than anything I could say.
# No evil was harmed in the making of this file. Probably.
This thing is absolutely awesome... a self-modifying tri-language source file, implementing Forth in just 22KB (or 34KB on x64). Very nice.
Now to go read the, um,
panic: unrecognised word: help
...documentation? :P
It actually happens that I've recently become really interested in Forth implementations and systems, so discovering this is especially cool... and on that note, what sources would you recommend I study to get an overview of Forth history and development? I've read enough historical anecdotes to understand there are conflicting opinions (as always), but nothing thus far has shown the evolution of the language itself, how ANS became a thing, and so forth.
PS. clang-3.7 -Os is the winner on i386, gcc-5.3 -Os on x64. tcc-0.9.26, interestingly, comes second on both (26KB and 36KB respectively). (Using Slackware-current.)
PPS. Your site's About section might want to know the Antix website seems to have been taken over by a spam system.
Yup! COM() is a varargs macro that actually assembles the data in memory --- the actual word layout is not the traditional one Forth uses (to make it C friendly). But the end result is a linked list of Forth words in exactly the same format that user words have, which the user dictionary extends.
It all means that the C source can just be compiled in a single step --- gcc -o fforth fforth.c --- without needing a precompilation stage, which makes it vastly easier to manage.
My favorite AWK program was probably my LaserJet II code listing program from 1991. I wrote this out of frustration with the terrible default listings I got when I printed source code. The AWK code did a "two-up" printout of source code: two pages of code per page of paper in landscape mode. It used a nice font and drew graphic boxes around the pages. I was in the habit of using separator lines like this in my code:
//-------------------------------------
So the program found these and changed them to graphic line separators. It also avoided splitting a function onto two pages if it could - it would leave whitespace at the bottom of a page instead, filled in with a faint dot pattern.
Somewhere I have a printout I made with this program; I was hoping to find it and scan it in to show what it looked like. I know it's here somewhere, but in the meantime the source code will have to do:
https://github.com/geary/awk/blob/master/LJPII.AWK
Of course, these days I hardly ever print out any code. But back then we printed everything.