
Show HN: Awk-JVM – A toy JVM in Awk - rethab
https://github.com/rethab/awk-jvm
======
rethab
Author here.

As I wrote in the README, this uses GAWK instead of plain AWK. Conveniences of
GAWK over AWK in a nutshell: \- functions \- several additional functions (eg.
bit shifting)

But even GAWK lacks some things that are very common in other languages: \- no
variable scope: imagine a calling another function in a for loop and the other
function again running a for loop. if both loops use 'i' as the counter, good
luck. the workaround for this is to declare the local variables as parameters
that are not passed (and separate them with four spaces) \- cannot return
array from a function. the workaround is to use pass-by-reference (not sure if
the precise definition is applicable here) \- arrays cannot be assigned to
another variable. workaround is to loop over array and assign it value by
value.

If anybody knows better workarounds, please let me know :)

~~~
mzs
Nice project! It's only very old awks that don't support functions (like BSD
awk does and nawk does on SunOS but /usr/bin/awk doesn't) and the GNU docs
have an implementation of strtonum:
[https://www.gnu.org/software/gawk/manual/html_node/Strtonum-...](https://www.gnu.org/software/gawk/manual/html_node/Strtonum-
Function.html)

~~~
wahern
POSIX awk supports user-defined functions. I'm surprised Solaris' default awk
doesn't, but /usr/xpg4/bin/awk does support user-defined functions. (I just
confirmed both behaviors on Solaris 11.4.)

------
tyingq
Not the main point, and this is a very cool dancing bear. But, on this point:

 _" since none of the awks can read binary, you first need to pipe the
classfile through hexdump"_

Gawk works fine with binary files for me. Using FIELDWIDTHS for fixed length
records or the readfile() extension to slurp in a whole file works fine. The
readline() function can also be paired with FIELDWIDTHS to read a fixed number
of bytes. Newline separated records with nulls in them also read as expected.
I'm curious what problems the author saw with binary and gawk.

~~~
rethab
I actually simply didn't know it was possible based on some googling. But
thanks for hint. Someone also opened an issue on github giving me a hint on
how to replace hexdump[0]. I'll definitely give this a try :)

0\. [https://github.com/rethab/awk-
jvm/issues/1](https://github.com/rethab/awk-jvm/issues/1)

------
WFHRenaissance
This is definitely an accomplishment. Someone show this to Kernighan.

------
siraben
The AWK Programming Language book is great not just for learning AWK but also
has chapters on data processing, generating tables and graphs, relational
databases and even a VM! I've implemented the assembler and VM from the book
and extended the instruction set.[0]

[0] [https://github.com/siraben/awk-vm](https://github.com/siraben/awk-vm)

------
ketanmaheshwari
METHODS[m]["attributes"][a]["data"][4]

Is this a five dimensional array? How does it get populated?

EDIT: I see it now in the code. Excellent!

~~~
leoh
Yes, a bit tricky. You don't need to declare arrays in AWK, amazingly.
Interesting how long it takes to get ergonomic gestures just right in
languages.

[https://www.gnu.org/software/gawk/manual/html_node/Array-
Exa...](https://www.gnu.org/software/gawk/manual/html_node/Array-Example.html)

~~~
tyingq
The technical term is a funny word..."autovivification". Perl does this as
well.

------
zserge
AWK is a very underappreciated language from the past, simple and fun to use.
You did a very nice job, thanks!

~~~
enriquto
What do you mean by "from the past"? All existing languages are from the past.

EDIT: what would be a language that is "not from the past"? Certainly
Python(1991) is from the past if Akw(1977) is. The origin of Python is twice
closer to the origin of awk than to the present time.

~~~
yjftsjthsd-h
I'd argue that changes and improvements count almost as much as start date.
Today, in 2020, "Python" mostly means "python3", and practically means
"python>3.5 or so", which is not _super_ recent but is a lot younger than
1991. Has AWK changed significantly in the last 30 years, or does current awk
basically look the same as it did in System V?

~~~
acqq
> Has AWK changed significantly in the last 30 years

Actually it did, exactly the dialect used by the author of OP, Gawk got some
very new and, in the context of awk language dialects, advanced features
recently.

[https://www.gnu.org/software/gawk/manual/html_node/Namespace...](https://www.gnu.org/software/gawk/manual/html_node/Namespace-
Example.html)

The introduction of "namespaces" construct allows it to work with older
programs but also construction of "libraries" in separate files.

Anyway, I really like the OP implementation, everybody should take a look at:

[https://github.com/rethab/awk-
jvm/blob/master/jvm.awk](https://github.com/rethab/awk-
jvm/blob/master/jvm.awk)

It's based on the
[https://github.com/zserge/tojvm/blob/master/vm.go](https://github.com/zserge/tojvm/blob/master/vm.go)
but I consider the awk version really more "elegant" in some my-own-taste
sense.

------
loudmax
My first thought reading the headline was that this was an implementation of
Awk that ran on a JVM. But no, this is the reverse of that. This is way is far
less useful but infinitely more interesting. Bravo!

~~~
userbinator
Such a thing does exist:
[http://jawk.sourceforge.net/](http://jawk.sourceforge.net/)

...and that naturally leads to pondering whether an implementation of AWK that
runs on the JVM can then run an implementation of the JVM that runs on AWK,
and vice-versa...

~~~
tennineeight
If its all Turing complete, I see no reason why it can't be possible,
theoretically speaking.

------
tyingq
Added a pull request that wraps typeof() and calls a "polyfill" if it the
typeof() function doesn't exist. Gawk didn't have that until v4.2.

------
exabrial
Google is failing me, but I believe someone wrote a JVM in Excel too. When all
you have is a hammer... :)

