
ngn/k, an AGPL K interpreter - kick
https://bitbucket.org/ngn/k/
======
warpech
If you find it unbelievable that code like this is written by hand, read this
short explanation about “a very unusual style of C” [1].

Fastai has a guide how to write Whitney-style code in Python [2].

Thre is a relevant thread in HN [3].

[1] [https://github.com/kevinlawler/kona/wiki/Coding-
Guidelines#t...](https://github.com/kevinlawler/kona/wiki/Coding-
Guidelines#this-is-a-very-unusual-style-of-c)

[2] [https://docs.fast.ai/dev/style.html](https://docs.fast.ai/dev/style.html)

[3]
[https://news.ycombinator.com/item?id=19481505](https://news.ycombinator.com/item?id=19481505)

~~~
saagarjha
This isn't even unusual "I want to write K in C" style, it's specifically
being annoying:
[https://bitbucket.org/ngn/k/src/master/m.c](https://bitbucket.org/ngn/k/src/master/m.c).
I mean, for goodness' sake, _it 's pivoting the stack in x86_64 assembly to
implement _start_!

~~~
ngnk
any ideas for a more compact and efficient way to get argc and argv on both
linux and freebsd without bringing in libc?

~~~
pja
Nope :)

But [http://dbp-
consulting.com/tutorials/debugging/linuxProgramSt...](http://dbp-
consulting.com/tutorials/debugging/linuxProgramStartup.html)

does suggest zeroing the top-most frame (%ebp) and making sure the stack is
16-byte aligned might also be worth doing. (Possibly it’s already 16-byte
aligned already because you jmp to main instead of calling it though?)

~~~
ngnk
that's a good read, thanks. afaik zeroing %rbp here plays no role other than
to assist debuggers. i'm not sure if alignment makes any measurable
difference. if it does, it'll be worth doing. i jmp to main() because i wrote
it and i know it never returns. it exit()-s.

~~~
saagarjha
> afaik zeroing %rbp here plays no role other than to assist debuggers.

Yep, it marks function frame boundaries. If you don't particularly care, you
may want to consider compiling with -fomit-frame-pointer and the compiler will
return this register back to the set of general purpose ones for it to use.

> i'm not sure if alignment makes any measurable difference. if it does, it'll
> be worth doing.

It will if you end up emitting any SSE instructions, because they will cause
loads and stores that need to be aligned.

------
scottlocklin
Yes, you need to understand a bit of APL to grok the style, but I don't know
why it bothers people so much. I'm guessing because it involves people
actually using the macro system for something interesting. Somehow lisp gets a
pass.

Personally I think stuff like the DO macro (F/Fj in this code base
[https://bitbucket.org/ngn/k/src/master/k.h](https://bitbucket.org/ngn/k/src/master/k.h))
should be used everywhere to the extent it is part of the C standard. If
compilers were aware of it, there are significant optimizations which could be
made. It's dirt simple and saves considerable typing and eliminates lots of
potential for error.

~~~
msla
> Somehow lisp gets a pass.

OK, which is more readable:

    
    
        (setf (car lst) "value")
    

Or:

    
    
        (rplaca lst "value")
    

I think the one which allows me to learn "setf's destination is the first
parameter" and forget the rest is easier by far. setf is glorious macrology:
It allows people to write arbitrarily complicated datatypes and have
assignment just work, reliably.

But the problem with this code isn't the macros. It's the fact the author
types like they're being charged by the character and taxed for whitespace.
The macros are amazing to the extent they allow this sort of thing. Blaming
them for it is wrong, but judging by the only standard that matters,
readability, this is not a good coding style.

~~~
RodgerTheGreat
Your example neatly demonstrates that your definition of readability is based
on familiarity. What does variable assignment have to do with the Contents of
Address Register, anyway?

In the absence of familiarity, minimizing the number of moving parts and
factoring out repeated patterns still has intrinsic value for aiding
understanding.

~~~
msla
> Your example neatly demonstrates that your definition of readability is
> based on familiarity. What does variable assignment have to do with the
> Contents of Address Register, anyway?

I explained myself _right_ below my examples.

> In the absence of familiarity, minimizing the number of moving parts and
> factoring out repeated patterns still has intrinsic value for aiding
> understanding.

And now you're repeating what I said.

------
emmanueloga_
Author has a page [1] listing a few diff implementations of "K" [2]:

* A proprietary array processing language

* A variant of APL with elements of Scheme

* K serves as the foundation for kdb+, an in-memory, column-based database

* Advocates of the language emphasize its speed, facility in handling arrays, and expressive syntax.

1: [https://ngn.bitbucket.io/k.html](https://ngn.bitbucket.io/k.html)

2:
[https://en.wikipedia.org/wiki/K_(programming_language)](https://en.wikipedia.org/wiki/K_\(programming_language\))

Edit: some examples of idiomatic code (I assume...)

* [https://github.com/KxSystems/cookbook](https://github.com/KxSystems/cookbook)

* [https://code.kx.com/phrases/](https://code.kx.com/phrases/)

* [https://github.com/KxSystems/javakdb](https://github.com/KxSystems/javakdb)

~~~
3xblah
"64-bit spyware"

Telemetry?

~~~
ngnk
i changed it to "telemetry". i'm not really sure what to call it. they are
upfront about it in the clickwrap eula but the data is not anonymized (see
section 1.5a - in k ".z.u" means username) and you are not allowed to obstruct
connectivity. it's not quite spyware either, as it happens with your consent.

~~~
userbinator
_it 's not quite spyware either, as it happens with your consent._

All the famous spyware I'm aware of which you had to have initiated the action
to install includes an EULA which does mention the data collection. That
doesn't make it any less spyware.

------
chrispsn
ngn wrote:

i'm not trying to purposefully copy [Arthur's] style. i've always been trying
to write shorter code - there are public traces of how that evolved in ngn/apl
and in dyalog's ride. i acknowledge that after seeing some of arthur's code,
including b, my mental threshold of what is considered acceptable dropped
significantly. once you accept certain principles about the code, for instance
that it's more important to be able to hold it at once in your head than to be
able to explain it to the uninitiated, this style becomes more efficient and
more pleasant

...

if you seek simplicity, this style is to a large extent discovered (as opposed
to invented)

complicated things can be arranged in many ways. simple in few. that's why it
always looks like entropy is increasing in the physical world :)

[https://chat.stackexchange.com/transcript/message/52953241#5...](https://chat.stackexchange.com/transcript/message/52953241#52953241)

~~~
RodgerTheGreat
A fun and somewhat illuminating exercise in C is to simply try minimizing the
number of semicolons in your programs. Pick any comfortable, well-defined task
and spend a few hours refining an implementation.

Fewer semicolons means fewer statements. Apart from simplifying the structure
of ones code overall, this also drives writing in a somewhat more functional
style. You will find yourself taking advantage of the comma and ternary
operators more, if you don't already. Macros are another tool for decreasing
repetition. So is expressing programs in a "data-oriented" style, using lookup
tables instead of explicit conditionals. If something is used only once,
inlining it can reduce the number of distinct statements.

------
EdSchouten
You could argue that the license of the code, in this case AGPL, is
irrelevant. I don’t think the source code is any easier to read than
disassembled output.

~~~
beagle3
You could say a similar thing about Japanese text if you don’t read Japanese -
“copyright isn’t relevant, who would ever want to copy this gibberish?”.

As people in this and similar threads mention - it is easier to read; in fact,
it is easier to read than even the preprocessed output, if you are familiar
with the style and concepts.

------
kick
'tlack:

It only took ten years for a non-toy one to come about! (Less if you count the
one that got sued off the internet from the Morgan Stanley employee.)

[https://news.ycombinator.com/item?id=944961](https://news.ycombinator.com/item?id=944961)

~~~
saagarjha
Somehow, I don't think they would appreciate this as much as you might have
hoped:
[https://news.ycombinator.com/item?id=945093](https://news.ycombinator.com/item?id=945093)

~~~
tlack
For the record, I'm more open to this style now so I'd like permission to take
a few steps back from that statement in 2009 (and probably most of my others
that year!).

Coding in a terse style really does have some benefits. You can understand
more program flow at one time and you can easily see larger patterns.

Another recent discussion
[https://news.ycombinator.com/item?id=21890259](https://news.ycombinator.com/item?id=21890259)

~~~
kick
I'm really happy you've changed your mind on this! Thanks for being open-
minded!

------
3xblah
Download links:

[https://bitbucket.org/ngn/k/downloads/k](https://bitbucket.org/ngn/k/downloads/k)
(Linux binary 64-bit)

[https://bitbucket.org/ngn/k/get/master.tar.gz](https://bitbucket.org/ngn/k/get/master.tar.gz)
(Source code)

No Javascript required

~~~
ngnk
i'd encourage people to compile from source. the binary in "downloads" was
compiled without the "-march=native" flag, in order to support older cpus, so
it's a bit slower. also, i don't intend to update it regularly - i'll probably
remove it in a couple of days.

------
chrispsn
Bizarre that this thread contains virtually no discussion of the technical
merits of the end product (performance and size) - just the coding style.

~~~
kick
Right? I found it immensely disappointing.

------
7thaccount
Any doc on how IO and other things work in this lang? The official K doc is
pretty bad, so I can't exactly go there lol.

~~~
ngnk

        0:"path"                  /read  lines
        "path"0:("line1";"line2") /write lines
        1:"path"                  /read  bytes
        "path"1:"content"         /write bytes

~~~
7thaccount
Thanks ngnk, but what if I can't read the entire file into memory (Ex: 20 GB
file)? Are there any streaming words?

~~~
ngnk
reading is done with mmap. it returns instantly and then loads memory pages
from disk only when you use them.

~~~
7thaccount
So it uses memory mapped files for all operations? So whether the file is 1 MB
or 1 TB, it'll read what it can via streaming? Sorry, not knowledgeable here.
Sounds neat though.

~~~
ngnk
1:x uses mmap, so it would return instantly no matter how large a file you
give it

0:x uses 1:x and then it splits the content into lines. unfortunately
splitting requires copying, so you'd be limited by the amount of ram (let's
ignore "swap").

i don't use mmap for writing/amendment yet. i'll be working on it.

the way modern hardware works is like this: every process run by the os has
its own view of the (typically 48-bit) address space. a process can request
from the os that a part of that address space be "backed" by a certain file.
this means that every time the process touches (i.e. tries to read or write) a
virgin memory page there (usually page=4k, always aligned), the os will be
automatically notified and will make sure to fill it in with actual content
from the file, before the process even knows. from that point on, the page
will occupy physical ram. if the os is low on memory later, it may decide to
free up the page and return it to its previous state.

in effect, data from disk (or any disk-like storage) can be "streamed" while
your program uses ordinary array indexing. the word "streaming" though implies
reading from start to end in order (which is additionally sped up by prefetch,
but that's a different story..); memory mapping is more general - it allows
random access.

~~~
7thaccount
I'd love to use this on Windows at some point as your language matures.

Is there any legal reason you'd be worried? It seems like ok and Kona are
meant to exist strictly as toys.

~~~
ngnk
ianal. to the best of my knowledge the project is clean from a legal
perspective. ofc, you don't have to be doing something wrong to become a
target of copyright trolling.

~~~
kick
Yeah, the reason oK was a toy was because John Earnest works for a company
that has their own fork of the K3 source, if I remember correctly, which puts
him in an area that won't end with one of Kx's harsh lawsuits getting
dismissed immediately.

(Not that if he made a high-performance interpreter it'd be a bad thing
legally: the reason GNU's programming guidelines are so archaic is partially
because everyone implementing GNU early-on had seen the UNIX source code yet
they could get around getting sued by writing esoterically.)

------
posterboy
This must be the famous _write once read never_ coding style I've heard so
much about, a bit like my FINO list of bookmarks, where project like this one
are always welcome

~~~
dang
This is precisely explained by the quote elsewhere in the thread:

 _once you accept certain principles about the code, for instance that it 's
more important to be able to hold it at once in your head than to be able to
explain it to the uninitiated, this style becomes more efficient and more
pleasant_

------
svan99
this is awesome. Congratulations!

------
saagarjha
The code is unreadable, formatted in the style of an IOCCC entry–and this
looks intentional. I wonder why.

~~~
smabie
It’s just a different style. After using kdb+/q for aoc2019, I quite like the
no space one-line style. I find the code easier to read and remember. I think
most people who have tried an APL derived language also agree.

It’s important that many/most cultural norms are not optimal, they are
arbitrary. So don’t immediately discount something for looking weird.

~~~
saagarjha
I refuse to believe that the call to main here is a matter of style:
[https://bitbucket.org/ngn/k/src/master/m.c](https://bitbucket.org/ngn/k/src/master/m.c).

~~~
smabie
While I wouldn’t write C like that, that’s how Arthur Whitney and others write
code. The entire kdb+/q codebase is structured like that. And considering the
success and insane performance of the system, I believe there is something to
it.

The point of this style isn’t obfuscation, it’s clarity and compactness. The
compactness boosts the signal to noise ratio and allows the programmer to
retain more information in working memory. I haven’t programmed C like that
and wouldn’t, but I definately understand and respect the theory behind it.

I have on the other hand written q like that, at first skeptical of the style.
But after a couple days, I wouldn’t have it any other way. Being able to fit
your entire program on a single screen has pretty profound productivity
benefits. It also makes IDEs unnecessary.

For other code structured like this, check out Kona, J or kdb+/q.

In general, I don’t think refusing to believe is a very useful attitude. See:
Chesterton’s Fence.

------
OskarS
Is K [0] in some sort of competition with J [1] about who can write worst,
most unreadable and unmaintainable C code possible?

[0]:
[https://bitbucket.org/ngn/k/src/master/m.c](https://bitbucket.org/ngn/k/src/master/m.c)

[1]:
[https://github.com/jsoftware/jsource/blob/master/jsrc/v2.c](https://github.com/jsoftware/jsource/blob/master/jsrc/v2.c)

~~~
kick
Snide remarks are dull, and HN's guidelines explicitly recommend against cheap
humor. Every time K is posted to HN someone makes this exact style of comment
in an attempt to be funny, but it's entirely unoriginal.

There's good reason as to why C written by APL programmers looks as it does,
and shallowly dismissing it does a disservice to everyone.

~~~
msla
I still wonder if this is the form the author of the code uses to write the
program.

