I cannot comprehend the mindset of people who decide to spam (because that is what your comment is) any forum with a page of bullshit GPT slop. Do you think it's helpful or interesting?
Personally I believe it is interesting since it shows the current state of the art for the 3 LLMs.
I was surprised that Claude was able to identify the language and explain the code.
But, please feel free to downvote my comments. I guess the aggregate in the end will demonstrate whether it was a useful comment or spam + bullshit GPT slop.
There are a lot of things in various programming languages which are hard to remember, but k and array languages have such a small surface area, not being able to remember it while working with it daily amounts to learned helplessness.
(source: mostly amateur k programmer, also worked with it in a bank, find it vastly easier to read/write/remember than most mainstream languages)
I don't know what you mean by the q array operations being defined in the standard library. Yes there are things defined in .q, but they're normally thin wrappers over k which has array operations built in.
I don't consider an interpreted language having operations "built-in" be significantly different from a compiled language having basic array operations in the stdlib or calling a compiled language.
It is syntactically different, not semantically different. If you gave me any reasonable code in k/q I'm pretty confident I could write semantically identical Julia and/or numpy code.
In fact I've seen interop between q and numpy. The two mesh well together. The differences are aesthetic more than anything else.
There are semantic differences too with a lot of the primitives that are hard to replicate exactly in Julia or numpy. That's without mentioning the stuff like tables and IPC, which things like pandas/polars/etc don't really come close to in ergonomics, to me anyway.
Do you have examples of primitives that are hard to replicate? I can't think of many off the top of my head.
> tables and IPC
Sure, kdb doesn't really have an equal, though it is very niche. But for IPC I disagree. The facilities in k/q are neat and simple in terms of setup, but it doesn't have anything better than what you can do with cloudpickle, and the lack of custom types makes effective, larger-scale IPC difficult without resorting to inefficient hacks.
None of the primitives are necessarily too complicated, but off the top of my head things like /: \: (encode, decode), all the forms of @ \ / . etc, don't have directly equivalent numpy functions. Of course you could reimplement the entire language, but that's a bit too much work.
Tables aren't niche, they're very useful! I looked at cloudpickle, and it seems to only do serialisation, I assume you'd need something else to do IPC too? The benefit of k's IPC is it's pretty seamless.
I'm not sure what you mean by inefficient hacks, generally you wouldn't try to construct some complicated ADT in k anyway, and if you need to you can still directly pass a dictionary or list or whatever your underlying representation is.
> None of the primitives are necessarily too complicated, but off the top of my head things like /: \: (encode, decode), all the forms of @ \ / . etc, don't have directly equivalent numpy functions. Of course you could reimplement the entire language, but that's a bit too much work.
@ and . can be done in numpy through ufunc. Once you turn your unary or binary function into a ufunc using food = np.frompyfunc, you then have foo.at(a, np.s_[fancy_idxs], (b?)) which is equivalent to @[a, fancy_idxs, f, b?]. The other ones are, like, 2 or 3 lines of code to implement, and you only ever have to do it once.
vs and sv are just pickling and unpickling.
> Tables aren't niche,
Yes, sorry, I meant that tables are only clearly superior in the q ecosystem in niche situations.
> I looked at cloudpickle, and it seems to only do serialisation, I assume you'd need something else to do IPC too? The benefit of k's IPC is it's pretty seamless.
Python already does IPC nicely through the `multiprocess` and `socket` modules of the standard library. The IPC itself is very nice in most usecases if you use something like multiprocessing.Queue. The thing that's less seamless is that the default pickling operation has some corner cases, which cloudpickle covers.
> Im not sure what you mean by inefficient hacks, generally you wouldn't try to construct some complicated ADT in k anyway, and if you need to you can still directly pass a dictionary or list or whatever your underlying representation is.
It's a lot nicer and more efficient to just pass around typed objects than dictionaries. Being able to have typed objects whose types allow for method resolution and generics makes a lot of code so much simpler in Python. This in turns allows a lot of libraries and tricks to work seamlessly in Python and not in q. A proper type system and colocation of code with data makes it a lot easier to deal with unknown objects - you don't need nested external descriptors to tag your nested dictionary and tell you what it is.
Again, I'm not saying anything is impossible to do, it's just about whether or not it's worth it. 2 or 3 lines for all types for all overloads for all primitives etc adds up quickly.
I don't see how k/q tables are only superior in niche situations, I'd much rather (and do) use them over pandas/polars/external DBs whenever I can. The speed is generally overhyped, but it is significant enough that rewriting something from pandas often ends up being much faster.
The last bits about IPC and typed objects basically boil down to python being a better glue language. That's probably true, but the ethos of array languages tends to be different, and less dependent on libraries.
But -- (and forgive me if I'm totally wrong) -- this isn't just "non-english" but "non-phonetic" which is a smaller set of written languages, and the underlying language is ... math.... so understanding the underlying grammer itself relies on having decades of math education to really make it jive.
If this code is just a final result of "learn math for 2-3 decades, and spend years learning this specific programming language" -- my statement stands. Interacting with this kinda binary blob as a programming language is impressive. I think I read somewhere that seymour cray's wife knew he was working too hard when he started balancing the checkbook in hex...
The underlying language isn't really very mathematical, at most there's a bit of linear algebra in the primitives but that's it. You certainly don't need any sort of formal maths education to learn APL. There are about 50 or so new symbols, which is not a big ask, with any sort of focus the majority of the syntax etc can be learned very quickly. The "bugs" in your original code stand out very clearly because things like "∘}" don't make sense, ∘ being "dyadic" (infix).
and it bears mention that a decent chunk of those symbols are things nearly everyone is familiar with from other languages (+, -, =, etc), symbols you've probably seen in math class or on your graphing calculators (÷, ×, ≠, ⌈, ←, etc), and symbols with very strong mnemonic associations once you've seen them explained (≢, ⍋, ⍳, ⌽, etc).
The volatility calculation looks like it should be doable in q/k, I'm not sure about the more complicated stuff but at the end of the day it's a general purpose language too so anything is possible. KDB being columnar means thinking in terms of rows can often be slower. Sounds like you have a keyed table? If the KDB guys you have aren't that good/helpful you could check out some other forums. Could be useful for the future to be able to circumvent issues you have with the kdb devs.
Doesn't work on empty arrays, whereas the K does and gives the reasonable result of 0 (for total water) and -inf for maximum height. Maybe it doesn't matter, but sensible defaults are a good thing!
Edit: didn't look at the error closely enough to see that the only thing that breaks is the maximum height. That's not that bad, but still not ideal imo.
The result may be reasonable, but this is just a fluke.
Numpy complains that: "ValueError: zero-size array to reduction operation maximum which has no identity".
This error makes sense. The only meaningful result for the maximum of an empty array would be minus infinity. It should not be zero. If it's zero, it means the programming language makes the silent assumption that the elements are non-negative.
I very much prefer to have the code throw back a meaningful error at me than a reasonably looking result, and then fail silently in a different situation.
Would you agree with the statement, "the maximum of an array should be contained in that array?" That seems like a reasonable constraint that most languages uphold.
No. I think the maximum of (x:xs) should be equal to max x (maximum xs). (Using haskell notation). Logically this implies maximum [] should be -infinity if it is defined. On finite lists, maximum is also the same as the supremum, and on infinite sets the supremum doesn't necessarily have to be contained in the set. So there's at least some precedent for not having the maximum in the list.
Going by your definition, max (x:xs) = max x (max xs) = max x (max (head xs) (tail xs)) = ... ad infinitum. The basic problem with that definition is that it only really defines max for non-empty lists, otherwise the pattern-matching against (x:xs) will throw an exception.
In mathematics maximum and minimum of a set are elements of said set. So the empty set has neither a maximum nor a minimum.
There is another pair of terms that obeys the rules you want max and min to obey: supremum and infimum: <https://en.wikipedia.org/wiki/Infimum_and_supremum>. And indeed supremum of the empty set is -inf. And infimum of the empty set is +inf. That’s because supremum and infimum of a set don’t need to belong to said set. They only need to belong to the set of bounds (upper for supremum; lower for infimum) of said set.
That definition strikes me as intellectually satisfying but less useful for engineering software systems around (for the same reasons as the other commenter). But maybe it's more appropriate in the domains K is used for, I wouldn't know.
If you want a serious answer, as a k programmer ||\|x (which I assume your example is based on) is instantly readable as producing a bitmask of ones up to the last. ie 0 1 0 0 1 0 -> 1 1 1 1 1 0. Or a max scan from the right on integer arrays. I don't know how to even describe that in English succinctly, let alone most mainstream programming languages. (Haskell - reverse.scanl1(||).reverse - ok, not too bad)
You've sort of answered your own question. None of those three would obviously mean the same thing to me. And that's just one possible combination. If you consider all possible combinations of 4 symbols in K, most of them will have a distinct but obvious (and useful) meaning that is extremely hard to summarise in a single word.
But to me as a relatively experienced APLer, it reads very clearly as "return if there are no non-ASCII chars in `w`" (and if there are, the indices of those chars will be stored in `i` and the code point of every char will be stored in `c`).