Hacker News new | past | comments | ask | show | jobs | submit login
k on pdp11 (ktye.github.io)
140 points by tosh 8 months ago | hide | past | favorite | 91 comments



I've been knee deep in APL for the past year, working on a YAML parser: https://github.com/xelxebar/dayaml. It's still very prototypy, but for ~30 lines of code it's surprisingly complete.

The process of getting that project to where it is now really revolutionized how I think of software development in general and how I manage my dev teams at work. Our best practices and wisdom around software dev really are much more strongly tied to historical habits and tooling whims than we'd like to believe. I believe we as a community have a lot of unpicked fruits laying around if we shift focus onto the human side of Human-Computer Interfacing for a while.

About K, though, Arthur Whitney sounds like such a beast, and I would love to mingle with the K community. Within the array programming circles, APL/J/BQN form one cluster and K looks like it's off on its own a bit with different ideas and a laser focus on high performance. Extremely enticing.


I laughed when I saw the source: "just" 30 lines, to be sure, but 30 lines of dense, alien glyphs! This is probably 300 (or 3000?) in a less expressive language, right?

What's a good place to start for languages like this? k is proprietary, isn't it? I found unrelated languages with the same name on GitHub.


>What's a good place to start for languages like this?

I recommend:

https://xpqz.github.io/learnapl/intro.html#but-it-s-unreadab... (for apl)

and

https://xpqz.github.io/kbook/Introduction.html (for k)

... the author admits they are not an expert in either, but the pace and tone are very beginner friendly, which I have found is necessary if you do not have any experience in array languages.


I’m happy to have contributed in some small way to growing the APL community. Thanks for the shoutout.


You're absolutely right that the kind of APL in my parser looks alien. And you're right that there would be a 100X blowup in code size for an equivalent C implementation.

Part of the issue is that you're seeing the entire application architecture on a single page. Zero libraries, to boot. I'd expect a similar feeling from looking at a detailed architectural diagram for some business application in a domain that you're only partially familiar with.

The code is written specifically for the domain expert, trading off ease of onboarding everywhere it would incur a cognitive cost. The reason I do this is specifically to eliminate any incidental complexity and force myself to only cognate directly about the problem domain.

It's a funny feeling. Almost every character in the code reifies a concrete feature of the parsing architecture. Said another way, I can't go in and change anything _without_ having to think about deep and relevant questions in YAML itself or the way I'm choosing to decompose the parsing problem.

One could argue that something similar is true about spaghetti code, since changing one piece of code affects every other, but I've really worked hard to orthogonalize out and compose salient features of YAML. Implicitly, every line of code is a parser pass that defines a type transformation on the AST. Each pass introduces specific invariants that later passes can rely on to express higher-level ideas.

Anyway, I set out on the parser project specifically to verify whether such a development experience is possible. Aaron Hsu has videos on youtube expounding similar ideas, but it's hard to believe unless you directly feel it yourself.

> What's a good place to start for languages like this?

Come over to the APL Orchard (https://apl.chat). It's pretty active, and the main Dyalog member there, Adám, is both incredibly knowledgeable as well as very available for helping interested beginners.


I was about to ask if Aaron had inspired you considering how similar your comments are as well as the type of problem your solving. Neat to see others doing the same!


there are at least three reasonably comprehensive open-source k implementations:

https://github.com/kevinlawler/kona (k3)

https://github.com/JohnEarnest/ok (k6, js-based)

https://codeberg.org/ngn/k (k6, c-based)

even today, many k programmers learn from the k2 reference manual. k2/k3 are extremely similar:

http://web.archive.org/web/20050504070651/http://www.kx.com/...

oK has a briefer manual which describes k5/k6:

https://github.com/JohnEarnest/ok/blob/gh-pages/docs/Manual....


you may also enjoy uiua (https://www.uiua.org/) which uses these alien glyphs but is even more alien because it's a concatenative language (stack oriented), like forth or postscript, but to make it even more alien it's written right to left. For example 1+2 is written "+ 1 2" (in forth it would be "1 2 +")

The language and the site are brilliant and I think worth 30m of your time skimming through and trying out the examples in the online editor / tutorial.


The only reason stack-based languages don't seem like alien tech to me was because of the HP48G+ graphing calculator I had in high school.

> Uiua can be embedded in Rust programs as a library.

Cool! I was wondering if there was any viability beyond short scripts.


I still use reverse polish calculator (on my phone) but I'm not used typing it left to right but executed right to left.

For example in uiua:

    * + 1 2 5
Pushes 5 on the stack, then pushed 2, then pushes 1. Then pops 1 and 2 adds them together and pushes 3, then pops 3 and 5 and multiplies then together

Reverse polish:

    5 2 1 + *
EDIT: it does makes sense in the context of an array language


The uiua version makes more sense to me than the other even without arrays. It's closer to how my brain wants to understand what is happening: "multiply (the result of...) and 5" instead of "5 (2 and 1 added) multiplied."

The latter is kinda like "the old man the boat" where information occurring later in time changes the meaning of what comes prior.

I'm sure once things get complicated enough I would understand neither RtL or LtR with equal confusion, but the smaller cases seem very readable.


Interestingly the uiua order is the same as lisp (if you add some parenthesis)

    (* (+ 1 2) 5)


Out of interest, which calculator do you use for your phone?

For whatever reason over 90% of them have the same silly design flaw that pressing 'log' just appends 'log(' to whatever is in screen, which makes no sense. It effectively makes it impossible to ever calculate the logarithm of something.


HiPER Calc Pro

I mostly use the RPN mode though.

In expression mode you're supposed to first enter log and then the inner expression


Ah thanks, looks like it has some interesting features. They seem to have chosen to add an 'Ans' variable in there, which I suppose also works. At least it avoids the issue where you calculate something, need its logarithm, and you just can't.

For now I'm using RealCalc, which is about the only one that I can find that has a 'normal' calculator mode (i.e. infix for binary operators, suffix for unary). Though it also has an RPN mode, which is quite useful.


Yup, UIUA is Polish notation rather than Reverse Polish notation.


From experience, my take is that most laymen devs are unwilling and/or uninterested in taking the dive into apl/k/array langs due to the, in their own words, deliberately abstruse and obscurantist syntax/glyphs/typing; it's just too theoretical and/or cognitively/neurologically daunting for average engineers that just want to get paid.


Imagine if someone wanted to play music in an orchestra, but couldn't read music because it was "too confusing", but they'd never even tried to learn. They would be laughed out the door.

I'm not sure why so many laymen devs you mention (who, incidentally, tend to have an affinity for learning music and other "esoteric" skills) seem to apply that same logic. Like, why not learn and try the basics before throwing your hands up and saying it's too hard. It isn't... although it is very, very different.


The comparison is not fair though. Modern music notation, despite all its faults, is the Fe facto standard. Many attempts have been made to improve it. Some of those attempts may have been objectively better but the cost to change the standard is too high.

You wouldn't be surprised if somebody criticized an alternative music notation as being too confusing or requiring too much effort to learn for little benefit if they already know the mainstream music notation


There has long been an ideological war against knowledge and learning.


> abstruse and obscurantist syntax/glyphs/typing

I've often wondered what are the real blocks to having it both ways? Have the source saved as some lowest common denominator - IL, tagged AST, or somesuch - and then read and written in whatever pleases the individual. It would be hard to make it work between languages with drastically different semantics that don't map to anything in another language, I suppose would be the real reason... but I always think about the fact that it makes it down to machine code eventually, so it seems intuitive that the process could be reversed.

Or less drastically, an IDE for APL beginners: symbols have human language helpers or UI overlays, but the source is still APL.


The concise notation of vector languages is not just an affectation; it's a crucial property. Obscuring the notation with long names and "more familiar" syntax diminishes its value. Translating it into something "easier" leaves beginners unable to communicate clearly with APL experts or engage with resources from the broader community.

Imagine suggesting to a Prolog programmer that they make the language more accessible to beginners with tools that dynamically translate it into the equivalent Fortran, which is, after all, a Turing-complete language...


I'd need some stronger evidence to believe the claim that in-place translation of individual glyphs into words is equivalent to translating between languages with wildly different paradigms.


I'd need some stronger evidence to believe the claim that in-place translation of individual glyphs into words is equivalent to translating between languages with wildly different paradigms.


Indeed, here's the reference manual for kx systems k 2.0[1]

One thing you'll notice is that most of the operations are given names, because trying to communicate about @[d; i; f ;y] doesn't really work, while "Amend Item" does. I'd equally need some very compelling evidence that @[d; i; f; y] is inherently superior to d.amend_item(i,f,y) - or more likely if we were to translate this into a syntax more acceptable to others it'd be more likely to be something like d[i] = f y (but I haven't bothered to figure out exactly the semantics of amend item; I just picked an item from the manual) - to anyone but diehard APL/J/K fans.

Most of it can be decomposed and expanded fairly straightforwardly. E.g. after lengthy back and forth a few days back it turns out ",/+|(8#2)\textb" that someone offered up translates into a more familiar OO-syntax roughly to:

    textb.encode(8.reshape(2)).reverse.transpose.concat
... where the two most unfamiliar parts to most of us would be 1) a standard library where operations that we usually expect to be on an array are usually on each item. So e.g. "reverse" is here not "reverse the array", but "reverse each item in the array", same for the rest. The other two which doesn't have "reasonable" definitions is "reshape", which in this case (two int arguments) just creates an array of 8 elements of the value 2, and "encode" which repeatedly takes a modulo from one array, applies it to each element in the other array, passes the quotient forwards and repeats until the "modulo array" has been processed (putting (8#2), which translates to [2,2,2,2,2,2,2] then basically decomposes bytes into their constituent bits, but you can also do e.g. (24 60 60)\somenumberofseconds to "encode" a number of seconds into hours, minutes and seconds, which might give a better idea of how it works)

I'd read the "decoded" version over the terse one any day. I think a "midpoint" that introduced some of these operators in a more "regular" language might well be able to remain readable, but too much density and I can't scan code and get a rough idea of what it does anymore, and that to be is a hard no.

Because while some aspects of thinking in terms of arrays this way is also very foreign, the syntax is frankly more than half the battle.

[1] https://www.nsl.com/k/k2/k295/kreflite.pdf

EDIT: Here are some of my notes from trying to figure the above out, and some horrific abuse of Ruby to modify some of the core classes to make something similar-ish to the above actually work:

https://gist.github.com/vidarh/3cd1e200458758f3d58c88add0581...


Trying to communicate about @[d; i; f; y] works once you know what that does. Like with any mathematical notation, it's given a name just like '+' has a name (plus).

I don't know if it's "inherently" superior (everything is subjective preference), but @[d;i;f;y] seems more amenable to algebraic manipulation. Also, just the fact that it is a function that takes four arguments vs some property of d means along with k's projections (similar to currying in functional languages) you can leave any of the 4 arguments empty to create a curried function.

For example you could do something like @[d;;:;x]'(a; b) which will replace the items at indices a and b with x. Or any number of other possibilities. Compare to python where to cover all possible behaviour you'll need various lambdas + for + zip + map + ...


So that to me seems separate from the minimalist syntax. The problem isn't the flexibility, but the sheer combined density. The problem isn't even necessarily the odd extra operator, but that there are so many of them that are heavily overloaded/

I don't think the projections example is possible to reproduce in an ergonomic way in e.g. Ruby or Python, but you could certainly imagine a syntax extension to allow something similar. Since a closure in Ruby is just another object, capturing arguments and returning a modified closure is trivial, but the syntax won't allow leaving out arbitrary unnamed parameters, and so you'd need an ugly placeholder value; amending the syntax to allow e.g. "f(x,,z)" with an omitted value in between set to some placeholder object wouldn't break anything, and might well be useful. There are certainly parts like that in k that seems valuable.

The dot-syntax vs. a more functional syntax I think is purely subjective - all of us understand both, I just happen to prefer an OO syntax. d.amend(i, f, y) or amend(d, i, f, y) isn't what'd break us. Not the "@" either if it was one of a smaller number...


There aren't really that many, not when you compare it to the total number of python/ruby functions that would be roughly equivalent. (Ignoring the fact a lot of k functions operate by default on matrices and other nested structures in ways that python/ruby/etc can't really do at all)

Yes, it's dense, but that's by design. And in a vacuum it may seem like replacing @ with amend wouldn't have much of a difference, which is true. But replacing every character with the same word (which is tricky - because it's hard to precisely define things, and some characters have multiple meanings) would end up with a line of k turning into a page of prose, losing the mathematical nature of it in the first place.


The nice thing about the "decoded" syntax is that it's extensible - you can design your own operations like "encode" and "reshape". But if you don't care about that, then a terse syntax allows you to read more code at a glance once you've become familiar with it. (Of course APL-like languages have other pitfalls, such as most operators being overloaded based on arity, such that meaning can change completely depending on the context. This is barely acceptable for a "terse" syntax where every operation has to be denoted by a single character but becomes quite silly otherwise.)


Yeah, I experimented with Ruby for it, and frankly I'm not adverse to golfing it down a bit, but I find there is a big gulf between people like me who treats text (whether code or not) as something to skim, find elements of to study in isolation, "zoom in and out" of, and navigate by shape.

I'm extremely picky about syntax and formatting because of how I process code, because even layout and formatting matters for me in how I digest and work with code, and my recall of code is often "visual" in that I remember the shape of different parts of the code I work with.

APL/J/K etc. throw off everything about that for me. I need to sit down and decode symbol by symbol. I get that I'll get faster at recognising them the more I try, but it's too visually compact for me.

On the other end of the spectrum you have at least a subset of users of these languages who see code as something to read symbol by symbol beginning to end with perhaps some referencing back. I get that if you treat code that way, then it's nice for it to be tiny and compact, and it doesn't matter as much whether or not the layout is conducive to skimming and jumping around if that's not how you process code.

The same groups, to me, also seem to correlate with a preference for language vs. maths, though maybe I'm just biased because I'd firmly be in the language camp.

I'm not sure how reconcilable those two ways of reading code really are.

On the other hand, I also deeply care about being able to see the important bits of the code at the same time, and so I tend to spend a lot of time decomposing my projects into small pieces and focus on size more than many "in the language camp", and I guess that's part of the reason why these languages still fascinate me, though I'd have to scratch my eyes out if I had to read code in them on a daily basis.


Ah, that makes sense. Okay, you've convinced me my original idea doesn't make much sense after all!


> Or less drastically, an IDE for APL beginners: symbols have human language helpers or UI overlays

Dyalog, which maintains a commercial APL, provides a free development environment which does exactly what you want[0]. You can hover over all of the available glyphs for documentation.

It’s a really fun development environment in that it does something that I’ve never really experienced in any another: you can modify prior inputs to experiment with new ideas.

Playing with APL (it often does feel like play, or even sculpting with clay) is really fun. Be careful, you might get attached!

0: https://www.dyalog.com/


> you can modify prior inputs to experiment with new ideas

having trouble understanding what is meant. like, changing things like in excel?


Try that experience at https://tryapl.org

Enter 1+2 where the cursor is below the copyright notice.

Now click or use arrow keys to navigate up and change the 2 to a 4 and hit Enter again.


This thing of having a screen editor where you could move the cursor and execute any command instead of a linear terminal was a staple of 1970's and 1980's computers. Oberon also is notable for letting you execute code written anywhere (and so allowing code to be used as command palettes or menus)


i mean that's cool, but from my background i would say since the input is "variable", assign it to a named variable and run the function with the variable assigned to 2, then again with it assigned to 4. that's doable in most any REPL.

or even take the function, and map a range of values to the function, to get a range of outputs. again, doable in most REPLs where the language supports some sort of mapping syntax.

can you help me understand what makes this unique? i'm not an APL user, so maybe it's just a different mindset altogether that i'm missing.


APL was one of the first interactive environments along with Lisp. They've had this back when you had a typewriter with a rotating ball with the symbols connected to the time share computer. So you'd type an expression on the typewriter, computer would run it, and it would be printed on actual paper lol.

With APL you can also do the stuff you're referring to by creating a function and mapping it to a list of numbers. If you're familiar with REPLs, the experience is similar.


You know how in a typical REPL, you’ll make use of the up arrow to repeat a previously executed expression?

Dyalog allows you to click on the previously printed expression, modify it in place and when you hit enter, it’ll evaluate as if entered on the read line.

It’s a simple little feature but it makes everything feel so malleable.


You beat me to saying the same thing lol. Their IDE is pretty helpful.


> "I've often wondered what are the real blocks to having it both ways?"

That it's not particularly useful. Here's a classic APL: {(~T∊T∘.×T)/T←1↓⍳⍵}

And in words: direct function (not T member T jot dot multiply T) compress T gets one drop iota omega.

Or in Pythonic words: lambda w: [n for n in range(2, w) if n not in [i * j for i in range(2, w) for j in range(2, w)]]

That is, the APL in Englishy words is no clearer. The Python is mostly clear because it's familiar (there are Python programmers who don't know or understand list comprehensions). And isn't it dull to have to type out the "for in range if n in for in range for in range" and a lambda and two list comprehensions? Just to say "numbers which aren't in the multiplication table of those numbers and themselves" Wouldn't it be clear enough if you could write more like a hybrid with JavaScript or C# anonymous functions with curly braces?

    lambda w: [n for n in range(2, w) if n not in [i * j for i in range(2, w) for j in range(2, w)]]

    w=>{nums=range(w).Drop(1); nums.where(n=>n not in outer_product(__op__.mul, nums, nums))}

    w=>{T=range(w).Drop(1); T.where(n=>n not in outer_product(__op__.mul, T, T))}   ⍝ no idea why the original APL code used T

    w=>{T=range(w).Drop(1); T.where(n=>n not in T ∘.× T)}  ⍝ outer product as a symbol with normal multiply symbol

    w=>{T=iota(w).↓(1); T.where(n=>n not in T ∘.× T)}    ⍝ drop as a symbol ↓

    w=>{T=1↓⍳w; T.where(n=>n not in T ∘.× T)}      ⍝ range as a symbol (iota ⍳)

    w=>{T=1↓⍳w; (T not in T ∘.× T)/T}      ⍝ where as a symbol (compress /)

    w=>{T=1↓⍳w; (~ T ∊ T ∘.× T)/T}    ⍝ not ~ and in ∊ as symbols

    w=>{T←1↓⍳w ⋄ (~ T ∊ T ∘.× T)/T}   ⍝ APL statement separator ⋄

    {T←1↓⍳⍵ ⋄ (~ T∊T∘.×T)/T}     ⍝ implicit lambda argument as a symbol (omega ⍵)

    {(~T∊T∘.×T)/T←1↓⍳⍵}      ⍝  inlined two statements into one, reads right to left
Presumably you're fine with symbols in Python like the colon when introducing functions, brackets for lists, asterisk for multiplication, parens and commas for tuples and function arguments, equals for assignment. You're likely fine with ! for Boolean 'not' in C-likes. If you have symbols for range, outer product, filter, in, drop, then you start to realise how annoying it is to type them out. We shorten the things we do often, or make them implicit. And we often do loops and data transforms when programming. It isn't the symbols which are the hard bit, and making them words doesn't make them usefully easier.


The Dyalog IDE lets you hover over each symbol and get an English name, description, and example of it being used monadically and dyadically iirc. The former is like using the symbol as a prefix to an array input and the latter in between two arrays. It doesn't take too long before a lot of symbols just click. It's easier to learn than you think.


I think it's a combination of that and being taught for so long the "right way" to do things with languages like APL appearing "too clever".


Afaik most undergrad education is still explicitly procedural/imperative/OOO with a dash of lisp/scheme in theory of comp.

Barring the exceptionally driven and programs that are unusually theoretical and/or historically based (i.e., SICP), many new grads enter industry having only ever used js/python/java/c++ it seems.


> for average engineers that just want to get paid

Εὐκλείδης' (fl 300 BC) hot take: https://mathshistory.st-andrews.ac.uk/Biographies/Euclid/#:~...


If you like podcasts you might check out the Array Cast podcast if you haven't.

>https://www.arraycast.com/


I'll third the ArrayCast, the host is Conor Hoekstra, a C++ programmer at NVidia Research. Regular panelists include Adám Brudzewsky[1] who works at Dyalog APL, Bob Therriault[2] who runs the J Wiki[3], Stephen Taylor who works with K and Q and cut his teeth on APL planning one of Chris Bonnington's Mt. Everest[4], Marshall Lochbaum who worked on J and APL and developed the BQN[5] language. All the episodes have transcripts on their site.

And their guests include Jeremy Howard the founder of fast.ai and Kaggle, Henry Rich implementer of the J language, Stevan Apter a K programmer and creator of No Stinking Loops site ( nsl.com ), Andrew Sengul who created April a blend of APL and Lisp, Morten Kromberg the CTO of Dyalog APL, Aaron Hsu creator of the co-dfns GPU compiler for APL, Brooke Allen a Wall Street entrepreneur and APL programmer, Eric Iverson son of Ken Iverson the APL creator and J designer/developer, Ashok Reddy former chemical engineer and employee at Rational (the UML people) now CEO of the KX company, Atilla Vrabecz a PhD Chemist turned K language entrepreneur, Leslie Goldsmith former APL developer at I.P. Sharp, John Earnest creator of the Ok language, Troels Henriksen creator of the GPU accelerated language Futhark, Lib Gibson former APL developer and consultant and project/developer manager at I.P. Sharp, programming polyglot Romilly Cocking, programming polyglot Vanessa McHale, and others.

[1] https://news.ycombinator.com/user?id=abrudz

[2] https://news.ycombinator.com/user?id=bobterryo

[3] https://code.jsoftware.com/wiki/Main_Page

[4] https://github.com/5jt/swf75

[5] https://news.ycombinator.com/item?id=24167804


It's a great podcast! I'm the 1st of the guests you listed, in fact. :D If folks are interested in learning APL and array programming, you might like the ~20 hour video series we created:

https://forums.fast.ai/t/apl-array-programming/97188

You might also like our complete Dyalog APL reference, which (AFAIK) is unique in never using symbols in examples which haven't been covered earlier in the reference. So you can read it top to bottom to learn the whole set.

https://fastai.github.io/apl-study/apl.html


Wanna second the Array Cast. Really, really solid stuff even if you're not specifically interested in the array languages.


Yes, I didn't mention that but they have a rotating schedule of some of the most elite programmers you have ever heard. I will also say that array programming is very seductive if you have the "puzzle solving" mentality that a lot of us have. It seems so elegant. I have never found a legitimate use for it in my paying work, but I am captivated by it privately.


What bars do you use to measure “legitimate use”?


I don't mean that the languages are not legitimate, mostly what would be acceptable to either management or a client. Also, I feel that setting someone up with code that they might have to maintain if I leave should be easy for them to handle and J, APL, or K would require them finding a rare developer.


Both are reasonable concerns. If you were running your own company the calculus might change though. Or not but I’d say it certainly could. Similarly for code which is not client facing.


Yeah, there's really great history and some amazing stories to be heard from the interviews.


I’ll second the request for elaboration! I’m interested in APLs but have never tried writing more than a tutorial.

Any particular examples of things that people {are,aren’t} doing, that they {shouldn’t,should} be doing that you felt like sharing, or just examples of things you realized while writing this, would be really interesting to hear!

I (and I know lots and lots of others) in tech feel like we could be doing better but don’t know how, and APL is interesting because it attracts such veneration despite being so weird


Please do tell more. I'm a CTO struggling with managing my devs. What do you mean?


Not OP, but I can guess a bit

With modern software development there is a ton of boilerplate and focus on tooling (how many hours do we waste on IDEs or text editors), compilers, build tools... whatever.

With APL or K, you don't really have a lot of that. Once someone is proficient, they can do a ton with just a tiny amount of code. Aaron Hsu has a bunch of videos online and chats on HN (user arcfide I think) on the strengths of APL in general and how he built a compiler to run APL on the GPU and got it down to like a page of the alien glyphs. His point is that with APL you can eliminate huge amounts of things like helper functions and see the entire code at a more macro level. He literally uses windows notepad to code in as well, so a lot less time is wasted on learning/managing/configuring all the tools in the ecosystem.

Supposedly they've done some studies where people can be taught APL pretty easily if they go in with an open mind. I can code a bit in it after only playing with it a bit. The symbols are a LOT more intuitive than you think (a lot of effort went into making them make sense in representing the operations they do). The big problem is professional devs don't want to get locked into a niche technology or admit there are other ways to do things that don't involve Java-like hell. This is less of a problem with finance workers using Kdb+ (a very fast array database that you access with the k or q languages) that get paid $$$$$$$ to write stock ticker analysis software.

As a CTO, I'm not sure what the lessons are except keep code more reasonable. Don't write stuff you don't need...etc. I'm guessing OP has gotten use to the expressive power of APL and the interactive nature of the software and how to keep things lightweight and now sees a lot of modern software as being overly bloated. It reminds me of a vendor that shipped us what should have been a page long application (pretty simple data transformations), but instead it was like 50 class files and all sorts of other junk. I was blown away by the inefficiency and extreme complexity for what we later wrote ourselves in a short script with few functions. Assuming it wasn't outright on purpose, I think certain industry tools encourage over architecture.


I think the problem is that most "big data" problems aren't really that big, or the latency requirements aren't that stringent. So people are OK with much slower solutions they can throw midlevel python devs & parallelize AWS compute at. With true time series problems you do get many "big data" problems that you can't actually parallelize (think sequential event based operations where the previous results are needed for deciding what to do with the next event).

With data frames you get some of the "close to the data" interactivity you get with a APL/J/K/Q stack. Of course, K is ~100x faster than Polars which is what.. ~10x faster than Pandas. Meanwhile most people are still using Pandas in a Jupyter notebook despite Polars being right there.

You'd think as people look at their AWS bills they'd be considering solutions that use 1/10th, 1/100th and even 1/1000th the compute time.. and yet.


They may be breaking even after firing all their internal server people...idk.

Other solutions like Kdb+ may be way more efficient, but are also crazy expensive from what I've heard.

Is there a free lunch in this context?


If you mean "free" as in open-source/free, there is J, which has its own builtin database. I'm assuming it's similar to k/kdb, but that's just a guess.

Learning an APL variant... not a free lunch. Takes a while and commitment to grasp.


J language is free, J database isn't - "Jd is a commercial database product from Jsoftware" - https://code.jsoftware.com/wiki/Jd/Overview


I've played with J and read a book on it. It's expressive and powerful of course, but wasn't nearly as natural to me as APL or Kdb+.


Arguably "crazy expensive" software that minimizes crazy expensive hardware is something work considering. Firms pay 10s of times more for things like DataDog than they do for a KDB+ license.

Also I found the argument that KDB+ devs are expensive to be laughable once I saw how much (same, or even more) we started to pay AWS/Python devs.


Crazy expensive may be affordable to firms like JP Morgan, but other industries just won't pay that when Postgres is free. It's not as good for the kind of analysis Kdb+ does IMO, but free is free and it's easier to just get a VM from IT.


Thanks! On a side note, do you think that APL-like languages like J or K are less expressive because they don't use the alien symbols?


I'm no expert, but they're all pretty darn expressive.

Personally I'd love to work with Kdb+ as you have a lightning fast database and you typically use the q-language instead of k. Q allows you a lot of the power of both SQL and a general purpose array language.

The dyalog variant of APL does have support for common data formats like CSV & JSON, but I think it's all a lot more natural in Kdb+


Excellent question. Thank you for asking.

This might sound strange, but the most salient lesson I've gotten is that value lies 10% in the software and 90% in understanding the problem domain. My customers don't usually care about _how_ we solve a problem for them, they just want a solution. They're often willing to discuss nitty-gritty details, but their ability to just get on with the stuff they're good at scales directly with our ability to simply remove a problem from their life.

Much miscommunication and incidental complexity stems from inverting that relationship, IMHO. Treating the product, i.e. the software, as the valuable asset and curating it thusly encourages us to focus on issues that are less relevant. It's harder to freely explore the problem domain and update our understanding when the landscape changes.

How this boils down in my day-to-day is that I have started giving my devs much more agency, allowing them to make bigger mistakes and learn harder-earned lessons, as long as it's clear that they're engaging directly with the core problem. In nitty-gritty discussions, I make sure that the discussion never loses sight of the core business problem and try to makes sure we stay aligned on that core aspect.

It's a bit strange that simply writing a YAML parser could connect so directly with business management-level sentiments, but I went into the project knowing that the problem itself isn't information theoretically massive and that APL should let me converge on some "set of equations" that distilled out its essential nature.

The process involved many rounds of completely throwing away my entire codebase and starting from scratch. Each time I found myself thinking about details not endemic to the problem, I would throw out the code and try again. It's psychologically easier to do that with 100 lines of APL than with the 10,000 line equivalent Python or C. Now everything is at a point where pretty much every character in the implementation externalizes some important piece of my understanding about YAML parsing.

That process contrasted starkly with the habits and culture at my work and got me to think about how much time, energy, and emotion is spent on potential irrelevancies, which then lead to thinking about where my company's true value lies and how we can keep that at the forefront.


I’ve found the notion of a parent vector to be useful for parsing problems. I like to call them Apter trees after this article http://archive.vector.org.uk/art10500340, but it’s also the core idea in Aaron Hsu’s thesis.


≠\ is an awesome trick. You might also like {2 | (1+⍳≢⍵) (⊣ - ⌈\⍤×) ~⍵}` for `⎕IO←0 for detecting which escape characters are themselves escaped. This is translated from BQN: https://github.com/mlochbaum/BQN/blob/master/md.bqn#L53

This last is more compactly expressed as <\ when you have left to right folds which (unfortunately) APL does not.


Back in the early 70s, I wrote software for the PDP11/20 (in ASM) to collect data from counters for accelerated particles. I recall handcrafting interrupt service routines to ensure they stayed ahead of the counters. And using paper tape to load the software after I hand-toggled the bootstrap loader.

For the next 15 years, I was an APL programmer, and later a Q/K developer. It would have been a hoot to use K was back in the 70s to analyze the collected data!

What a fun project! Bravo


I've just discovered array/ matrix languages, and especially J. It's so a pleasure to use it. I struggle since a long time to use python or r for their mixed scalar/array style. K/J/apl are so clean and well designed in this regard.

For instance: Eratosthene's sieve to get all prime under 100 : " (-.] * /])2}.i.100 " Another great property of J is his natural language like structure :

without =: -.

multiplied_by =: */

upto =: 4 : 'x}.i.1+y'

Sieve_100 =: (2 upto 100) without (2 upto 100) multiplied_by (2 upto 100)


That's not Eratosthene's sieve, but rather a naive O(n²) algorithm, implementing "a prime never occurs in the multiplication table".


> upto =: 4 : 'x}.i.1+y'

Fyi: for a while now, j has supported the more convenient 'direct definition' syntax. You can now write that definition like:

upto =: {{ x}.i.1+y }}

It will automatically make a monad or a dyad depending on whether you use x or not (or an adverb or a conjunction if you use u or m or v or n).


> mixed scalar/array style

For whatever it's worth, R specifically is almost entirely arrays, with the exception of NULL.

Even language constructs like functions, including source code itself, are represented as heterogeneous arrays (called "lists"). I'd go so far as to claim that R is an homoiconic array language.


I think (-.*/~)2}.i.100 is a little more idiomatic.


That's fantastic. It's a shame there isn't more stuff written (and especially, videos/streams) about K, but I get the feeling that will improve since there are a good few open-source K-like implementations (at least ngn/k, oK, ktye/i, kona, kuc and Goal). Recently I stumbled upon iKe (http://johnearnest.github.io/ok/ike/ike.html) which is pretty amazing.


Beautiful. Try typing:

2 3 + 4 5

in the upper-left, under "kyte/k".


What's the deal with Arthur Whitney's Shakti these days? The download has been gone for a year or two at this point.

https://shakti.com/


No idea what the current state is. The other day I found an older version that runs in your browser at: https://kparc.io/k/


I took wonder about this. I guess you have to get in touch with them.


Judging by the google group, they are in early development phase and are onboarding their first customer.


I figured that would've happened ages ago considering Arthur's fame in the financial world.


People aren't really writing new systems in kdb+/q I feel like. And if you have a legacy kdb system why would you ever migrate to shatki?

If you have to make the choice between Shakti or some timeseries db + Python, Shakti would be a very suspicious choice. And I say this as someone who likes K and strongly dislikes Python.


Just gonna say, this is why i wrote KlongPy - an array language in Python that lets you interop with Python while also getting some array lang. efficiencies. Klong isn't as smooth as K, but it's still quite useful.


Are they not? What makes you think that? I honestly don't know myself, but am curious.

Maybe they already found all the high paying customers and now there's nobody left that will pay that much? No idea.


People are, unfortunately, but because of inertia, and less and less. kdb/q are great for experimentation but for large scale systems it's a pain to maintain and it's often slower/scales worse than other solutions.


This is so perfect. Even just the PDP-11 color scheme hits right. The documentation is incredibly helpful in showing how both the language and the machine work. It looks like a great learning tool, but also you could play with it for hours.

My personal favorite piece of K lore is that the -entire language- supposedly fits in processor cache, which is why it's so mind-bendingly fast. Apparently it runs something like 85 - 90% of the algorithmic trading industry for this reason. I could have the details wrong, but...


Definitely does not run 85% of the algo trading industry. That would be, you guessed it, C++ (and fpgas).

I actually don't know anyone in HFT who's ever used it.

Also it's only quite fast when processing vectorized data. If you're processing market data off the network packet by packet, it's going to be much much slower than C++.


I do see it every now and then. We have a few kdb/q applications but on the whole, mostly C++ and C. Maybe a few percent and some of the larger, much older firms.


Thanks for the correction. I figured it might be too extreme to be true.


The entire binary would still fit on a 3.5" floppy and still have room for your application.


If I could afford a career change it I'd be soooo tempted. The idea of being on the ground floor of implementing K on edge computing / "IoT" for construction automation is really interesting.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: