Since this flurry of APL/J/K/A+/Q articles started, I've been wondering about the relationship between APLs and literate programming.
They say that APL was developed to mimic mathematical notation, and I see the resemblance. But real works of mathematics consist of one or two lines of expressions separated by sentences and paragraphs explaining the goal of those expressions.
So far in K "advertising" I only see the expressions and not the explanations. The explanation is there, but it's not a comment in the codebase; it's in a blog post that explains it. The code has many lines of expressions on top of each other, and nobody writes math that way.
I feel like I'm missing something, basically. It's clear how to write K, but what are the best practices for making a large codebase readable? I shouldn't have to invent that myself, right?
> APL was developed to mimic mathematical notation
APL was developed to replace mathematical notation. Making it a computer language was almost an afterthought. Indeed, there have been some trivial mathematical proofs in j[1][2][3]. Perhaps those give you a better idea of the connection?
Assuming you write your code for a certain domain, I think most of your colleagues could follow along after learning a little k.
I wrote a toy simulator for the domain I work in using something like 6 lines of APL code and sent it to someone good with APL. They replied back to the email pretty quickly with some tips on how to improve it and could generally get the gist of what I was trying to do while knowing nothing of the problem domain. It was really an interesting experience. I think with zero doc, a colleague of mine would immediately understand what I was doing after going through a short APL tutorial. It isn't magic, but not nearly as cryptic as people make it out to be.
Edit: people do comment their APL code. I think a million loc project would be pretty difficult in APL, but the beauty is that you should be able to do some pretty powerful work in a few pages I would imagine.
I think you're right. Aaron Hsu's work in APL is inspiring and awesome. The source code to his data parallel compiler hosted on the GPU is 17 lines of APL! Read the abstract to his paper here:
The source code to his data parallel compiler hosted on the GPU is 17 lines of APL!
The co-dfns source I've seen[1] is much longer. Do you know if there's a write-up somewhere of the differences between that and Appendix B from the dissertation?
The 17 lines don't include the parser or the code generator, which most people would count as "part of a compiler" in a practical sense. They are usually the most mechanical parts of a compiler though, so there's relatively little to be excited about in them.
I think its important to distinguish between something that is a core piece vs all the other things that make the system usable. For example once you start adding error handling, and good error reporting, the complexity goes up by an order of magnitude. And in many cases the approach for the core does not necessarily scale out to the other contexts.
The right tool for the job. If you are building a huge website with input forms, videos, data collection, ML algorithms, yes then you wouldn't do the whole thing in APL or J even if you could. Python is big in ML because packages for working with data in array language ways were developed. Pandas by Wes McKinney is one example, and he studied J or q, and even tweeted: IMHO J is the best APL for data analysis because of its broad suite hash table-based functions.
I like APL and J as a scratchpad where arrays are the basic unit and not scalars. J is functional and it turned me on to that world before I touched Haskell or F#.
Aaron Hsu has a lot of great videos that speak to a lot of the usability and scaling out you mention:
I am able to grasp concepts or own them after coding them in APL or J even if the code isn't as fast such as how well APL applies to Convolutional Neural Networks [1,2]. I really understood the mechanics of CNNs better after working through this paper a lot more than books I had read on ANNs in general since the late 80s/early 90s. By contrast, I have coded ANNs in C and Python, and I get lost in the PL, not the concept, if that makes sense. Anyway, I am a polyglot and find people criticize J/APL/k etc. from a brief look without really trying to learn the language. I learned assembler and basic back in 1978 to 1982, and I felt the same way when I first looked at opcodes.
Bahaha. It's a small world fellow HN user. As soon as ACM opened their digital library, I started looking for interesting APL papers and found that one and thought it was beautifully done. My takeaway is that you can make purpose-built AI in APL with very little code versus calling out to a large library like Tensorflow and having no idea what's going on.
I think someone has translated this to J, but I am trying on my own to practice my J-fu by implementing it in my own way. Then I usually open it up to the J experts on the mailing list, and my learning takes off. There are some awesomely smart people there who are generous with their time.
Yes, the takeaway is that with APL or J is that you can see the mechanics in a paragraph of code, and it is not a very trivial example. If the libraries or verbs are created to deal with some of the speed or efficiency issues, it is promising as a way of understanding the concept better.
The dataframes of R and Python (Pandas) were always a thing in APL/J/k/q, so it is their lingua franca or basic unit of computation upon which the languages were built - arrays, not a library.
More importantly, almost along the lines of the emperor has no clothes, is a tack to get away from the black box, minimal domain knowledge, ML or DL that cannot be explained too easily - see newly proposed "Algorithmic Accountability Act" in US legislature. Differentiable Programming and AD (Automatic Differentiation)applied with domain knowledge to create a more easily explainable model, and try to avoid biases that may creep into a model and affect health care and criminal systems in a negative way [1][2].
And then there are those who use DL/ANNs for everything, even things that are easily applied and solved using standard optimization techniques. Forest from the trees kind of phenomenon. I have been guilty of getting swept up with them too. I started programming ANNs in the late 80s to teach myself about this new, cool-sounding thing called "neural networks" back then ;)
Yes J does attempt to be a concise notation like mathematical notation for working with concepts in a manageable way. Like mathematical notation it takes effort to learn the meanings of the symbols, however, once learned you require less boilerplate to explain the meaning of those arranged symbols unless you are teaching somebody how to code in J.
Because the code is concise and includes all the important information, you can view most useful programs in one page, so it doesn't take much to work through the logic again if you have not touched it in a while. I put comments inline for attribution to a source, or a quick mnemonic to unravel some tacit J code.
I loved J the first time I saw it back in 2011/2012. I had played with APL in the 80s. I have played with many languages (asm, Basic, C, Haskell, Joy, Forth, k, Lisp, Pascal, Ada, SPARK 2014, Python, Julia, F#, Erlang, R, etc.), and each paradigm shift has taught me to approach problems from many different angles. I use the language that suits my current need at hand. Frink is on my desktop at work for all of my engineering, unit conversion, small input program stuff. R/RStudio is there for my statistics work. Julia is replacing MATLAB for me. I wrote Blender 3D scripts in Python in the early 2000s to make 3D wood carvings from 2D photos.
J is always open on my desktop, and is more than a desktop calculator. It is my scratchpad for mathematical and whimsical ideas or exploration. See Cliff Reiter's "Fractals, Visualization & J" for fun [1], or Norman J. Thomson's "J - The Natural Language for Analytic Computing" [1]. I just bought Thomson's book three month's ago for about $35. There's now a crazy $925 posting on Amazon! Somebody's creating sales from HN!
"Mr. Babbage's Secret: The Tale of a Cypher and Apl" was also a fun book. It's not really an APL or programming book!
While I really like array programming languages in general, I think what is really an elegant balance of readability and conciseness is the nile language and its application to rasterization. Highly concise, yet very readable.
Not a large K program, but a company I used to work for maintains a large, open source q framework https://github.com/AquaQAnalytics/TorQ which is used in many large investment banks and hedgefunds.
Funnily enough, the framework is actually a more expansive version of a tick system developed by Kx (the company that makes kdb) https://github.com/KxSystems/kdb-tick
The Kx one is incredibly concise. When I first started working with it, it took me a while to figure out what was going on.
The larger K/q programs get, the more they tend to look like "normal" code, but you still see a lot of these clever one liners hidden away in there.
> The larger K/q programs get, the more they tend to look like "normal" code, but you still see a lot of these clever one liners hidden away in there.
Yep, that's what I suspected. TorQ confirms it.. giving everything a single-letter name will no longer do :-)
I'm a bit surprised at the number of comments that explain what the next line does. I'm not sure what to think of that, but it reminds me of beginner tutorials explaining code for people who can't yet read it confidently; for obvious reasons, not a popular style among more conventional languages.
They say that APL was developed to mimic mathematical notation, and I see the resemblance. But real works of mathematics consist of one or two lines of expressions separated by sentences and paragraphs explaining the goal of those expressions.
So far in K "advertising" I only see the expressions and not the explanations. The explanation is there, but it's not a comment in the codebase; it's in a blog post that explains it. The code has many lines of expressions on top of each other, and nobody writes math that way.
I feel like I'm missing something, basically. It's clear how to write K, but what are the best practices for making a large codebase readable? I shouldn't have to invent that myself, right?