I wonder if array languages will find wider popularity some day with a bit more verbose notation. Take “Array Indexing” (#29):
{(,⍺)[(⊂⍴⍺)⊥¨⊃∘.,/⍵]}
Or “Inverted Table Index-Of” (#31):
(⍉↑x⍳¨x) ⍳ (⍉↑x⍳¨y)
These could be written:
{(reshape lhs)[(enclose shape lhs)
polynomial each disclose
outer product reshape reduce rhs]}
(transpose take x index each x)
index (transpose take x index each y)
Same terms, just with searchable words instead of symbols. (Don’t mind the specific words—I’m not much of an APL user!) It definitely sacrifices some important aspects of APL, but in exchange I think it gains something in terms of accessibility by explicitly writing the pronunciations you might have in your head anyway when reading & writing the symbolic notation. It also encourages extracting reusable named definitions instead of repeatedly writing “idioms”.
> It definitely sacrifices some important aspects of APL
It sacrifices the most important aspect: The notation.
From A Conversation with Arthur Whitney[1]
BC To ask a slightly broader question, what is the connection between computer language and thought? To what degree does our choice of how we express software change the way we think about the problem?*
AW I think it does a lot. That was the point of Ken Iverson's Turing Award paper, "Notation as a Tool of Thought." I did pure mathematics in school, but later I was a teaching assistant for a graduate course in computer algorithms. I could see that the professor was getting killed by the notation. He was trying to express the idea of different kinds of matrix inner products, saying if you have a directed graph and you're looking at connections, then you write this triple nested loop in Fortran or Algol. It took him an hour to express it. What he really wanted to show was that for a connected graph it was an or-dot-and. If it's a graph of pipe capacities, then maybe it's a plus-dot-min. If he'd had APL or K as a notation, he could have covered that in a few seconds or maybe a minute, but because of the notation he couldn't do it.
Another thing I saw that really killed me was in a class on provability, again, a graduate course where I was grading the students' work. In the '70s there was a lot of work on trying to prove programs correct. In this course the students had to do binary search and prove with these provability techniques that they were actually doing binary search. They handed in these long papers that were just so well argued, but the programs didn't work. I don't think a single one handled the edge conditions correctly. I could read the code and see the mistake, but I couldn't read the proofs.
Ken believed that notation should be as high level as possible because, for example, if matrix product is plus-dot-times, there's no question about that being correct.
Sure, I’ve read Iverson’s paper. I think the notation is perhaps what enabled him to come up with what I feel is the important thing: the semantics. I care a lot about function-level programming, tacit programming, implicit iteration, and dimension-agnostic code; I don’t really care for APL/J notation specifically, and not because of the funny character set or terseness, but because it’s not compositional. Changing the symbols out for words wouldn’t help much with that; I just think it would make it more popular.
Isn't it? What do you mean by it because it is but maybe you mean something else?
Edit: answered below, let us read.
> I just think it would make it more popular.
I think many languages are not looking for 'more popular' per se, especially if they have to sacrifice the basic principles of the language that users/fans like about it. Like Arthur/k I hate scrolling and I already ache seeing the masses of (for me unnecessary) code you produce out of the beauty that is APL. I work most of my time in languages far more verbose, so please do not take the nice languages that do not have that away from me. It wouldn't be very hard to write a language on top of k/apl to do that though if you wanted to; like q on k, as long as you can translate back to and inspect the original, like q on k, then it's not too bad I guess.
Could you elaborate what is not compositional about it? I always thought one of the biggest attractions of anything point-free is its compositionality (from which follows easy refactorability).
Composition lends itself to point-free code, but not all point-free code is built from composition. By “it’s not compositional” I mean a few things:
1. Tacit programming in APL-family languages is very much about application of functions to arguments, not function composition; it’s just that the arguments can be made implicit in various ways, e.g., forks & hooks in J.
2. You can’t always extract a subexpression verbatim into a definition and have it work the same way, for example because you need to disambiguate whether a verb is being used monadically or dyadically.
3. Conversely, you can’t always do evaluation by just substituting the definition of a word into an expression where that word is used. How an expression is parsed depends on the types (well, parts of speech—noun, verb, adverb) of intermediate results and the context of the sentence.
Part of the reason for all this complexity is just that the language tries to be purely function-level (à la Backus’s “Can Programming Be Liberated from the Von Neumann Style?”), but not all terms denote functions. That’s why I prefer concatenative programming languages—they’re much simpler in structure because all terms do denote functions and your program is just a flattened dataflow graph (typically a post-order traversal).
I can see that point. Concatenative languages are languages people also (like APL/k/j) cry about when they see code in it; they are usually terse as well (words are usually kept short) which is indeed why I like it. When I encounter a new board, I first port a Forth.
There is some experience with this in the K community, and the answer is "it does not gain accessibility"; and "extracting reusable definitions" goes against the spirit of APL, at least for the simple stuff.
Everyone focuses on the syntax because that's the first thing you see, but it's just the tip of the iceberg. The real problem is that you have to think differently, and a weird representation actually HELPS you start thinking differently.
My favourite two examples (K syntax - K is of the APL family) are: maximum subarray sum, written in K |/0(0|+)\ and in english "max over 0 (0 max plus) scan"; And flatten, written in K ,// and in english "join over converged".
They are both good examples of K's orthogonality (which is shared by APL and J), and of the fact that the english doesn't make them more readable if you don't already have the right mindset.
K has a verbose english syntax called Q. People who grok the APL mindset, quickly drop Q for K (which is terser and uses symbols), and people who don't rarely find Q more palatable than K.
which in k5 and newer leads nicely to a way to find the mode of a list. k5 made group produce a dictionary instead of a list of lists, and the grade operators are overloaded to sort dictionary keys by values, producing a list:
The version of this in the Lisp family is the many attempts to replace s-expressions or "drop the parens". The original notation always bounces back unscratched because it's a deep fit for the mindset.
That's very true. I'm having fun trying to wedge my brain into an APL-shaped hole right now because I did it with Lisp before and know it eventually pays off.
I like the permutation example, make a n^n array of 1...n there by ending up with n choose n with replacement, create a boolean mask of which ones have all n numbers and return.
There was a post here last week about BEAM needs an apl that pointed out the basic idiom of APL is take an array make a boolean mask then select on that mask which I thought was a very straight forward explaination of how APL differs from Matlab or Numpy.
And it is a very different approach: make everything take what you want vs make exactly and nothing more than what you want.
By using abstract symbols, it encourages one to skip the internal verbalization and go straight to the concept. Pretty neat.
But in the article, they use symbols that I don’t have on my keyboard. I guess one would learn the combination for such symbols and become fluent to the point that it didn’t matter. For example, I don’t think of the individual letters when I’m typing.
Although my highschool typing class strongly encouraged doing so. I believe that was more for maintaining a rhythm, however.
Earlier this year I took a deep dive into APL (something I'd been meaning to do for a while). With Dyalog APL and the APL mode in emacs, typing these things in is pretty easy and becomes natural if you're already a touch typist.
For instance, iota is just `.i`, rho is `.r`, etc. Some are more natural (like that) others are arbitrary but logically consistent, like take and drop are next to each other on the keyboard.
After a few days I felt like I could type most programs without having to think about where keys were. Admittedly, I've lost it all now since I haven't touched it for a couple months. If I actually stuck with it for more than the 4-6 weeks I did I imagine I wouldn't have lost it so quickly.
I understand that the compactness is a feature, and I don’t have any gripes with the symbolic notation personally, I just think a more verbose notation might avoid turning people off too early. By starting from familiar territory, you can get people accustomed to a new style of thinking before switching to a more compact syntax. Lots of new programming languages borrow syntax from familiar mainstream languages in order to reduce friction for beginners and experts alike.
Heck, as long as we’re on the subject, I’d take a symbolic notation that used small icons or emoji as a sort of middle ground—at least then they might be more mnemonic. For instance, there’s nothing about ↑ that suggests “take” to me, or ⍉ “transpose”, but I’d have a good guess at what these icons mean, and they could remain legible at fairly small sizes:
Why? Because when you aggressively use infix notation, there is operation order precedence ambiguity which is not there in polish (functional) notation. The pipeline idiom creates a "main line of computation" which really covers about 80% of use cases. I'd rather not have to remember so many things (I am getting old) and some notations are far clearer than others.
Music notation is a form of assembly language in which the original compositional ideas are expanded and flattened out into reams of tiny instructions for a "virtual machine": an abstraction of a musical instrument.
Nothing in music indicates semantic concepts like "this is motif M3 from the first movement, but transposes into a minor key, and played backward at half tempo". It's all just expanded into notes, which are like micro-instructions for the violin or oboe.
The music composer, when engraving notation, rarely invents any new symbols. If so, in a modest quantity.
In even small scale software engineering, the symbols unique to a given software system outnumber those in the programming language. We develop software by creating large dictionaries of new identifiers, such as functions and types. These symbols, which can number in the thousands and beyond, cannot have one character names, or there will be chaos.
The main challenge is understanding a program's own identifiers, not those in the programming language.
Terseness in the programming language helps at the microscopic level. The leaf terms of the program's syntax tree become more compact. The advantage of this in the overall context of making a big program is modest and has diminishing returns. Giving the core functions of a language short names, say from around two to six characters, goes a long way toward bringing about a useful level of terseness.
What also helps understanding is giving meaningful names to intermediate values. A pipeline of 13 operations is hard to understand, whether their names are one-character glyphs or 4-8 character words. If it's broken into, say, three smaller pipes, the first two of which are evaluated into variables with meaningful names, it will be helpful.
So that is to say, we shouldn't be cramming so many open-coded steps into a computation that they need one character names in order to fit into a reasonable space on the screen.
This is the problem with APL programs and programmers: failure to decompose complex operations into small functions with clear names.
I don't care that you can make some cool operation by stringing together seven letters. I am not impressed. What I don't want to see is you copy and pasting those seven letters all over the place whenever you wish to invoke this cool operation. If I will have to maintain that, I would really appreciate it if that string of seven letters appeared in one place as a definition, and had a nice ten-letter name, which is then used in all the other places where that operation is required even though (horror of horrors) it is 42% longer!
And, no, do not be giving it a one-character name using some Greek or Japanese symbol not yet claimed by APL. Do you really want your maintainers to have to make flashcards of the program's identifiers and do spaced repetition night after night?
> Terseness in the programming language helps at the microscopic level.
And even more at the sub-system level.
In practice I've found I can efficiently understand and work with chunks of code that I can immediately see + a dozen or so helper functions + 4-5 in-context/highly contextual other functions.
Everything else exists only at a very high level.
In c# I can see around 4-5 functions on my dual screens.
In APL it is closer to 100 functions.
> A pipeline of 13 operations is hard to understand, whether their names are one-character glyphs or 4-8
I don't think this is true and thankfully significantly fewer developers believe this is true than they did 10 years ago thanks to linq/streams.
By day I'm a c# dev and most of my colleagues can easily understand a long linq chain.
The mental complexity in linq almost always comes from heavily nested linq rather than from the number of operations.
> This is the problem with APL programs and programmers: failure to decompose complex operations into small functions with clear names.
The clarity of a name comes from the context, density, domain, and usage.
For example a single letter variable in a 5 line loop is often crystal clear.
So is a 2 letter variable used consistently hundreds of times throughout a code base.
So is a single word only used once that comes from the business domain.
I've found APL allows you to get rid of many pointless intermediate names and in doing so it increases the clarity of the code.
In a language like c# this can cause significant and problematic code duplication - in practice in APL it's not a big issue.
With IDEs, and soon 3D, programming UX design space extends far beyond language syntax. Concision is important for APL-style abstraction-by-eyeball. So one might enrich tooling instead. "Tell me about this expression... What does it mean? Highlight related expressions. What idioms are being used? Are there any relevant design notes? Where is it in the dataflow graph? ..."
We've not yet been discussing it much, but with VR/AR and cloud, the constraints under which programming languages have evolved for the last several decades are about change. Perhaps unicode-enabled concision will be part of that.
I wonder if it is the same transition that an illiterate adult goes through if and when they finally learn to read. Most of us probably don't remember the feeling of doing that when we were young children.
The thing is if you get rid of the pressure to be notation you get rid of the pressures that makes APL what it is. You could definitely make an array language with words but you'd end up with something more like Factor with it's cleave/spread/apply combinators (you work on a factor inspired concatenative language right?).
Yeah, that’s fair. I don’t imagine such a language would be well liked by the APL/J/K purists (“You’re missing the point!”), just that it could bring the school of thought to the masses—and maybe as a side effect, encourage people to check out the “real deal”. More of a marketing decision than a technical one.
You wouldn’t necessarily have to use explicit dataflow combinators in a wordy array language, but if you did, you might as well go all-in on function-level programming and do a concatenative language. And yeah, I’m working on a Kitten, a CPL with static types—apart from the paradigm and some similar standard APIs, it doesn’t actually have a whole lot in common with Factor; the feel I’m going for is more like postfix OCaml/Rust, whereas Factor is pretty Lispy/Smalltalky.
I don’t imagine such a language would be well liked by the APL/J/K purists (“You’re missing the point!”)
Meh. I'm not interested in code golf. I find it mostly disappointing that discussions of APL and its descendants rarely include anything of substance about the array programming model. Everyone argues about whether the constant-factor size change from glyphs/words is worth it, but what I actually find interesting is the flexible code reuse you get from having every function automatically work on arbitrarily large data.
Exactly, I think the semantics & interactive mode of programming have a lot of value, and it’s a shame they seem to get less attention than they deserve just because the notation is inscrutable to newcomers.
English identifiers solve half the problem. The other half is figuring out how the code maps to a tree. For example, in "transpose take x index each x", what is applied to what?
The best solution for me is to use "f(x,y)" syntax instead of "f x y" etc. Can you rewrite your examples using "f(x,y)" syntax?
The tree isn't trivial. For example, +/x is over(plus,x), not plus(over(x)). And our example certainly isn't transpose(take(x(index(each(x))))), so what is it?
It is trivial -- mostly, and for IBM APL2 and GNU APL. I have quoted the GNU APL manual below. Note that J, K, etc. are richer, and have different rules. APL is standardized (ISO), but GNU APL attempts to be IBM APL2 and ISO compliant (ISO/IEC 13751:2001).
Note that each (¨) is an operator -- there is no ambiguity.
⍴¨(1 2 3)(1 2)(1 2 3 4 5)
3 2 5
which (I suspect), you would rather
each(rho, ((1 2 3) (1 2) (1 2 3 4 5)))
So -- your example would be (better)
transpose(take(each(index, x), x))
Quoting the GNU APL manual:
"There are 4 APL symbols / ⌿ \ and ⍀ that can be functions or operators, depending on their context (i.e. depending on the APL tokens left of them). Neither the ISO standard nor the otherwise excellent IBM APL2 language reference manual explain the rules that decide whether, for example /, should mean ’function compress’ or ’operator reduce’. In simple cases the meaning of / is obvious: +/B means ’operator reduce’ while 1/B means ’function compress’.
"
"
an become more complex if index brackets or parentheses are involved. The algorithm used in GNU APL for classifying / and friend is (simplified) this:
1. / ⌿ \ and ⍀ are tokenized as operators
2. Scan the tokenized statement from left to right to find / ⌿ \ or ⍀
3. If a / ⌿ \ or ⍀ is found, go backwards until a constant, or a function, or a user-defined name is found: 3a. if a constant, a function, or a user-defined name is found then →4 3b. if ) is seen then skip over it, That is, a parenthesized expression is entered. 3c. if ] is seen then skip until the corresponding [ is found, That is an index list or axis is skipped over. 3d. otherwise: SYNTAX ERROR
4. decide the role of / \ ⌿ or ⍀ based on the kind of token that was found in 3a: 4a. function: leave / as operator as initially tokenized 4b. constant: change / from operator to function (compress) 4c. see below
This works fine except in case 4c. which cannot be resolved at ⎕FX time. For this case we introduce the following convention:
A user-defined name is assumed to be a function (and hence / is an operator) unless the name (and nothing else) is enclosed in parentheses. For example:
A / B ⍝ A is function, therefore / is operator
(A) / B ⍝ A is value, therefore / is function
This rule is moderately compatible with ISO and APL2; in APL2 (A) would be redundant parentheses so in that case there would be a small incompatibility between GNU APL and IBM APL2.
Thank you! That's indeed easier for me to read. A bit tricky to understand what it does, but that's just a matter of reading the reference for each function in turn.
I think this is a difference in background. A programmer I am friends with found Haskell a nightmare to read because (we finally realized) it looked more like math than ALGOL. Because I my background I have parsers for both approaches that are equally strong, whereas his parser for ALGOL like syntax was strong and his one for math was much weaker. Whenever we crossed that notational divide and switched mental parsers, the incidental mental load he was subject to skyrocketed.
I just upvoted you, but I don’t think it works like that.
It’s like to judge how you can paint Da Vinci’s “Monna Lisa” by using more or less strokes than necessary.
Probably you can do it but it’s not the same thing the author intended.
Art, like programming, is entirely subjective.
And, as far as I know “J” objective was not very dissimilar than yours, unless I badly misjudged what you are trying to achieve.
I would like to thank the HN user who has posted this a couple of times. I am grateful for this resource and reading it sooner rather than later.
Why I found this particular publication unique amongst all the others I have read:
It is many pages before the reader encounters any programming jargon. I believe the use of the term "primitives" may be the first slip.
It is possible the reader with absolutely no familiarity with programming would not be alienated by any of the terminology Iverson uses. That is uncommon for an expository text by an author who knows how to program, in my experience as a reader.
Perhaps terminology was carefully chosen with a view to avoiding programming jargon and letting the symbols (notation) and example input and output communicate the concepts.
Here is another question:
Matrix multiplication can play an important role in so-called "AI" or "Machine Learning". For example, Hopfield networks.
The extraordinarily popular Python language, specifically the "NumPy" library, is frequently cited on HN.
How suitable or unsuitable is APL for matrix multiplication and, by extension, "Machine Learning"?
Assuming in each case a programmer competent in her chosen language (i.e. ignoring the competencies of the programmer), which language has faster execution times, Python or APL?
> How suitable or unsuitable is APL for matrix multiplication
Matrix-multiply is just plus-dot-times or +.×
> which language has faster execution times, Python or APL?
This is tricky to answer. For most problems, I'd say APL, but this is isn't quite fair.
A competent Python programmer who has made it "as fast as possible" without resorting to extensions will be creamed by a novice APL programmer, so you need to consider third-party modules like NumPy just to compete.
Now NumPy is very well-optimised, but you'll really struggle to do well against k/q[1] an APLish that actually focuses on performance, and if you run into a problem where it doesn't, there are extensions you can use to eke out even more performance.
At this point, you might try using Tensorflow -- which gives Python a great edge -- but are we really programming Python anymore? kdb+ can use Tensorflow as a library as well...
It's a style difference between "optimized C and ASM kernels wrapped in a slow heavyweight language" and "a very simple interpreter that fits in cache". Circumstances are probably going to drive which approach wins out.
NumPy is written in C and BLAS in Fortran so for pure technical programming which AI / ML is Fortran Or C is the best choice any interpreted language is going to be at a disadvantage.
Arguably all the extra stuff C++ has doesn't really help and Fortan having built-in support for many things the c needs external libs for - Fortran would be the best.
APL is really good and fast with matrices. I mean it is an array language and a matrix is just an array with another dimension. The problem though is that it wouldn't be fast enough for actual scientific computation on non-trivial datasets.
Thanks for the link. Do you know if there is something similar for GNU APL?
Do you think it will be annoying to switch to GNU APL after going through this? I'd rather not spend time learning a proprietary interpreter if there's a viable free software alternative.
EDIT: Have started going through the tutorial using the GNU APL interpreter. I'll see how it goes I guess. 2+5 seems to work the same :)
An emacs mode for what? I don't use emacs much but would be up for using it it helps me with something. Fairly happily plodding along with just the GNU APL interpreter and akt though (and akt seems to work with vim through invoking `akt vim`).
It has an input method that lets you type the non ASCII characters more easily. The default uses the super key (windows key) like alt uses alt, bit there's also a mode that allows you to prefix things with ., So e.g. to get rho you type .r, and to get an actual . you type .., which I find more ergonomic. And you can launch the repl from inside emacs as well. I'm mostly a vim user myself, and have been using it with spacemacs for a more familiar experience.
When reading APL for the first time, how does one determine what the characters mean? E.g, the first example is
(+⌿÷≢) x
I can look up the '+' '⌿' '÷' characters at https://en.wikipedia.org/wiki/APL_syntax_and_symbols but I can't find the "≢)" characters which appear to be the '≢ ' and ')' characters when I copy and paste. However, I see it as '≡' '/)' when rendered. Is that broken Unicode rendering of the slash? I'm guessing this function means something like (divide (reduce sum array) (count array)) but I can't search for that triple equal sign on the Wikipedia page.
)help ≢
monadic function: Z ← ≢ B (Tally)
Returns the number of elements in the first dimension of B.
dyadic function: Z ← A ≢ B (Not match)
Returns 1 if A has not the same shape and ravel as B.
But, this is a dyalog expression (which is not an ISO standard APL). Still, the symbol meaning is pretty much the same.
After reading a few tiny intros about it, the core of APL is not that hard, that said, it would be hard to suffer dialects and more idiosyncrasies with single symbol heavy languages like that.
I remember that people who came from the other dialects, Sharp and STSC, had different ways of doing e.g. nested arrays and calling out to the OS which made their code hard to read.
I needed to print a keyboard mask to put over the keys to remember where to type. After sometime you get used to it, but it is too much overload in your working memory when you are learning it.
The vector operations were really nice and the notation made it succinct, but packages like pandas are far better.
It is impressive how usability were not a factor considered when some languages were developed.
Perhaps we should have different keyboards for programming. Traders have Bloomberg terminals. Why shouldn't we have an APL keyboard if we're using that language? If language is a tool of thought, then the input devices reduce time to putting that thought into action.
I took the same course! I remember that we had been taught to calculate the voltages and currents for all points in an electrical network as vector algebra operations. Then, for one of our assignments we were expected to just learn APL on our own to write a program to do these calculations. As I recall, it wasn't too difficult using APL.
I believe that we were given or used some sort of IBM manual to learn APL. I just searched my library and found APL/360 An Interactive Approach by Gilman and Rose, 1970[1]. However, APL was most definitely not the focus of the course so I think this was a book I bought a couple of years later in grad school for myself.
We wrote the programs on a fancy version of a teletype machine, made by IBM, called the IBM/360 APL Terminal System. See the photos of the paper based terminal in [2].
Framesets totally break the accessibility of a website. The main website is in one frame, and its navigation in another one. Both frames have different contexts and different URLs.
That's only true if screen readers haven't adopted to framesets.
With the popularity of the latter for 20+ years of early web, they're totally fine with them (as they are with tables for layout, and many other things people believe "break accessibility).
I've been puzzled about Kona for a while, since I don't really see a purpose for it. K/Q is specifically intended as a very high-efficiency language, and Kona doesn't say anything about benchmarks.
The original reason may have been that k3 was not open source and the free version was limited. But in general people write these things for all sorts of reasons. Just as there are dozens of Scheme interpreters.
Why would you admit that? If you don't know something, look it up. It's not someone else's fault if you don't know something. On HN and in the world, don't whine about it, do something about it.