
A deep dive into APL - sndean
https://curtisautery.appspot.com/5776042744610816
======
rebootthesystem
I am blown away to see how frequently APL seems to come up on HN these days, a
language I used professionally for over ten years.

As much as I love it I have to say one of the issues with APL is that it was
way ahead of it's time. Because of that it struggled to run on computers of
that era.

I was introduced to the language in 1982~ish. Like I said, I used it
extensively, attended and presented at APL conferences and even have a picture
of my younger and dorkier self with Ken Iverson (creator of APL).

This struggle to run on hardware that could not handle the huge expansions and
contractions of memory utilization inherent in the way APL works
multidimensional data sets like putty meant it could not compete where other
languages had no issues at all. On PC's you were limited to 640K of RAM. It
would be years before you could have enough memory to almost not care about
utilization.

As the language didn't really catch the attention of the CS masses it failed
to evolve as it could have over the years. And this is the reason I would
suggest today it is not much more than a curiosity.

I'm sure there are some out there using it professionally. If I were to guess
i'd say it must be mostly companies with a significant installed based of APL
software they don't dare try to rewrite.

I firmly believe that the future of computing has to be in an APL-like
language. By this I mean that we need to evolve to a notation for computing,
rather than text for computing. Iverson himself wrote an excellent paper
titled "Notation as a Tool for Thought".

The issue is that in order to evolve APL one has to have a deep understanding
of APL. And, frankly, there are but a few of us around who have that
understanding. Not sure how many of this group would embark on the mission to
create a true next-generation APL worthy of it's lineage.

I read and I smile every time I see the language come up on HN. I taught my
kids some APL and they are always blown away by it. Not sure where to go from
here.

~~~
michaelfeathers
> Not sure where to go from here.

For the past few years I've been telling people about the APL family of
languages. The clincher for me was seeing the video of an APL version of
Conway's Life mentioned in the submission. The impressive part wasn't its
brevity, it was how they approached the problem.

In APL you have a 'rotate' operator that shifts data in a particular
direction. If you have a vector and you rotate it once, every element moves to
the next position and the last element then becomes the first.

The nice thing about APL is that most operations work on data regardless of
its dimensionality. So, to do Conway's Life, you take your 2D matrix of cells
and produce rotations of it in eight directions (N,NE,E,SE,S,SW,W,NW). You
then take those rotated versions of the matrix along with the original and
conceptually stack them. Then, for each grid point, you sum downward,
producing a new matrix that contains the neighborhood count of the original
matrix. From that you can create the next Life generation.

This sort of problem doesn't come up everyday, but the thing that I think is
profound is that the existence of these operations allows us to think about
problems in different, possibly simpler ways. They are untapped potential and
they could be as well known as map and fold.

APL and its derived languages are hard to approach but there isn't much that
keeps us from importing the data structures and operations in more
approachable languages.

~~~
nickpeterson
Hey Michael, been a fan since reading 'Legacy Code'.

Do you believe that refactoring a large APL codebase would be inherently
easier than something like, a large legacy C++ codebase?

What is the relative complexity of comparing, say 1000 lines of APL to 10K
lines of C++?

I remember reading an interview between Arthur Whitney and Bryan Cantrill
where they mentioned something about recognizing idioms in the dense code much
more easily in K than in something like C because it took so many less
characters. Do you see any sort of refactoring advantage to that?

I've also recently seen work on the Co-dfns compiler in APL where the author
mentioned that APL allowed him to easily refactor compared to traditional
codebase, and looking at the github library, there is something like 3 million
edited lines over the span of several years in a codebase this is a few
thousand lines of code. I find this fascinating because my hope is to
eventually start a small software business and I think reducing complexity in
the number one priority in order to make it sustainable in the small (the goal
isn't to have a team of 50 developers).

Thoughts?

~~~
michaelfeathers
Nick, I think that the primary win is referential transparency. APL idioms
raise the level on that base. The tradeoff, though, is the same as we have for
functional but further along the road: if you develop a codebase that doesn't
use common idioms it's harder to hire people to work in it.

I think array languages, or at least their operation set and idioms, will move
into the mainstream in the same way that functional is, but it will take time.
Sustainability for your business would be more an issue of hiring and
retaining talent.

Reach out if you want to talk about this more: @mfeathers

------
ceautery
Sorry, everyone, looks like choosing to host my site on appspot was
incongruous with high traffic, since I only get 1 gig of outgoing bandwidth
per day.

Here's the cache:
[https://webcache.googleusercontent.com/search?q=cache:K-e7Jh...](https://webcache.googleusercontent.com/search?q=cache:K-e7JhLlJ4YJ:https://curtisautery.appspot.com/5776042744610816+&cd=1&hl=en&ct=clnk&gl=us)

------
leoc
I'm just going to leave this here:
[http://www.ccs.neu.edu/home/shivers/papers/rank-
polymorphism...](http://www.ccs.neu.edu/home/shivers/papers/rank-
polymorphism.pdf)

~~~
jtraffic
"APL, and its successor J [...] provide a notational interface to an
interesting model of computation: loop-free, recursion-free array processing."
How is APL loop-free, exactly?

Later they say: "Under this implicit lifting, the iteration space is the
argument frame rather than a sequence of loop indices." So if I understand
correctly, we have iteration, but no loop. But that doesn't seem like a really
important distinction...what am I missing?

~~~
RodgerTheGreat
The key feature is _abstract_ iteration, like functional maps, filters and
folds, and _implicit_ iteration, where operations "penetrate" to the items of
a vector or matrix automatically, rather than _explicit_ iteration like a
"for" loop.

Abstract iteration is useful because it results in programs with fewer "moving
parts"\- no loop induction variables to misplace or mutate in the middle of a
loop body. Programs are necessarily expressed in a more rigid manner and some
irregular algorithms can be difficult to express.

Summing a vector with an explicit loop in K (very non-idiomatic!):

    
    
        r:0;i:0; do[#v;r+:v@i;i+:1]; r
    

The equivalent using the "over" adverb:

    
    
        +/v
    

Both examples perform the same calculation. The latter is more concise and
easier to reason about.

~~~
scottlocklin
Stuff like +/v is easier to reason about, but it's also handled by special
code which runs a lot faster. At some point I need to do a blog on all the
horrible things that happen inside your computer (cache hits, memory being
swapped in and out, stacks popping) when you do an interpreted, explicit loop;
bytecode, AST or whatever. There are R&D interpretors which claim to remove
this overhead for trivial for loops,which are the main kind that end up
getting used in numeric code, but none of them ever seem to make it into
production (I'm sure someone will correct me if I am wrong; I am pretty sure
Art wasn't doing this in K4, though he was probably best positioned to do so).

The real reason we like +/v besides less typing; it can be handled with
special code which runs close to the theoretical machine speed. Lots of small
places and languages this fact can be exploited in. R and Matlab basically
have a subset of operations you can do +/v type things with. APL is the main
class of languages where this sort of thing is built into the semantics of the
language. If you're dealing with numerics in an interpreted language, it
should be built into the semantics of your language, and that's how you should
do things. Really it should be in compiled languages too, and that's how
people should reason about code, but it's probably asking too much since APL
has only been around since the 70s...

~~~
kazinator
> _handled by special code which runs a lot faster_

This is just another way of saying "we don't have a compiler, so don't process
data at the element level if you want speed".

It is not an advantage of the language, but a disadvantage.

If you have a compiler, then it's only valid for aggressive, machine-specific
optimizations. This is articulated by statements like "the library function is
marginally faster than an open coded loop, because it uses some inline
assembly that takes advantage of vectorized instructions on CPU X".

If I write an FFT routine in C myself, it will not lose _that_ badly to some
BLAS/LAPACK routine.

~~~
scottlocklin
Forcing your compiler to figure out you're doing something trivial on a rank-n
array is silly. So is writing all the overhead and logic (where a typo can
break things) which goes into a for or while loop instead of two characters:
+/

I encourage you to try writing an FFT routine in C yourself and compare it to
FFTW, where they basically wrote a compiler for doing FFTs. It's also worth
doing in an interpreted language in an array-wise fashion versus with a for-
loop. You should get something like a factor of 100,000 speed up.

~~~
kazinator
> _Forcing your compiler to figure out you 're doing something trivial on a
> rank-n array is silly._

What is the alternative, if there is no canned procedure for it?

The procedure has to be written somewhere, somehow, in some language.

If compilers are silly, assembly, I guess?

~~~
kd0amg
The analysis to recognize whether a "canned procedure" is applicable is
nontrivial, to put it lightly.

------
cousin_it
Looks like the site is hugged to death :-(

My only experience with APL-like languages was playing with K for a few
months. It's amazing, no other language can do so much in a few characters.
Here's a list of hundreds of snippets:
[http://code.kx.com/wiki/Qidioms](http://code.kx.com/wiki/Qidioms)

To give a taste of K style, I'll try to explain one of those snippets here.
The problem statement is to merge three arrays x, y, z under control of
another array g. Don't worry, it will make sense in a moment. These are the
definitions of x, y, z and g, followed by one line of code solving the
problem, and the result:

    
    
        x:"abcd"
        y:"123456789"
        z:"zz"
        g:"101121211010101"
        (x,y,z)[<<g]
          "1a23z4z56b7c8d9"
    

First of all, (x,y,z) is just the concatenation of three arrays, and [] is the
familiar array indexing operator. The twist is that [] can also accept an
array of indices, so e.g. "ab"[0 1 1 0] returns "abba".

But the really clever bit is <<g. A single invocation of < returns the sorting
permutation of an array, i.e. an array of indices x such that g[x] is sorted.
After two invocations of < you get the sorting permutation of the sorting
permutation. In other words, the inverse of the sorting permutation. In other
words, the permutation that turns sorted g into regular g. In other words,
exactly the permutation that you need to mesh x, y and z together!

By any reasonable programmer's standard, that's way too much cleverness. But
if you want a lot of functionality in a few characters, that's the price to
pay, and K programmers seem happy to pay it. It's also really fast, because
you're combining optimized bulk operations, and dropping down to individual
elements only in special cases.

~~~
pmoriarty
What do you gain from using the << syntax over a verbose but reasonably clear
function name like "sorted-permutation"?

Clarity in function names is one of the main reasons I prefer Lisp and Scheme
over languages like APL, K, and Haskell, which seem to favor the use of
something like a mathematical notation, which I've always found obfuscating.
This obfuscation is made worse by the aversion to comments that I've seen in
some of these languages, while in Lisp and Scheme, the verbose and explicit
function names are in a way self-documenting.

For someone who prefers a clear and explicit programming style, the terseness
of some programming languages is a real turnoff.

~~~
RodgerTheGreat
The function "grade up" has the symbol "<". The composition of this function
with itself "<<" is not a special case- it's "grade up of grade up".
Compositions of functions are first-class objects in K, so you could certainly
give it a name if you like (as well as eliding some of the brackets in the
above example):

    
    
          ordinal: <<:
        
          (x,y,z)[ordinal g]
        "1a23z4z56b7c8d9"
        
          (x,y,z)ordinal g
        "1a23z4z56b7c8d9"
    

The idiom << is common enough, though, that K programmers learn to recognize
it when they see it in code. I'd have to look up the definition of "ordinal"
to be certain it did what I think, but << is totally unambiguous. Contrary to
your argument, naming short idioms like ,/ << +| and so on actually makes a
program _less_ explicit.

------
fernly
Nice to see APL lives and people still try to teach it. For software
historians, all the 1970s-era manuals are at bitsavers[1]. The language was
quite a bit simpler then; several of the primitives this tutorial thought
appropriate to introduce didn't exist then.

[1]
[http://www.mirrorservice.org/sites/www.bitsavers.org/pdf/ibm...](http://www.mirrorservice.org/sites/www.bitsavers.org/pdf/ibm/apl/)

------
dang
This recent thread about an APL compiler for the GPU was so interesting:
[https://news.ycombinator.com/item?id=13565743](https://news.ycombinator.com/item?id=13565743),
that we invited the author to do an AMA about it:
[https://news.ycombinator.com/item?id=13797797](https://news.ycombinator.com/item?id=13797797).

------
adamaid
A+ a derivative of APL is still being used in a certain investment bank. 20
years of projects to deco it and replace it with something more modern haven't
managed to completely kill it off yet. Main issue I had with it was the
inability / cost to hire people with any experience, and the off-putting /
steep learning curve. Once you get the hang of it though it's a great language
for solving certain more numerically orientated problems.

~~~
osullivj
Morgan Stanley. They open sourced it ~15 years ago at aplusdev.org. The
project was led by Arthur Whitney, who went on to become Mr KDB.

~~~
throwaway7645
Mr. Moneybags...well earned from what I hear.

------
mmcclellan
I really liked the way that Iverson built up concepts in his [Arithmetic
(PDF)]([http://www.jsoftware.com/books/pdf/arithmetic.pdf](http://www.jsoftware.com/books/pdf/arithmetic.pdf))
manual for J. I found it very intuitive and useful even if you never plan to
use the language. There are others at the site like his manual for Exploring
Math and one for Calculus as well.

~~~
throwaway7645
I thought they were great too!

------
muraiki
I've slowly been going through "J Tutorial and Statistical Package", which
teaches J (Iverson's evolution of APL which uses only ASCII characters) in the
context of building a library for statistics[1]. I've found that it's a great
way to learn the language, and that stats is a domain where APL languages work
very well. I also read an interesting paper about using APL to create a
notation for statistics, such as "normal prob between 0 2"[2]. Another nice
thing about J is that it's GPL'd and free to use commercially.

[1]
[https://webdocs.cs.ualberta.ca/~smillie/Jpage/jtsp.pdf](https://webdocs.cs.ualberta.ca/~smillie/Jpage/jtsp.pdf)

[2]
[http://archive.vector.org.uk/art10501700](http://archive.vector.org.uk/art10501700)

------
sunkencity
The core language is good enough but the object oriented stuff they put on top
later on is just a hell of an eyesore.

------
ColanR
Is APL still used for anything now? It seems like it could be a useful
language for _something_.

~~~
evincarofautumn
I dunno how _widely_ used it is, but there is Kdb+[1], “a column-based
relational time-series database” built on the K[2] language, a descendant of
APL. I mainly see J[3] used for code golf, but at least that means a decent
number of people know it.

[1]:
[https://en.wikipedia.org/wiki/Kdb%2B](https://en.wikipedia.org/wiki/Kdb%2B)

[2]:
[https://en.wikipedia.org/wiki/K_(programming_language)](https://en.wikipedia.org/wiki/K_\(programming_language\))

[3]:
[https://en.wikipedia.org/wiki/J_(programming_language)](https://en.wikipedia.org/wiki/J_\(programming_language\))

~~~
alfalfasprout
Kdb+ is incredible. It blows away any of the current attempts at time series
data stores out there (eg; InfluxDB, OpenTSDB). Unfortunately, Kx Systems
failed to allow it to reach mass popularity by keeping licenses extremely
expensive. Startups simply can't afford to outlay $100k+/year for a tiny
cluster.

~~~
avmich
I wonder if anybody compared Kdb+ with Jdb, a J DB system.

~~~
throwaway7645
I tried googling this awhile back. I assume kdb+ is closed source?

------
rnhmjoj
The main problem with (GNU) APL for me is the line editor: I can't feel
comfortable using a REPL to write a program without keybindings like ctrl-a,
ctrl-w. Everytime I try to learn APL I eventually give up.

~~~
bjarkevad
I haven't tried it out, but maybe rlwrap can help here.

~~~
rnhmjoj
Unfortunately it doesn't seem to work.

------
jtraffic
I thought it was super cool when I found out you could overload operators in
Julia. I wonder if you could re-create some APL-like syntax that way.

~~~
KenoFischer
You might enjoy this one:
[https://www.youtube.com/watch?v=XVv1GipR5yU](https://www.youtube.com/watch?v=XVv1GipR5yU)

~~~
jtraffic
I really did enjoy that. It reminded me of how I use R and then drop down to
C++ when I need speed. This interpreter is analogous: use Julia and then go to
APL when you want compact expression. Although, I'm not sure how advantageous
it is. At least _prima facie_ it's not as compelling as the speed gains you
get from dropping to C++ from R.

------
gionn
Is this sorcery?

~~~
Semaphor
It's APL. First time I heard about it was in those "how to shoot yourself in
the foot" lists [0]:

* You shoot yourself in the foot and then spend all day figuring out how to do it in fewer characters.

* You hear a gunshot and there's a hole in your foot, but you don't remember enough linear algebra to understand what happened.

[0] [http://www.toodarkpark.org/computers/humor/shoot-self-in-
foo...](http://www.toodarkpark.org/computers/humor/shoot-self-in-foot.html)

~~~
throwaway7645
Second bullet point is pretty funny. I really liked linear algebra, but I'm
sure it could get frustrating if you're not an expert. I wonder if it makes
sense philosophically for code to be more mathematical like APL or like a
spoken language like Python.

~~~
GregBuchholz
"SciPy – the embarrassing way to code"

[http://www.vetta.org/2008/05/scipy-the-embarrassing-way-
to-c...](http://www.vetta.org/2008/05/scipy-the-embarrassing-way-to-code/)

~~~
throwaway7645
I enjoyed this article. Thanks!

