
Papers from the Lost Culture of Array Languages (2011) - tosh
http://prog21.dadgum.com/114.html
======
geophile
APL, (which is the array language I had some exposure to), had three
interesting characteristics:

1\. It implements an algebra: Operators take arrays as input and yield arrays
as output.

2\. The special characters bound to those array operators.

3\. It is functional.

#2 definitely gave APL programs a distinct and elegant personality, but
probably didn't help with widespread adoption.

But #1 and #3 are powerful ideas, and have showed up in other forms. SETL was
a set-oriented language from the late 70s/early 80s. I took compiler courses
from the guy behind it, R. B. K. Dewar (of Spitbol fame). Fantastic teacher.
SETL programs were concise and powerful like APL programs, exactly because of
#1 and #3.

Later in my grad school career, I got into relational algebra, which also has
these characteristics. My advisor was T. H. Merrett who engaged on a many-
years, quixotic quest to explore relational algebra as the basis of a general-
purpose programming language. I didn't buy into all of that, but as a database
guy, I find relational algebra to be powerful and useful on a day-to-day
basis.

Finally, I really like functional stream approaches as a great way to bridge
set-at-a-time programming (for lack of a better term) with ordinary, low-level
programming. Java 8 streams and lambdas is a great (if flawed) example of this
approach. Even Map/Reduce is, under the vast amounts of complexity.

I like this approach so much that I built a command-line tool, osh
([http://github.com/geophile/osh](http://github.com/geophile/osh)) based on
it. The idea there is the Unix idea of composing simple commands using pipes,
except in osh, the pipes carry streams of Python objects. And the commands
that operate on the objects in these streams are Python. E.g., to print the
pids and commands of processes whose commandline contains "emacs":

    
    
        $ osh ps ^ select 'p: "emacs" in p.commandline' ^ f 'p: (p.pid, p.commandline)' $
        (37268, '/usr/bin/emacs24 ')
        (42107, 'emacs ')
        (63758, '/usr/bin/python /usr/bin/osh ps ^ select p: "emacs" in p.commandline ^ f p: (p.pid, p.commandline) $ ')
        (113605, '/usr/bin/emacs24 ')

~~~
iheartmemcache
I read your post, thought 'hmm sort of powershelly with a bit of awk', clicked
on your Github project page and saw Powershell was listed as the first
'Software with Similar Goals to Osh'.

I've been searching for the 'panacea' of shells (and/or auxilary shell
tools/hacks that dup() fd's 0,1,2 to enhance existing shells) that hits that
same 'sweet spot' you describe. As such, over the last ~15 years I've been
through a litany of setups, ranging from:

\- the standard "bash/zsh/fish" approach (where you extend the shell) to

\- the "scsh/ipython/eshell" approach (where you bring an inferior shell's
functionality into a language) to,

\- the screen/tmux approach (where you take a shell and then layer
functionality over it). I.e., for directory navigation, I'd written my own
f-recency+bookmark system that would hook 'cd <tab>' and generate a pane sort
of like Midnight Commander to nav around

I'm not sure where I'm going with this other than, I feel your pain and I'd
imagine tons of other people do/did as well. Powershell is _painfully slow_
and RAM heavy but the ability to add custom properties(!), providers, access
the registry, and manipulate all of these objects as you'd like. Your project
definitely looks like an interesting take on things as well. At least we're
making _some_ progress, I suppose ;)

===

(!) This is incredibly powerful since you can take a path, like
C:\users\foo\downloads\video\, take file item, and then have Powershell invoke
an executable to extend functionality out. If Windows doesn't have "Length" or
"Encoder" as a property on the file out-of-the-box, you can just use an
auxilary tool (say, ffprobe), "mapcar" the exec to the list-of-files, grep out
the Length: field, and bam, that file now has Length. ``ls|where Length -gt
15'' ends up being pretty magical.

~~~
geophile
osh gets extensibility by reading a file on startup (~/.oshrc by default).
That file is Python, and contains both configuration (e.g. database login
info, ssh login info), and functions that can be used in osh commands.

------
sumanthvepa
I had the good? fortune of programming in A+ (a derivative of APL) at Morgan
Stanley in the mid 90s. Not sure if it is used there anymore. For someone
coming from a background in object-oriented languages like C++ and Java (only
just introduced in 1995,) A+ was utterly alien and incredibly powerful. A
single line could accomplish stuff that would take 50-100 lines of C++. But
that single line looked like Egyptian hieroglyphics! To even read or write
code in A+ required a special font addition to XEmacs to work as the glyphs
would not render in any other font. I never fully understood the art of
programming in A+ in the year or so I had to work on the product that was
written it it, but I was amazed at the things the experts in my team could
achieve with it.

~~~
FractalLP
The glyphs are unicode now and I bet A+ is just legacy and new things are kdb+
which is the successor to A+. Can anyone confirm?

~~~
tluyben2
It is still there and open source [0] and I guess you can say k (not kdb or
q!) are it's successors in some ways. Definitely in the way they are all very
much APLs, with or without the glyphs. If you read the APL book by Iverson and
then go to implementations of APL, A+, J, K, you'll notice differences, but
there are all identical enough to pick them up fast once you are proficient at
one of them. kdb+/q are different beasts and that is were the issues start
(and the big bucks are to be found); debugging (performance) issues related to
critical software written on top of k.

[0] [http://www.aplusdev.org/index.html](http://www.aplusdev.org/index.html)

~~~
ca01an
Do you have any experience with kdb+/q? I was considering taking a job offer
where I'd have the opportunity to learn kdb+/q but I don't have any experience
with it and I can't find a huge amount of information about it online.

~~~
oddthink
Despite my griping, I liked kdb+ and q. As a database query language, q is
awesome, if you are doing a lot of time-series analysis (running sums, etc.,
nothing fancy). As a language, the table type in q is very nice, a bit like
what pandas or data.table wish they could be.

For a while there was a freely downloadable version to try out, and you can
look at "q for mortals", [http://code.kx.com/q4m3/](http://code.kx.com/q4m3/),
to get some flavor.

~~~
tluyben2
They just made the 64 bit version free for experimenting with. It is annoying
because it only will start up and run if you are connected to the internet,
but it fully works. The 32 bit version can be downloaded as well and that
fully works, with or without internet connection.

------
segmondy
The key thing is "Notation as a tool of thought" What would mathematics or
music be without notation? It might be hard to believe but there was once a
time mathematics didn't have notation for "+, - * /" or even the decimal point
"." Mathematics was solved by writing words, it was slow, it was painful and
it didn't progress as fast. Likewise with music, sure we can write music by
writing do re mi fa so la ti, but really?

Yet, this is how we program today with words, writing english words that don't
mean what they claim to be. SomethingManager, SomethingDelegator, etc. Bah!
The very ideal behind APL is to notations, to represent code with symbols
which are non ambiguous and always mean the same thing.

There's a lot that we have lost. It's still worth taking a look at. For those
interested, J (www.jsoftware.com) is a free version with lots of great
learning resources.

~~~
pasabagi
As somebody who started programming long after I learned to touchtype, I feel
like programming uses a lot more typographical fauna than normal language -
granted, not as much as maths - but I'm pretty sure half of the maths notation
is as it is because it looks cool.

Further, when you have a SomethingManager, is it really representable by a
general symbol? What symbol would that be?

------
sharpercoder
With many languages adopting LINQ object querying, I find it more and more
suitable to start using math's set operators for them. Many languages have
almost identical behavioral sequence operators (.find in js, .Where in C#,
etc) but math sort-of standardizes them all with a sound theoretical
background. I'm not entirely sure if they all map 1-to-1, but at least most
will.

    
    
        const items = ∅; // empty set
        if (myItem ∈ items) { ... } // is element of
        const union = items ∪ otherSet;
        assert items ≌ ∅; // items are all equal

~~~
b34r
Typing these characters is quite annoying though. The characters for most
programming languages are readily visible on a 108 key keyboard, which means
they’re easy to remember and teach.

~~~
guidoism
It's depressing that this is such a common argument against moving away from
the ASCII character set. We all use operating systems that allow us to define
key sequences for these characters. And it's really easy to remember them
after a week or so, I mean, how often do you actually look at the letters on
your keyboard anyways?

The main issue is that computers don't come with a standard default layer for
math and APL characters anymore so it's a huge leap of faith for someone to
start to use these characters.

~~~
posterboy
I had to paint my keyboard blank to learn writing withot looking at the
printed on symbols ...

------
pjc50
This reminds me of "[semantic] compression oriented programming":
[https://caseymuratori.com/blog_0015](https://caseymuratori.com/blog_0015)

The arary languages have always struck me as the extreme end of this. The
semantics are very dense: each operator does a lot. Intermediate results
aren't granted names. The few people who can follow this find it pleasant as
the "extraneous" information is removed. Most people find it hard to adapt to.
And it doesn't readily suit a lot of business logic.

Possibly the one context where arrays-as-language-primitive has really taken
off is the 3D processing world, first with OpenGL's matrix stack computation
and later with shader programming. Note that shader programming is the other
way round: you don't start with a screen object and apply functions to it
mapped across all the pixels (fragments), you write a small program as if for
one pixel. Even if the end result is executed in a highly parallel fashion.

~~~
guidoism
> Intermediate results aren't granted names.

I would argue that most programmers are fine with this. Unix pipes are a very
similar point-free way to program that we're all comfortable with. If people
came to APL thinking of it as something similar to pipes as opposed to
something similar to C they might have an easier time with it.

~~~
posterboy
It's funny to call this point free when OOP notation like
_ps().grep(pid).kill(9).or(halt);_ comes down to the same thing. Just rambling
because I'm confused. lol.

~~~
uryga
You're right, method (OOP) notation lets you compose/pipeline functions. Pipes
enable the same thing. So does F#'s pipeline (|>) operator, Haskell's
composition (.) operator, Clojure's threading macro, concatenation in stack-
based languages like Forth, etc... It's not a concept unique to OOP, which is
why GP referred to 'point-free programming', which describes the general idea.

------
bitminer
I remember APL as part of learning how to program at my high school in the
early 1970s.

I could touch type APL on the IBM 2741 text terminals (basically Selectric
typewriters with serial interface.). And could compose simpler APL codes at
that speed.

The principal weakness was APLs narrow view of data type. Text was a 3rd class
citizen. There was no substring searching in the early IBM implementation. As
a result one had to:

    
    
      Form a cross product of text with the substring, testing for equality of each character
    
      Rotate each row of the cross product by n steps where n is the row number
    
      (This aligns the tests along columns)
    
      Perform AND reduction along the columns to find the full matches.
    

None of this was intuitive and no I didn't invent that method.

~~~
geocar
> None of this was intuitive

It's extremely intuitive!

    
    
        ⌊/(⍳≢b)⊖a∘.=b
    

Let me illustrate why with q:

    
    
        q)a:"this is some cake i like"
        q)b:"cake"
    

Form a cross product of text with the substring, testing for equality of each
character

    
    
        q)a=/:b
        000000000000010000000000b
        000000000000001000000000b
        000000000000000100000010b
        000000000001000010000001b
    

Rotate each row of the cross product by n steps where n is the row number

    
    
        q)(til n:count b) rotate' a=/:b
        000000000000010000000000b
        000000000000010000000000b
        000000000000010000001000b
        000000001000010000001000b
    

Perform AND reduction along the columns to find the full matches.

    
    
        q)(&/) (til n:count b) rotate' a=/:b
        000000000000010000000000b
    

And there we have it. A naive string search in an array language.

Now some interesting advantages fall out of rediscovering this:

\- It's obviously parallelisable: The "obvious" (iterative) solution in other
languages isn't, and without some difficulty it isn't clear where the work can
be split up.

\- It uses "outer product" with compare = instead of multiplication × -- an
experienced programmer might forget how deeply satisfying it is when first
learning how to compose operators

\- With some thought it gives a programmer ideas on how they can do string
search _even faster_. Fast string search in an iterative language doesn't look
anything like the naive solution.

b⍷a (b find-in a) is useful enough that an experienced programmer should
expect a "modern" language to implement it (in q it's called "ss", other
languages of course call it different things) but it is far from necessary,
and we might cheat ourselves of something. After all, _when_ we are
experienced enough to see these things as obvious, other things become obvious
as well.

------
pklausler
Arrays in array languages are functions of their indices, and many array
operations can be viewed as compositions of functions on those indices.

~~~
pacaro
:lightbulb: thank you

------
protomyth
F-Script was an interesting array language object combo that was originally
written on OPENSTEP. It fit great with Cocoa later and was quite nice to
program in.

[https://en.wikipedia.org/wiki/F-Script_(programming_language...](https://en.wikipedia.org/wiki/F-Script_\(programming_language\))

~~~
chc
F-Script was really interesting. It was basically array programming welded
onto Smalltalk welded onto Cocoa. It had a whole Smalltalky object browser,
and the array programming features felt surprisingly natural.

------
tabtab
They kind of got replaced by "table oriented" query languages, like SQL.
Arrays can get very unruly to coordinate, manage, and/or comprehend. Tables
tend to ensure a bit more structure and can "do" most of the same things as
arrays if you normalize properly.

SQL is probably not the ideal "table processing" language, but it's good
enough and a strong enough standard such that competitors have yet to unseat
it (although I'd like to see more competition, such as Tutorial-D/Rel, SMEQL,
etc. fight it out in the market-place.)

------
poster123
I have used Fortran 90+, Python with Numpy, R, and (rarely) Matlab/Octave, all
of which allow operations on whole arrays and array sections. Would they be
considered array languages? What is qualitatively different about APL, J, and
K?

~~~
geocar
Not by me. Not by most people I reckon.

Consider the lone factorial problem: Every array programmer will do some kind
of _the product of, one plus array of all ints up to x_

    
    
        apl: ×/⍳X
        q: prd 1+til x
        j: */1+i.x
        k: */1+!x
    

Notice how similar they are?

Now, how do you solve this problem in Fortran? In Python? In R? In Matlab?
What's the first tool in your toolbox? Is it to iterate over the values?

    
    
             FUNCTION FACT(N)
             INTEGER N,I,FACT
             FACT=1
             DO 10 I=1,N
          10 FACT=FACT*I
             END
    
    
        def fact(n):
            result = 1
            for i in range(1, n+1):
                result *= i
            return result
    
        function b=fact(a)
          b=1;	
          for i=1:a	
            b=b*i;
        end
    
        function fact = iter_fact(n)
          fact = 1;
          for i = 2:n
            fact = fact * i;
          endfor
        endfunction
    
        fact <- function(n) {
          f = 1
          for (i in 2:n) f <- f * i
          f
        }
    

If the native programmer's first impulse looks something like that, then it's
not an array language because the programmer isn't thinking in terms of
arrays.

~~~
jerf
I wonder if some of the popularity of "functional" in imperative languages is
that it lets you put this sort of stuff together without needing an "array
language". In python it's pretty easy to

    
    
        def product(iterable):
            return reduce(operator.mul, iterable, 1)
    
        >> prod(range(1, 5))
        24
    

(albeit with some imports, and note range is [begin, end) so that's 4!, not
5!). It might not be the first thing a Python programmer does, but it's
certainly reasonable. Haskell of course is

    
    
        Prelude> foldl (*) 1 [1..5]
        120
    

(inclusive this time, hence 5!) which is only slightly more verbose than the
"array" languages, but does have the advantage of clearly specifying what your
base case is if the list is empty.

Perhaps array languages are just getting subsumed as a special case of
"functional programming", be it either the relatively weakly-guaranteed FP
that gets embedded into otherwise OO/imperative languages or Haskell's
stronger-guarentees FP.

~~~
fusiongyro
It's worth noting that the behavior is totally different if you pass higher
dimensional matrices. The Fortran, Python, Haskell, etc. code will all break
or fail to typecheck, whereas the APL/J/K implicitly maps over them.
Understanding this topic ("rank") is a major source of confusion for new users
and power for existing users, as lots of different mapping regimes can be
obtained without a lot of additional operators.

The rank operator in J (and probably Dyalog APL) allows you to treat a 3D
matrix as an array of 2D matrices, a 2D matrix of arrays, a single item or a
bunch of unit-sized boxes. This concept generalizes to higher dimensions. I
don't think this aspect of array programming has gotten as much airtime as it
deserves, probably because it is complex, but this is where the semantics of
array languages and conventional languages really differ.

~~~
eesmith
The Python-with-NumPy example I gave at
[https://news.ycombinator.com/item?id=16849500](https://news.ycombinator.com/item?id=16849500)
supports higher dimensional matrices:

    
    
      >>> import numpy as np
      >>> np.multiply.reduce(np.array([[1,2,3,4], [5,6,7,8]]))
      array([ 5, 12, 21, 32])
    

and it lets you reshape into different forms, like:

    
    
      >>> arr = np.array([[1,2,3,4], [5,6,7,8]])
      >>> np.multiply.reduce(arr.flatten())
      40320
      >>> np.multiply.reduce(arr.reshape((4,2)))
      array([105, 384])
      >>> np.multiply.reduce(arr.reshape((2,2,2)))
      array([[ 5, 12],
             [21, 32]])
    

I believe this was influenced by the array languages.

~~~
fusiongyro
Very cool, I did not know that! Thank you for posting this example!

