Hacker News new | past | comments | ask | show | jobs | submit login
Notation as a Tool of Thought (1979) [pdf] (utoronto.ca)
131 points by rbanffy on Oct 23, 2021 | hide | past | favorite | 65 comments



I think this is a fascinating subject that touches on how human general purpose intelligence works. I've spent decades working with C like languages and by now I find them hugely helpful thinking tools.

I did some optimization work on a bilinear bitmap upscaler recently [1]. There were times in the middle of that work where I felt like I was sitting watching someone else do it for me. Not physically watching, but more like a part of my brain was doing it automatically. The amount of state in scope at once is too big to fit into my working memory (maybe I'm weak at this), so the work has to be symbolic manipulation. It feels like there are patterns in the symbol manipulation that my brain can do with minimal effort. Somehow the result is that I can write (hopefully) correct code that is beyond my ability to comprehend without the notation. The fact that I don't understand how I understand it contributes to the feeling that someone else wrote it for me.

The problem I have looking at APL is that it doesn't look like it would help with this kind of work. The examples in the paper are all mathematical and abstract. I'd like to see some real-world practical examples, where performance matters, IO is fiddly and memory layout is part of the API. And instead of the problem domain being "differentiating a polynomial", I'd rather it was decoding the Huffman data out of a JPEG, or implementing a video game's collision detection system. My feeling is that C like notations are better for that type of thing. Maybe that isn't what APL is supposed to be good at. However, the paper starts by saying APL is needed because maths notation is not universal.

[1] https://stackoverflow.com/a/69633678/66088


One possible reason for me not thinking APL is a good notation is that it seems to avoid assigning names to things. I believe good names are a vital tool for a thinking notation. In my bilinear upscaler example, I was initially struggling to understand the code I had found on the web. The variable names were all things like "x1a", "x2". I changed them to things like "srcX" and "weightX". That helped a lot.


And another: types. They seem to be under emphasised in APL. In my bilinear example the original code had lots of "(pixel >> 8) & 0xff" kinds of things to access the colour components. I created a type to allow those to become "pixel.g". That immediately made the code feel easier to understand. (Pleasingly it made the code run faster too).


Prominent APLers think it doesn't need one. I agree, because APL tries to be close to math, and you don't write types when doing math. You use alternate and overloaded operators.

A video on the topic

https://youtu.be/z8MVKianh54


In math, not only are you dealing with an expectation of a multi year commitment among participants in an academic conversation, but people also either define the terms they use or rely on established culture.


Try writting a Math paper with thousands of expressions and you will see how ergonomic it is for the kind of work we do while programing.


The single-letter variable names and the lack of types (beyond some duck-typing using bold and uppercase) is what makes math so difficult to read (for me). There must be tons of errors in math papers that go undetected just because there’s no type checking.


Ok, but let's not forget that math is written by and for mathematicians and not software developers...


Also, any halfway decent math writers (including those writing for domains that use math, but are not strictly done by mathematicians) also define their terms or provide a context. If you're doing physics, the following is perfectly comprehensible (and is useless if you haven't studied at least a bit of physics):

  x(t) = x_0 + v_0 * t + 1/2 * a * t^2
In a physics book, all of those terms would have been defined prior to that statement, but you wouldn't exclude that equation because it's so useful later on (to actually do calculations with, or to solve for the other elements).

If you take any arbitrary APL program or mathematical statement out of context, it's going to be mostly meaningless. I mean, I can tell you what that does above if I've taken algebra but haven't taken the first couple weeks of Physics 101, but I don't know what it means so it's not useful. That's not the fault of the equation and it shouldn't be discarded just because it needs context to fully understand (either context fully stated with it, or an expectation that you've already studied the topic and so a minimal context can be provided and you can fill in the gaps).


Non-executable, untyped, unverifiable notation is a feature, not a bug, because it makes it harder for a peer reviewer to reject your paper.

Imagine if peer reviewer's pdf viewer put red squiggly lines under invalid formulas and equations...


You're wrong. Pro APLers use descriptive names.


I think probably C is a better fit for the kind of things you're doing. Or is it the other way around? The reference implementation for JPEG is in C, and ease of implementation in C was probably a significant criterion for this and other standards. I tend to see lots of structs, bit-packing, and bitwise operations that don't make as much sense in APL. I wouldn't say that addition modulo specific powers of two is inherently "practical" or "mathematical", it's just what a CPU provides. APL's not interested.

That said, Rosetta Code has many examples of common algorithms in J, an APL relative. I looked up Huffman Coding there and found a link[0] to an implementation that looks okay. But J's likely to stumble on some JPEG-specific details, and you won't get C-level performance with this sort of code.

[0] https://code.jsoftware.com/wiki/Essays/Huffman_Coding


Excellent link to the Huffman example. I think it will take me a lot of study to grok it. I'll consider investing that effort.

> I tend to see lots of structs, bit-packing, and bitwise operations that don't make as much sense in APL.

That's why I mentioned "performance matters". Those things are common in algorithms where performance matters. If APL isn't interested, then it isn't good for most algorithmic work. Those operations aren't some quirk of current CPU architecture. It is less work in our universe to do bit shifts than multiplies. And bit packing reduces storage and IO, which also saves work at a fundamental level. The same operations are used in FPGA/ASIC designs where the architecture is much more free. Maybe quantum or analogue computers would change that picture a little but that's not what APL is for.

I'd say the main use for a symbolic thought notation is when working on performance critical stuff. If performance isn't critical then I use a simple algorithm and a simple implementation. Higher level system design stuff is usually better done as diagrams than symbolic notation. I guess when working on a large codebase I'm using the structure of the symbolic programming language to help me think about how distant parts of the code interact. But that doesn't sound like something APL would be good at either.


Cool paper - thanks for posting! I love the typesetting of the math in these old papers - I have no idea how it was done, but it's ~good vibes~ for me.

If you're interested in reading a more recent take on notation, check out this post [1] by Terence Tao on MathOverflow. It's pretty cool, and a shorter read than this PDF - if you're in a rush.

While I'm posting overflow questions I like, check out this gem [2] on parsing HTML with a regex.

[1] https://mathoverflow.net/questions/366070/what-are-the-benef...

[2] https://stackoverflow.com/questions/1732348/regex-match-open...


> I love the typesetting of the math in these old papers

I agree except that the first thing the paper does is define "1" to be a function that "produces a vector of the first N integers". Highly confusing. It turns out that symbol isn't a "1" and rather is a weird squiggle that unfortunately looks like one.


It should be the Greek letter iota (i.e. I from indices). Besides iota, APL uses frequently the Greek letter rho (i.e. R from rank; rho applied once gives a vector with the array dimensions and applied twice it gives the dimension of the rho vector, i.e. the tensor rank of the array).


Interesting. I don't think "ι" looks like "1" at all.


Incidentally, what font is being used for the main text?


APL is a write-only language. [1]

This will be a controversial statement to devotees, but there's a simple argument that should clarify why this is true:

1. APL's inventor claims that APL is a tool of thought. Specifically, it is designed to facilitate the development of algorithms.

2. Tools of thought, like mathematical notation or writing in general, are used wherever people think.

3. When people think about algorithms, they write code or pseudo-code.

4. Thus, if APL were truly a tool of thought, it would be adopted as a pseudo-code. Academics would use it to more easily, concisely, and precisely express algorithms and programs. Software engineers would use it in their whiteboard planning. Etc.

5. In reality, essentially no one uses APL as a pseudo-code. Even by its devotees, it used exclusively as a programming language. The process of coding something into APL's idiosyncratic logography can only happen after the idea is already clear by other means. And virtually no one finds themselves understanding an algorithm via APL code which they did not already understand via more accessible languages or pseudo-code.

6. Therefore APL is not a tool of thought. On the contrary, it is what its notation indicates to virtually everyone who sees it: a write-only language.

Admittedly, there's some appeal in this idea that a super-concise notation might be a sort of life-hack for thinking about algorithms. If you could just get your brain to think in this notation, all sorts of new concepts would be accessible to you! But that's just not how reality has unfolded, at least in the case of APL. Perhaps there is a yet-undiscovered notation that truly realizes the promise of a Tool of Thought about algorithms. But I have no idea what it would be.

This letter from Dijkstra and the discussion underneath it [2] further strengthens the argument above. Note how APL is marketed as a tool of thought, but those who would teach it are distressed at the idea of teaching it without a terminal to execute it on. As Dijkstra notes, this is inconsistent with what we'd expect from a true tool of thought. Such a thing should have no trouble expressing its appeal through writing.

[1] https://en.wikipedia.org/wiki/Write-only_language

[2] https://www.jsoftware.com/papers/Dijkstra_Letter.htm


Point 5 is not true. APLers regularly communicate verbally and on paper in APL or pseudo-APL. See [0], paragraph after the bullet points. I've participated in many conversations that were as much spoken APL as English, as well as heard off-hand references to writing paper notes in APL-like notation.

[0] https://aplwiki.com/wiki/Edsger_W._Dijkstra#APL_by_Dijkstra....


I agree #5 may not apply to some people. If a small community of people is able to use APL to do things that can't be done by other means, that would be an effective refutation of this argument, or at least an important caveat to the scope of its claims. Maybe APL is truly an effective tool of thought for at least this small group of people.

I would be interested in your response to this post [1], which is on that point.

I noted the Ackermann function and inverted index results in Hui's response to Dijkstra. I can't understand the APL so can't be sure how novel or significant these results are, but I would predict that such properties of the Ackermann function and inverted tables/indices would have been known long before these derivations. These may just be examples of pre-existing knowledge being encoded into APL logography.

[1] https://news.ycombinator.com/item?id=28967730


I don't claim that APL is necessarily useful to everyone. I think the fraction of people who could use array-based notation effectively is not small, but there's no hard evidence for this.

APL has not to my knowledge been used to prove novel mathematical results. Has any programming language? It's not very interesting to note that a language that's unknown to working mathematicians doesn't find much use in cutting-edge mathematics. However, APL is closer to the notation that is used for these discoveries than nearly any other programming language (Mathematica is the exception that comes to mind).

There are some historical APL successes to point to. It was used to design IBM's highly-regarded system/360[0]; as the first APL implementation ran on the 360 this was of course done entirely on paper. I believe some early time-sharing systems were written in APL as well, though I don't have a source. STSC's Mailbox[1] was one of the earliest email systems[2]. More recently Aaron Hsu has used APL to create what seems to be the first data-parallel compiler[3]. While data-parallel is rigorously defined (polylogarithmic time given infinite processors), the vagueness of "compiler" means this isn't a result of interest to mathematicians.

I myself have used APL to make what I consider to be a significant discovery about sequences of natural numbers[4]. It unifies methods that I've come up with ad-hoc in the past, and I've now used this framework dozens of times to more quickly solve a variety of problems. I don't expect you to understand the APL-heavy description, but isn't that what you asked for? I and many others think in APL. It just seems natural to us. Here's a quote along those lines[5]:

> And I looked at the code that I’d written, and it’s about 150 lines or so, and it was complicated, complicated stuff. I went to sleep, and I had a dream, and in the dream, it told me how to approach this in a whole different way that I had done before. So I woke up, I sat down at the computer at six o’clock in the morning, and by noon I’d rewritten the whole thing from scratch, pretty much. I kept 7 lines of code that were tangential to that, and it was all correct. It passed the QAs like that. So sure, you can write for loops in your sleep, but I can write entire correct programs in my sleep!

[0] https://dl.acm.org/doi/10.1147/sj.32.0198

[1] https://forums.dyalog.com/viewtopic.php?f=30&t=1629&p=6415

[2] https://en.wikipedia.org/wiki/History_of_email

[3] https://scholarworks.iu.edu/dspace/bitstream/handle/2022/247...

[4] https://aplwiki.com/wiki/Partition_representations#Unificati...

[5] https://www.arraycast.com/episode-0-transcript


Your point 5 is completely wrong.

'The process of coding something into APL's idiosyncratic logography can only happen after the idea is already clear by other means.'

In fact, often I will write something in APL before translating it to other languages, not the other way round. When I was a beginner, I did translate things as I hadn't learned to properly 'think' in APL yet. Perhaps you have tried it, and not got past that initial barrier? In which case, my advice would be to stick with it for a bit.

'And virtually no one finds themselves understanding an algorithm via APL code which they did not already understand via more accessible languages or pseudo-code.' is also untrue.

An example of a time APL like notation has helped me to understand an algorithm is Kadane's algorithm for a maximum subarray sum. There are probably hundreds of articles and videos out there explaining it, suggesting that it's probably something quite a few people find unintuitive.

However, in k, an APL descendant, the entire algorithm is expressible as |/0(0|+)\

To anyone who knows APL/k this is instantly readable and intuitive. This was the first thing that really made it 'click' for me.

Ok, APL hasn't been widely adopted and there are definitely some downsides and problems with it, but it's still useful and definitely not write-only.


> However, in k, an APL descendant, the entire algorithm is expressible as |/0(0|+)\

The conclusion notwithstanding, an ability to concisely express the algorithm doesn't relate to an ability to actually understand it. Personally I found Kadane's algorithm intuitive as is and the concept of partial sum seems more important for the intuitive understanding of the algorithm. Array languages are surely benefited from built-in partial sum, but that'd be arguably same for any other language with readily accessible partial sum.


> an ability to concisely express the algorithm doesn't relate to an ability to actually understand it.

Exactly! Computationally, Kahane's algorithm is only a few simple steps. It's easy to write the steps down in a few lines of any language, and equally easy to understand what the individual steps are doing. The hard part about it isn't that it's too long and needs to be compressed into 8 symbols to be understood. The hard part is understanding why it is correct.


Maybe conciseness isn't the only thing that determines understandability, but it's definitely highly related. (A 1 page proof is generally easier to read than a 100 page proof, no?)

Kadane's algorithm was a simple example, and sure, it's not hard to understand in any language. My point was that the K version for me is the clearest and most obvious I've seen, and that understanding why it's correct is much easier starting from that than from the equivalent in say, Python or C++.


> Maybe conciseness isn't the only thing that determines understandability, but it's definitely highly related

Making something short is different from making it concise. The Python version is perfectly concise. There are no distractions or boilerplate. Making it shorter doesn't make it more concise beyond that point.

> My point was that the K version for me is the clearest and most obvious I've seen, and that understanding why it's correct is much easier starting from that than from the equivalent in say, Python or C++.

I think this very unlikely to be correct. The K and the Python are saying the same thing. Understanding why it works is the same problem from either starting point.

Now, I will admit that it's helpful to explicitly express "cumulative sum which is floored at zero" in your implementation, and K pushes you towards that solution, while Python and other languages don't force you into looking at it that way. But that's more about alignment of the most transparent formulation with the language primitives, than it is about shortness.


Generally in all imperative solutions I've seen, there are more 'moving parts', manual updating a maximum variable, explicit iteration etc, these to me are distractions and boilerplate compared to a functional/array lang solution.

Sure, they're saying the 'same thing', but that doesn't mean both ways are equally as understandable. They're definitely not to me, and probably others too (who know array langs).


Maybe brains just work differently here. I agree that it's nice to have array primitives like sum(x) and cumsum(x) instead of just for loops, and you can indeed have those in most languages. And it's very cool that APL type languages can concisely express a wider range of array primitives than other languages. But... in this Kadane situation, I really, really don't find that the for loop makes the difference between understanding it and not understanding it. That's mind-blowing to me, quite honestly.

More generally, based on my MATLAB experience, I also find that expressing everything in array primitives clearly, clearly doesn't scale beyond a certain level of complexity, and it's very important to have the ability to resort to for loops beyond that. You can do a lot in the array processing paradigm but it quickly becomes obfuscated if the problem doesn't fit the primitives well.

The array paradigm can really be a straitjacket, and this helps to explain why APL and related languages have been successful in e.g. financial time series computation, where those primitives are well-suited, but have not gained traction in general-purpose programming.


That's a nice subtle insult there - I can obviously understand what a for loop does.

You can resort to explicit loops in array languages anyway. I doubt the lack of looping mechanisms is why they've been as successful in general purpose programming, probably more to do with unfamiliarity + historical lack of open source implementations I would guess.


It's not meant to be an insult, I'm trying to reflect what you're saying. You seem to be saying it's much harder to understand the algorithm in the for loop formulation. That's mind-blowing to me. I feel like it's a very small difference, and the same concept comes across in both formulations, although I agree it's a bit nicer in the K version.

If that's how your brain works, and how a lot of APL lovers' brains work, I definitely get how APL would be strongly preferable to folks like yourself. To each his own, maybe.

I think there can be a lot of diversity in how smart people's brains work, so it's really not meant to be insulting or condescending. It's hard for me to understand but I can accept it if that's how it is.

We all want our languages to solve problems with concise, easy-to-understand, nice-looking code, but the considerations that go into that can perhaps be very different by individual.


Ok, my bad, I misinterpreted your comment a bit, sorry about that.


> A 1 page proof is generally easier to read than a 100 page proof, no?

No. You can see the reason from information-theoretic aspect: highly compressed data is essentially indistinguishable from random data. Such data can only be recovered by a huge amount of prior information (in this case a decompressor) and any chance to recover it without that information is slim. Most human-readable data (not just code, but general language and so on) is highly redundant for this reason.


No one's suggesting to compress code. I was just saying in general, shorter things tend to be clearer and easier to read. (as long as it's not overly 'golfed' or obfuscated)


> as long as it's not overly 'golfed' or obfuscated

This restriction is very subjective. For example APL and its descendants can be thought as golfed by default (indeed they tend to be far shorter in competitive golfing, see for example [1]). I'm not saying that they are not necessarily unusable! I just want to point out that being shorter doesn't generally mean much without further context.

[1] https://code.golf/rankings/langs/all/bytes


They can definitely be much shorter than other languages in golfing, no doubt about that. I wouldn't say they're golfed by default though really.


> APL is a write-only language. [1]

I've never seen an argument against APL that wouldn't also apply to Kanji and Chinese script - notations with billions of readers.


The main problem with your thesis is that Iverson notation was used for at least 5 years prior to being turned into the programming language APL, purely as a mathematical notation.

See also the 1964 IBM Systems Journal document in Volume 3, Number 3, "A Formal Description of System/360" with Iverson, Falkoff and Sussenguth as authors.


I don't see Iverson's intentions as relevant here. He may have wanted APL to be a tool of thought, but it wasn't really picked up as such. APL the computational tool, and its descendants, can claim some successes, but the "tool of thought" concept didn't gain traction for the vast majority of technical people.

It's like if an important human rights document was written in Esperanto. The document might be indisputably important, while its language and the intentions behind it are an incidental historical detail.


i regularly use K (an APL descendant) as "pseudo-code".

in fact, i regularly _think_ about problems in an array-programming fashion, heavily inspired by K. when jotting them down, i often use K. when turning them into actual code to run, it's mostly immaterial if i then use K, J, numpy, julia, etc. the general approach translates well into a multitude of array programming varieties.

as such, i don't believe your point 5 holds. this way of thinking is certainly off mainstream, as is, e.g constraint programming, but there are niches where people use it highly productively. you won't necessarily glean that from the output coming from those niches, though.


You can write APL or J in a way that is readable. Part of the problem is that most people's exposure to array languages is in the context of code golf or leetcode-style problems.

In some areas, if you're familiar with the language, it's much more readable than other languages due to being more concise.

For example, if you wanted to calculate the sample standard deviation

```

mean =: +/ % #

sum_of_squares =: +/@:*:@:(] - mean)

stdev =: %: @: (sum_of_squares % <:@:#)

```

Is pretty readable, I think. Of course, of you don't know the primitives, then it doesn't look like it's readable.

(I'm on my phone, so I may have typed something wrong.)


Two spaces in front of each line of code to make a code block, HN does not use markdown. (_ as spaces):

  __mean =: +/ % #
Becomes:

  mean =: +/ % #
Don't use the surrounding ```...``` because it just becomes noise, and you don't need extra newlines, two lines of code can be adjacent to each other if you prefix each with two spaces:

  line 1
  line 2, no extra newline was used
https://news.ycombinator.com/formatdoc - a very short document describing all the formatting options for HN comments. There aren't many.


I appreciate the hint about code blocks. However, the point of markdown is to be readable without special rendering, so I really don't care that much.

I also disagree about not using ```. It's useful when writing markdown in an interface that swallows whitespace.


Except markdown is not easier to read if you clutter your comment (as I've seen other commenters here do) with something like:

```

foo

bar

baz

```

```

note all the

extra

vertical

whitespace because regular paragraphs get a slight

gap

```

```

let's

throw

in a

third block

for some reason

```

As code blocks:

  foo
  bar
  baz

  note all the
  extra
  vertical
  whitespace because regular paragraphs get a slight
  gap

  let's
  throw
  in a 
  third block
  for some reason
The former becomes a problem in the discussion because it clutters the space and is harder to read than the latter, especially when you have multiple blocks around the discussion (in one comment or spread out across comments). You may not care, but using code blocks that offer a more compact and clear expression is certainly more courteous.


> You may not care, but using code blocks that offer a note compact and clear expression is certainly more courteous.

I disagree, and so it's not likely "certain".

I do think that nitpicking over formatting is rather unproductive and rude, though. And it had wasted considerably more space than a format that you happen to dislike, but that others are fine with.

Like I said, I don't care. If you do, then that's on you.


Also, Markdown doesn't support ``` fenced code blocks; that's a common misconception. I don't know which Markdown derivative originated them (maybe GFM?) but they ended up in CommonMark.


> Markdown doesn't sorry ``` fenced code blocks

There isn't a single markdown. There are many different flavors, and all of them are markdown. Many include ``` (GitHub, Kdoc, pandoc, reddit, commonmark), especially as it allows you to specify the language if you want. I actually don't know if any tool I use that doesn't support them.


New Reddit (aka, Really Obnoxious Interface Reddit) uses that as well. Since Old Reddit is the only useable version of Reddit (IMHO), it's really annoying to see people make such extensive use of it as it gets butchered for Old Reddit users.


> 3. When people think about algorithms, they write code or pseudo-code.

Or draw boxes and arrows on a whiteboard or pieces of paper. But I agree, they usually don't write APL or anything like it.


It may be a tool but you know? Jewelry hammers are also tools but too specific to be used by many. Guess the same happens with APL.


I do not agree. I have never used an APL interpreter or compiler, but nonetheless I use it as a pseudo-code, so instead of no one, there is at least one user.

Whenever people claim that APL is write-only they give some convoluted example with a bunch of obscure APL operators that are seldom used, so even for APL users it might be difficult to remember what they do.

For the majority of the problems that a programmer might need to solve every day, only a small fraction of APL might be needed, which is simple to understand, learn and remember.

As an unfamiliar notation, you might need to learn 5 or 6 extra operators at most and for the rest you need just use the same that you use in any programming language only with additional rules that allow more freedom than in the popular programming languages.

The main difference between APL and conventional languages is that you no longer have to write "for" loops, you just write the same expressions that you would write for scalar variables.

I cannot see how one would claim that omitting all the redundant boilerplate code needed by "for" loops or "map" functions decreases the readability of a program.

If you do not want to learn 20 new operators, there is no need for that, the base APL language does not need them. There are only very few operators that are strictly needed, e.g. reduction, inner product, outer product, rank, index.

If you like verbose identifiers you could use names for them, the symbols are not the essence of APL, only its rules for writing expressions.

The 3 simplest rules are:

1. a function of 1 argument applied to an array is the array of the values obtained by applying the function to the array elements.

2. a function of 2 arguments applied to a scalar and to an array is the array of the values obtained by applying the function to the scalar and to each array element.

3. a function of 2 arguments applied to 2 arrays of identical ranks and dimensions is the array of the values obtained by applying the function to each pair of corresponding array elements.

I do not see anything obscure and unintuitive in such simple rules. (It should be obvious that these are particular cases of a general rule that in a function of N scalar arguments, you should be able to replace 1 or more of the scalars with arrays of the same rank and dimensions and then the function result shall also be an array of the same rank and dimensions.)

On the contrary I find it hard to understand why these rules are not valid in any programming language and you are forced in most of them to write loops for such trivial operations.

There is only one real barrier for using the APL notation, but in my opinion those who surpass it are rewarded.

The order of evaluation for APL expressions is different from the order learned in school, which is then extended in various way in the programming languages.

In APL there are no precedence levels. The right operand of an operator is everything that is to the right, until the end of the expression.

So the operators are evaluated from the right to the left (i.e. in reverse order compared to the traditional order for e.g. addition), unless you use parentheses. This order is frequently more readable than the traditional, because usually when browsing a program you do not need to read the expressions until their end. The important parts are at the left end and it is normally enough to read only those. With the traditional order you might need to scan the line much farther to the right to be sure that you did not miss something important.

This obstacle of the different evaluation order discourages many of those who could benefit by using APL as a pseudo-code, but once you understand that this modified order can really simplify most expressions and you become adept at reading it as easily as the traditional order, you can use APL as a pseudo-code with good results.


Thanks for these interesting thoughts. One quick question: how is rule 2 a special case of the general rule? I think I'm missing something in your explanation of rule 2. Is it just the scalar+array version of rule 3, for a function f(x, y) of two scalar arguments?

Could you then say that one of the big ideas of APL is to define a canonical way of extending functions of scalars f(x, y, ...) to take one or more array arguments, and then baking those ways into the syntax, so you don't have to use zip and map?

I agree there's something to that. It seems like a feature that other languages could benefit from. Maybe a decorator symbol could be used to tell the language when you want to do this, or a setting so it's turned on by default in a particular module. We probably wouldn't want it as an immutable universal language rule, since entering an array to scalar argument is a common programming mistake that you could want your compiler or runtime to throw an error for.


The rules 2 and 3 are how to derive a function with array arguments from a function with 2 scalar arguments.

You can replace one scalar with an array by rule 2 and both scalars with arrays by rule 3. This should be extended to an arbitrary number of scalar arguments, where you should be able to replace any scalar with an array whose elements have the same type as the scalar and any combination of scalar and array arguments should be valid.

This is the first step in using APL, by defining only scalar functions, but using them also with array arguments.

This only works with arrays of the same dimensions, where the function arguments are elements in the same positions, i.e. which would have had the same indices in a "for" loop.

This already covers the most frequent uses.

Only when you need more complex ways of combining arrays, you begin to need specific APL operators, e.g. reduction which transforms an array into an array whose rank is 1 less, e.g. like when you sum the elements of a vector into a scalar value, but the APL operators are completely general, so with the reduction operator you can apply any function of 2 arguments to a vector or to the subvectors of a larger array.

Reduction operations are one of the places where it is obvious that the APL order of evaluation is better than the traditional, because it provides useful operations for the non-commutative functions, e.g. reduction with subtraction is the sum with alternate signs and reduction with division is the product of the fractions obtained from pairs of consecutive elements. Both operations are frequently useful, unlike to what would have been obtained with the traditional order of evaluation.

The same happens with the other APL operators, they are generalized versions of the traditional operations, e.g. the inner product operator can do dot products or matrix multiplications, but it can be used with any pair of functions, not only with addition and multiplication.

Like any use of overloaded functions, you are right that this would no longer flag as an error when you use an array of e.g. int instead of an int, but usually this would be caught eventually, e.g. if you try to assign the result to a scalar, unless you do this in an initialization with auto inference of the declared type, which would probably still give a type conflict later.

I agree that this is a possible disadvantage, but I have never encountered an error of this kind, with overloaded functions or operators, that was not flagged eventually as a type error, even if possibly not in the first place where the error was introduced.


Thanks, this is great and much more meaningful to me than the generic "tool of thought" hype I'm used to. APL is extending scalar functions with implicit map and reduce operators, and doing it in a way that's more general, consistent, and concise than, say, MATLAB or R's system of rules for scalars/vectors/arrays, or NumPy's array broadcasting rules.


Yes, you have expressed the facts very well.


It is a tool of thought _for smart people_. But thanks for that, um, like, uh totally awesome Wikipedia link! +1


> It is a tool of thought _for smart people_.

As I said, if this were really true, there would be evidence of people using it to express thoughts that are too hard to express by other means. But in reality, people who think in APL just don't seem to be on the forefront of published algorithmic thought.

It might be helpful to contrast APL with nonstandard analysis, a mathematical tool of thought for doing rigorous real analysis more easily. People have proved new and useful results in nonstandard analysis, and the fact that these results were first proved in nonstandard analysis is at least some evidence that nonstandard analysis served as a real tool of thought to produce those results. (To be fair, the idea that nonstandard analysis was an essentially tool in these proofs is controversial among mathematicians. But there is at least the hard fact of the results being proved first by this means.)

So, if smart people are able to use APL to push the boundaries of algorithmic thought, we'd expect APL-ers to have that sort of achievement to show for it. But we don't see that. After pre-existing thoughts are translated into APL's logography, there does not seem to be any demonstrable enhancement of the ability to reason about algorithms. Or maybe there is some enhancement for that individual person, relative to their abilities without APL, but it does not seem to cause that person to rise above the general human level of algorithmic ability.

"APL as a tool of thought" seems to be a beautiful theory whose predictions have not met with reality.


What sort of evidence are you expecting there to be? A tool of thought would usually, uh, stay as thought. I'd expect a good percentage of APL users are using it as such, but making a big deal out of that just doesn't really make any sense for any given individual. At least in the couple array language chat rooms I'm in, pseudocode-ish APL is somewhat frequently used to convey thoughts, and I certainly use arraylangs as such too.

APL doesn't magically double your ability to think though. Pushing boundaries takes a lot more than raw thought power. And APL is used so little that you couldn't make any correlations even if APL increased thinking ability tenfold anyways. But it still is a very useful tool (among many other tools) from time to time.

I personally don't see myself using any array language as a general purpose language for everything though. They're useful (extremely useful!) for things that map well to array operations or benefit from terseness, and anything from "meh" to awful for other things.


> As I said, if this were really true, there would be evidence of people using it to express thoughts that are too hard to express by other means.

Like Arthur Whitney you mean?


Nice opinion piece. I think in J when doing data munging, BTW...


point 5 should be:

5. In reality, essentially no one uses APL.

and your reasoning flys out the window as a result.


There are many reasons people might choose not to use APL as a programming language. There are fewer reasons why they are not choosing to use it as the "tool of thought" it is supposed to be.



Consider that without a good tool it can be hard to think about or discuss certain subjects. (And thus they become, to a degree, invisible.)

For example, I have seen conversations where altered states of consciousness are discussed. Drugs, meditation, etc.

I've seen it characterized various ways via cartoons, impressionistic drawings, frames from movies. But never really well done.

Such an important subject. But really no good tool for talking/thinking about it.

Such a tool would be a class of invention. A variety of technical writing.

Consider the creative metaphors used in Physics for talking about strange subjects like time, space, gravity. Consider the "Feynman Diagram".


(1979)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: