
J Notation as a Tool of Thought - janvdberg
https://www.hillelwayne.com/post/j-notation/
======
nonbirithm
J and K remind me of the essay about the Lisp Curse[0], which mentions that
the expressiveness of the language became a sort of Achilles heel in its
culture. It talks about people writing their projects in Lisp and not
expecting other people to adapt to their conventions or combine their efforts
on one library, where every solution worked well enough initially - but only
for one person, its author.

In K the error messages are far harder to understand than most languages, and
you'd need either enough comments or many hours of experience with the
language to understand what a piece of code does, due to its sheer
expressiveness. Also, there's comparatively less documentation for when you're
trying to do something practical. I can't imagine introducing J or K in an
open source project will be helpful if other contributors have to catch up to
my understanding of the language and need to internalize and remember which
single-character symbols mean "print" or "iter." I imagine that at least a
Python programmer could reasonably understand the gist of a similar Ruby
program, merely due to the fact that "class" and "map" and "length" are
sequences of characters that usually show up in both languages. Not so with K.

Still, I find the vastly different ways of doing things with K interesting to
see, even if I will only use it personally.

[0]
[http://winestockwebdesign.com/Essays/Lisp_Curse.html](http://winestockwebdesign.com/Essays/Lisp_Curse.html)

~~~
coliveira
That is fine. Most people simply cannot drive a Formula-1 car, but it would be
stupid to conclude that a F-1 is a cursed car and nobody should want to drive
one. Similarly, a few languages can only be used by highly trained people who
understand well how it works. Lisp, Forth, and J are in that category of tools
that require highly trained people. In their hand, one of these languages can
do fantastic things, in the hands of an average programmer you have nothing.

~~~
mandus
That is actually a nice analogy! No one, not even the pro formula-1 drivers
will want to drive a f1 car in regular traffic. Much the same with these
languages (apls, lisps, forth, etc); they have their place and role, but is
better not used in regular open source or commercial codes.

~~~
dunefox
That is absolutely false. I have seen absolutely horrible "clever" code hacks
in Java production code - absolutely opaque and extremely complicated.

Lisp is definitely a good choice for 'regular open source or commercial code'.

------
siraben
This was a nice tour of how concise J can be. To see more of this stuff I
highly recommend the Concrete Mathematics Companion[0], which uses J as an
algebraic tool to reason about the correctness of derivations in that
excellent book by Knuth.

[0]
[https://www.jsoftware.com/books/pdf/cmc.pdf](https://www.jsoftware.com/books/pdf/cmc.pdf)

------
BiteCode_dev
J should be a dsl for array embedded in other languages, the way regexes are
for text.

------
gspr
The article has the footnote

> APL was the first language to use “monad” as a term. The popular FP meaning
> only appeared thirty years later.

in its, to me odd, use of the word "monad". This is incorrect. The "popular FP
meaning" arose because it's a very special case of the standard concept from
mathematics [1]. The mathematical terminology harkens back to the 1950s, and
thus predates APL.

In Haskell, a monad is precisely a monad in the mathematical sense in the
special case of the category of Haskell types (which, due to technical
complications isn't exactly a perfect model for the actual Haskell types, but
it's pretty close). It thus seems dishonest to me to refer to this use of the
word "monad" as somehow being newer than APL's use.

[1]
[https://en.wikipedia.org/wiki/Monad_(category_theory)](https://en.wikipedia.org/wiki/Monad_\(category_theory\))

~~~
4bpp
> The mathematical terminology harkens back to the 1950

Does it? The earliest instances I can find are in Mac Lane's 1971 book or
thereabouts; the older term was "triple". Wikipedia dates APL to 1966.

"Monad" has also seen nebulous use in philosophy at least since Leibniz.

~~~
notagoodidea
In fact since the Pythagoreans [1] where the _monad_ comes first and after the
_dyad_ and after the numbers and so on. It also was the term for unicellular
organism before this was coined.

Monadic was an adjective used in logic before the 20th century, using the noun
dervied by the greek root _monas_ which gave monad and monadic (and the same
with dyad, dyadic).

If I take for right what is on the french [2] and english pages [3] of
wikipedia about the terminology history of the monad words its usages seems to
date to around the 70 when Saunders Mac Lane named it in reference to the
philosophical term.

I feels some times people tend to forget how interconnected is and was
mathematics and philosophy and how term from philosophy were used in a
mathematical context based on their philosophical definition.

[1]
[https://en.wikipedia.org/wiki/Monad_(philosophy)](https://en.wikipedia.org/wiki/Monad_\(philosophy\))

[2]
[https://fr.wikipedia.org/wiki/Monade_(th%C3%A9orie_des_cat%C...](https://fr.wikipedia.org/wiki/Monade_\(th%C3%A9orie_des_cat%C3%A9gories\))

[3]
[https://en.wikipedia.org/wiki/Monad_(category_theory)#Termin...](https://en.wikipedia.org/wiki/Monad_\(category_theory\)#Terminological_history)

------
chrispsn
I agree with most of this article, except that - for me - grouping by table
columns is a more intuitive and flexible way to control subset application
than 'rank'.

k9 is in development, but blending k7 and k9 we'd get something like:

    
    
       t: [[]sensor:  1 2 1 2 1 2
             day:     1 1 2 2 1 1
             reading: 1 2 3 4 5 6]
      
       0 1 1 0 * select by sensor, day from t
      sensor day|             
      ------ ---|----------------
      1      1  |[[]reading: 0 0]
      1      2  |[[]reading: ,3]
      2      1  |[[]reading: 2 6]
      2      2  |[[]reading: ,0]

~~~
wvlia5
This is 'pivot' in python/pandas and excel

------
justusw
In J, I think it's very difficult to move on from the initial stage of dealing
with "line noise" as the author mentions.

For me, this was further complicated by the fact that error messages are very
hard to understand in the beginning.

I've written an article that describes how I typically deal with solving most
of the domain/rank/length errors:
[https://www.justus.pw/posts/2020-07-31-errors-
in-j.html](https://www.justus.pw/posts/2020-07-31-errors-in-j.html)

------
ipsum2
Interesting blog post. All of these are implemented as trivial operations in
numpy, so while J or APL might be the originator of the syntax, this 'tool of
thought' is not as esoteric as the author makes it sound, and learning J if
you already know numpy (or TensorFlow or PyTorch or..) isn't that remarkable.

>> np.array([1, 2, 3]) * 2

array([2, 4, 6])

>>> np.array([1, 2, 3]) * np.array([4, 5, 6])

>>> np.arange(0, 16).reshape(4, 4) + np.array([5, 5, 5, 5])

array([[ 5, 6, 7, 8], [ 9, 10, 11, 12], [13, 14, 15, 16], [17, 18, 19, 20]])

~~~
moonchild
You've reproduced the most trivial example, which many languages make easy. I
would be very interested to see another language with a rank operator, for
instance.

    
    
      ------------------------------------------------------------------------
    

Challenge for you: rewrite a nontrivial program in one of those frameworks,
with the following restrictions:

\- No iteration (including implicit iterations—map, filter; reduce is ok)

\- No loops whatsoever. Recursion is ok, but should be avoided wherever
possible.

\- No explicitly named arguments; everything in pointfree style.

(I know, map/filter can be implemented recursively. But compared with the
equivalent constructs in apl, they're about 10× as verbose, and harder to
reason about and understand. Even reduce is somewhat of a gimme.)

~~~
SuperCuber
Rank Is actually implicitly done by numpy using a mechanism called
broadcasting. For example:

    
    
      >>> np.array([10, 20, 30]) + np.array([[1,2,3], [4,5,6], [7,8,9]])
      array([[11, 22, 33],
           [14, 25, 36],
           [17, 28, 39]])
    

Sieves exist in numpy, called masks:

    
    
      >>>np.array([10, 20, 30]) > 15
      array([False, True, True])
    

Of course they can be operated on just like any other numpy array.

Grades exist in numpy:

    
    
      >>>np.array([5,4,3,2,1]).argsort()
      array([4, 3, 2, 1, 0])
    

Of course they are a little more verbose since all of those operations are
from the library and not native to python.

~~~
moonchild
> Rank Is actually implicitly done by numpy using a mechanism called
> broadcasting.

Numpy's broadcasting is _scalar_ conformability, to which the rank operator
(and general conformability) provides a general case. Example in j:

    
    
         ] x =. i. 2 3 4                                                                                                                   
       0  1  2  3
       4  5  6  7
       8  9 10 11
    
      12 13 14 15
      16 17 18 19
      20 21 22 23
         ] y =. i. 3 4
      0 1  2  3
      4 5  6  7
      8 9 10 11
         x + y         NB. this will error because + expects that, if its arguments' shapes are not the same, one will be a prefix of the other
      |length error
      |   x    +y
                       NB. this is easy enough to fix, however
         x +"2 y       NB. +"2 is shorthand for +"2 2; meaning, choose rank-2 arrays from both the left and right arguments
       0  2  4  6
       8 10 12 14
      16 18 20 22
    
      12 14 16 18
      20 22 24 26
      28 30 32 34
    

Numpy will actually do this without the rank operator, because it uses suffix
agreement rather than prefix agreement (which is absolutely bonkers—j used
suffix agreement for about 5 minutes in 1990, before realising it was an awful
idea). For for numpy, see if you can add:

    
    
      np.array([[[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11]], [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]]) + np.array([[0, 1, 2], [3, 4, 5]])
    

Intelligently. (I'm sure it's not overly difficult to come up with _a_
solution, but can you do it with a single higher-order function call which
generalises to other argument shapes?)

(The j equivalent, (i. 2 3 4) + (i. 2 3) also works without trouble.)

    
    
      ------------------------------------------------------------------------
    

Another example, which may be more illustrative, is the ability to perform
reductions along arbitrary axes. For example:

    
    
         ] x =. i. 4 3
      0  1  2
      3  4  5
      6  7  8
      9 10 11
         +/ x       NB. sum reduced along leading axis, the default, producing an array of shape 3
      18 22 26
         +/"1 x    NB. sum each rank-1 array (vector); or, reduce last axis, producing an array of shape 4
      3 12 21 30
    
      ------------------------------------------------------------------------
    

Another curiosity, which I have thus far neglected, is the extent to which
numpy's being 'a little more verbose' is actually incredibly important in
shaping the way you approach and think about problems. The great-uncle comment
also addresses this, but Iverson probably says it better than either of us
can: read
[https://www.jsoftware.com/papers/tot.htm](https://www.jsoftware.com/papers/tot.htm)

~~~
yuppiemephisto
What is prefix/suffix agreement? Could you give a simple example of why prefix
beats suffix?

I'm curious and have never heard these terms before, and googling didn't help.

~~~
00ajcr
To broadcast operations (such as addition) between arrays in NumPy, trailing
dimensions have to be equal (or be of length 1).

In the example given above the 3D array and 2D array have shape (lengths of
dimensions):

    
    
       (2, 3, 4)
          (2, 3)
    

That is - the _suffixes_ do not agree (4 != 3 and 3 != 2) and NumPy raises an
error.

However, for the same operation in J the _prefixes_ agree:

    
    
       (2, 3, 4)
       (2, 3)
    

and the addition gives the expected result.

To add the arrays with these shapes in NumPy, one method is transpose each
array (reverse order of the dimensions), add these arrays, and then transpose
back:

    
    
      (a.T + b.T).T

~~~
repsilat
Thinking about these as a "verbose imperative language" programmer, I'd say
suffix agreement _seems_ to make more sense to me at first glance, and I'd
like to hear more about why it might be worse.

For why I think it makes sense: I think of multidimensional arrays as being
arrays of arrays, and "normal" index lookup operating on the first dimension.
If I have a

    
    
        float[100][3]
    

in some context I might think of it as 100 vec3s, and I might want to do some
vec3 operation on each of them. I might want to dot them all with my some
other vector, or add them all to some other vector. I almost never have 100
scalars and want to apply one scalar to all elements of the corresponding
vec3.

But I guess maybe this is all widely agreed on, and maybe the contentious part
is just index order? Like, maybe you'd say "100 vec3s" is actually

    
    
        float[3][100]
    

in which case prefix agreement would make more sense.

------
wodenokoto
This reminded me very much of working in R or working with Numpy.

I however always thought having an array as the primitive in R was great only
because R is focused on statistics and data science.

My understanding is that J claims to be general purpose programming and as
such I’m surprised the paradigm holds.

~~~
ggrrhh_ta
If you run the J GUI, written in J, it is going with blow your mind about how
general purpose it is...

~~~
mst
I wish somebody would write a walkthrough of writing something like a piece of
GUI to show that off.

(I keep trying to figure out J and ... sliding off ... every time I need to do
something "normal" ... which is annoying and the fact I'm reduced to
wishcasting in HN comments as a solution makes me disappointed in both myself
and the universe)

------
tsimionescu
This seems very interesting for numerical computation, but at first glance few
of these operations seem to apply directly to general computation (I'm sure
that there are isomorphisms that can be used to apply them indirectly).

The example with sorting was the oddest from this point of view. The author
praised the idea of the permutation vector as being a relatively direct
mathematical specification of sorting (return the permutation of the original
array in sorted order) while leaving out the 'sorted' part. Also, this focuses
entirely on the result, but there is no word on the algorithm - ' _how_ is the
array sorted' is the question that computer science is designed to solve, as
opposed to 'what are the fixed points of a sorted array'.

As such, as someone with no experience of J, I'm left wondering if this is
another example of the infamous Haskell quicksort - well-picked examples of
code which produces a desired result are extraordinarily terse and expressive,
but actually implementing well-known algorithms is often just as verbose as C,
give or take.

~~~
moonchild
> there is no word on the algorithm - ' _how_ is the array sorted' is the
> question that computer science is designed to solve, as opposed to 'what are
> the fixed points of a sorted array'

' _[H]ow_ is the array sorted' is one question that computer science can be
used to solve. However, assuming it _has_ been solved (which, for many cases,
it has), a much more interesting question is 'what can we do with a sorted
array'; or, 'given that sorting has such-and-such time or memory complexity,
how hard is this other algorithm?' The OP isn't showing off a particular
sorting algorithm, he's showing off a sorting _interface_ , because that's
more novel in context.

Sorting according to a predicate could be done with outer
product/table+reduction, but usually you don't need anything like that. Let's
say you have an array of 3-vectors, and you want to sort the vectors according
to the 2nd item in each vector. So your sort predicate is v1[1] < v2[1]. We
can just sort the whole array according to the vector of the 2nd items of each
vector; so:

    
    
         ] x =. ?@] 9 3$10                                                                                                                 
      2 4 9
      3 8 2
      8 0 4
      3 9 6
      3 1 8
      2 6 8
      1 2 5
      9 4 0
      5 0 9
         (<a:;1) { x       NB. index: the first element of every row of x
      4 8 0 9 1 6 2 4 0
         1 { |: x          NB. first element of the transpose of x; equivalent
      4 8 0 9 1 6 2 4 0
         x /: 1 { |: x     NB. x sorted according to the above.  Notice that the second column is now sorted
      8 0 4
      5 0 9
      3 1 8
      1 2 5
      2 4 9
      9 4 0
      2 6 8
      3 8 2
      3 9 6

~~~
tsimionescu
My point was that general programming usually looks more like writing the
sorting algorithm than simply calling it. Thus, I would be more interested in
seeing real examples of how the properties of J allow you to write algorithms
more concisely or otherwise better.

The article showed how J solves some mathematical problems with built in
functions or their built-in modifiers. But in general programming, you won't
find a function that does exactly what you wanted, maybe with a little adverb.
You'll usually have to write that function yourself, to describe how it works
in terms of some of those pre-existing functions.

Note that your second example could be expressed as something like

    
    
        x.OrderBy(y => y[1])
    

In C#, assuming x is int[N][M]. I understand that J's version can work with
many other shapes of X, but that is hard to appreciate without seeing example
of how that comes in handy when writing some other (hopefully well known)
algorithm.

~~~
moonchild
> My point was that general programming usually looks more like writing the
> sorting algorithm than simply calling it. Thus, I would be more interested
> in seeing real examples of how the properties of J allow you to write
> algorithms more concisely or otherwise better.

Ah - I see. Here[1] is quicksort in j. John scholes's videos are excellent and
very illustrative, though they deal with apl rather than j; conway's game of
life[2] and sudoku[3].

> The article showed how J solves some mathematical problems with built in
> functions or their built-in modifiers. But in general programming, you won't
> find a function that does exactly what you wanted, maybe with a little
> adverb. You'll usually have to write that function yourself, to describe how
> it works in terms of some of those pre-existing functions.

Right. This is true, but you generally stay much closer to the builtins in j
than you do in other language.

> Note that your second example could be expressed as something like

> x.OrderBy(y => y[1])

> In C#, assuming x is int[N][M]. I understand that J's version can work with
> many other shapes of X, but that is hard to appreciate without seeing
> example of how that comes in handy when writing some other (hopefully well
> known) algorithm.

Right; you mentioned sort predicates, so I was giving an example of a way in
which you could sort according to another relation. My point is: being able to
change the layout of your data so easily can obviate sort predicates. (Though
I do think a sort adverb taking a predicate would be cool.)

1\.
[https://code.jsoftware.com/wiki/Essays/Quicksort](https://code.jsoftware.com/wiki/Essays/Quicksort)

2\.
[https://www.youtube.com/watch?v=a9xAKttWgP4](https://www.youtube.com/watch?v=a9xAKttWgP4)

3\.
[https://www.youtube.com/watch?v=DmT80OseAGs](https://www.youtube.com/watch?v=DmT80OseAGs)

------
yladiz
Out of curiosity, internally does a language like J `map` over arrays even
though the array is a primitive? Meaning, is `map` essentially a “primitive
function”?

