Array Languages Rock

crowding · on Sept 28, 2012

The idea appears to be that array languages have builtin functions for operations that would require map/reduce/etc in the applicative programming paradigm. I agree that is convenient, but the downside is that only array operations that are supported this way are the ones that are anticipated by the language designer. If you want to split up an array in some way other than intended, you're back to applicative programming, so your language needs good support for that too.

I would argue that the more important thing that makes something an "array language" is that scatter/gather operations on arrays have syntactic support, as described here:

http://prog21.dadgum.com/141.html

As a personal anecdote: In my thesis work I switched most of my data analysis workflow from MATLAB to R. In terms of paradigms I'd say R is a slightly worse array-language (although it does have builtin syntax for things that require messing with sub2ind() in MATLAB) but a much better functional language than MATLAB. The empirical result is that I finished my first analysis project -- including the time it took to learn R -- faster than it had taken me to do a similar task in MATLAB, using 1/3 the code. Add to that that there is presently more active development of open graphics and statistical analysis libraries in R, and I haven't really looked back, though I occasionally think of picking up an APL-derivative like J to play with.

kd0amg · on Sept 28, 2012

the downside is that only array operations that are supported this way are the ones that are anticipated by the language designer. … I occasionally think of picking up an APL-derivative like J to play with.

You really should -- it provides a nice counterexample to the "downside" you worry about. Having mapping over arrays (even with mismatched dimensionality) built into the mechanics of function application means that any function the programmer writes is automatically supported by the array mapping.

crowding · on Sept 28, 2012

I can see that from this distance -- I should have been more specific about is being a downside of the way MATLAB is only half-assedly array-oriented, not necessarily a feature of all array oriented languages.

w_t_payne · on Oct 5, 2012

That sounds really nice. I must try J.

batista · on Sept 28, 2012

>The idea appears to be that array languages have builtin functions for operations that would require map/reduce/etc in the applicative programming paradigm. I agree that is convenient, but the downside is that only array operations that are supported this way are the ones that are anticipated by the language designer. If you want to split up an array in some way other than intended, you're back to applicative programming

I can't see why this should hold for all array languages in general.

If the language has some mechanism to let you add new operation that are on the same "class / level" as the built-ins, then the above does not hold.

klodolph · on Sept 28, 2012

> Having done a bit of Prolog programming in the dim and distant past, my intuition is that trying to make everything declarative is a mistake...

Calling Prolog "declarative" or "logical" is just marketing. Prolog is really built around two concepts: unification and backtracking. This produces a system that can be used as a general-purpose language, but you will probably get more mileage thinking about it as a database engine, capable of expressing non-finite relations. Compare it to SQL, which only supports finite relations (finite number of rows in each table). Indeed, optimizing Prolog compilers will remind you more of SQL engines -- optimizing Prolog compilers produce indexes, does that sound familiar?

lambda · on Sept 28, 2012

Yeah, I've always wished that I could query my SQL databases with Prolog. It's really so much more powerful than SQL, and allows you to actually create abstractions, rather than repeating yourself endlessly in joining tables together or using clunky views.

matthavener · on Sept 28, 2012

You might be interested in Datomic.

"Datomic embeds Datalog, a subset of Prolog, to move queries into the application." (http://www.infoq.com/news/2012/03/clojure-west)

w_t_payne · on Sept 28, 2012

Everything that you say is true. Thinking in Prolog does have something of the same flavor & texture as thinking in SQL. (Although my experience with both is limited & somewhat rusty at the moment). However, I seem to recall that one of the philosophical goals of Prolog was to provide the ability to write "executable specifications", and, if I recall correctly, Prolog programs seemed to me to be (at least on the small scale) pretty declarative in nature, although (as with SQL), attempting to write a fast, highly performant program required an understanding of what the machine was doing "under the hood".

Ingaz · on Sept 28, 2012

You are right, but what I tell my programmers: if you want high performant SQL-query you can make 2 things: 1. Think carefully about data and write generic SQL without any DBMS specifics 2. Try to optimize everything with your DBMS specifics

The latter - almost never works. Be it Oracle, MSSQL, DB2 or PostgreSQL.

And even when it works - it's unacceptable.

We had a query in Oracle that executed several hours. Our Oracle DBA found a way to tweak Oracle that shrinked it to 30 minutes. It was interesting, but impossible to use in practice - we could not guarantee that other queries will go fine with this server settings. (Our DBA was the first who was against using in production)

mbq · on Sept 28, 2012

You missed significant language here -- R (http://www.r-project.org/). Its array model is way better than Matlab, has all functional goodies on board, excellent expressiveness, can flawlessly represent real data (there are built-in missing values, ordered and unordered categorical variables, data commenting, arbitrary metadata, even a "spreadsheet" type) and absolutely huge library (biased into data science, but it features linear algebra, optimization and DEs). It is also pretty fast, especially because it is easy to write C/Fortran accelerators so they are present behind many library functions. And in contrast to Julia/Cobra it exists for a while, so way more bugs/flaws is already fixed/understood.

riffraff · on Sept 28, 2012

if you haven't seen it yet, the concept of OOPAL[1] as seen in fscript[2] is rather interesting.

Basically it boils down to integration of array programming and objects through the concept of "messaging patterns", which change the behaviour of a message/method call so that it multiplexes over self/arguments via syntactic magic

So you have stuff like

    # at aggregate level
    [[0,1],["a"]] count #=> 2
    # content level, looping left
    [[0,1],["a"]] @count #=> [2, 1]
    # content level, looping right
    2 greatherThan@ [1,3] #=> [true, false]

you can combine these things for multiple levels of nesting, and there are other messaging pattern for indexing, reduction etc.

This feels more declarative than performing explicit loops or using map/fold&co for some things, but also a fair bit mor obscure

(Perl6 also has something similar via hyperoperators and metaoperators, I think)

[1] http://www.fscript.org/documentation/OOPAL.pdf [2] http://www.fscript.org

schme · on Sept 28, 2012

Some computer scientists (e.g. Martin Odersky, creator of Scala) don't see object oriented programming as a true paradigm, for it can be quite easily combined with them (hence Scala and its oop + functional programming approach).

I think this sounds reasonable, especially considering the pseudo-oop of todays languages.

w_t_payne · on Sept 28, 2012

Yeah, but OOP is not just a class of languages, it is a whole different way of thinking about problems and their solutions.

niggler · on Sept 28, 2012

Where's the love for languages like APL and J?

w_t_payne · on Sept 28, 2012

If I had any experience with them, I am sure that I would love them. :-)

kd0amg · on Sept 28, 2012

They highlight what I would call a limitation in other array languages. Octave (and I would expect MATLAB as well) allows scalar operators to be lifted to multidimensional arrays only if one operand is scalar or both have the same dimensionality, which looks to a J programmer like it's just incomplete special-casing.

By comparison, J's lifting permits mismatched dimensionality ("rank"), e.g. using a scalar operator on a list and a matrix, though the dimensions of the particular axes ("shapes" of the arrays) must still match (one must be a prefix of the other). This generalizes to operators written for higher-rank arrays too.

w_t_payne · on Oct 5, 2012

This is making me seriously consider investing some time learning J.

flavy · on Sept 28, 2012

Totally agree with the author. For a Matlab-flavored large scale processing DSL, check out the matrix library in scalding: https://github.com/twitter/scalding/wiki/Introduction-to-Mat...

binarymax · on Sept 28, 2012

I could be misunderstanding here, but isnt this what monads are for?

w_t_payne · on Sept 28, 2012

Having done a little bit of reading, I would have to agree. Yes, this is exactly what monads are for.

w_t_payne · on Sept 28, 2012

I thought that monads were something to do with I/O? (I never really grokked Haskell, so please forgive my ignorance).

ajanuary · on Sept 28, 2012

Monads are a general concept that happen to be a handy way to wrap up IO and protect Haskell from the ugliness of its impure nature. At its core, monads have nothing to do with IO and are used in a lot of non-IO situations.

I'd recommend Google for "Maybe monad in {insert favorite language here}" to at least grok the idea that monads are a more general concept.

w_t_payne · on Sept 28, 2012

Oh. Is that what they are? Never knew that. Well, you learn something new every day. :-) (Thank you very much).

binarymax · on Sept 28, 2012

Im not a Haskell guy either, but monads are a concept not restricted to Haskell.

A trivial example is if you know jQuery - $(".myclass").addClass("myclass2") ...the selector returns zero or more objects. You dont need to iterate over each and manually add myclass2.

dbaupp · on Sept 28, 2012

Note that jQuery isn't necessarily a monad in the strictest sense (i.e. it's unclear whether it satisfies the laws), but it certainly has many attributes that are similar, and so is a good example nonetheless.

Reference: http://stackoverflow.com/questions/10496932/is-jquery-a-mona...

w_t_payne · on Sept 28, 2012

Oh. I did not know that. I will have to investigate further. Thank you.

dbaupp · on Sept 28, 2012

IO is just one example of a monad. (It might be called the "canonical" example, since monads provide a very neat functional solution to the highly non-function problem of interacting with the real world.)

Other examples include lists and functions. (Anything which satisfies the monad laws can be called a monad, there isn't any restriction or definition other than that.)

debacle · on Sept 28, 2012

Cobra seems to have the most potential of those bantered about, but it also looks like the most conventional language of the three.

verroq · on Sept 28, 2012

Did he misspell declarative as array?

w_t_payne · on Sept 28, 2012

Yes. Yes I did. :-)

pstuart · on Sept 28, 2012

Intriguing, but a small code example illustrating it would have been nice.

w_t_payne · on Sept 28, 2012

I will try to dig one up...