
What is the minimal basis for Futhark? - Athas
https://futhark-lang.org/blog/2019-04-10-what-is-the-minimal-basis-for-futhark.html
======
snrji
This reminds of an old idea of a friend of mine, who said that pure,
unrestricted recursion (this applies to iteration as well) is like goto: a
primitive too powerful and not abstract enough that should not be directly
exposed to the programmer in a high level language. This post shows that
there's an additional reason for avoiding explicit recursion when possible:
performance. Sure, you can define any function by using recursion, but the
compiler won't be smart enough to parallelize it.

Does anyone know if a language with only high order functions like map, reduce
and others could be Turing complete? Probably 80-90% of cases would fit in the
schema of some predefined functions like scanl. My intuition is that for some
cases you would still need explicit recursion or iteration, for instance if
you wanted to loop forever or until a certain condition is met, but perhaps
these cases could be modelled with some special high order functions as well,
or with explicit infinite recursion as in Rust's 'loop'.

~~~
Athas
You need some form of unbounded while loop for Turing completeness, but it can
take many forms.

~~~
snrji
I see, but you could still wrap all other cases in some predefined functions
or higher level primitives, and just have an "unsafe" mode for unrestricted
recursion. It could even be treated as a side effect.

~~~
Athas
Sure, and you can use things like co-recursion, which is how Agda (a total
language) models potentially non-terminating computations. It means that while
you can run forever, you need to perform productive work within finite time.
As an example, consider producing an infinite sequence of items, but where
each item is produced in finite time.

Also, in a high-performance setting, a surprising number of programs do not
need recursion (and hence Turing-incompleteness) at all. In the vast majority
of the Futhark programs I've written, `while` loops are either not used at
all, or used for something fairly simple like an outermost convergence loop
(and typically with a iteration bound anyway, so it could be written as a
`for` loop with early exit).

~~~
snrji
I see, thanks for your comment.

------
tehsauce
From the conclusion:

"Performance does suffer for nontrivial programs, because the compiler will
not understand the algebraic structure of the custom functions, and so will
not perform important structural optimisations."

I wonder what it would take for the compiler to understand this algebraic
structure? Is this something feasible with more developer resources?

~~~
Athas
> I wonder what it would take for the compiler to understand this algebraic
> structure? Is this something feasible with more developer resources?

Possible, but not feasible. Map-reduce fusion in particular is difficult when
the reduction is expressed non-primitively, because the compiler has to be
careful not to duplicate work. The tree reduction is particularly challenging
because of the sequential loop that obscures the producer-consumer
relationship, but even the chunked reduction would require index analysis to
ensure that inlining the map does not duplicate work.

It gets even worse for more complex optimisations. For example, there is no
realistic chance that the compiler will be able to efficiently sequentialise
any of the hand-written reductions, in those cases where they are nested
inside other parallel constructs and their own parallelism is not necessary.

