

Fexl - a Function Expression Language - sheffield
http://fexl.com/basics

======
fexl
Ah, thanks for posting about my Fexl language. Although I'm the author of it,
and somewhat fond of it, I must emphasize this caveat:
<http://news.ycombinator.com/item?id=2717560> .

Although lazy evaluation is quite amazing, in certain circumstances I find
that it's kicking my ass. For example, consider a function that simply sums
the numbers from 1 to N:

    
    
      \sum == (\N
          long_le N 0
              0
              (long_add N (sum (long_sub N 1)))
      )
    

Or, more tersely, using the semicolon as a syntactic "pivot" to avoid right-
nesting:

    
    
        \sum == (\N long_le N 0 0; long_add N; sum; long_sub N 1)
    

The problem is that when you call (sum 100000) it builds up a giant chain of
(long_add 100000; long_add 99999; long_add 99998; ...) and then evaluates that
monstrosity recursively.

So I need to do some more work on forcing early evaluation. You can start by
using the standard accumulator trick:

    
    
      \sum == (\total\N
        long_le N 0
            total
            (sum (long_add total N) (long_sub N 1))
        )
    
      \sum = (sum 0)
    

But even then you still have the massive recursion problem because you're not
forcing the addition operation early.

I'm thinking I just need to allow basic types like "long" to be called as
functions with no effect (i.e. equivalent to the identity function, so you can
do things like this:

    
    
      \sum == (\total\N
        long_le N 0
            total
            (
            \total = (long_add total N)
            total;    # force evaluation; result is I (identity function)
            sum total (long_sub N 1)
            )
        )
    

However, _even that_ doesn't always do the trick, particularly when you're
dealing with higher-level values that aren't basic data types. The problem
occurs generally when you build up a big "chain" of calculations with
arbitrary values, and you have no simple way of forcing evaluation along the
way.

This is, of course, the classic struggle between eager and lazy evaluation.
But if I don't do the evaluation lazily, I can't define recursion in terms of
the closed-form Y combinator, namely (Y F) = (F (Y F)). Instead I'd have to
define it in terms of some kind of run-time "environment" using either key-
value pairs or the de Bruijn positional technique -- something I've managed to
avoid thanks to lazy evaluation in terms of pure combinators.

So I must say that although Fexl is an interesting pure-combinator lazy
evaluation language, the jury is still out on its practical utility, in my
humble opinion. I've used it in some projects as an embedded interpreter, but
the application was quite constricted so you didn't encounter some of these
larger issues.

That is why I emphasized this caveat earlier today:
<http://news.ycombinator.com/item?id=2717560> .

~~~
eru
Have you looked at how people deal with too much lazyness in Haskell? If yes,
is it applicable to your language? Why, why not?

In your example I'd use foldl' (the prime is important) in Haskell for a
strict sum.

~~~
fexl
Thanks for the tip -- I do see something in Haskell about marking things as
strict. Perhaps I can do something similar in Fexl. The general problem I'm
dealing with is long chains of state transformations like this:

    
    
      \chain = (\state
        \state = (event1 state)
        \state = (event2 state)
        \state = (event3 state)
        state)
    

That of course is simply equivalent to:

    
    
      \chain = (\state 
        event3; 
        event2; 
        event1; 
        state)
    

But the former is a way of showing the computation in a forward instead of
reverse direction. And I know I could rephrase the former in a monadic style,
but that in itself does not alleviate the problem of the lazy evaluation.

This certainly does not matter if you're just chaining three events together,
but try linking that chain together 20000 times to give you 60000 events. Oh
it works, but it's nasty with memory usage.

So maybe I can introduce something into Fexl, without sacrificing elegance,
which forces some level of evaluation of the event applications.

I did try forcing at least a top-level evaluation of each event along the way,
using a technique sort of like this:

    
    
      \eval = (\state state I \_\_ I)
    

(That's because I know the state is ultimately just a list. I have a really
efficient way of doing arbitrarily large key-value maps simply using nested
lists in just a few lines of Fexl.)

Then I did this bit of nastiness:

    
    
      \chain = (\state
          \state = (event1 state) eval state;
          \state = (event2 state) eval state;
          \state = (event3 state) eval state;
          state)
    

But I dunno, it still didn't quite do the trick. The jury's still out. So far
for most "real work" I'm still just using embedded simple token-based domain-
specific concatenative languages, with the enclosing interpreter written in
either ANSI C or Perl. Fexl is still mostly a lab toy.

~~~
chalst
You can control the order of execution of pure functions by using CPS (so
strict can be represented by lazy or vice versa). You can't force monadic
operations to occur out of order this way: you need to have some concurrency
between the pure expansion semantics and the action semantics.

Conal Elliot has written some nice things in this vein; he makes a relevant
point in [http://conal.net/blog/posts/can-functional-programming-be-
li...](http://conal.net/blog/posts/can-functional-programming-be-liberated-
from-the-von-neumann-paradigm/)

So why can't you interleave the add operations? Are the atomic arithmetic
operations side effects? Can you not represent CPS faithfully for some reason?
I'd really like to see the expansion phase of Fexl expressed using CPS.

BTW, borrow a notation from Haskell and have a dot operator be the transpose
of the semicolon operator.

~~~
fexl
(Intriguing suggestion about the dot operator by the way.)

On this question: "Are the atomic arithmetic operations side effects?" Not
really. Well, sort of. I mean, take a look at the reduction code for adding
two long values: <https://github.com/chkoreff/Fexl/blob/master/src/long_add.c>

In short, when you evaluate (long_add 2 3), that value is _replaced_ with the
number 5, right inside the machine data structure. So in that sense there is a
"side effect", but it's a purely functional referentially transparent side
effect only in the C internals -- nothing mutable going on at the Fexl level.

I'm all well-versed with CPS (continuation-passing style), e.g. I've done
stuff like this:

    
    
      \do_stuff = (\state\return
          do_this state \state
          do_that state \state
          return state)
    

But that doesn't in itself help me, yet.

By swapping the order of the parameters "state" and "return" in do_stuff,
do_this, and do_that, I can transform that function into a monadic style:

    
    
      \do_stuff = (\return
          do_this;
          do_that;
          return)
    

But as it turns out that accomplishes nothing _essential_ \-- it is merely a
syntactic difference.

Keep in mind that Fexl is purely combinatorial, and ultimately what's really
going on under the hood are the application of these two rules:

    
    
      C x y    =  x
      S x y z  =  x z; y z
    

So maybe that will give you some insight into just how irredeemably lazy this
language really is. :)

(Yes there are some other combinators such as I, L, R, and Y, but these are
ultimately shorthands for S and C forms.)

If by "interleave the add operations" you are suggesting a change to the core
evaluation strategy used in the interpreter, that is probably out of the
question -- I've made my bed there and I have to lie in it. There's not much I
can do at this point about my reliance on combinators, I mean, check out the S
combinator: <https://github.com/chkoreff/Fexl/blob/master/src/S.c> . That's
baked in the cake!

But if you mean there's something I can do different in my Fexl function
itself, that might be something to consider.

I tried the full gamut here, using both accumulator and CPS:

    
    
      \test_big_sum_4 =
      (
    
      \sum == (\N \total \return
          long_le N 0
              (return total)
              (sum (long_sub N 1) (long_add total N) return)
              )
    
      # TODO still a problem!!
      \N = 100000
      sum N 0 \total
      print "sum 1 .. "; print N; print " is "; print total;nl;
      )
    
      test_big_sum_4
    

But to no avail: it still uses up large amounts of memory.

However, I could force the evaluation of (long_sub N 1) and (long_add total
N), and that might do the trick. Then it'll be totally tail recursive with
machine integers at every turn, and run in constant memory.

~~~
chalst
_I'm all well-versed with CPS_

I'm talking about a particular application of CPS, the encoding of CBV lambda-
calculus in the CBN calculus. Checkout Danvy & Filinksi (1992) if you need
brushing up on this: look at what happens in your calculus when you code up
the CBV version of the foldl, which should force the first atomic operation to
happen before unwinding the next application of addition.

Danvy & Filinksi, 1992, Representing control: a study of the CPS
transformation
[http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.8...](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.84)

------
mahmud
What is an "expression language"? I just started doing Java EE crap and there
is an assortment of "ELs" that you can embed in your java apps.

Outside of Java, an "expression language" doesn't make any sense to me. An
expression, statement, binding, assignment, application, and abstraction are
all constructs, or elements of programming languages. Most of them can be
implemented using others, sure, but I expect all useful languages to be
capable of _all_. So, my question is, what makes an expression language an
_expression_ language, to the exclusion of all other constructs? (IOW, why is
that particular part being made into a defining characteristic of the
language?)

~~~
fexl
:) It's a _function_ expression language, meaning a language for expressing
pure functions of an arbitrary nature. It's really just a variant of the
lambda calculus, and I compile those expressions into combinators to eliminate
all variables.

For example, the "flip" function for swapping the order of two functions is
this:

    
    
      \flip = (\x\y y x)
    

But Fexl converts that into pure combinators like so:

    
    
      \flip = (S (C (S I)) C)
    

Actually it uses the higher level combinators L, R, I, etc. so that flip is
defined as:

    
    
      \flip = (L I)
    

But the higher-level are shorthand for forms that use S and C only.

~~~
mahmud
Well, these things exist in the wild, and they're not as fun & cute as yours:

<http://en.wikipedia.org/wiki/Expression_Language>

[http://docs.jboss.org/seam/latest/reference/en-
US/html/elenh...](http://docs.jboss.org/seam/latest/reference/en-
US/html/elenhancements.html)

[http://java.sun.com/products/jsp/reference/techart/unifiedEL...](http://java.sun.com/products/jsp/reference/techart/unifiedEL.html)

~~~
fexl
I hear ya. I consider Fexl to be at the extreme end of computational
abstraction. The kind of things you reference are convenient tools for making
certain lower-level things accessible from a high level sort of script.

I've used that approach in my work for many years with great success. I use
ultra-simple domain-specific languages consisting of this syntax:

    
    
      token token token token ...
    

:) Seriously though. It's amazing what you can do in a script consists of
nothing but tokens or "words", e.g.

    
    
      do_this 4 5
      do_that x "Hello there."
    

Where all filler such as #-comments and white space, even line breaks, are
completely insignificant. I've made a _lot_ of hay out of languages like that.

I even have a Turing-complete language in that form, and I can define "verbs"
which access any sorts of machine functions I care to provide. It ends up
looking a bit like Forth but I think more clear (coming from a biased judge of
course). I'm actually tending to favor that token-based Turing-complete
language _over_ Fexl for doing real work. I'm not devoted to the _idea_ of
functional programming, I'm devoted to simple, powerful, flexible, and secure
programming.

Strangely, the token-based languages make my application code more "language
independent" -- meaning that I can write the interpreter in any language I
want (C, Perl, etc.) and it really does not matter at all. I like to keep
pushing as much application logic into the scripting language as possible.
With the Turing-complete language it might even be feasible to do _all_ of my
application logic in it, leaving me with only a small residual interpreter
written in C. But it remains to be seen if my Turing-complete token language
can really scale well to that large body of application logic.

------
pwang
Can you contrast this with Joy?

<http://en.wikipedia.org/wiki/Joy_(programming_language)>

~~~
fexl
Ah -- concatenative languages, yes. I'm becoming a big fan of simple token-
based concatenative languages. See my summary here of my recent travails:
<http://fexl.com/second_thoughts> .

In short, instead of Fexl I've started using a simple token-based language.
Like Joy, my token-based language uses a stack. But my language also uses a
global mutable key-value space, and it only supports keys and values which are
strings (which may be interpreted as numbers).

I think my little language is easier to "execute" manually on a white-board in
a way that's easy for non-programmers to follow along. I don't need a lot of
arcane "stack flip-aroo" operators like Forth and Joy use.

So how do I do a sort you ask? I just jam a bunch of keys into the key-value
space and then use "next" to iterate through them. The key-value space does
the sorting for me.

What about strictly sequential lists you ask? I can just store a bunch of keys
using a scheme like this:

    
    
      item/a1 ... item/a9
      item/b10 ... item/b99
      item/c100 ... item/c999
      item/d1000 ... item/d9999
    

By prefixing a single character which represents the number of digits, I get
the proper numerical order when I iterate in ASCII character order.

You know what my real goal is? I want to be able to give my non-programmer
colleagues the ability to embed little snippets of computation here and there,
making simple things simple, but also imposing no upper bound on the
possibilities of what they might do.

------
Gotttzsche
i dont understand the identity function that uses just C and S. :(

C x y = x

S x y z = (x z) (y z)

\I = (S C C)

so uh... that would evaluate to (\z (C z) (C z)), right? and then to (\z z z)?
but then you got z twice. what does that even mean? applying z to z?

~~~
eru
I z = (S C C) z = (C z) (C z) = C z _ = z

Where _ stands for a value that doesn't matter.

And actually for any x we have S C x == I.

~~~
fexl
Yes, I like that expository technique of using "_".

Often when I develop a Fexl function I actually use "_" as a placeholder for
"something I haven't implemented yet". So if I'm doing something with a list,
I might say:

    
    
      list
        _
        \head\tail _
    

And then I just go back and replace each "_" with an implementation of that
particular case. In doing so, I might create new "_" slots ("holes"), which I
then implement, and then I repeat this process until no more holes appear.
It's a really good "case analysis" approach to programming.

For example I used this approach to implement an associative key-value map
structure which uses nested lists, where a list at level N branches on the Nth
character of the key, so it's a radix-style algorithm but without splitting
the keys into pieces:

    
    
        # Helper function.
    
        \map_put == (\map\pos\key\val
            map             
                (item (pair key (pair val end)) map)
                \head\tail
                head \top_key\top_node
    
                # Do a three-way comparison of key char at pos
                # with the top key char at that pos.
    
                long_compare (string_at key pos) (string_at top_key pos)
    
                    # key char is less than top key char
                    (item (pair key (pair val end)) map)
    
                    # key char equals top key char
                    (
                    top_node \top_val\top_map
                    \new_head =
                        (
                        string_compare key top_key
                            (pair key (pair val
                                (map_put top_map (long_add pos 1) top_key top_val)))
                            (pair key (pair val top_map))
                            (pair top_key (pair top_val
                                (map_put top_map (long_add pos 1) key val)))
                        )
                    item new_head tail
                    )
    
                    # key char is greater than top key char
                    (item head (map_put tail pos key val))
            )   
        
        # The actual function (map_put key val).  This just
        # calls the helper function with position 0 to start.
        # I should probably re-order the helper function so
        # pos is the last argument, then I can define this
        # function simply as \map_put = (map_put 0).
    
        \map_put = (\map\key\val map_put map 0 key val)

~~~
eru
Thanks!

By the way, if you like SKI calculus, you would have loved this year's ICFP
programming contest.

------
fexl
What a difference a day makes. The day after this discussion about my problems
with lazy evaluation, I figured out an incredibly simple solution:
<http://fexl.com/eager_and_lazy>

------
gmartres
So, how is this different/better than other functional languages?

~~~
fexl
OK, I'll discuss what's different and leave "better" out of it.

1\. In Fexl, there is no distinction between data and function. All data
structures such as lists, pairs, etc. are represented as functions.

2\. Fexl has a small grammar, about as small as I think is feasible for
expressing arbitrary functions. You could actually omit these rules:

    
    
      exp => \ sym = term exp
      exp => \ sym == term exp
      exp => ; exp
    

But the resulting forms would be far more difficult to write and understand.

3\. Fexl has a very simple compilation and evaluation strategy, reducing
everything to combinatorial forms. There are no "environments" or "closures"
or "contexts" being whipped around at run-time. The resulting Fexl executable
program is about 35K in size.

4\. Unlike other functional programming languages, Fexl does not rely on
"pattern matching" for branching on the different possible forms of a piece of
data. Instead, it simply _calls_ that piece of data as a function, passing in
the appropriate handlers for the various cases. For example, using excessively
verbose function names here:

    
    
      list
          handle_empty_list
          \head\tail handle_first_item head; handle_remainder tail
    

5\. Fexl doesn't really have a distinct concept for defining a name -- that's
all handled normally by lambda calculus. However it does provide the syntactic
shorthand "=". For example this function:

    
    
      \square = (\x mul x x)
      print (square 4)
    

Is equivalent to this function, which does not use the "=":

    
    
      (\square print (square 4)) (\x mul x x)
    

I'm not exactly sure if that's oh-so-different from other programming
languages, but I'll venture a guess that many of those other languages use a
symbol table to store function definitions at run-time, while Fexl does not.

6\. Fexl never creates circular data structures in memory. Now because of lazy
evaluation, you can create logically circular or infinite structures in Fexl,
but these are always closed forms and never involve literal circularity in
memory. Consequently it is possible to manage memory using reference counting
-- and Fexl does that. Some may not like that, but it's simple and it gets the
job done. The code even has a built-in assertion to ensure that all memory was
properly reclaimed at the end of a run.

(I'm not _certain_ that other languages create circular structures in memory,
but I am certain that Fexl does not so I thought it worth mentioning.)

~~~
gmartres
That's interesting, but excluding the "reducing to combinatorial forms" thing,
which only seems to be an implementation detail, can't this all be done in
Scheme or another Lisp dialect? "no distinction between data and function"
goes hand in hand with
<https://secure.wikimedia.org/wikipedia/en/wiki/Homoiconicity>

~~~
fexl
You can certainly do this sort of thing in Lisp, using forms like:

    
    
      (defun square (x) ...)
      (lambda (x y) ...)
      (add 2 (add 3 (add 4 5)))
    

In Fexl you would see instead:

    
    
      \square = (\x ...)
      \x\y ...
      add 2; add 3; add 4 5
    

So at least Fexl has the virtue of being more compact in those cases. :) Also,
in Fexl, whenever you see a name, it _always_ refers to a function, unlike in
Lisp, where names like "defun" and "lambda" and "prog" are meta-logical
syntactic devices and cannot be defined as functions in their own right. And
anything like "setq" or "setf" is strictly out of the question in Fexl.

You could of course implement lazy techniques in Lisp, even going so far as to
write a Fexl interpreter if you like. I just wanted to see what happened if
procedural, mutable, and meta-logical constructs were _completely_ eliminated
as possibilities in a language.

~~~
gmartres
So, Haskell without types and monads basically? ;)

~~~
fexl
Definitely without the types, yes. However, monads are completely do-able in
Fexl. I can write Fexl code that looks procedural and "side-effect-y", using
the monadic technique so that you never actually see the state variable that's
being chained through the functions. Monads are more a _style_ of code than a
feature of the language per se.

Personally I'm quite happy without the baggage of type declarations. As long
as I build up functions systematically, I have very few problems with run-type
type violations. Sure every once in a while I forget a semicolon or whatever
and my function gets "out of synch" like a T-1 line gone out of phase. But
it's usually pretty easy to see what went wrong.

~~~
gmartres
Yeah, I know how monads work, what I mean was that according to the examples
you don't seem to enforce purity(nothing like the IO monad).

And I've never found types to be a burden in Haskell, the type inference works
great so that you can usually omit them if you can't be bothered, but I still
include them most of the time because they help you reason about your code and
see patterns.

~~~
fexl
Right, I don't enforce purity, but I do allow it. Ultimately there's a
"string_put" function which (1) produces an actual side effect and (2)
evaluates to the identity function. You can wrap monadic (monastic?) purity
around that if you like.

Also, strict typing is pretty much impossible in Fexl, since I'm using
combinators. You can't really assign a meaningful type to things like S, C, I,
Y, etc. So yes, Fexl is very "loosey-goosey" that way. I also didn't want to
bother with some ponderous PhD project like a "type inference engine" written
into my ANSI-C interpreter. I figure if you want to do high level things like
that, write those tools in Fexl itself (i.e. use meta-programming techniques).

