
Lang5 – A Stack-Based Array Language - kqr2
http://lang5.sourceforge.net/tiki-index.php
======
nils-m-holm
I see the attraction of stack-based languages. Composing functions by
concatenation is simple and expressive, but at some point it hits a limit and
turns into stack juggling, where you have to keep track of a lot of state on
the stack (or stacks). This is even worse in low-level FORTH, but still
noticeable in higher-level stack languages, like Joy and, probably, lang5.

Klong ([http://t3x.org/klong/](http://t3x.org/klong/)) also started as a stack
language, but I soon grew tired of keeping track of the stack, so I wrote a
compiler that translates K-like syntax to a stack machine program. You can
still see the underlying stack language when you start the interpreter withe
the -d command line option.

------
mabynogy
Sadly sourceforge is down atm.

[http://webcache.googleusercontent.com/search?q=cache:tPBms8N...](http://webcache.googleusercontent.com/search?q=cache:tPBms8N-M_cJ:lang5.sourceforge.net/+&cd=1&hl=fr&ct=clnk)

------
aeneasmackenzie
An article by the author on the language:

[http://archive.vector.org.uk/art10500710](http://archive.vector.org.uk/art10500710)

------
Volt
It would be nice to know why the submitter thought this was interesting.
Otherwise, I can only see it as just yet another dead language.

~~~
evincarofautumn
It’s interesting to me because it’s both concatenative—so the basic building
block of programs is composition—and array-oriented—so you can implicitly lift
operations over arrays like in APL. These are both “weird” families of
languages, but people who take the time to learn them tend to speak highly of
them, for good reasons that are hard to explain. :)

For example, here’s a program to toss a die 100 times and print the arithmetic
mean of the results:

    
    
        : throws(*)
          dup
          6 swap reshape
          ? int 1 +
          '+ reduce
          swap / ;
    
        100 throws .
    

“: name … ;” introduces a definition (like in Forth) with the given name, and
“.” prints a value.

This executes like so:

    
    
        # push number of tosses
        100
        # stack: 100
    
        # copy it
        dup
        # stack: 100 100
    
        # push number of sides of each die
        6
        # stack: 100 100 6
    
        # swap the number of sides and number of throws
        swap
        # stack: 100 6 100
    
        # reshape the scalar 6 to dimension 100
        # i.e., generate 100 copies of 6
        reshape
        # stack: 100 [ 6 6 6 … ]
    
        # generate a random number in
        # range of each element [0,6)
        ?
        # stack: 100 [ 0.347891 4.126314 2.314372 … ]
    
        # truncate each element
        # to an integer [0,5]
        int
        # stack: 100 [ 0 4 2 … ]
    
        # add 1 to each element to place it
        # within the range [1,6]
        1 +
        # stack: 100 [ 1 5 3 … ]
    
        # sum the array by reducing
        # with the addition function
        '+ reduce
        # stack: 100 351
    
        # retrieve the number of throws
        swap
        # stack: 351 100
    
        # calculate the mean
        /
        # stack: 3.51
    

Note that whenever we’re applying a _scalar_ function (“?”, “int”, and “1 +”)
but its argument is an _array_ , the operation is implicitly lifted over each
element of the array.

Lang5 is cool because it basically combines the terse expressiveness of APL
with the compositional higher-order functional style of concatenative
languages like Joy and Factor. Since everything is based on composition, you
don’t need to use any local variables by default—the mantra is “name code, not
data”—and you can factor out any subexpression (“extract method”) and give it
a name just by cutting and pasting, like:

    
    
        : sum '+ reduce ;
        : randint ? int 1 + ;
        : throws(*)
          dup  6 swap reshape  randint  sum  swap / ;
    

(And even if you do use local variables, this is still an advantage in terms
of simplicity of reasoning about programs.)

Concatenative programming languages and array languages are basically two
different approaches to “function-level programming”, a style of functional
programming based on _combinators_ instead of lambda calculus, in which all
terms denote functions. A literal value like “100” is a function that accepts
a stack and returns a “new” stack with the value 100 on top.

In a way, they’re the “most functional” languages—yet they also have a
straightforward imperative interpretation that makes them map nicely to real-
world hardware. You can think of a concatenative program as a series of pure
functions taking the current program state (a stack) and returning a new
state, _or_ as a series of imperative procedures mutating a stack in-place;
because the stack is “linear”, consumed on each call, these two views are
equivalent, so you can think about programs as _either_ pure mathematical
rewriting rules _or_ step-by-step procedures.

They also have a bunch of nice theoretical properties, especially when you add
static types, that make it easy to provide good tooling and achieve good
performance.

~~~
exikyut
Huh. What a weird form of functional programming.

(Cue proper realization of what "array language" means)

If only this could be smoothly transferred over to mainstream languages...

~~~
evincarofautumn
I’ve been working on a concatenative language, Kitten, which I hope eventually
bridges the gap to more mainstream programmers with a seamless blend of
functional and imperative semantics, useful language features that work best
in a concatenative setting, and straightforward reasoning about correctness
and performance.

Kitten is small, but not nearly as minimalistic as other concatenative
languages—while it’s meant to be a “systems” language in the realm of C++ or
Rust—with unboxed data types by default and no GC required—there are various
concessions for usability like a traditional tokenizer, local variables, infix
operators, an expressive static type system similar to Haskell, and a
compositional effect/coeffect system.

Moreover, thanks to static types, thinking of the program in terms of a “data
stack” is somewhat discouraged—instead of stack shuffling operations, it
encourages the judicious use of local variables and dataflow combinators
(e.g., patterns like “apply a function to two values and get both results”).
The stack isn’t even really an implementation detail: no stack is actually
present in memory at runtime in the latest iteration of the compiler, since
data lives in registers or on the call stack, just like in C.

Anyway, in the meantime, you can toy around with an existing concatenative
language like Factor, which feels very Lisp-like and has a Smalltalk-like
object system and nice interactive environment; or you can get many of the
benefits of concatenative programming by preferring a compositional/dataflow
style (not necessarily stack-oriented) in languages where it’s reasonably
easy, such as Haskell and Clojure.

~~~
exikyut
Hmm, interesting.

I just had a slow look through
[http://kittenlang.org/intro/](http://kittenlang.org/intro/). In all honesty
my stupid ADHD decided to conk out precisely at the Lambdas section, I'm not
sure why. Up to that point, the only thing that I'd note is that "match (xs
head)" is not described anywhere (on the page at all).

But I think it's interesting you posit Kitten as a systems language. Huh.
(What's a "traditional" tokenizer?! Ah; I assume you mean "not pure Forth" as
opposed to "buy APL keycaps here" (heh).)

I guess the only advice I can think of right now is, offer batteries included,
and include the batteries you're most passionate about, because it'll mean the
implementations are complete and well-tested. If that's web templating (random
top-of-my-head thought), do that; if it's image processing, do that. It'll
raise the chances the language becomes $thing-heavy because nobody contributes
other code, but you can always stop advertising and spend time rounding things
out if that happens.

I meant to install Kitten, but that didn't quite work this time around due to
problems on my end. (First git was silently segfaulting due to inconsistently-
upgraded shared libraries on my Slackware box, then I realized I needed
Haskell and switched to Arch, then I was reminded that pacman doesn't _really_
know how to resolve dependencies properly ("if you install something that
needs the newest version of glibc, maybe I shouldn't need to have to upgrade
glibc (and reverse-resolve its dependencies) myself?!"), then after half-
installing Haskell I discovered (/) only had 19MB free, so now I have to go
figure out what caused that... (I argue it's my 80KB/s ADSL2+ upload speed
:P))

I'll definitely be looking at Kitten at some point though. Thanks for the
reply!

EDIT: Got past the previous issues, now Haskell is sad.
[https://github.com/evincarofautumn/kitten/issues/206](https://github.com/evincarofautumn/kitten/issues/206)

~~~
evincarofautumn
> "match (xs head)" is not described anywhere (on the page at all).

“A ‘match’ expression takes an instance of an ADT, matches on its tag, and
unpacks its fields (if any) onto the stack so they can be manipulated.” (Under
the “Algebraic Data Types” section.)

By a “traditional” tokenizer I mean “a tokenizer more like what you’d expect
from a mainstream C-like programming language, not just splitting on
whitespace like Forth does”. It has some superficial similarities to Forth
(postfix by default), but overall has more in common with the ML and C
families (Haskell, OCaml, Rust, C, C++).

The reason I decided to start labeling it a “systems” language recently is to
get the attention of the people from C++ and Rust who I’d like to try it out
eventually. The implementation isn’t nearly close to this, but the language
itself is designed to be easy to implement efficiently (strict evaluation,
guaranteed tail call elimination, unboxed data types, no GC or nontrivial
runtime required) and has a lot of static structure available for the compiler
to do optimisations (static type and effect system).

Kitten’s docs are perpetually in an awkward state because I’m the only one
working on it regularly—they’re usually slightly behind the compiler version
I’m actually working on, and occasionally describe _language_ features that
aren’t yet _implementation_ features (i.e., they _should_ work but are known
to be buggy/incomplete in the current version). I’m focusing on the next major
round of changes to the compiler, which I hope is the last before a release,
and I plan to go through all the docs and flesh them out as part of the
eventual release process.

~~~
exikyut
(Sorry for late reply, woops)

Hm, thanks for the manual reference (heh, should have gone and looked). Now to
go figure out what "xs head" means...

I think the tokenizer choices are pragmatic and interesting. They make the
language less simply annoying and irritating, and thus more accessible to
learn.

The idea of a systems language with Kitten's focuses sounds good - oddball
alternatives always serve to (slowly) influence the mainstream, let's hope
that happens here :D

The approach to documentation is reasonable too, if unintuitive at first
glance; providing a language that implements the 95th percentile of what is
advertised is actually not that bad of an approach.

~~~
evincarofautumn
“head” has the type:

    
    
        <T> (List<T> -> Optional<T>)
    

(“For any type T, function from list of T to optional T.”)

So assuming “xs” is a list, “xs head” is either a full Optional (“some”)
containing the first element of the list, or an empty Optional (“none”) if the
list was empty—typed, of course, according to the element type of the input
list, so if “xs” has type “List<Int32>” then “xs head” has type
“Optional<Int32>”.

------
3rdAccount
I haven't seen this in ages, but was always sad it didn't pick up a little
more steam. Neat project. I'd love to learn concatenative and array languages
using it.

