
The algebra and calculus of algebraic data types - alex_hirner
https://codewords.recurse.com/issues/three/algebra-and-calculus-of-algebraic-data-types
======
vajrabum
That's fun and a bit surprising, but maybe it shouldn't be. I'm reminded that
Dana Scott with Christopher Strachey showed that by using lattices or complete
partial orders with a bit of topology to model graphs of possible computations
of a program you could, just as in analysis, define least upper and lower
bounds and a notion of continuity to derive a construction of limit for a
computation which is analogous to a limit in analysis. They called this model
a domain. That bit of of math is the basis of denotational semantics of
programming languages and is necessary because sets are largely sufficient as
the basis for algebra and analysis but not for programs which have evils like
partial functions, side effects and variable assignment. I believe that
Christopher Strachey with Scott also introduced the formal notions of sum,
product and recusive types. They also showed how definitions or models of
recursive types and functions could be well founded through their construction
of limits on domains. An early tech report on it can be found here:

[https://www.cs.ox.ac.uk/files/3228/PRG06.pdf](https://www.cs.ox.ac.uk/files/3228/PRG06.pdf)

and here's a more recent free book from David Schmidt on the topic:

[http://people.cs.ksu.edu/~schmidt/text/DenSem-full-
book.pdf](http://people.cs.ksu.edu/~schmidt/text/DenSem-full-book.pdf)

~~~
evincarofautumn
Another interesting area of program semantics that hasn’t seen much attention
afaik is in the design of languages whose programs form some algebraic
structure.

My area of research is concatenative programming, where concatenation of two
programs denotes the composition of those programs, and the empty program is
the identity function (on the “program state”, which is usually a stack). That
means that the syntax and semantics of the language both form monoids, and
there’s a homomorphism from the syntax onto the semantics.

Several years ago, I was trying to come up with languages with other more
interesting algebraic structures, so you could apply theorems from those
structures to programs. A particular challenge is a language whose structure
forms a _nontrivial_ ring¹, in particular a Euclidean ring—then you could find
the greatest common divisor of two programs, and because a Euclidean ring is a
unique factorisation domain, you could also factor a program into _prime
subprograms_. I was fascinated by what that might mean and whether it could be
useful.

Unfortunately, it was that “nontrivial” aspect that I could never get past.
The language always ended up insufficiently powerful to express anything of
interest, its algebraic structure was trivial (e.g. there’s only one program),
or it ended up having a weaker structure (e.g. an idempotent semiring). Chris
Pressey also worked on this concept a bunch around 2007, and produced some
results like Cabra² and Burro³, but ran into similar dead ends like Potro⁴.

¹
[https://en.wikipedia.org/wiki/Ring_(mathematics)](https://en.wikipedia.org/wiki/Ring_\(mathematics\))

²
[http://catseye.tc/article/Languages.md#cabra](http://catseye.tc/article/Languages.md#cabra)

³
[http://catseye.tc/article/Languages.md#burro](http://catseye.tc/article/Languages.md#burro)

⁴
[https://github.com/catseye/Chrysoberyl/blob/master/article/L...](https://github.com/catseye/Chrysoberyl/blob/master/article/List%20of%20Unfinished%20Interesting%20Esolangs.md#potro)

~~~
rntz
This algebraic approach to program semantics is basically how categorical
semantics works. The syntactic equational theory of the simply typed lambda
calculus with pairs, for example, is that of the "free" cartesian closed
category. It can be mapped (in a way that respects the cartesian closed
structure) into any cartesian closed category, giving you a semantics.

A monoid can be seen as a one-object category (the monoid elements are the
morphisms on that object, and the monoid operator is morphism composition), so
perhaps concatenative languages have categorical semantics too. (Although it
seems like the semantics of "quote", or whatever [ foo bar baz ] is in forth,
might make things a little interesting - ie. require more structure than a
monoid homomorphism. I expect you really want cartesian closed categories, and
quote is probably curry or apply.)

~~~
evincarofautumn
I don’t know how concatenative languages fit into category theory—I would
guess Cartesian closed categories are the place to start, but that’s not
really my area of expertise.

Maybe you can offer some insight if I tell you that the quotation _syntax_ “[
… ]” in a concatenative language corresponds to lambda abstraction:

    
    
        e : a
        ---------------------
        [ e ] : ∀s. s → s × a
    

And the quotation _operator_ “quote”, which takes a value on the stack and
returns it quoted, corresponds to eta-expansion (or lifting lambda abstraction
into a term, if you like) and has the type:

    
    
        quote : ∀sa. s × a → s × (∀t. t → t × a)
    

Generally the standard Turing-complete basis in concatenative calculus is the
following set of combinators:

    
    
        compose : ∀rstu. r × (s → t) × (t → u) → r × (s → u)
        swap : ∀sab. s × a × b → s × b × a
        drop : ∀sa. s × a → s
        dup : ∀sa. s × a → s × a × a
        quote : ∀sa. s × a → s × (∀t. t → t × a)
        apply : ∀st. s × (s → t) → t
    

(Although you can get away with fewer if they’re equivalent in power.)

The first three correspond to the standard B, C, K, and W combinators from
combinatory logic, which give you all the substructural rules from
logic—swapping is exchange, dropping is weakening, and copying is contraction.
Without swap/drop/dup, it’s equivalent to ordered linear lambda calculus,
which can be interpreted in any category (since it’s just B and I); what are
the categorical equivalents of linear (BC), affine (BCK), ordered (BKW), and
relevant (BCW) logics?

------
vignesh_m
This is almost identical to the study of combinatorial species
([https://en.wikipedia.org/wiki/Combinatorial_species](https://en.wikipedia.org/wiki/Combinatorial_species))

------
carapace
If you like this you'll want to look at "semiring programming" (just search on
that phrase) and Categorical Programming, e.g.: "Compiling to categories"
Conal Elliott [http://conal.net/papers/compiling-to-
categories/](http://conal.net/papers/compiling-to-categories/)

~~~
etatoby
I found the paper linked inside the article very interesting. It explores (one
possible) meaning of negative and rational types. I'm still going through it:

[https://www.cs.indiana.edu/~sabry/papers/rational.pdf](https://www.cs.indiana.edu/~sabry/papers/rational.pdf)

------
rfw
Nat = Nat + 1 does end up being meaningful: since ultimately the numbers that
fall out (e.g. Bool = 2) end up describing the sizes of sets, an
interpretation of Nat is some infinite cardinal (aleph-null), which is the
size of the set of natural numbers.

~~~
firethief
That makes sense, but this case shows that apparently arbitrary algebraic
simplifications aren't allowed because otherwise we could subtract Nat from
each side and have trouble.

------
arithma
A cool aspect is how this algebra/calculus sheds light that
maps(/dictionary/associative-arrays) have an equivalent cardinality to
functions and that they are in many ways the same.

~~~
notduncansmith
In Clojure, maps can be called as functions (equivalent to `get`). This
isomorphism is also reflected in K.

------
benrbray
I once tried to take the Taylor series idea further and define the "cosine" of
a type as the Taylor expansion of cos(x), but didn't get anywhere with it.

Can you do anything else weird with ADTs?

~~~
abecedarius
Huh. Thinking about this example I'm confused, because the power series for
exp(x) looks like the type for unordered lists of x: e.g. the term for lists
of length three is x^3, and if you ignore order you get x^3/3!. The cosine is
almost the same as the even terms of this series, but with a minus sign on
every other term.

What confuses me (besides what to make of the minus sign) is that the type of
sets of x ought to be 2^x -- that is, x -> Bool -- because it's isomorphic to
the characteristic function of a set, and a function type is an exponential
([https://bartoszmilewski.com/2015/03/13/function-
types/](https://bartoszmilewski.com/2015/03/13/function-types/)). But 2^x
differs from exp(x) by a constant factor. So I seem to be messing with
surface-level analogies without real understanding.

~~~
etatoby
You're comparing unordered lists of unlimited length with subsets.

To define a subset of x, you simply choose whether any element is present or
not:

type Set x = x → Bool

Cardinality: 2^x

Defining an unordered list (of unlimited length, with possible repetitions)
with algebraic data types is harder, if at all possible. Especially when you
consider what "unordered" should mean in the face of repetitions. I wouldn't
be surprised if it was not possible, or if the expansion turned out to have a
strange cardinality such as e^x

~~~
abecedarius
Thanks, I forgot about repetitions of the same element. How could I miss that?

------
Sharlin
One thing I’ve been wondering is how to interpret the function types `Void ->
a` and `a -> Void`, where Void is the uninhabited type (unique up to
isomorphism). The number of inhabitants of those types should be |a|^0=1 and
0^|a|=0, respectively, but what does that mean? In real world, function(s) of
type a->Void certainly exist, such functions being those that diverge (loop
forever or halt via some non-local means).

~~~
hiker
`Void` being the uninhabited type, in the light of the Curry-Howard
isomorphism stands for a false proposition.

`a -> Void` get interpreted as "not a" or "from a follows contradiction" or
equivalently "a is uninhabited". Combinatorially it's `0 ^ a` which for non-
empty a is zero but is equal to 1 when a is empty (0^0=1). In other words
there are no functions of type `a -> Void` for non-empty a and there's exactly
one such function for uninhabited a (id :: Void -> Void).

`Void -> a` is interpreted "from falsehood, anything (follows)"
[https://en.wikipedia.org/wiki/Principle_of_explosion](https://en.wikipedia.org/wiki/Principle_of_explosion).
Combinatorially a^0 = 1 for all a so there's exactly one such function. An
easy way to define it is by induction on Void (which has no cases and you're
done).

~~~
uryga
Yeah, Void makes for great for inductive definitions ;)

------
tromp
Very neat insights into data type differentiation! Minor nitpick: the article
uses

    
    
        data Unit = Unit
    

but the Haskell standard Prelude already has the empty tuple type (), with
single element (), functioning as the standard unit type.

------
hyperpallium
Does distributivity hold? _a x (b + c)_ = _a x b + a x c_

~~~
evincarofautumn
Up to isomorphism, yes. That is, you can write a pair of functions that are
mutual inverses that witness the fact that those two types are isomorphic:

    
    
        distl :: (a, Either b c) -> Either (a, b) (a, c)
        distl (a, b_c) = case b_c of
          Left b -> Left (a, b)
          Right c -> Right (a, c)
    
        factl :: Either (a, b) (a, c) -> (a, Either b c)
        factl ab_ac = case ab_ac of
          Left (a, b) -> (a, Left b)
          Right (a, c) -> (a, Right c)
    

With the “TypeOperators” feature enabled (and “UnicodeSyntax” for pretty
arrows and such, because why not), you can write it more literally:

    
    
        type (×) a b = (a, b)
        infixl 7 ×
    
        type (+) a b = Either a b
        infixl 6 +
    
        distl ∷ a × (b + c) → a × b + a × c
        factl ∷ a × b + a × c → a × (b + c)

------
andrzejsz
What happened to Code Words that are not new issues?

