
Algebraic Data Types: Things I wish someone had explained about FP - jrsinclair
https://jrsinclair.com/articles/2019/algebraic-data-types-what-i-wish-someone-had-explained-about-functional-programming
======
tel
While I really like these blog posts for keeping things light and intuitive,
they're also sprinkled with claims that go a bit too far.

    
    
        > Arrays, Objects, Maps, WeakMaps, and Sets are all algebraic data types
    

This isn't really true on a couple levels. Algebraic Data Types are described
through a small set of composition (data) or decomposition (codata)
operations. Arrays, Maps, and Objects might be explainable in this way
(though, really, we'd just be modeling them with ADTs). WeakMaps and Sets
really cross the line though. WeakMaps, by their nature, need to appeal to
something much more than just ADTs and Sets require "quotienting" to unify
sets which contain the same elements in various ways. You can build the "raw
stuff" from ADTs, but you won't have something that really behaves much like
these types at all until you add in something extra.

The emphasis on sum types is huge, but it leaves out something critical:
"JavaScript doesn’t have a lot of built-in features for sum types" is true of
user-defined sum types (and related to it just not having types all together),
but it has great support for a few special sum types: integers, booleans,
characters! These sum types are important to talk about because it helps make
clear that we already work with sum types all the time! Our basis for
distinction and choice is built atop them in any language.

Finally, while products and sums get you some wonderful modeling capabilities,
the place where Algebraic Data Types really shine is when you add two further
capabilities: function types (exponentials) and recursion (fixpoint, which
doesn't really map to high school algebra). Those are clearly built into
JavaScript as well and would be great to discuss as well.

~~~
AnimalMuppet
In what sense are integers and booleans sum types? What is the definition of
sum types that booleans fit? Is it just that it can be true or false, and
because there's the "or", it's a sum type?

~~~
naasking
> In what sense are integers and booleans sum types?

Natural = Zero | Successor(Natural)

Bool = True | False

The mapping from Natural to Integer is standard.

~~~
smabie
Assuming non-fixed width integers, how can you have an infinite sum type?

~~~
tel
There’s nothing wrong with an infinite sum, in principle. You can’t usually
define them unless you can use recursion. In the case of integers you can do

    
    
        data Int = Positive Nat | Negative Nat
        data Nat = Zero | Succ Nat

~~~
jolmg
Though, that has 2 zeros, a positive and a negative one, but zero should be
neither. To avoid that, you can do:

    
    
      data Int = Zero | NonZero Sign PositiveInt
      data PositiveInt = One | Succ PositiveInt
      data Sign = Positive | Negative

~~~
tel
Good point.

------
Jeff_Brown
When I found sum types (in Haskell) I could not believe they had not been part
of any prior language I'd learned. You can kind of fake them with inheritance
but it's awkward, and you don't get totality checking, and you can't close the
set of alternatives, so a new descendent added later can break things.

~~~
zozbot234
> I could not believe they had not been part of any prior language I'd
> learned.

Huh? They're part of Pascal (known as variant records) and plenty of other
languages besides. It's mostly C that lacks them, and even then you're just
expected to implement them yourself, for maximum efficiency.

~~~
akavi
They're not present in the dominant statically typed languages
(C/C++/C#/Java/Go) and not really a thing in dynamic languages generally, and
so the vast majority of programmers never encounter the concept.

~~~
thestoicattack
C++ has std::variant, but that's pretty new.

~~~
Sniffnoy
My understanding is that's not really a sum type, though -- e.g., you couldn't
use it to add a type to itself, only to add _distinct_ types to each other.

~~~
edflsafoiewq
No, you can add a type to itself (that is a pain in the ass though). You might
be thinking of the fact that it may not hold a value (eg. if an exception is
thrown while moving into it), but they made it as close to a sum as they could
given the nature of C++.

------
lidHanteyk
Where does the author insert themselves into their imaginary social
interactions? Are they the sneering functional programmer?

In particular, I wonder about their characterization of sum types. I had to
search to make sure, but they never say the word "union". This is curious,
because they do understand "tag" and "tagged" as words used to describe sum-
type behavior, and because the classic way to explain sum types to
"imperative" C programmers is as "tagged unions", or union types with a tag
value that explains which of the union's constructors is present. C or Java
programmers would instead prefer an analogy with subtyping, which the article
proceeds to use, but by overlooking the tagged-union analogy, the author is
actually missing out on both a useful analogy for reaching out to imperative
programmers, and also how low-level implementations of functional programming
languages usually implement sum types.

Pattern-matching is probably the killer part of ADTs, but only two paragraphs
were spent upon the concept. Worse, it's implied that pattern-matching is tied
deeply both to ADTs and to functional programming, when it occurs outside of
those contexts. Languages like Python 3, Ruby, Racket, and Swift fall outside
of the traditional "functional programming" traditions but still have
interesting pattern-matching abilities.

The author _still_ doesn't know Haskell. Want an effectful loop?
Control.Monad.Loops [0] contains many prebuilt loops, and they're written in
standard Haskell. The ST monad [1] provides imperative variables and mutation.
So does IO [2], if one insists on avoiding GHC for some reason. I agree that
it's good to learn an ML, and Haskell is a fine choice, but the author needs
to realize how drenched in Haskell memes they currently are, and to either
learn _more_ of Haskell _itself_ , or to take a break from it for a while.

[0] [https://hackage.haskell.org/package/monad-
loops](https://hackage.haskell.org/package/monad-loops)

[1] [https://wiki.haskell.org/Monad/ST](https://wiki.haskell.org/Monad/ST)

[2] [https://hackage.haskell.org/package/base/docs/Data-
IORef.htm...](https://hackage.haskell.org/package/base/docs/Data-IORef.html)

~~~
madsbuch
My thesis is, that proclaimed functional programmers spend more time on
mathematical structures than implementation details. Hence a sum type and not
a tagged union (how the target evaluater distinguishes should not name the
type).

~~~
rectang
Software developers are allowed to have discussions about "integer" types
without being constantly interrupted by number theorists to remind them that
those aren't really integers.

Perhaps one day we will be able to have similar discussions about sum types
without constant interjections by functional programmers.

~~~
madsbuch
Yes indeed, as number theorists should be allowed to ditto.

------
privethedge
There are also union types. And sum types can be implemented in terms of union
types: [https://waleedkhan.name/blog/union-vs-sum-
types/](https://waleedkhan.name/blog/union-vs-sum-types/)

------
didibus
A product type is just a closed group of values where you know in advance what
will be grouped and what type the values are going to have that are part of
the group.

A class in Java for example is a product type. As it describes a group of
fields and their types. Same for a struct in C++.

Where I disagree with the article is that JavaScript does not have product
types, because you do not have a definition of such a group of values in
advance. A JS object does not list what values it groups and what type they
each will have. A JS class comes closer in listing what values it will group,
but not their types. Also, product types must be closed, and a JS class
defines an open group, at runtime the object could group more then what it
specified, unless explicitly frozen. You could say a frozen class defines a
product of types where all types are the Object type and thus can be of any
value, but that's a stretch, because product types are only as useful as they
define a constrained product of possible values for the type.

A sum type is just a known list of possible types a value can take. That's the
one people aren't as familiar with as most popular languages don't have a
construct like it. It let's you say, this variable can be a String or an Int,
but not both.

Often people say product type is AND while sum type is XOR. This variable
contains a String AND an Int. They will both be there. So it contains both.
That's a product type. This variable can contain a String XOR an Int, they
can't be both contained, it is only one or the other. That's a sum type.

You can think in Java that all types are a sum type, they are either null or
of their defined type. But null isn't a real type, more that it is an
allowable value of all types. So it's a bit of a stretch.

Like the article said, best way to mimic them in popular languages is by
extending an abstract class. Each type the variable can be will be a child of
the parent. Thus a variable of type parent can be one and only one of its
children. It's not full featured, you can't make existing types extend from
the parent, so you can't arbitrarily define sum types using any available
type. You also often can't list what are all the possible types of parent,
depending on the reflection capabilities, knowing what all extend a type isn't
always feasible. Also, the static type checkers don't consider these like sum
types, so it won't tell you that you forgot to handle cases where parent is of
one of its child type. Etc.

------
chapium
[https://nrinaudo.github.io/scala-best-
practices/definitions/...](https://nrinaudo.github.io/scala-best-
practices/definitions/adt.html)

There is a lot of scala reference material regarding ADTs

------
hackermailman
There's an old book Functional Programming: Practice and Theory by Bruce J.
Maclennan to learn this if interested, it uses standard math notation instead
of being restricted to teaching an existing programming language.

------
tyri_kai_psomi
I found the best nugget of this entire article to be at the very end

> Keep persevering. Don’t give up. If you find that a bunch of blog posts
> don’t explain things in a way that makes sense, skip them. Keep looking
> until you find some well-written ones. The same goes for courses and
> tutorials. Everyone comes from different backgrounds. What makes sense for
> someone else might not work for you. And that’s OK.

I feel like this is advice everyone should hear, especially junior or mid
level developers

------
didibus
I actually think this is a great little series of articles.

Something I've found confusing with algeabric data types is the fact that they
are not mathematical concepts. They originate as a programming construct and
were invented as part of a programming language. They were then given a name
that makes it sounds mathematical, but it was just an attempt at giving some
kind of metaphorical meaning to them. Same as if I name a search engine Odin,
because Odin was the God of wisdom. But if you thought it had anything to do
with Norse mythology you'd similarly get confused.

People tried to find correspondence for them with mathematical theories. Thus,
in Set theory they could correspond to Disjoint Unions of Cartesian Products.
Or in category theory, they could correspond to Coproducts of products.

I guess you could say they came out of type theory, but I'm not sure of the
history here fully. Did the extensions to simply typed Lambda calculus first
came out of the theory or did it retroactively built a theory from the
programming construct?

At any rate, my point is, these mathematical correspondence are very
confusing. And I think it's wrong to try and understand these things from the
mathematical angle unless you're a mathematician trained in one of the
corresponding mathematical theories.

As such I think the article does a good job.

~~~
cbdumas
While I don't know the history of the term "algebraic data type", I do know
that the algebraic structure of ADTs is much deeper than what this article
presented. For instance this structure supports some limited notion of a
derivative[1]! And purely algebraic manipulation of ADT "equations" can
actually be used to generate some non-obvious results [2]. So I think it's
probably wrong to say that ADTs were just "given a name that sounds
mathematical".

1\. [https://codewords.recurse.com/issues/three/algebra-and-
calcu...](https://codewords.recurse.com/issues/three/algebra-and-calculus-of-
algebraic-data-types) 2\.
[http://www.math.lsa.umich.edu/~ablass/7trees.pdf](http://www.math.lsa.umich.edu/~ablass/7trees.pdf)

~~~
didibus
That's all true from the fact that there are real correspondences with math
theories. But like, that's the whole point of applied math no? To find
correspondences with other things so that we can then leverage and apply the
math principles to that thing?

I'm sure there's quite many things in programming language where you can use
such correspondence to your advantage, not just ADTs. But we don't teach it
starting from the mathematical correspondence and then back.

To put it more simply, most programming isn't thought by first teaching
learners about theoretical computer science. Generally it happens either in
parallel or you learn the theory after the fact. But for some reason, when it
comes to functional programming it seems most teachers want to start with
teaching you the theory.

And my above point is that, even historically, it is not always true that
theory came first. Often times, the construct were invented and used
practically from finding solutions to concrete problems, and later a theory
around it was developed. In the case of ADTs, I'm not 100% sure which came
first, but I know they were first implemented in the programming language
Hope. Not sure if the theory was there prior or not.

~~~
cbdumas
I think you might be right in this case that practical use preceded
theoretical understanding but that's not really my point. Software development
certainly has examples of pseudo-math (for example there was a trend a few
years back to write JavaScript that could run on both the server and the
client. For reasons passing understanding, this was called "isomorphic" by
some practitioners). All I meant was that ADTs are not that.

~~~
didibus
Ah I see. I believe you are correct. The correspondence seems to hold quite
well, and even the name algeabric was pretty aptly chosen in this instance.

I'd be curious to know if there's an original paper on ADTs somewhere about
it.

------
ww520
Union is a better example than Enum to explain what algebraic data type is.
It’s a natural progression from Struct as product type to Union as sum type.

------
leitasat
it should have been called 'How to implement Enum in Javascript'

------
jplayer01
Okay, maybe the design isn't great. Can we talk about algebraic data types
now? I come to HN for interesting discussion about interesting topics. I start
to wonder why I bother when the only comments on a post like this, which
actually concerns something that interests me, are complaints about design.

~~~
scott_s
(If you think comments have gone off-the-rails, I find the best solution is to
downvote the off-topic top level comments, and then make a substantive on-
topic top level comment.)

~~~
jplayer01
I don't have anything interesting to say about ADT's. Or even many topics I
read about. I'm mainly here to learn from others.

~~~
scott_s
Then just downvote, and come back later.

~~~
jplayer01
I felt there was value in pointing out yet another comment about the design
was unnecessary. Also, I was annoyed.

~~~
scott_s
I contend that not only was there no value, it _added_ to the problem. Hence
my suggestion.

