
Algebraic Structures: Things I wish someone had explained about FP - jrsinclair
https://jrsinclair.com/articles/2019/algebraic-structures-what-i-wish-someone-had-explained-about-functional-programming/
======
evmar
This post shows a Haskell-ish definition of Functor, then attempts to show the
same thing in TypeScript.

    
    
        interface Functor<A> {
            map<B>(f: (a: A) => B): Functor<B>;
        }
    

But the TypeScript definition loses something important: the point of a
Functor is that you get back the _same_ data type -- there's one `f` in the
Haskell defn both in the argument and return type, while this TS definition
can give you back any random data type. E.g. the implementation of Array.map
could give you back an Either.

I don't mean this comment to be a random nitpick of the post. Trying to think
these things through is hard, and trying to use things you know to help
understand the new idea is not unreasonable. But in particular with PL each
thing has so much associated baggage (here in TS, subtyping) that reasoning
"by metaphor" often means you end up missing the critical point.

~~~
lacampbell
Haven't had a coffee yet so go easy on me - doesn't this solve your issue?

    
    
        interface Functor<A> {
            map<B>(f: (a: A) => A): Functor<A>;
        }
    

You map over an option, you get an option. You map over an either, you get an
either. etc etc.

~~~
tel
That gets closer to the problem, but the solution is further away. You don't
want to return "Functor" (which is like a vtable, the interface itself) but
instead the thing itself which is implementing the Functor interface.

So you end up with

    
    
        interface Functor<A> {
            map<B>(f: (a: A) => A): Self<B>
        }
    

where `Self` needs to recognize that the type being defined to implement this
interface has a "slot". This tends to make things tough.

~~~
lacampbell
You've lost me. Returning an interface isn't returning a concrete thing. It's
returning the thing that implements the interface

I'm a very low level functional programmer. I'm big on immutability, big on
not using loops and instead using map/flatMap/filter/fold, I tend to roll my
own either and option implementations when they don't exist because it's the
tidiest way of handling errors I've come across, etc etc. But when it comes to
stuff like functors I don't get what it's buying me. What interesting stuff
can I do if I know that both an option and a list can be mapped over?

I really need to look more deeply into it at some stage. I might be missing
out on some powerful tools. Or it might be a bunch of stuff that's
theoretically interesting but practically useless.

~~~
rockostrich
I think you may have misunderstood their point. They're just saying that
specifying `Functor` in the return type of `map` isn't enough to resolve the
issue because you could still have a case of `List#map` returning a `Maybe` or
`Either` since they are all implementations of `Functor`.

This gets handle by the Cats library in Scala
([https://typelevel.org/cats/typeclasses/functor.html](https://typelevel.org/cats/typeclasses/functor.html))
by defining the type class the functor is being defined for as an abstract
type on the functor itself.

------
tom_mellior
The
[https://en.wikipedia.org/w/index.php?title=Algebraic_structu...](https://en.wikipedia.org/w/index.php?title=Algebraic_structure&oldid=898454436132)
link in footnote 1 is broken, it takes me to a page saying 'The revision
#898454436132 of the page named "Algebraic structure" does not exist.'.

As for the quoted definition, I agree that if this is your first exposure to
abstract algebra, you'll need a moment to unpack the sentence, follow some
links, and especially, read on. But that's not just Wikipedia; the blog post
also takes considerably more than one sentence to explain what it is trying to
say. You can't explain everything in one sentence.

For whatever it's worth, as a data point relating to a recent discussion on
whether a university CS education makes you a better programmer or not: We
literally started learning about algebraic structures in the first math class
on the first morning of the first year of university. If you're going to
program in a setting where algebraic structures (or data types!) are relevant,
this university knowledge will help.

Finally, this blog looks very nice visually, but the "broken typewriter" font
effect makes the code examples much too hard to read. It would be great if the
ribbon in that typewriter could be replaced.

~~~
jordigh
I think Wikipedia is wrong here. I never knew that anyone referred to any
generic set with operations as an "algebra". That's a magma! An algebra has
much more structure than any generic set with operations.

Does this really happen? Do people really say generic "algebra" for any set
with operations?

(And as an aside, I don't like the "abstract algebra" monicker. It sounds so
immature and undergraddy. There isn't an ordinary algebra and an abstract
algebra. It's all just algebra.)

~~~
tom_mellior
> I never knew that anyone referred to any generic set with operations as an
> "algebra". That's a magma!

If I have a structure S with an associative operation, and another structure G
with an associative operation and a neutral element, I will say that S and G
are different algebras, not "different magmas". Others looking at S or G will
not ask "oh, what kind of magma do you have there", they will ask what kind of
_algebra_.

So... Yes, these are both (special cases of) magmas, but the general term used
for them is "algebra" or "algebraic structure". Don't you agree?

~~~
jordigh
But you added extra structure: associativity. Cohn's definition doesn't. It
just says "set with operations" (well, finitary operations, but that's kind of
always implicit in the definition of "operation").

So you're saying people shorten the phrase "algebraic structure" to "algebra";
this hasn't been my experience.

~~~
tom_mellior
> But you added extra structure: associativity.

Of course I added extra structure since I wanted to make a point about
_different kinds_ of algebraic structures which are all subsumed by the term
"algebraic structure" or "algebra". And it's only possible to distinguish
kinds of algebraic structures by differences in structure.

But adding extra structure in one example doesn't mean that I somehow exclude
magmas from the definition. Here is the example again, extended to be include
a component with no extra structure:

If I have a structure M with no structure but an operation, a structure S with
an associative operation, and another structure G with an associative
operation and a neutral element, I will say that M and S and G are different
algebras, not "different magmas". Others looking at M or S or G will not ask
"oh, what kind of magma do you have there", they will ask what kind of
algebra.

Of course this extension by M doesn't change anything about the validity of
the example. Magmas are just as included in the term "algebraic structure" as
semigroups, groups, rings, and fields are.

> So you're saying people shorten the phrase "algebraic structure" to
> "algebra"; this hasn't been my experience.

<shrug> It has been mine. Wikipedia has lots of uses of the phrase "the
algebra of":
[https://en.wikipedia.org/w/index.php?search=%22the+algebra+o...](https://en.wikipedia.org/w/index.php?search=%22the+algebra+of%22&title=Special%3ASearch&go=Go&ns0=1),
always meaning something like "the algebraic structure of set X with
operations f, g, and h".

------
tel
Calling these things algebraic structures might help you win some confidence,
but the communities which talk about algebraic structures aren't going to be
helpful for learning how to program with these things. Mathematicians love
algebraic structures (and non-algebraic ones).

The advantage really plays out more with the first-order structures, too.
Things like monoid, semiring, torsor, group. You also have nice ones in more
standard data structures: a balanced tree is an excellent example of a
structure where the laws exist to cut out unbalanced trees.

In my opinion, there are two things to study here:

First, the practice of thinking about abstract structures that apply to
concrete data. For this, the practice of thinking of there being a type (or
multiple interrelated ones) which offers some set of "constructors" which
create the type or augment existing values (gluing new items into a tree,
merging two trees, etc) and some set of "laws" for which all values of that
type must uphold.

It turns out that you can do a lot of analysis of the behavior of these
structures in the abstract and then apply it wholesale throughout programs.
Many concrete values you work with are the combination of multiple structures
in natural ways. Sometimes you can replace whole APIs with hundreds of calls,
each named uniquely to this implementation of this type, with just a small set
of nicely orthogonal methods with completely standard names.

Second, the use of higher order structures like Functor, Applicative, Monad.
These get a LOT of airtime because they're both challenging and offer
important capabilities. But they're also in a lot of senses their own realm of
study. Not only are they developed very uniquely in programming communities
(as opposed to what you'll find if you read about the category theoretic
definitions) but they are also "higher order" in that they involve functions
between types.

This higher-order nature both makes their own equations much more complex, but
it also means that to see them applying (in languages other than Haskell and
its ilk where purity drives this) you have to get _really_ good at seeing
languages in an abstract fashion. It's a great skill to develop, but ramps up
the difficulty greatly.

Master the "first order" ones first. Master concrete, interesting types like
Either, Maybe, List (as a source of non-determinism) first. Then come back and
see if you can see how the skills you develop with the "first order"
structures apply to these higher order, computationally minded types.

------
Vosporos
For the more beginners of us, I love this blog post / cheatsheet by Julie
Moronuki on the matter of Algebraic Structures
[https://argumatronic.com/posts/2019-06-21-algebra-
cheatsheet...](https://argumatronic.com/posts/2019-06-21-algebra-
cheatsheet.html)

------
3PS
> Functor isn’t the only algebraic structure either.

I actually wouldn't call it an algebraic structure at all. A functor really
isn't just a set with some finitary operations. A quick search online [1]
tells me I'm not alone.

[https://www.quora.com/Why-is-functor-considered-to-be-an-
alg...](https://www.quora.com/Why-is-functor-considered-to-be-an-algebraic-
structure)

~~~
Ezku
I guess this is an easy point of confusion. Are specific instances of functor
algebras, then? (What about f-algebras?) What would be the more appropriate
word for the category theoretical things the author is trying to refer to
here, functor and monad and so on?

~~~
edflsafoiewq
A monad (on Set) is an "algebraic structure" (eg. the notion of a group). An
algebra for that monad is an "algebra" (eg. one individual group).

This is the original use for monads in universal algebra, before they were
interpreted as computational effects. An algebraic structure like a group or
monoid is traditionally given by a signature: a list of what operations it has
and what rules the operations need to follow. You can generalize from a
signature Sig to a monad M. If a is a set, then Ma is "the set of expressions
with constants from a": the set of formal expressions built from elements of a
and the operations in Sig and where two expressions are regarded as equal if
you can manipulate one into the other using the rules from Sig. The list monad
for example corresponds to the signature for monoids.

The monad laws can thus be read as expressing "how to do algebra" at the most
general level (ie. the level that is common to all algebraic structures). I
would gloss them as "the order you evaluate an expression does not matter".

------
carapace
I know it's not for everybody, but go back and read Backus' Turing Award
lecture introducing FP: "Can Programming Be Liberated from the von Neumann
Style? A Functional Style and Its Algebra of Programs"
[https://dl.acm.org/ft_gateway.cfm?id=1283933&type=pdf](https://dl.acm.org/ft_gateway.cfm?id=1283933&type=pdf)

The two main points (they're in the title) are eliminating the "von Neumann
bottleneck" between the CPU and RAM, and the _algebra_ of [FP] programs, the
potential to manipulate programs as one manipulates mathematical formulas.

~~~
jandrese
I just skimmed that paper but I don't see a section on how to avoid the
bottleneck when your functional program is trapped on a Von Neumann machine.
It seems to me that functional constructs have to be mapped down to structures
that will suffer the Von Newmann bottleneck if they are being executed on a
Von Newmann machine. But there isn't any apparent discussion of an alternative
machine architecture, just the computer language.

Indeed one of the complaints you sometimes see about functional programming is
the amount of memory churn it produces on a Von Newmann architecture.

~~~
carapace
I'm actually working on that. I've come up with a way to dynamically create
dataflow graphs on banks of hardware interconnected by latching sort-nets and
programmed by a simple, pure, functional, "concatinative" programming language
called Joy.

~~~
jandrese
Didn't the MIT LISP machines achieve this back in the 80s?

------
haolez
I don’t have a need right now to master a new programming paradigm in order to
leverage my business.

However, if I happen to bump into this need, I’d focus first on Logic and
Array-Oriented programming languages first. They seem more valuable to my
industry (finance).

For example: Prolog and J.

------
Razengan
Gotta say I love the style/aesthetics of that site.

~~~
mc3
You must have good vision, because it's annoying for me to read.

~~~
andrewnc
I had to turn off the background and change the fonts of the code samples. It
was impossible otherwise.

~~~
lebed2045
I found the code blocks a very interesting looking and authentic, but that's
true for me: it was somewhat difficult to read.

------
privethedge
> a.map(g).map(f) ≣ a.map(x => f(g(x)))

> But the one on the left will be slower and use a lot more memory.

Is it really true? I mean, GC will clean the intermediate array, won't it? And
the speed won't be significantly slower. It's still linear complexity anyway.

~~~
tom_mellior
Dragging a large data structure through the cache only once rather than twice
can be beneficial. Also, GC will clean the intermediate array, but (depending
on the specifics of the language and the data types involved) it might first
have to make yet another scan through the data structure. So yes, it's linear,
but possibly 2-3x slower.

~~~
privethedge
Then why do I have the following results?

    
    
        const a = [...Array(1000000).keys()];
        const m = 8;
        let leftAvg = .0;
        for(let _ of Array(m)) {
            const t0 = performance.now();
            a.map(Math.tan).map(Math.sin);
            const t1 = performance.now();
            leftAvg += (t1 - t0)/m;
        }
        let rightAvg = .0;
        for(let _ of Array(m)) {
            const t0 = performance.now();
            a.map(x => Math.sin(Math.tan(x)));
            const t1 = performance.now();
            rightAvg += (t1 - t0)/m;
        }
        console.log(leftAvg, rightAvg, leftAvg/rightAvg);
        // JS Firefox 70:  264 360.75 0.7318087318087318
    
    
    
    
        var a = Enumerable.Range(0, 10000000).Select(x => (double)x).ToArray();
        double[] xs1 = null;
        double[] xs2 = null;
        var m = 16;
    
        var leftAvg = .0;
        foreach (var _ in Enumerable.Range(0, m))
        {
            var watch = System.Diagnostics.Stopwatch.StartNew();
            xs1 = a.Select(Math.Tan).ToArray().Select(Math.Sin).ToArray();
            watch.Stop();
            leftAvg += (double)watch.ElapsedMilliseconds / m;
        }
    
        var rightAvg = .0;
        foreach (var _ in Enumerable.Range(0, m))
        {
            var watch = System.Diagnostics.Stopwatch.StartNew();
            xs2 = a.Select(x => Math.Sin(Math.Tan(x))).ToArray();
            watch.Stop();
            rightAvg += (double)watch.ElapsedMilliseconds / m;
        }
    
        Console.WriteLine($"{leftAvg} {rightAvg} {leftAvg / rightAvg}");
        // C# Results: 505.75 602.25 0.839767538397675

~~~
tom_mellior
Hard to tell without more information, and probably more runs. Did the code
run often enough for the JIT compiler to warm up sufficiently? If you're
running in interpreted mode, or only baseline compiled mode, you have other
overheads. Also, and this is admittedly something I should not have glossed
over: The _reading and writing costs_ of fused maps might be sped up by the
factors I mentioned, but in your loops you also have _allocations_ , which
have their own costs and can complicate things. And the computation is not
free either, though at sufficiently large sizes the memory accesses should
dominate sind and tan, I think.

It also matters how this code is compiled exactly. The C# version (I know
nothing about C# or how good its compiler is) looks like it must first
allocate some kind of dynamic stream, and only when ToArray() is called can it
allocate the final array, so there might be extra copying. Maybe the compiler
is smart enough to optimize a sequence of arr.Select().ToArray() to allocate a
target array of the size of arr right away, I don't know.

Also, the JavaScript version uses a smaller array than the C# version, is that
on purpose? 1000000 unboxed doubles are only 8 MB, which is not very big: On
the machine I'm typing this on, L3 cache is 6 MB.

My advice would be to run the JavaScript version many times, for many more
than 8 iterations, and with sizes increasing stepwise up to a GB or so. Also
try replacing the maps with preallocated arrays and hand-written loops that
contain only the computations, not the allocations. I know this sounds like
I'm trying to give you homework, which I'm not, but benchmarking is hard, and
there are many factors to take into account.

~~~
tom_mellior
> And the computation is not free either, though at sufficiently large sizes
> the memory accesses should dominate sind and tan, I think.

Looks like I was wrong about this! You might want to retry your experiments
with cheaper operations than sin and tan.

I wrote a little C benchmark to test this more:

    
    
        #include <stdio.h>
        #include <time.h>
        #include <math.h>
    
        extern void sinTanSeparate(double *a, double *b, int n) {
            for (int i = 0; i < n; i++) {
                b[i] = tan(a[i]);
            }
            for (int i = 0; i < n; i++) {
                b[i] = sin(b[i]);
            }
        }
    
        extern void sinTanFused(double *a, double *b, int n) {
            for (int i = 0; i < n; i++) {
                b[i] = sin(tan(a[i]));
            }
        }
    
        #define N (128 * 1024 * 1024)
        #define RUNS 5
    
        double a[N];
        double b[N];
    
        int main(void) {
            clock_t start, end;
    
            printf("will do %d runs over %zu MB of data\n\n",
                   RUNS, sizeof a / (1024 * 1024));
    
            for (int i = 0; i < RUNS; i++) {
                start = clock();
                sinTanSeparate(a, b, N);
                end = clock();
                printf("separate: %f sec\n", ((double) end - start) / CLOCKS_PER_SEC);
            }
    
            printf("\n");
    
            for (int i = 0; i < RUNS; i++) {
                start = clock();
                sinTanFused(a, b, N);
                end = clock();
                printf("fused:    %f sec\n", ((double) end - start) / CLOCKS_PER_SEC);
            }
    
            return 0;
        }
    
    

Compiling this with gcc -O3 gives:

    
    
        will do 5 runs over 1024 MB of data
        
        separate: 1.461349 sec
        separate: 1.020120 sec
        separate: 1.019002 sec
        separate: 1.019888 sec
        separate: 1.018454 sec
        
        fused:    1.014774 sec
        fused:    1.014724 sec
        fused:    1.013895 sec
        fused:    1.016440 sec
        fused:    1.013729 sec
    

So _almost_ no difference, though with enough runs I think this would be
significant. Interestingly, although C is not JIT compiled, even here there is
a "warmup" effect. I guess these are initial page faults or something.

But if we now comment out <math.h> and instead use some cheap "fake"
implementations of in and tan:

    
    
        // #include <math.h>
        #define tan(x) (x + 1)
        #define sin(x) (x + 2)
    

we get very different behavior:

    
    
        will do 5 runs over 1024 MB of data
    
        separate: 0.548558 sec
        separate: 0.154741 sec
        separate: 0.151271 sec
        separate: 0.150542 sec
        separate: 0.151337 sec
    
        fused:    0.078880 sec
        fused:    0.074742 sec
        fused:    0.078313 sec
        fused:    0.076987 sec
        fused:    0.077729 sec
    

Here the computation is so cheap that it's really other effects that dominate,
and you get a 2x difference.

------
pierrebai
YAGNI

99% of problems one encounters while programming can be solved in C++ with
std::vector and functions taking a vector in and producing a vector out.
That's my main problem with FP and many other language making bold claims:
oversell. That simple fact is that for most computing tasks, you don't meed
much more than simple types.

~~~
talaketu
Heck, even a turing machine could solve most of these problems.

~~~
mc3
How highfalutin! It's x86 assembly for me!

~~~
temp1999
x86 is more complex than a Turing machine.

------
lidHanteyk
Unfortunately, the author doesn't actually understand functional programming;
they think that it has to do with loops and mutability. Also, they cannot get
outside of their "I worked really hard for my PhD" mindset for long enough to
consider programming as it actually is, rather than as they want to imagine it
to be.

~~~
KirinDave
May I humbly recommend you read the article before leveling a complaint about
its content, as at least some of this was clearly addressed in the second
section, starting above the fold.

