> If you are looking to learn the language, I recommend Learn You A Haskell.
I definitely don't recommend this. While cute, Learn You A Haskell is lacking a lot of pragmatic advice about how to write software in haskell. It's good for getting you up to the FAM trio of type classes but leaves out a lot of practical advice around things like navigating base (what to use and ditch from it), the available community libraries, topics like monad transformers, and pragmatic advice around things like error handling, testing, and isolating effects from tests that come up very quickly in real world Haskell development.
Feels worth mentioning that Learn You A Haskell is about 1/3 the length of the book you mentioned.
I agree that it won't teach you all you need to know to be really proficient with the language, and also wish it didn't omit monad transformers in particular. I still like that it's an easy read, and that I could hand it to someone with pretty much zero knowledge of FP. I used sections of it as a first resource, and then learned the rest by reading documentation, which is how I usually deal with new languages.
I haven't read the book you suggested, so can't comment on the content, but seeing that the first chapter is on lambda calculus makes me a bit skeptical. Perhaps you can tell me, is the book similarly beginner friendly?
I have tried several times to learn Haskell. The only resource that actually worked for me was Philipp Hagenlocher's "Haskell for Imperative Programmers".
It is incredible. He does an amazing job of starting from the basics and then diving deep. After going through the series I had a solid enough grasp on the language to use it for practical things.
I'm currently reading Graham Hutton's "Programming in Haskell" (2nd), which is also another great resource. He also has YouTube videos [1] for chapters in the book. Philipp's videos are great too, which I also watch after finishing a book chapter.
Going to give this a try later. I've had the same experience, and every time hit a point of going "I'm X levels deep in syntax I don't understand and can't see a benefit in sight..."
Haskell feels like a language that's full of great ideas but overall nobody really likes it. Maybe it just needs to simmer for 30 years until someone comes up with whatever Haskell's equivalent of C++'s Rust is.
I learned Haskell around 2010 starting with LYAH followed by Real World Haskell. I too wouldn't recommend LYAH for actually learning Haskell in $CURRENT_YEAR now that there are manifold and much better resources on the topic now, such as Haskell From First Principles. LYAH was quite hilarious and entertaining, but I think its time has passed. I don't use Haskell a whole lot these days, but to this day Haskell is still one of my Blub languages.
This conversation reminds me I need to get back to No starch's book about learning Physics with Haskell. Grabbed it cheap but haven't gotten back to it.
And you definitely don't need to learn category theory or know anything about it to start writing programs in Haskell or to maintain and build large Haskell applications.
A lot of category theory is fun to learn and knowing it will help you in many ways but it is not a necessary requirement. I mention this because a lot of people seem to think it is.
Definitely a nice article for the math-oriented/curious/etc! It's neat seeing what Haskell isn't in order to understand better what it is.
> A lot of category theory is fun to learn and knowing it will help you in many ways
I agree with your statement except that part.
W.r.t coding, learning functional programing will help you in many ways. Learning category theory will just allow you to say "ah I've seen that concept before" - but it won't really help much.
The number of times I’ve heard someone say that some academic topic was not helpful and they were right is close to zero. It’s a pretty good heuristic to not adopt that attitude.
This statement reads as really weird. You seem to be saying that your opinion and/or experience on the usefulness of academic topics is a good heuristic to use to not adopt an attitude.
Indeed you’re free to ignore my perspective, which is just what my comment offered… But this is something I’ve seen play out over and over and over again.
It probably started all the way back in high school with people in the back of the class asking how all this high school maths would help them in real life. Then you get to university and people ask the same in just about every class. Then you get to industry and all the people who had that attitude growing up are calling you for help (and the people who still have that attitude seem to stop learning, or learn at much lower velocity).
It’s always more obvious in hindsight, and particularly with maths: stuff always seems kinda useless before you deeply understand it. That’s unfortunate, because it severely hampers the motivation to learn the stuff in the first place… But it is what it is, I guess.
Even pure mathematicians, who have every incentive to see their own work as immensely useful, are notorious for frequently having no idea just how useful and applicable their work will be in the future (albeit usually over longer time horizons).
My experience is that this is even more true for cross-disciplinary applications, as either discipline naturally is unaware of the perspective of the other discipline. It’s true that it’s important/necessary to specialize deeply and narrowly, but the value of cross-disciplinary skills (especially regarding maths) is probably under-appreciated, IMO.
While that is a great general rule, everyday life isn't generalities, it's specific cases.
Outside of game programming, topology isn't useful knowledge for software engineering. A clear exception to your rule above.
Being an strict adherent to a heuristic and not examining the specifics of a situation is in my opinion a poor habit. Superficially pattern matching will lead people down the wrong path more often.
I've found that knowing categories to the extent that Haskell teaches them through osmosis is useful. For example, knowing functions are functors in the return type and cofunctors in the argument makes grasping co/contravariance in OO much easier.
I haven't found any utility to the CT that is above the scope of Haskell; it feels like if it's too advanced to implement in Haskell, it's too advanced to find a use case anywhere.
You can pretty easily invert this principle in order to make it useful. When designing new systems, make it such that it mirrors that concept you've seen before. In making it such, you now know that certain properties will hold and the system will be easier to work with due to this, having less inelegant, sharp edges.
Agree. I haven’t seen a single practical example of Category Theory being used to come up with a solution that was better than a solution not inferred from Category Theory.
Enjoyed the article as a functional programmer and category theory enthusiast. Haskell does not have to be a strict encoding of Category for it to be useful. We are grown ups and can understand that not everything is perfect and we can overlook some warts in order to benefit from a beautiful set of rules that help us think about composition as a first class citizen in our work.
The first thing mathematicians tell you when talking about categories of, say, sets, we have to do some hand waving to avoid Russel's Paradox. In other words even mathematicians have caveats when talking about different aspects of category theory.
In the case of the Functor, we use fmap and not a "function that lifts functions" because it is much more ergonomic. Yet, it is just as "correct" since it is isomorphic to lifting a function and applying it.
With monads we use bind and flatmap although the most obvious categorical implementation would be a simple compose function. That maps more directly to the "category of endofunctors", but again it is isomorphic to the same code that uses flatMap/bind, and the same laws hold.
A lot of utility has been gotten from analogies with category theory and as long as we hold our noses we can ignore the occasional whiff of in-authenticity.
I don't believe that you need to learn category theory to take advantage of the concepts we have borrowed, but it does make everything become more coherent to dig into it a little. It's also a lot of fun if you like that sort of thing.
LYAH is a fun read, but for someone trying to learn the language seriously — by this I just mean more than a casual tour to get an idea of what the language looks and feels like — there are, thankfully, some much better resources and books available these days. No shade to LYAH intended whatsoever, but it's just not what I'd recommend for some undertaking really learning the language. https://haskellbook.com/ is a solid option IMO.
I'm confused by his argument that Haskell functors are only endofunctors because everything is Hask.
I thought that even if everything is part of one category, if the input is a different category that is a subset of the one category and transformed to another category, even if a subset of the same category, it could still be a regular functor instead of an endofunctor.
For example, lets say you had a function f(x) that takes unsigned ints and returns the signed int inverse (so f(1) = -1). -1 is not part of the unsigned category since it is below the Initial value of an unsigned number (0). I'm a total newbie to category theory so I'm not sure if he's wrong or I'm missing a key detail here.
You could probably try to talk about functors between subcategories by using classes, though I think there's no way to define a functor from one class to another in Haskell.
Slightly more annoying is the fact that Haskell has no real product types and instead does more currying than an Indian takeaway. This makes various constructions in Category theory a bit more annoying than they need to be. If you ever wondered what on earth Applicable is and why you need it to fmap a function with multiple arguments, this is why.
There's also the fact that Haskell doesn't really enforce the functor laws, nor does it give any way to really state them. Basically it uses Category theory as a close enough approximation for what it does, but various finer details get lost in translation.
What is an example of category-theory functor in Haskell, that isn't just an endofunctor?
A functor F from C to D is a mapping that associates each object X in C to an object F(X) in D.
In Haskell, C and D are both the same (Hask types), hence endofunctor (functor into self).
In general a functor could be something like `prime_factorizaton:: Int -> [(Prime Int, Mulitplicity Int)]`, with morphisms being things like the operation
a => square :: a -> a
square (n::Integer) = n^2
square (factorization::[(Prime Int, Mulitplicity Int)]) = map (\(a,b) -> (a, 2*b)))
prime_factorizaton (square) x === square (prime_factorization(x))
is a Haskell Functor (endofunctor), mapping (->) to (->).
Either :: Type -> (Type -> Type)
is not an endofunctor, it maps a type to a type constructor and thus maps (->) to (~>) (natural transformations).
If you uncurried Either, you would also get a non-endofunctor which maps from (-×>) (product category) to (->).
Either' :: (Type, Type) -> Type
Every function is a functor between equality, since equality satisfies congruence:
isPrime :: Integer -> Bool
is a functor between (:~:) @Integer an (:~:) @Bool.
Boolean negation is a function between "less than or equal" to "greater than or equal" categories; mapping (<=) to (>=):
Hmm, well calling it a problem might be a bit much, but it means that functions with more than argument require some additional care, and technically a stronger notion of functor.
Say you have a multivariate function a: X x Y -> Z, and a functor F. This would give you a function F(a): F(X x Y) -> F(Z). Now this is not really multivariate any more as F(X x Y) is not a product, but products behave quite well in category theory so this is not usually that big a problem, and it turns out that quite a few functors are 'continuous' which means they preserve 'limits' and a product is one such limit, in which case F(X) x F(Y) ~= F(X x Y). In particular all right adjoint functors have this property, which is why adjointness is such an important concept.
Now in Haskell things get slightly more difficult, because you don't have a function a: X x Y -> Z, you've got a function a: X -> (Y => Z). Where Y => Z is an object representing all maps from Y to Z (such an object may or may not exist for any particular category). And to use currying you need something like (X * Y) -> Z and X -> (Y => Z) to be isomorphic. So you don't just need F(X x Y) = F(X) x F(Y), which is fairly innocuous, you need to deal with monoidal categories and monoidal functions between them, which is a heck of a lot more complex. This is where the Applicative stuff comes from.
Now some of the stuff you don't get as easily in Haskell is the stuff relying on basic properties of sum and product types, such as
# The following are the obvious projections for a 2-tuple
proj1 :: (X x Y) -> X
proj2 :: (X x Y) -> Y
# Assuming F(X) x F(Y) exists we get the following for free
(F(proj1) x F(proj2)) :: F(X x Y) -> F(X) x F(Y)
No clue how you'd state that in Haskell. And the following
# The following are the obvious projections for a 2-tuple
anX :: X -> (X + Y)
anY :: Y -> (X + Y)
# Assuming F(X) + F(Y) exists we get the following for free
(F(anX) + F(anY)) :: F(X) + F(Y) -> F(X + Y)
where you can interpret X + Y as an 'Either X Y'.
Sure you could probably write a function
forall Functor f. Either (f a) (f b) -> f (Either a b)
but it's not that obvious that it should exist, whereas in category theory it's one of the most obvious constructions imaginable and its existence is clear from the definition. And in Haskell you wouldn't think about using products so the first property is probably barely used anywhere (there's probably a right inverse for zip somewhere: unzip :: [(a,b)] -> ([a],[b]) but Haskell generally doesn't seem to like functions with multiple outputs.
I believe the heart of what you're asking is "why do we distinguish between codomain and range?" and relatedly why do we even need the concept of "surjectivity?" For example, the function f(x) = x + x could be thought of a non-surjective "endo-"function (endomorphism) `N -> N` (where N is the natural numbers) or it could be thought of as a surjective function from `N -> E` (where E is the even natural numbers). Why do we need the "looser" version of `N -> N` when we can do with the tighter bound `N -> E`?
As with many definitional things in pure mathematics, the answer is "there's no absolute iron-clad reason why you can't think of things that way, but there are a variety of reasons both mechanistic and motivated by intuition why you might not want to do that."
To motivate it via intuition, let me start with another example. Take the function that maps every natural number k to the function f(x) = x + k (so it's a function whose codomain is `(N -> N)` or also expressed as `N^N`, allowing again for the fact that N^N is again "too loose" since we only care about functions of the form f(x) = x + k). Now someone could come along and say "well, f(x) = x + k" can be simply identified with just k so really the function we have is the identity function on `N -> N`, but I think most people would object and say that `k` is not the same thing as `f(x) = x + k`. The former can be thought of as an encoding of the latter, but it's not the same thing. So saying that the codomain of our function is N^N is in some sense more accurate than N and instead we can say that there's a way of embedding a subset of N^N into N via encoding f(x) = x + k as just k.
The crucial thing here is that deciding whether we are dealing with an artificial encoding of something, or the real thing itself is what lets us distinguish between saying that the function's codomain is N vs N^N, and in some sense the former feels "wrong" even though it's a "tighter bound."
Let's develop that intuition a little bit further. Imagine you have a function that takes a rational number and maps it to the largest integer smaller than it (that is the floor function). You could think of this function in one of at least four ways:
1. Q -> Q: A function that maps a rational number back to a rational number
2. Q -> Z: A function that maps a rational number to an integer
3. Z -> Z: Because rational numbers are countable, every rational number is "just" the indexed form of an integer, so "really" what we have is a function from Z -> Z
4. N -> N: Well by the same logic, every function on integers is just a function on natural numbers!
I think the last two probably feel the most "unnatural" for folks, and most representative of the problem of "representation vs what an object 'really' is" I talked about earlier. But the first two have what is basically the same choice, just in a context where the answer isn't as clear. Do you think the integer 2 really is just the same number as the rational number "2 / 1?" Or do you think that they are different from each other and an encoding process is necessary to move from one to the other? In some cases it's useful to say "yes every integer is exactly a rational number" and in that case we can say that Q -> Q reflects this intuition best. In other cases it might be more useful to say "no an integer is not a rational number, but there is a canonical way of identifying every integer with a certain rational number" and in that case we can say that Q -> Z reflects that intuition.
So your choice of codomain reflects what you think an object "really is" as opposed to just being an encoding. In this case we say that choosing Hask as the codomain (or "target") of our functor, hence making it an endofunctor, is a choice that reflects an intuition that the result of mapping using a Haskell functor is really just another type, not some other object that can be thought of as a type.
From this intuition there flows a variety of practical, mechanistic ramifications. Almost always when talking about mappings we are concerned about how they affect the structure of the source and target. When we say the codomain is the same as the domain, we also almost always mean that structurally they are the same and this considerably simplifies the analysis we need to do later, because we can reuse a lot of the same machinery in the domain and codomain. If you say that the codomain and domain are not the same, then all of a sudden you have to concern yourself with the structure of the codomain and domain and prove that all the equivalences and isomorphisms you want to hold and preserve the properties you care about do actually exist and do actually preserve everything you want preserved and it's just a lot more annoying.
So at a mechanistic level, what we gain from saying that "Haskell functors are only endofunctors because everything is Hask" is that we know we never map into something that has more structure than Hask; it always has the same structure. So we don't have to concern ourselves with a bunch of exploration into what happens if we map into a category with more structure than Hask.
Thank you for the incredibly thorough writeup. I've read it once but I can tell I'm going to need a few more passes at least before I fully grasp everything because there is a TON of nuance (which does not surprise me I know just enough to realize Category theory by nature is full of nuance).
A question for you, do you think there would be any value in a programming language that DID include the need to concern itself with the structural differences between two categories or are we getting into issues like too much boilerplate to encode anything useful territory and similar?
> A question for you, do you think there would be any value in a programming language that DID include the need to concern itself with the structural differences between two categories or are we getting into issues like too much boilerplate to encode anything useful territory and similar?
Maybe but I'm skeptical. There's something nice about bringing some notions of mathematical rigor to code, but I'd be cautious.
Most day-to-day engineering consists of broad, but shallow insights. Only a few parts of an application really require the deep insights that abstract mathematics would provide.
I think developers as a whole would benefit from a stronger theoretical math background because there are indeed a few crucial spots in an application where it's extremely helpful, but I'm not bullish around building a programming language explicitly atop say category theory, precisely because the majority of the work would probably only be made more brittle to future modifications with overly insightful code (in the sense of this article: https://www.hillelwayne.com/post/cleverness/)
Haskell is a pretty fun language to learn and if you’re interested in trying to translate concepts from category theory into programming, it’s probably a good language to use.
Haskell is not category theory in the same sense that arithmetic is not group theory: there’s only really one category. Well sort-of because type constructors let you work in different subcategories.
Category theory is about different categories and the nature of the relationships between them. It usually operates at a level of ‘let C be a category…’ with concrete categories or functors being examples. Haskell operates at a level even more concrete than single objects in a single category: the objects are types and Haskell operates on the values of those types. Of course Haskell can work at slightly higher levels too, but it never really escapes categories that look a lot like Hask.
Maybe that statement is too strong though. I’d be keen to see some interesting actual-category-theory level things in Haskell. (Maybe I’m being unfair and these do exist but I am considering them to not count as ‘actual category theory’)
I thought that the ‘categorical theory of patches’ thing was a good example of category theory applied to programming but note that it didn’t require any kind of general notion of e.g. push-outs in a programming language.
From my late-beginner Coq student perspective, I'm amused to think that the loop detection in Haskell (which allows you to distinguish divergence from a specific instance of a type) might have made Haskell less sound from a type theory perspective, because it is in some way intervening in the calculation to add or detect distinctions that would otherwise not be a part of the underlying theory!
Well, maybe that's not the best way to put it: but the example of using the <<loop>> detection to distinguish a loop from a concrete instance is something that Haskell added presumably for users' convenience; if it didn't have that feature, you would not, in fact, be able to distinguish the infinite loop from a specific value in finite time, except by manually examining the implementation!
IMO after the first section the headings in the post are misleading:
- A functor is "not a functor" because not all endofunctors on Hask can be written as Haskell Functors.
- A monad is "not a monad" because you can implement a monad without satisfying all the monad laws.
I don't think either of these statements _really_ evidence the claims the headings make. Good post though, I always find that it's useful to think about the subtleties between mathematical objects and their implementation.
I do actually agree with you. The goal for me was just to understand the connection to CT a bit more precisely, not to imply that it isn't useful. I guess I didn't convey that part very clearly, which together with the tongue-in-cheek title might make me sound like quite a pedant. The post is the culmination of me already knowing Haskell, googling category theory one day, and being utterly confused.
I might be a counterexample to the claim that "this is pedantic and the differences they're mentioning are meaningless in practice". I learned about functors in uni and only later took a look at Haskell. I was genuinely very confused by the naming of Functor: functors map both morphisms and elements, but this isn't the case in Haskell. So I was left wondering whether I had missed something.
ah, sorry! right – the type constructor and fmap together make up a functor.
I meant functors where the objects being mapped aren't types, but type inhabitants, and the morphisms being mapped are not arrows between types, but arrows between type inhabitants. (I suppose this also has to do with "the only category in Haskell is Hask", rather than types also being themselves categories?)
What are the prereqs for learning category theory? I looked up a definition online and it had a lot of stuff I've never heard before. What exactly is a mapping in this context
I definitely don't recommend this. While cute, Learn You A Haskell is lacking a lot of pragmatic advice about how to write software in haskell. It's good for getting you up to the FAM trio of type classes but leaves out a lot of practical advice around things like navigating base (what to use and ditch from it), the available community libraries, topics like monad transformers, and pragmatic advice around things like error handling, testing, and isolating effects from tests that come up very quickly in real world Haskell development.
Haskell Programming From First Principles is a much better book IMO: https://haskellbook.com/
I haven't written Haskell in a while, but there may be even better books today as well.