Hacker News new | comments | show | ask | jobs | submit login
Why category theory matters in programming (iheart.com)
129 points by kailuowang 5 months ago | hide | past | web | favorite | 45 comments

Eh, as someone who works in scala day-to-day, and has studied actual category theory, I think the emphasis on category theory itself is silly. And in fact, most of the time, more category theory and more monads will make your code worse (e.g. MonadTransformers are D: )

Future, which is really a souped up promise abstraction, is the main star, and for-comprehensions are nice syntactic sugar. Being a "Monad" doesn't help you do anything with it though. And Future technically doesn't even truly obey the Monad laws, because of exceptions.

It's useful that a couple of container classes in scala use implement this trait:

    trait ChainableContainer[A] {
      def map(fn: A => B): ChainableContainer[B]
      def flatten(nested: ChainableContainer[ChainableContainer[A]]):ChainableContainer
But beyond making it a little easier to guess what's going on, the underlying math isn't useful in actual programming.

That has a little to do with Scala just being a pretty bad place to implement most of these things. E.g., Monad transformers work just great in Haskell but are terrible in Scala due to the sort of terrible ways type parameters and inference work together.

Future also doesn't obey the monad laws for not being RT and that causes a lotttt of complexity.

OK, I'm another data point for "studied category theory; program in an ML dialect; think it's silly".

Most all the theorist/logician/algebraist-turned-programmer folks I know all feel more-or-less the same way.

Just explain the pattern in english. Might as well call it a FooDeBar. Giving axioms and such is usually a waste of The Man's money and my time.

That said, I think it's extremely valuable to see this sort of thing in an academic or side project/enrichment activity setting. The mindset and mental model are fantastically helpful. See dshnkao's post, for example.

Just to be clear, you're advocating:

1. Don't use terminology from academic category theory / abstract algebra in computer programming

2. But if you haven't studied these things, and you are a programmer, it might be very enlightening to do so.

Is that right?

> category theory / abstract algebra

Woah there!

But otherwise, yes.

Most working mathematicians have seen category theory and borrow bits and pieces of notation and such. But most of their work is done in far more concrete settings.

Working programmers should adopt the same arrangement.

It's a subtle position, but not an inconsistent one. I promise :-)

It's not just monad transformers, monads themselves are pretty awkward to use in any language that's not Haskell. And actually, a lot of FP languages simply don't even have monads at all (OCaml, F#).

Monads should never have escaped Haskell.

> Monads should never have escaped Haskell.

They should have put them in a monad.

Isn't a Monad just the fancy name for a type/structure that implements map and flatMap in a lawful manner?

Close: the required operations are `flatMap` and bind/pure/point/lift/return/whatever-you-want-to-call-it.

You can derive `map` from these two other functions so it doesn't need to be part of the interface.

Bind is flatMap (>>=)

Confusingly, map in scala is fmap in Haskell

Oops you're right of course. And I can't edit my post.

I was reading very famous paper[1] today, and I stopped at the following:

    "To get an adequate definition of g [<] f we must therefore consider all possible algorithms for computing f."
You see, I was reading this paper because I'm currently looking into various ways of that the symmetries of Turing machines are modeled (I.e., the fact that a TM can emulate any other TM would seem to imply that the concept of an 'algorithm' must have fundamental features that are preserved across this kind of virtualization. Closely related to the concepts of an Oracle, a Non-deterministic TM, and recursion.)

Now, this passage stood out to me because it seemed to be a set-theoretic, indirect phrasing of the exact symmetry I'm talking about. We have to consider all algorithms for computing f in order to recover the features that must be true for all f, and thus give identity to the notion of a general algorithm for f. Which in turn is used to establish theorems about such a general algorithm notion; (critically, its "relative complexity".)

My point being that I would not have expressed this notion in set theoretic language because my intuition wants it expressed in algebraic language. Things are identified by their symmetries, and building theories around symmetries directly is more efficient and insightful than building theories around how those symmetries are embedded in set theory. And when I sit around thinking about how to design a good API or an efficient program, I want to do it by stating what must always be true about the problems I'm solving, in a high-level, abstract way. (I'm not so good at it in practice.)

Set theory may be a simple model of many things, but it is only really intuitive for things that can be easily described in terms of unions and intersections. It is often just extra work elsewhere, at least if you have a frame of mind to see other ways.

[1] https://www.cs.toronto.edu/~sacook/homepage/rabin_thesis.pdf

as someone who also works in scala, MonadTransformer is so much better than gigantic nested flatMaps and pattern matchings

Functional Programming and Category Theroy are great, there's a strong level of robustness in the whole approach to programming.

The author slams imperative programming, it isn't backboned on a robust compositional system, as is often done by FPers

BUT :)

there's lots of good programs in both styles....to me, this implies that there are more important things that matter to programming, the design to achieve a purpose. It's nice to express your designs in a robust system ( and be influenced by it), but what matters most is the design choices you make. When I look at well programmed things it's always the design choices that impress me.

Coming from HPC, I think imperative programming will stay as long as we don't have an AGI that does the actual programming for us, as it's the model with the highest possible abatraction that still gives you the whole truth about the underlying memory operations (and thus allows you to optimize for it without having your hands tied behind your back and just trusting the compiler gods). Not saying that this is how one should usually program in a context where performance is not the highest goal, but there's always going to be these usecases.

Agreed. That said, the problem is that imperative programming is still used widely in contexts (eg, 95% of web programming) where performance is not an issue and high-level abstractions such as those used in FP are more appropriate. This is almost entirely due, imo, to the paradigm's inertia and to bad education.

The wonderful thing is that we can mix the two.

One great method is to write your program using a nice functional language (Haskell, CL/racket/scheme, etc.), and where you want optimizations, replace functions with lower level imperative language implementations (C/C++, rust, etc.) and FFI.

Category theory is fine, I guess, but there's no need to invoke a sledge-hammer when talking about 99% of most work here.

It's a little like saying, yes, fine we can reference the hyperreals to derive infinitesimal calculus in a beautiful way, but it's mostly abstract and highly involved, even to just understand the foundational aspects (which one may argue is the case for normal calculus, but the notion of local linearity I believe is much more obvious than an extension to the reals). Though it's a nice academic exercise, which may even yield some insights, it's rather obscure and unintuitive for almost all purposes.

This is the way I view category theory and functional programming; though I'd love to be corrected if I'm wrong. There's no need to invoke almost all of it, since most things can be expressed quite clearly in the language of set theory need there be rigour.

But infinitesimal calculus is all about local linearity, in the analytic formulation you're taking the closest linear approximation to the function at that point. Using the dual numbers approach, there is an infinitesimal (nilpotent) region around each point which is linear.

Category theory matters to programming because, among other things, it teaches us how to abstract over programming languages. What better definition of abstract programs than "things with types which compose"?

Now consider categories with various types of constructions (products, limits, exponentials, etc.) and you'll notice they correspond to requiring certain features in your language.

That's nice, but form a hard-nosed engineering perspective, not just sometimes, but in general 19 lines of yucky procedural code that pretty much anyone can understand (and debug) are usually way, way better than 9 lines of "elegant" functional code comparatively far fewer people can (really) understand - and which can be comparatively far more sinister and nefarious to debug.

Given that, after all, it's just posting likes we're talking about. You'd think that with such a bold proposition as "Why category theory matters -- no really, it does!" this would be about hot-synching whole data centers, or getting drones to deliver medicine in Africa, or something like that. You know, something perhaps unthinkable (or maybe just far less tractable) with procedural (or just less sophistical functional) code.

But no, in this case apparently it's about... updating your FB likes.

> 19 lines...pretty much anyone can understand

> 9 lines of...functional

First of all, those numbers are more like 1000 vs 200, but lines of code isn't a useful metric here.

The thing functional programming does for us is not hide the functionality of our code, it is to make it more consistent.

Category theory shows us ways that our code can be more consistently organized, so that we don't need to reason about the entire codebase when writing the code that guess it together.

This means we can create modules that are so compatible, all it takes to compose them is a simple map or fold.

While it may take more effort to understand the functional language itself, you may find that effort is comparatively less than learning what someone's imperative code does.

I always know right away what map and foldl will do, but I can never know what a loop does without reading through that loop.

> and which can be comparatively far more sinister and nefarious to debug.

This is kind of a false dichotomy as well written fp, type code is tends to have fewer bugs. Also your argument that more people understand it doesn't mean much as few people understand something they haven't been exposed to.

> hot-synching whole data centers

It actually does. There was a podcast with Paul Chiusano who is a big proponent of this type of programming http://futureofcoding.org/episodes/10-unisons-paul-chiusano-.... IIRC Spark is based on Algebird which is somewhat close to this type of programming. Commutativity is a life saver for distributed computing.

Commutativity is a life saver for distributed computing.

Nice to know, and I wish I understood these issues, and these kinds of arguments better.

I'm just saying - the original blog post didn't come anywhere near to making that kind of an argument.

> This is kind of a false dichotomy as well written fp, type code is tends to have fewer bugs.

I tentatively agree but I would caution you against making such claims in public because they are hard to rigorously justify and can end up making functional programmers just look arrogant.

In my experience FP code is more buggy, but I imagine there's a lot that goes into it

What particular language are you talking about?

Mainly scala, but I've used some clojure as well. Ever since learning Go, I just haven't looked back. I think FP does well with streaming problems, but outside of that it seems to have a cognitive overhead that doesn't pay

It's a cheap and shallow shot you're making, criticizing the article for using facebook likes as an example. I'm a programmer with some math background but who has never tried hard to understand the more theoretical sides of languages like scala. The article is a very good, even slightly inspirational, introduction to how category-theory type thinking can arise when you start to abstract some common patterns in programming. The author clearly went to a lot of effort to make this explanation available, and obviously not in her/his native language either.

Anyone put off by some of the negative responses in this thread, note that among the people saying "FP isn't necessary/helpful in real-world programming" the same people are saying "but understanding this stuff is very helpful to your development as a programmer".

> The reasonable programmer adapts his code to be read by others; the unreasonable one persists in trying to adapt others to be able to read his code. Therefore all progress depends on the unreasonable programmer.

-- Oscar Wilde

I disagree that "far fewer people can (really) understand" functional code. It's more like far fewer people care to. Either way, that's a separate matter from whether category theory matters to programming or not. I happen to think it matters in the same way theoretical physics matters to engineering: that is, eventually engineering will make use of it, and sometimes an engineering need drives the research in the first place, but as a matter of day-to-day practice they're effectively completely separate fields.

Although apparently some coders desperately want to think their glorified bit-shuffling is some deeply abstract mathematical endeavor.

> hot-synching whole data centers

That is actually what Facebook is doing with their spam detection code - their purely functional architecture makes it easier and more reliable for them to hot-swap out parts of their codebase when the need to update it.


Er... I think that people tend to use the if then else approach, because it is actually more flexible, in the sense that you may actually want to do something else than return false when something wrong happens in one of the functions in the composition chain...


  category * -> *


> So, is there a simple and universal way to compose these functions?

proceeds to add layers of complexity

Why not just write a function to compose them? Argue against the competition, not strawmen.

Most code that is written has effects - null returns, error returns, exceptions, IO, etc., etc. We can write a custom function to compose every pair of functions we write, but some clever people figured out there were patterns to this madness - map, flatMap, lift, return, etc. - and it's far more efficient to write those patterns once and reuse them.

Essentially, it's DRY at a higher level of abstraction.

You still have to write each composition operator separately; writing the code to compose null returns doesn't give you the code to compose IO. Only difference is whether they have separate names.

That's not the only difference. You get to reuse any code written using only those names (that interface).

What language are these coffee examples written in?

The article doesn't give me a reference for what syntax I'm trying to parse, and while most of it is easy to guess, it would be helpful to have an explicit reference.

That would be Scala. (http://www.scala-lang.org/)

This kind of contribution always make me hesitate betweeen two reactions:

1.Man I must be dumb I can't even begin to understand what matter

2.Can't we achieve/explain the same goal by keeping it simple stupid

> 2.Can't we achieve/explain the same goal by keeping it simple stupid

We should keep it "as simple as possible but no simpler". Sometimes "as simple as possible" is actually somewhat challenging ...

Applications are open for YC Summer 2018

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact