Hacker News new | comments | show | ask | jobs | submit login
Category Theory for the Sciences (mit.edu)
209 points by 0xmohit on July 23, 2016 | hide | past | web | favorite | 75 comments

This comes up every so often and people argue over the merits of learning category theory or not or how useful or abstract it is. The answer is it is very abstract and you shouldn't learn it expecting great dividends in your day to day work. Like with most mathematics you should learn it because it will change the way you think and solve problems. I definitely don't use calculus in my day to day work but the idea of infinitesimal quantities, continuity, and linear approximations have more than once helped me come up with an approximation to a problem that has made the solution tractable.

If you are expecting to apply something immediately to your work then combinatorics and probability theory are way more relevant and practical for day to day programming work.

In my line of work, I've been well-served by first-order logic, graph theory, and automata theory. But what to learn depends as much on what you know as it does on what you do. For example, if you don't know set theory, you should probably start there because practically every other branch of discrete mathematics rests on a foundation of set theory. If you do know set theory, you can pretty much take your pick among the branches of discrete math. Of course, there's more to math than just the discrete variety, and although I feel as though discrete math has served me the most in my programming career, it would never hurt to study numeric analysis/methods, the infinitesimal calculus, etc.

Abstract algebra has also been a fun subject for me, but hasn't really affected my programming as much as I'd anticipated. I suppose by the time I learned about abstract algebra, I'd already been well-versed in ADTs (both kinds - abstract and algebraic data types) and object-oriented design, and, apart from some new terminology, the underlying concepts didn't feel particularly new or profound to me.

But category theory has direct application to programming: (ignoring non-termination), types and functions form a category and most of the concepts are directly applicable. Functors/map are of everyday use and other CT results are commonly used. Even JavaScript is getting functional, so CT is getting more relevant and practical everyday

I love category theory, but I feel that its penetration of JavaScript or most other languages doesn't (and likely won't) get much farther than monads and functors. Many people who use functional languages don't even know about applicatives, even though monads can be viewed as a specialization them.

If you're thinking about new programming paradigms I think the theory behind CT versions is worth knowing, but if most of your monads are maybes or lists it'll be of little practical value.

> but if most of your monads are maybes or lists it'll be of little practical value.

I think even this is maybe too much: if all you use are monads and functors, then it's easy enough to learn monads and functors on their own, without any "real" category theory.

Then you should learn type theory proper. Although type theory can be expressed in terms of categories that is mostly because type theory and category theory, alongside set theory, higher-order logic, etc. are foundational theories. So even though it is applicable, concepts from type theory are much more relevant, with or without the categorial phrasing.

I know a friend of mine used category theory heavily in his Ph.D. thesis which was related to GIS and optimization. I keep meaning to learn more category theory so I can have a chance of understanding his thesis.

Category theory has served mathematicians well, but only because they had a great mass of concrete detail which needed to be abstracted away.

An abstract framework like category theory can actually be harmful where there isn't that much concrete detail to begin with. A personal example I faced was being overwhelmed by category theory jargon when starting to learn Haskell a couple of years back. My confidence returned only when I realized that there was just one category in play with types as the objects and functions as the arrows. The jargon was unnecessary. Today Haskellers discuss so many interesting issues about the language and its implementation which do not fit into the category theoretic framework at all.

In mathematics, I would describe category theory as an offshoot of structuralism. The main idea is that instead of starting with some "premodel" of what you want to study ("I want to study geometry on surfaces, so I'm looking at subsets of R^3!") and then taking away structure ("...but what is the preferred center of the universe? I better only look at properties which are invariant under translations..."), you try to identify structure in your problem and then determine the consequences. This is obvious in retrospect. If your models describe reality and everything is built out of sets then is 0 an element of 2?

Category theory provides useful tools for identifying structure. It's not necessarily a good vocabulary to talk about these things. Forcing everything into a first-order mold is typically confusing. On the other hand, there has been a lot of work using category theory to solve real problems. This has resulted in some sophisticated tools and a very "scalable" way of looking at things (e.g. universal properties and adjunctions are everywhere).

Category theory is needlessly complicated. There are many redundant side conditions, and the definitions are ... let's be charitable and call them idiosyncratic. This is just more apparent when you apply catgeory theory to programming languages. In semantics you already have a precise way of writing statements (type theory) and you know that everything you can write down is parametric. This eliminates almost all side conditions on most definitions in category theory. So yes, coming from Haskell, catgeory theory probably looks extremely complicated and not worth the effort to learn it.

However, some of the tools from category theory are as useful to computer science as differentiation is to engineering. There are plenty of questions to which you can calculate the answer using some elementary category theory. What is the dual concept to the reader monad transformer? What is the eta law for sum types? Can this data structure be a monad? If not, can I embed it into a monad (e.g. using codensity)? These are all simple questions, but if you don't know the general strategy for dealing with them it seems like I'm asking you to solve a puzzle instead of asking you to compute 7 * 4.

I think type theory is really nice for writing down certain things. Quantified statements are a perfect example! But Category Theory has nice properties, too. You already mention universal properties and adjunctions. Those are obvious examples in the other way.

Much of math is this way. Find a subject that you can see from two perspectives and abuse the best parts of each!

> you know that everything you can write down is parametric. This eliminates almost all side conditions on most definitions in category theory

That's interesting. Can you give an example of a side condition that parametricity would allow you to avoid?

Any function f of type

  f : forall a b. (a -> b) -> F a -> F b
Such that f id = id is automatically a functor (respects composition).

In type theory, every equivalence is automatically natural.

There's a lot more and it's made worse by the fact that most of the side conditions are redundant even in normal category theory. For instance, you can specify an adjunction using two functions F : A -> B and G : B -> A, together with an equivalence phi : forall a b. Hom(F a, b) = Hom(a, G b) such that phi satisfies one additional equation. It follows from this that F, G are functors and that the equivalence is natural. Since a lot of definitions in category theory mention adjunctions you can usually simplify the definitions like this. For example the definition of a Cartesian closed category in most textbooks expands to something like 20 equations, but you can simplify it down to the usual few equations which specify a model of simply typed lambda calculus.

> the definition of a Cartesian closed category in most textbooks expands to something like 20 equations, but you can simplify it down to the usual few equations which specify a model of simply typed lambda calculus

Has that been written down somewhere convenient? I'd like to have a look.

By the way, I found lots of your comments on your post history also very interesting!

I agree that Haskellers tend to get ridiculously over-excited about category theory, but it's simply not true that there's only "one category in play". Categories abound in Haskell, for example the Kleisli category, and any instance of Category that isn't (->).

Thanks. I should have been more careful.

As someone who formally studied abstract algebra.

One of the most applicable use-case I remember being constantly mentioned about category theory was classification of some viruses.

Category theory makes sense when your business is large enough and you are looking to make everything streamlined. You could use some Category theory to rearrange everything from employee, customers and products into better types.

But the idea that you absolutely need to understand this stuff to get going does not make sense to me.

Does anyone here have any experience using the Ologs proposed in the book for doing knowledge modeling or other things? Some anecdotes, or in lieu of that, opinions of this formulation compared to other knowledge model representations would be interesting.

The FQL project uses ologs for data integration. There are many papers and presentations at http://categoricaldata.net/fql.html .

So. Just to clear up which math is good for a programmer to learn.

All computer science is divided into Theory A (algorithms and complexity) and Theory B (logic and programming language design). Applications of category theory are part of Theory B. If you're a programmer who wants to have real world impact, you should study tons of Theory A and completely ignore Theory B.

That sounds inflammatory, but it is unfortunately 100% true. All of Theory B combined has less impact than a single hacker using Theory A to write Git or BitTorrent.

Sadly, Turing wasn't aware of this at the time -- think of the real world impact he could have had.

Turing's work was in algorithms, models of computation, and cryptanalysis. All that is considered part of Theory A, see e.g. http://blog.computationalcomplexity.org/2003/03/theory-and-t....

Take a minute and reflect on the fact that you're arguing Turing had nothing to do with mathematical logic.

His doctoral advisor was Church, his work based on Gödel, his thesis named Systems of Logic Based on Ordinals.

Comments like these show the perils of treating mathematics like a cookbook. It's a very limiting mindset.

I'm arguing that Turing's impact on computing (most importantly, defining the Turing machine) didn't have much to do with logic. His work on logic is also important, but I don't think it's useful for a programmer.

On the other hand the fact that you're reaching all the way to Turing to exhibit a clear impact on practice suggests that for the programmer without such grand aspirations there isn't much to be gained.

I wonder how the Theory A hackers would impacting the world without the programming languages, databases and concepts the Theory B guys made for them. And this progress is still happening today. Technologies like Golang, Apache Kafka or React are changing the face of the industry because they bring new programming models, not just algorithms.

Golang came from the systems programming world, not the Theory B world. In this regard it's similar to C or Java. If you want a language that came from the Theory B world, try Haskell.

It has been noted that the relative poverty of Europe in the post-war period led to emphasis on what you call Theory B in those countries. Hardware or allocated processor time was not necessary to make theoretical progress.

The wealthy US was doing much more in the way of computations on real hardware and algorithms and complexity (Theory A) became a crucial focus for equally pragmatic reasons. You still design the new theoretical object on paper, of course...

I think I've mentioned this theory before, and I'd love some corroboration or counterexamples.

This does surprise me (I never studied computer science). Isn't Theory A mostly about efficiency of implementation? In the real world, surely the ability to model a problem domain well and to map it into code neatly (Theory B) is going to be more valuable than the ability to make some particular implementation super-efficient?

Just on a formatting note, it would be great to have a static generator that takes markdown input and generates this kind of html book with footnotes and references. Similar to [spec-md](http://leebyron.com/spec-md/). Anyone know of interesting projects along those lines?

The only thing that comes to mind is Pandoc [0], [1]. It is worth adding that it was originally created by John MacFarlane, a philosophy professor at the University of California, Berkeley. It's written in Haskell.

[0] http://pandoc.org/

[1] https://github.com/jgm/pandoc

As noted, pandoc. Using pandoc-markdown(5), you have footnotes/endnotes, and can export directly to latex and PDF:

    pandoc -f markdown -t latex -o foo.pdf foo.txt
A footnote is indicated in text as [^tag], say [^1].

    [^tag]: Is the footnote denoted by the corresponding tag.

    [^1]: Is the footnote '1' above.

     Multi-line footnotes may be designated by indenting subsequent paragraphs 1 or more spaces.
I've created long (70+ page) documents using only these elements.

At other times, exporting to LaTeX and adding additional structure or elements may prove useful.


Example: a markup of Dugald Stewart's "Account of the Life and Times of Adam Smith, LLD" I did a few days ago:


I've been having similar thoughts recently, and decided that the best option is probably too use Sphinx. It's reStructuredText rather than Markdown, but that that turns out to be an advantage as RST has a much cleaner extension story than Markdown. And the syntax is pretty similar to Markdown anyways.

One option is to translate markdown to DocBook[0] (such as this tool[1]) and then generate an HTML book from there.

0 - http://wiki.docbook.org/WhyDocBook

1 - https://github.com/MSmid/markdown2docbook

Incidentally, the document here smells very stongly of latex2html or a similar generator.

It's not that grotty. It looks more like TeX2page, from the navigation style in particular (like the online SICP and How to Design Programs).

I didn't mean to imply grotty at all, and apologise if I'd left that impression. I was just noting that this had the hallmarks of LaTeX to HTML translation.

Also the index.

More heavyweight than markdown, but MathBookXML does a lot. https://mathbook.pugetsound.edu/

Particularly of note:

    1.3   What is requested from the student

    The only way to learn mathematics is by doing
    exercises. One does not get fit by merely
    looking at a treadmill or become a chef by
    merely reading cookbooks, and one does not
    learn math by watching someone else do it.
    There are about 300 exercises in this book.
    Some of them have solutions in the text,
    others have solutions that can only be
    accessed by professors teaching the class.

    A good student can also make up his own
    exercises or simply play around with the
    material. This book often uses databases as
    an entry to category theory. If one wishes
    to explore categorical database software,
    FQL (functorial query language) is a great
    place to start. It may also be useful in
    solving some of the exercises.[0]
This is not a novel, and you don't learn by just reading things. You need to engage topics like this in hand-to-hand combat, and if you're not willing to do that - give up now. Your knowledge will be superficial, and ultimately useless.

Added in edit: To reply to both (currently) comments, you don't learn how to program by just reading and not doing. You can learn about programming, but you won't actually be able to program, and your knowledge will be superficial. If that's your objective, just to know about this topic, and others in math, then fine. If you want to be able to use your knowledge in any meaningful way, my comment stands. It's not enough just to read. You have to engage, and do the work.

[0] http://category-theory.mitpress.mit.edu/chapter001.html#lev_...

To be fair, you can and do learn a lot about things by simply reading and trying to understand the material. Notice I said "about", and that is an important distinction. Not everybody needs and or wants to have a working knowledge of the category theory or any other subject. Rather, many either want to satisfy their curiosity or get a basic understanding of a subject before deciding whether it would be worthwhile for them to invest more time in learning it. Philosophers of science, for instance, do need to know a lot about, say, modern physical theories, and many of them do - down to a level of astonishing mathematical detail; yet, no one would expect them to be able to work as a theoretical physicist.

I learned a ton about my car from reading the car's manual. And, can you believe it, the manual has no exercises in it!

But did you ever drive the car?

In theory, I did. There was a lot of commuting, so I can't say for sure that category theory wasn't involved.

"Some of them have solutions in the text, others have solutions that can only be accessed by professors teaching the class."

And this is why I don't usually end up doing the exercises.

That's your choice, but in my experience that simply means that you don't end up able to apply the knowledge you think you have acquired by reading about a topic.

That might be fine for you, but it's an inherent limitation in reading about something, rather than actually engaging in it.

I've edited my original comment to make a comparison, and expand on why I say this.

But to take you up on this - some exercises have answers in the text. You could do just those. Other exercises don't have the answers, just like code that you write doesn't already exist. When you write the code you then have the ability to check that it does what it's supposed to do.

The idea of having exercises without answers it to foster the ability to understand when your answers are right. Most of the time it's easy to know if you got the right answers, but it's finding the answer that's the challenge. If you choose not to learn this topic, fine, that's your choice. My comment just says: don't expect to understand the material at anything other than a superficial level if all you do is read about it.

As it says in the quoted section: One does not get fit by merely looking at a treadmill.

Usually in college-and-up mathematics, if you get the answer, you know when you have the answer, because you have a correct proof.

You don't need the solutions, the whole point is to write your own, which is much more work and which will invariably differ from the official ones -- sometimes slightly, sometimes drastically.

The only way to learn mathematics is by doing exercises.

This, in my experience, is utterly false.

It does seem to be a pervasive bias in the community, though --cf. the ubiquitous "[missing step] is left as an exercise to the reader", and so forth.


Can you tell me how you learned mathematics without doing any exercises?

    > without doing _any_ exercises?
I suggest you ask a question about what he actually said (wrote) if you want an answer. In the current form the question doesn't seem to be that (a question) at all but a cheap rhetorical method to pretend the other party said something ridiculous by taking a few words but adding exaggeration until the original contents no longer exists.

    > without doing _any_ exercises?
I suggest you ask a question about what he actually said (wrote) if you want an answer. In the current form the question doesn't seem to be that (a question) at all but a cheap rhetorical method to pretend the other party said something ridiculous by taking a few words but adding exaggeration until the original contents no longer exists.


OK, so I said:

    The only way to learn mathematics
    is by doing exercises.
To this, gone35 replied:

    This, in my experience, is utterly false.
To me, that says that it's not true that the only way to learn mathematics is by doing only exercises. Logically, that means there is some way to learn mathematics by doing something other than exercise, possibly in addition to exercise, admittedly, but certainly something that is not exercises.

The word "utterly" seems additionally to imply that this extra something completely outweighs the exercises, perhaps even to the point of no longer requiring exercises at all. And so my question.

So I believe my question to have been completely reasonable. Rephrasing gone35:

    In my experience, is utterly false that
    the only way to learn mathematics is by
    doing exercises.
So let me be very much more specific.

    gone35: I would be very interested to
            know what methods of learning
            mathematics you espouse other
            than, or perhaps in addition
            to, doing exercises.
Is that better?

I would be very interested to know what methods of learning mathematics you espouse other than, or perhaps in addition to, doing exercises.

Read carefully, go to seminars, ask people, and think through stuff.

I'm in lowly applied math/theoretical CS though, so it might be different in other, more abstract fields.

> think through stuff

Perhaps you are simply talking past each other? If "think through stuff" involved coming up with your own questions about the material and answering them, this would fall under what many mathematicians would call "exercise". In fact, I've found the belief that this is often more valuable than in-book excercises to be pretty pervasive in mathematics.

I think the original proposition is just a tautology. What does it mean to "know" something? I think it is equivalent to being able to complete an "exercise".

The fallacy here is thinking that the exercises in a book are special and must be completed and/or that you must have completed one before you could complete another outside the book.

[...]by taking a few words but adding exaggeration until the original content no longer exists.

Or just a case of mixed up disjunctions... It happens.

Also I happen to be born with two X chromosomes... Not that it should matter tho.

> You need to engage topics like this in hand-to-hand combat

Fortunately, doing the exercises in books isn't the only way to do that.

for the sake of precision it should be pointed out that learning about mathematics by reading alone is not the logical compliment of learning about mathematics through reading and book exercises. the term "exercise" by its nature implies something that is "practice". one may, however, simply learn by doing (rather than "exercising"). for a programmer, there is a clear distinction. this might not be the case for a mathematician.

Category theory , among with algebraic geometry, is one of the most complicated concepts ever developed. Trying to extend it to a wide range of applications seems like overkill. Yes, technically you can use it, but it's like using an archaeology kit to dig a hole when a simple spade will do.

> most complicated concepts ever developed

I disagree. Much of Category Theory isn't complicated -- it is just really abstract. So you need to have some concrete knowledge that you are going to apply it to before you start learning it or alternatively "mathematical maturity" (or as I like to call it "the ability to just accept abstract statements and go with it").

As a tool it clarifies and unifies relationships between mathematical concepts and domains like nothing else has.

To me it looks like a simple theory made needlessly harder by bad UX design. For instance, arrows are written left to right as f: X -> Y, g: Y -> Z, and composed right to left, g . f. There a lot of unmnemonic names like epic and monic morphisms, which are dual to each other but with unrelated names. The beginner, at least if they're me, must struggle to fit all these difficulties into their head together long enough to even follow what's being claimed, before they can work with the formalism.

If category theory is as cool as it sounds (and as simple as it doesn't) then somebody could do us a neat service by working out a game like http://worrydream.com/AlligatorEggs/ did for the lambda calculus. (Added: I guess it must've already been done, as formalizations of category theory in interactive proof assistants. Any tips on what's the most accessible?)

When I look at category theory diagrams, I always think that the visual embedding is underutilized. It seems like the diagrams could be enriched with more symbols to encode more information. For example (and probably a bad example) differentiating between automorphisms and identity functions using different symbols on their arrows). Somehow making these diagrams completely self-contained could be potentially interesting, and I wonder what the tradeoffs would be.

Maybe the current formulation of the diagrams is just right for people who use them to do a lot of heavy lifting. It would be interesting to hear some opinions on this.

You might have a look at section 1.39 in "Categories, Allegories", by Freyd and Scedrov. They introduce a language of diagrams and show how common definitions can be represented this way. Not a particularly easy read.

but isn't abstract the same as complicated


Complicated means there are many strongly coupled parts that you have to understand before you can understand the whole. Computer programs tend to be complicated (one sign of a good computer program is that you can understand and work on pieces of it without having to understand the entire thing).

Abstract means that the idea itself isn't tied to a particular instance of what it is describing. Interfaces in Java are abstract. The idea of a group in mathematics is abstract. Abstraction is usually an attempt to make an idea less complicated so you can focus on its essence.

No. The idea of a monoid is abstract but extremely simple. Conversely, the Navier-Stokes equations are very complicated but very concrete.

I know I'm probably going to keep getting down-voted ,but if it's merely abstract and not complicated ,why is category theory considered a difficult course? They don't teach it in undergrad. Is it because it is poorly taught, or students don't like abstractions, or students are ill-prepared, etc. The question I'm digging at is what makes something hard to understand versus easy

Maybe because category theory makes sense only after you have sufficient math background? Just like it's easier to make sense of the definition of algebraic structures when you're already familiar with enough of their instances.

An other reason, maybe category theory isn't that useful? I'm just speculating here as I have only basic knowledge of category theory, maybe someone else can enlighten us! While all the maths taught at undergraduate level are extremely useful and pervasive in all areas of science, I'm still unsure about category theory. It's certainly nice and elegant as it allows us to reframe different theories and definitions in a same context, but is it useful for an engineer or an applied mathematician?

Category theory is very useful for certain branches of pure mathematics (and it instils a very useful mindset in the mathematician), but much less so for applied mathematics.

Also you're quite right that category theory only really starts to make sense when you have a big stock of examples, like "group theory" and "poset theory" and "topology" and "set theory". Once you've got those examples, you can detect these common threads between them and call them "category theory".

On the other hand, at least some of the ideas from the category theory have already made their way into the _practice_ of programming. I am talking about the notion of a monad. It has already become an important tool in functional programming; but even in mostly-imperative programming languages such as C# one can discern important patterns as instances of this concept. Contrary to the common perception, it is not hard to understand. Two basic examples: the optional (a.k.a. nullable) type construct is an example of a monad (called 'Maybe' monad in Haskell); a generic list construct is another one. I think it is enlightening to be able to see a common pattern behind these, seemingly different, things. And this is exactly where the power and the usefulness of the category theory lies - in mathematics or elsewhere.

> the optional (a.k.a. nullable) type construct is an example of a monad

Nullable (in the C# sense) does not form a monad, because it does not form a functor, because you cannot write fmap, because it does not nest.

Most people find abstract hard. I've seen how hard it is for people to really grok groups, and groups are comparatively trivial, with many concrete examples. The examples in Category Theory tend to be things that are themselves abstractions of things that are already quite abstract.

It's hard to appreciate because to make the abstract nature valuable you need a body of concrete examples. You could probably do this halfway through an undergrad course, but it's not traditional. You can also do it with a body of programming knowledge.

You need to know a few interesting categories and functors before you can appreciate it. Without that, it is really abstract nonsense.

As an example, concretely I might have a list of people's heights and ages. I can attempt to fit a line to match heights as a function of age. Abstractly, I might have a list of numbers x and y. I can attempt to fit a line to them too. More abstract, but no more complicated.


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact