Hacker News new | comments | show | ask | jobs | submit login
Functor-Oriented Programming (r6.ca)
178 points by szatkus 9 days ago | hide | past | web | 67 comments | favorite





Seems to be similar to the direction taken by John De Goes (using algebras to describe your problem domain, building programs from those algebras via free monads/applicatives/etc., interpreting those programs using natural transformations). I think there is definitely something to this kind of thinking.

Both authors argue that modern programming languages do not have the best ergonomics for this style of programming, albeit it is at least possible to do most of it in languages such as Haskell and Scala.

1. Data Structures Are Antithetical to Functional Programming [http://degoes.net/articles/kill-data]

2. A Modern Architecture for FP [http://degoes.net/articles/modern-fp]

3. Modern Functional Programming: Part 2 [http://degoes.net/articles/modern-fp-part-2]


It depends on whether reusing your data or reusing your code is more important; and that's application specific.

Reusing code means you need to shoehorn your data into as generic a container as possible, because types are what make one bit of code incompatible with some other piece of data. Reduce the number of types, and you naturally increase the reusability of code. But there's irreducible complexity; the structure of data doesn't go away, if it's not explicit in types, it'll be implicit in code. There's a risk of it being so implicit that it's not until runtime that a description of the data being manipulated is known - this is a logical consequence of building your program by composing functions.

It might be that you need to do lots of computation, and you need to heavily optimize your computation primitives. Reusing this code, and the right code, is very important. So the data needs to change shape. This is what you get in databases - the constraints of the relational model - and in monadic code dealing with lists, etc.

On the other hand, programs are usually easier to understand when the data structures are mapped out explicitly, and the operations programs perform are understood in terms of projections between types or deltas within data structures. Sometimes, when projecting between types, all you need to know about a function are its input and output types, and you know what it does.

Most line of business applications don't have much reused code, outside of libraries for UI and accessing external services. Data is more important. And that's fine.


Also, he discusses some of the limitations of this approach here - https://m.youtube.com/watch?v=A-lmrvsUi2Y He also calls for another language, although his asks for that language seem different.

Those posts by Degoes are more polemical than techniques that he actually uses in production.

> Data Structures Are Antithetical to Functional Programming

Interestingly I've heard a similar pov from the 'pure OO' (Smalltalk style) world as well. Specifically avoiding 'data structures' and embracing late binding.


On the other hand, from Linus Torvalds:

> I will, in fact, claim that the difference between a bad programmer and a good one is whether he considers his code or his data structures more important. Bad programmers worry about the code. Good programmers worry about data structures and their relationships.

https://lwn.net/Articles/193245/


I don't think Torvalds' quote applies in this case: it's not really "killing datastructures", so much as "using abstract datastructures"; i.e. it's like programming to an interface rather than a concrete instance. In fact, that's exactly what it is, since he's using typeclasses (which can be thought of as a sort of interface). We could achieve a similar thing with existential types (AKA modules).

I understood the Torvalds quote as being more like: would it be worse if some chunk of your library code disappeared, or if some chunk of your datastructures disappeared? If it's the former, then your library functions are "too smart", i.e. they're probably doing non-obvious things to their inputs. If it's the latter, then the data is more "self-describing".

Note that I use the phrase "library functions", since (first-class) functions are a perfectly good way to represent data (e.g. Church encoding and its variants).


First-class functions can perfectly represent data, but that doesn't mean they're a perfectly good way to represent data.

For example, if you use Church encoding for your numbers, how do you do static analysis of bounds checking? If you implement if-statements by passing two functions to a function which calls one or the other for truth and falsity, how do you determine dead code at compile time? You just move a bunch of problems from compile time to run time, where they're harder to reason about, in most Turing-complete languages.


> move a bunch of problems from compile time to run time

Perhaps this is an artificial dichotomy? What we really want is verification that some properties hold. These properties should be specified in a uniform way, irrespective of the time that the check can apply.

When I specify a property, why do I have to also pick the time - 'run time' or 'compile time' and why the language used to specify exactly the same property is so different in each case? I think there is some unnecessary conflation here.


> For example, if you use Church encoding for your numbers, how do you do static analysis of bounds checking?

I dunno; I wouldn't use Church encoding in that situation. For the same reason I wouldn't use infinite precision rationals to represent 24bit R/G/B colours on a microcontroller, and I wouldn't use floats in a computer algebra system.

You seem to have strawmanned "functions are a perfectly good datastructure" into "always use functions for everything", which is (a) obviously not what I claimed and (b) not even the point I was trying to make; it was just a footnote pointing out that "code vs data" has nuances because sometimes our "data" is code.

> If you implement if-statements by passing two functions to a function which calls one or the other for truth and falsity, how do you determine dead code at compile time?

Supercompilation. Still, there's not much point taking something as expressive as functions, and using them to implement something impoverished and abstract like booleans. Much better to implement something actually relevant for your domain; for example, using functions to implement booleans, then using those booleans to implement permission checks (e.g. `if(isAdmin, doAction, permissionDenied)`), is harder and more complicated than just implementing the permissions checks directly using functions.

Note that this is something the actually happens all the time! Any time an OO programmer has two objects with different implementations of the same method (say, an `Admin` object whose `doAction` method does the action, and an `UnpriviledgedUser` class whose `doAction` method shows a "permission denied" message), they're essentially "implementing if-statements by passing two functions to a dynamic-dispatcher which calls one or the other for truth and falsity". It's not a problem.

Note that Smalltalk actually does implement "if statements" using objects with different `ifTrue` methods. Yet the only reason they do so, and in fact why any language has things as inherently-meaningless as booleans (and functions, strings, arrays, etc.), is because the language designers/implementers aren't able to predict the domains that we'll be dealing with. The best they can do is provide generic building blocks which are sufficiently expressive for us to construct semantically-meaningful domains for ourselves. Hence any attempt to implement <generic building block A> using <generic building block B> is usually pointless from a practical perspective (although is useful for learning): it's basically a constructive proof that <generic building block A> is expressive enough that we don't need <generic building block B> at all, and hence is self-defeating!


You seem to have strawmanned "functions are a perfectly good datastructure" into "always use functions for everything"

You said that using functions for everything was, and I quote, "perfectly good". That seems to imply to me that it cannot be improved, and conveyed an unwarranted absolute position.

Note that Smalltalk actually does implement "if statements"

Smalltalk is of course what I had in mind when I wrote that. And of course Smalltalk's lasting contribution is the dynamic optimization gymnastics it inspired to compensate for its design choices, and not the design choices themselves.

language designers/implementers aren't able to predict the domains that we'll be dealing with. The best they can do is provide generic building blocks which are sufficiently expressive for us to construct semantically-meaningful domains for ourselves

I'll deny this to my dying day. The most expressive language with the most generic primitives is not the best language, and it is far, very far from the best we language designers can do. The languages that give people the most ability to construct domains for themselves are the ones that end up with the lowest common denominators between codebases and programmers. It's the reason Lisp and Smalltalk didn't take over the world. Too much freedom is a bad thing; form is liberating, and more importantly, prosocial.


"perfectly good" is a colloquial idiom for "acceptable", "decent", "useable".

Example: "Someone in my building threw out a perfectly good audio amplifier. It just needed a blown fuse replaced inside! I hooked it up to some speakers and now using it for my TV."

It doesn't mean that the hardware is so advanced, it cannot be improved.


> You said that using functions for everything was, and I quote, "perfectly good".

The "perfectly good" quote is accurate. I never said "for everything".


And both styles lead to equally unreadable code. If you can't take an average new graduate, put it in front of your code and have him be productive in your codebase in a few days it's no good.

An average new graduate wouldn't be productive in any codebase in a few days. And even if they were, that's a terribly low bar to set. Software development education sucks and most people have to learn on the job. So why not teach them ways to write more reliable and maintainable code.

That is probably good advice but this style of coding (which comes down to lawless type classes without inherent utility) does make it much more difficult to read unknown code.

Usually documentation isn't fleshed out in internal modules so I end up flipping between various instances and call sides just to figure out what the type class is supposed to do. This doesn't mean that there aren't valid use cases for this but please don't use it as default.


> If you can't take an average new graduate, put it in front of your code and have him be productive in your codebase in a few days it's no good.

So if a piece of code for securely transferring funds from one back account to another is not readable by a new graduate in a few days, it's no good?

Should we remove safety abstractions from this code safe in order for the new graduate to be able to understand it in a week?

A large part of software is safety-critical. How can you argue that safe abstractions, which reduce or eliminate faults, are not worth it if they can't be learned in a week or so?

You wouldn't happen to work as an Equifax code-reviewer, would you?


Sounds like you have a bias for hiring inexperienced, cheep developers, and not providing them with any mentoring.

> And both styles lead to equally unreadable code.

Perhaps because the languages used aren't a good fit for these styles?


There is an easy design choice - which Haskell didn't take - which would allow us to have transparent Identity functors, associative Compose and many others: Add a conversion rule to your language and don't eagerly expand definitions at the type level.

For example, in Coq you would write

  Definition Identity a := a.
  Definition Compose F G a := F (G a).

  Instance: Functor Identity. Proof[...]
and so on. This works, since explicit conversion during type checking allows you to associate type classes to otherwise transparent definitions.

Of course, this is no silver bullet and complicates other things. Everything works out beautifully though, if you combine conversion with bidirectional type checking, at the expense of less powerful type inference.


What does Idris do? That seems to be the best hope of combining Coq-like advanced typing with Haskell-like practicality.

Interestingly, the author did a dissertation using Coq, so I wonder if he thinks that it's not really suitable.

I don't know, but Russel O'Connor got his PhD in 2008, the same year that type classes were introduced in Coq.

There has been a lot of development on the Coq proof assistant since then. I don't imagine that working with Coq in 2008 was particularly pleasant. :)


I don't think working with Coq has ever or will ever be particularly pleasant :-)

O'Connor gave a nice talk at TPHol's which I attended: https://arxiv.org/abs/cs/0505034

Keeping up with Coq shouldn't have been too difficult for him.


This is great. It's essentially asking you to work in a 2-category[1], which, once you get used to it, is an amazingly intuitive way to make sense of constructions in ordinary 1-categories.

[1]https://ncatlab.org/nlab/show/2-category


Gabriel Gonzalez: The functor design pattern

http://www.haskellforall.com/2012/09/the-functor-design-patt...

is also a well-known post.


Pardon my ignorance, but can't this be more easily translated into terms of (modern) object oriented programming, where you avoid mutating state?

Here, the "natural transformation" is a function that takes an input object which adheres to a specific interface (i.e. provides a set of specific methods), operates only on that input interface, and returns another object that conforms to the expected output interface.

Avoiding mutating state means that either those objects have no attributes, or that the state can only be set on construction. (In FP speak, this corresponds currying of common parameters. Or to an internal abstract type, if the constructor is private and you can only create it via a specific set of static methods).

Is this modern style of OOP equivalent to "Functor-Oriented Programming", or am I missing something?


> Avoiding mutating state means that either those objects have no attributes, or that the state can only be set on construction. (In FP speak, this corresponds currying of common parameters. Or to an internal abstract type, if the constructor is private and you can only create it via a specific set of static methods).

I don't think the style you're talking about is something that most people would call "object oriented programming" ("modern" or otherwise). Indeed the usual name for the programming style in which you avoid mutating state is "functional programming".


Maybe, but the terms "class", "method" and "interface" are more commonly understood than "functor" or "natural transformation".

So for better understanding by the general public it may make sense to offer an explaination in terms of OOP, even if what is described is more FP style than OO style.


Functors are functors, natural transformations are natural transformations; neither has much in common with classes, methods or interfaces. Functors are being used to achieve composition here, but the use of functors is an essential part of the pattern; talking about "composition-oriented programming" would be vague to the point of meaninglessness (every programmer believes they're being compositional), and there simply isn't an OO equivalent of open recursive types as far as I know.

One important aspect of programming with functors is the ability to quantify over them, i.e. higher-kinded types. This is crucial for building reusable components on the functor-level of abstraction.

In modern OOP this would amount to quantifying over interfaces which is typically not even possible in those languages.


I really want to see examples of this.

I've seen mentioned the "streaming" package as an example of this: http://hackage.haskell.org/package/streaming

"streaming" provides a Stream type which is parameterized by a Functor. The main module in the package provides very general functions which work for any functor and often involve natural transformations and functor compositions and sums. http://hackage.haskell.org/package/streaming-0.1.4.5/docs/St...

The "Streaming.Prelude" instantiates the functor with a pair type and provides the more concrete API that one would expect from a streaming library. http://hackage.haskell.org/package/streaming-0.1.4.5/docs/St...


The unification-fd library referenced in the article is probably a pretty good example. I have not looked at it, though, and it would be nice to have explicit bits of code pointed out in an explanation.

writing DSLs with Free monads is an example of this. Recursion schemes are another example - see kmett’s library in Haskell, or Matryoshka in Scala.

Anyone got some good examples of this?

> With functor oriented programming, polymorphism is not necessarily about using functions polymorphically. Instead, polymorphism provides correctness guarantees by ensuring that a particular function can only touch the specific layers of functors in its type signature and is independent of the rest of the data structure. One benefits from polymorphism even when a function is only ever invoked at a single type.


Imagine you have two functions that you only call with lists of integers:

    def summarize1[A](list: List[A]) = ...
    def summarize2(list: List[Int]) = ...
Even though you don't call summarize1 polymorphically, the signature tells you that it's only looking at the structure of the list rather than what's inside it - a reader can immediately see that it's going to do something like return the length of the list. Whereas summarize2 could do other things, e.g. sum the elements of the list, add and subtract alternating elements...

A list is a pretty simple structure, these kinds of guarantees become more useful when the structure itself is complicated. E.g. in the project I'm currently working on I have a tree-like datatype that only ever contains Strings, but I've written it generically so I can be confident that my structural functions only work on the structure, and I have a clear separation between functions that operate on the structure of the tree and functions that operate on the content of the tree.


I'm not 100% sure of what the author had in mind, but here's a possible (and possibly dumb) example.

Imagine you need a 2-parameter "Lasagna" datatype wich is a non-empty list of alternating "a" and "b" elements. It could be used to model something like a timeline, perhaps.

Instead of constructing it by hand, you could express it as a composition of other stuff:

    import Data.List.NonEmpty
    import Data.Bifunctor
    import Data.Bifunctor.Tannen

    type Lasagna = Tannen NonEmpty (,)

    example :: Lasagna Char Int
    example = Tannen $ ('a',3) :| [('b',4),('d',6)]
http://hackage.haskell.org/package/base-4.10.0.0/docs/Data-L...

https://hackage.haskell.org/package/bifunctors-5.4.2/docs/Da...

That type has a whole bunch of useful instances without even having to auto-derive them.

And you could have functions that manipulated the pairs without caring what shape the "outer functor" has.


It would be nice to know what kind of code OP writes. Ie. are we talking application code or library code, and if it's libraries are we talking networking libraries, graphics libraries or math libraries?

Usually I find that Haskellers write very "low-level" code, ie. implement math abstractions or the like, while I prefer to write applications (in Haskell), where this abstraction technique may not be suitable.


This style is plenty useful in "higher level" stuff. Consider Reflex: http://docs.reflex-frp.org/en/latest/

I don’t think I understand the reference. Reflex is a library, used to create applications.

The actual application code doesn’t seem to adhere to any type of functor-oriented programming, as far as I can see: http://docs.reflex-frp.org/en/latest/architecture.html


> I don’t think I understand the reference. Reflex is a library, used to create applications.

Yes, but typical uses would be mobile apps and web apps, pretty "high level" stuff.

> The actual application code doesn’t seem to adhere to any type of functor-oriented programming, as far as I can see: http://docs.reflex-frp.org/en/latest/architecture.html

https://hackage.haskell.org/package/reflex-0.4.0/docs/Reflex...

The entire library abstracts over its implementation via the "Reflex" class. This is not _exactly_ the style advocated in the original post, but it's basically the same spirit.


> application code or library code

Do the two call for different styles?


I’d argue that they do to some extent.

With application code you’re interfacing with some real-life system, and need to adapt to all the corner cases of this, whereas with library code you might be implementing something entirely theoretical, where everything fits together perfectly in some abstraction because there are no corner cases to consider (it’s just a clean math concept).

With a math concept, the elegant concept already exists beforehand, and you just implement it. With application code, there might be 27 different but-ifs, and-also-remembers that you need to consider, where developing an elegant, coherent concept of the entire app is very challenging, or even impossible.

With some application code, I find I’m often the first one to make sense of it all, taking all the corner cases and creating a coherent model out of it, i.e. making it all fit into some well-defined abstraction. And then a new corner case appears and my model falls on the floor, and needs to be redesigned.

The definition of exact real arithmetic doesn’t change over time. There’s no customer who calls you up and asks you to add a feature that doesn’t fit into your existing model/abstraction.


Low level stuff, at least in public: http://r6.ca/code.html

It's not the same thing, but it triggered something. A programming language related is Koka (see: An overview of Koka: https://koka-lang.github.io/koka/doc/kokaspec.html#sec-an-ov...).

It's basically a functional programming language where functions can be applied to a type. The type is then passed as the first parameter. So instead of `afunction(param1, param2)` you would do `param1.afunction(param2)`.

Example: (https://koka-lang.github.io/koka/doc/kokaspec.html#sec-struc...)

struct person(age: int, name: string)

fun birthday(p: Person) { return p(age = p.age+1) }

val me = Person(27, "Albert") // { age: 27, name: "Albert" }

val meWithBirthday = me.birthday() // { age: 28, name: "Albert" }


That's just OO without encapsulation.

No, it's just some sugar over plain old functions.

It's basically how Python methods work, and while I'm sure you could no-true-scotsman Python classes/OO, I have yet to see it.

Right, but there's more to classes in Python than just methods (inheritance etc.).

Inheritance is not essential to OO; it is merely a method of sharing code.

Sure, and functions are just sugar over goto's.

Are you suggesting that the difference between a.foo(b) and foo(a, b) is analogous to the difference between goto and functions? Functions give structure and restrictions to goto. Method syntax is just a different way to write the exact same thing.

I'm suggesting all of programming is syntactic sugar. Calling something syntactic sugar is not a valid critique.

> I'm suggesting all of programming is syntactic sugar. I wasn't critiquing anything, just suggesting that moving the first argument of a function call to come before a . isn't enough to consider something OO.

There's typically more to OO than just method syntax. Dynamic dispatch, subtyping, etc. And saying something is OO without encapsulation isn't really saying anything, given that encapsulation is arguably the most important part of OO.

I'd argue that, it's not remotely the most important part nor is sub-typing. Encapsulation encourages better OO, but you can take a well encapsulated OO program, and remove all encapsulation by making all the data public, and that program is still a well structured OO program. OO is about the way objects talk to each other to accomplish work, the message, the API between objects; it's not about the many little things like encapsulation that encourage better messaging. There are dozens of practices that encourage better OO, but you can write good OO without them if you know what good OO is and don't need those training wheels.

If you don't have encapsulation, then you're not just using messages to talk between objects. If you are, then you have encapsulation.

Well, now we have to get semantic; if by encapsulation you mean you keeping your data private and only exposing behavior publicly, then you're wrong. If by encapsulation you mean all behavior that uses an objects data is in the object and no outside code relies on that objects data even if it is public, then you're agreeing with me.

You clearly didn't play out my thought experiment above or you wouldn't be making the statement you just made.


I agree that in your thought experiment, the program remains OO, but the language is not. You can write OO programs in many languages, but I would argue that you don't have an OO language unless it has special support for OO programming.

I wasn't arguing it was an OO language, merely that it was OO code. OO used to be done in C in exactly the same basic way, passing the struct that is the object as the first arg to all the member functions; Python does this explicitly as well.

OK, I was talking about whether or not the language was OO. So I guess we're in agreement then :)

It reminds me a of the S3 single-dispatch polymorphism in R.

A strict type system makes sure your program is statically correct, and catches many bugs caused by the programmer. The real world though is dynamic and I prefer to check all input data at runtime, eg. each function check if what's passed into it is correct. It not only catches a lot of bugs, those caused by the programmer and the ones caused by interacting with the outside word, it also makes the code easier to test. This probably sounds like I'm swearing in church to a Haskel programmer though.

I'm not sure what you mean, or perhaps you have an incorrect understanding of how Haskell works. Of course data is still checked at runtime; for example a parser will check that a string is formatted in valid way, or a lookup in a hash table will check that a key is valid. This is done at runtime as it would be in virtually any language.

However there's no way in Haskell (short of low-level type system tricks) to check that a function with an argument of type `Int` actually receives an `Int` when it is called. There's no need to, because it is guaranteed by the type system. The strength of these guarantees, coupled with the ability to define rich, expressive types, are what makes Haskell programs very robust. For example you can represent all possible results of a valid JSON parse:

    data Value
      = Object (HashMap Text Value)	 
      | Array (Vector Value)	 
      | String Text
      | Number Float	 
      | Bool Bool	 
      | Null
Which variant of this type is actually encountered will not be determined until runtime, so the program is required to respond to the different cases; however a function which takes type `Value` will never need to worry that it doesn't actually receive one of the above variants at runtime.

Any chance someone can provide code examples, maybe in a bunch of different languages (elm, haskell and even js functional)? Would love to understand this more

> Instead of writing transformations between data structures, one writes natural transformations between functors, where a natural transformation between functors F and G is a polymorphic function of type forall a. F a -> G a.

There's nothing natural about this.

;)




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact

Search: