
Codata in action, or how to connect FP and OOP - siraben
https://www.javiercasas.com/articles/codata-in-action/
======
willtim
This paper on "Object Algebras":
[https://www.cs.utexas.edu/~wcook/Drafts/2012/ecoop2012.pdf](https://www.cs.utexas.edu/~wcook/Drafts/2012/ecoop2012.pdf)

is a great example of using ideas from functional programming and category
theory to improve on the visitor pattern, such that the expresson problem can
be solved.

The value of such mathematics is that it enables us to formalise some of these
patterns and generalise them.

~~~
DonaldPShimoda
> such that the expresson problem can be solved.

I'll add this paper to my reading list (because this is very much something
I'm interested in), but in the meantime could you explain what you mean by
"solving" the expression problem?

My understanding has been that EP is more of a lens through which to evaluate
the usefulness of solutions along the objects-or-functions spectrum, rather
than an actual problem to be solved in a traditional sense. Does the paper you
linked design a system in which the full breadth of that spectrum is made
expressible with no concessions?

~~~
willtim
The "expression problem" from Wikipedia (quoting Phil Wadler):

 _" The expression problem is a new name for an old problem. The goal is to
define a datatype by cases, where one can add new cases to the datatype and
new functions over the datatype, without recompiling existing code, and while
retaining static type safety (e.g., no casts)."_

With a sufficiently expressive language, the expression problem can be solved,
sometimes in multiple ways. However, the solutions often involve boilerplate,
trickery or advanced type system features. A proposed solution to the
expression problem may be impractical, but that is subjective and really a
separate discussion.

I guess you could view the problem as a sort of lens with which to evaluate a
language. E.g. Can this language solve the expression problem and what does
the best solution look like?

~~~
DonaldPShimoda
Oh true! I've definitely read that definition before but I guess I just
internalized it with a lossy interpretation. :) Thanks for reminding me what
it was really about!

------
dustingetz
> Codata looks a lot like objects and methods, where the codata value is the
> object, and the eliminators are the methods. ... the usual Functional
> Programmer has been an Object Oriented Programmer before. Because of this,
> he is likely to know codatafrom OO, and now is starting to understand data
> from FP

To me Python, Java, C++ are about encapsulation of imperative effects, but
article is not about _that_ kind of OOP

I think article is talking about v.f() "eliminator" vs f(v) "constructor" and
that objects and closures are equivalent, but I am missing the deep insight
beyond that. I don't understand how this helps with the expression problem
(lets me not recompile my code to add new match cases to add additional data
constructors)

~~~
jacinabox
Functions (or closures) are the prototypical codata... a co-inductive codata
can be presented as a function space from an inductive datatype into some
other type. It has a coinduction principle, derived from the induction
principle of the parameter type. Traditional imperative code can be
represented through the use of a free monad over a functor representing some
command language, and the free monad is a codata as well.

------
mirekrusin
"Category Theory is a branch of Mathematics mainly used by haskellers to annoy
everyone else." :)

~~~
ivan_ah
he heheh, I guess that's _is_ a good an application as anything else.

Last month there was a conference on applied category theory
[https://act2020.mit.edu](https://act2020.mit.edu) and as part of it there
were some excellent introductory tutorials, which I highly recommend for
anyone interested in CT:
[https://www.youtube.com/playlist?list=PLCOXjXDLt3pYPE63bVbsV...](https://www.youtube.com/playlist?list=PLCOXjXDLt3pYPE63bVbsVfA41_wa3sZOh)

It's the first time I watch lectures on CT and I actually understand something
;)

------
ivan_ah
Here is the article mentioned in the blog post (Open Access):

[ Codata in Action ]
[https://link.springer.com/chapter/10.1007/978-3-030-17184-1_...](https://link.springer.com/chapter/10.1007/978-3-030-17184-1_5)

~~~
infogulch
Wow that whole issue is packed with interesting topics.

------
mirekrusin
" (...) \- In an OO language, you can add a new case by creating another
subclass and implementing the methods. But, if you want to add a new function
to a datatype, you have to add it to the superclass and implement it on all
the existing subclasses, which means modifying a lot of files and recompiling
most of the program.

\- In a FP language you can add a new function to an existing datatype just by
adding a function. But, if you want to add a new case, you have to add the
constructor and modify all the functions that use the datatype, which means
modifying a lot of files and recompiling most of the program.

(...)"

...and by using multiple dispatch (as in Julia) you can do both. And it just
works. True code reuse. Why did it take us so long to recognise this?

~~~
tsimionescu
I don't think multiple dispatch fixes this problem in any way.

The idea here is that you can have an opaque interface. Then, adding new
implementations of that interface is very easy. But, if you want to extend the
interface, you need to modify all of the existing implementations. Multiple
dispatch suffers of this problem just as much as single dispatch.

In the other case, you have a transparent interface - anyone can build code
based on your data type. But, if you add more cases to your data type, anyone
who had functionality built on it needs to take into account the new cases.

I think that this is in inherently an unavoidable problem. You can of course
expose both kinds of constructs in your language, but you can't make interface
implementation easily modifiable, and you can't make sum types easily
extensible. The programmer ultimately has to choose the one that they believe
will best capture the domain and its probable evolution.

What multiple dispatch can do is take some cases where sum types are obviously
preferable to single dispatch (operations on related objects) and apply the
interface abstraction to those cases as well (without reaching for horrible
solutions like the visitor pattern).

Edit: thinking a bit more about the article, what they are showing is that you
can essentially use the visitor pattern (and multiple dispatch) to implement
GADTs instead of using them for regular multiple dispatch with inheritance.
Basically, you can use them bring completely disparate types under the same
umbrella, just like a GADT. I'm not sure that this still allows subtyping of
that GADT as if it were a regular class.

~~~
codr7
There is no need to modify existing implementations when extending an
interface using multiple dispatch.

You would need to add the required implementations, but there's really no way
around that since the logic has to go somewhere.

And as soon as you start actually dispatching on multiple arguments, using sum
types to solve the same prolem quickly turns into spaghetti.

~~~
tsimionescu
> There is no need to modify existing implementations when extending an
> interface using multiple dispatch.

Sure there is. If I have a function that takes a Foo and a Bar, and it calls a
generic method on them, the generic method must have methods that cover the
real subtypes of Foo and Bar. If someone had created subtypes of Foo and Bar
and defined methods on all the generic methods that they knew about, and now
you create a new generic method, you have absolutely broken the interface just
as much as adding a new method to Iterable in Java.

~~~
codr7
It must have enough methods to get the behavior you want, that doesn't always
mean all subtypes.

Assuming you're running a version of Java that supports default
implementations, which may often be used to avoid breaking the interface.

Sum types are a different story, as you would have to dive into existing code.

------
guerrilla
While we're on this subject, if we were to keep extending the syntax, is this
what introProduct would have to look like?

    
    
            data Coproduct a b where
                InjL :: a -> Coproduct a b
                InjR :: b -> Coproduct a b
    
            elimCoproduct :: (a -> c) -> (b -> c) -> Coproduct a b -> c
            elimCoproduct f g (InjL x) = f x
            elimCoproduct f g (InjR y) = g y
    
            codata Product a b where
                ProjL :: Product a b -> a
                ProjR :: Product a b -> b
    
            introProduct :: (c -> a) -> (c -> b) -> c -> Product a b
            ProjL (introProduct f g z) = f z
            ProjR (introProduct f g z) = g z

~~~
lapinot
Exactly! Maybe you've seen them before but what you just wrote are so-called
"copatterns". See:

[https://www.cs.mcgill.ca/~bpientka/papers/unnesting-
copatter...](https://www.cs.mcgill.ca/~bpientka/papers/unnesting-
copatterns-2015.pdf)

[https://doi.org/10.1017/S0956796819000182](https://doi.org/10.1017/S0956796819000182)

[https://agda.readthedocs.io/en/latest/language/copatterns.ht...](https://agda.readthedocs.io/en/latest/language/copatterns.html)

Agda even has extra sugar on top for postfix projectors: you could write
`myPair .ProjL` (as expression and as copattern). Syntax for pattern-matching
anonymous function crucially uses that (yup, we have multi-clause anonymous
functions!). You'd write:

    
    
      introProduct f g z .ProjL = f z
      introProduct f g z .ProjR = g z
    

or

    
    
      introProduct = λ { f g z .ProjL → f z
                       ; f g z .ProjR → g z }
    

Note that most languages already have that "codata" thing but it's usually
called "record" or "struct".

~~~
guerrilla
I only found Setzers slides on this yesterday after trying to figure this out
as I haven't been keeping up on Agda. So this is basically "copattern"
matching on record projections. Records/structs are then the dual of
variants/emums where we name and type the the destructors instead of the
constructors. Thank you for confirming it. Agda's syntax then is meant to
mimick descendants of C a little bit with that dot.

I'm still fuzzy but it seems like this can be directly translated into an
underlying term rewriting system and would just work almost unmodified and all
we've done is opened a second route for type checking terms based on
orthogonal conditions? I'll have a look at the nested copatters paper as that
probably answers my question. Thank for the resources. They seem much clearer
than my search results. I wonder why people don't start explaining codata with
records instead of coNats and coLists. The latter was confusing.

------
carterschonwald
For a more concrete example, look at my nearly 5 year old post to Haskell
reddit for embedding/faking copatterns in Haskell
[https://www.reddit.com/r/haskell/comments/4aju8f/simple_exam...](https://www.reddit.com/r/haskell/comments/4aju8f/simple_example_of_emulating_copattern_matching_in/)

Was based off working to understand agdas notion of copatterns plus wanting to
understand how to best mode protocols.

Def the case that copatterns capture the spirit and essence of oop done right.
And with immutable or linear logic flavor codata, it becomes possible to also
do stronger polymorphic updates than you can traditionally do with oop. Also
you can nicely model type state varying apis for a single object

Edit : I also 2-3 years ago did a related live coding lecture which is related
and some might Like [https://github.com/cartazio/symmetric-
monoidal/blob/master/s...](https://github.com/cartazio/symmetric-
monoidal/blob/master/src/Control/Monoidal.hs)

------
TeMPOraL
> _This is no different to how car and cdr blow up in flames when you pass as
> parameter an empty list._

Depends on the Lisp in question, I guess. I don't know how it is in Scheme,
but in both Common Lisp and Emacs Lisp CAR and CDR have well-defined behavior
when used on an empty list: they return NIL, which in both Lisps is equivalent
to an empty list.

------
zvrba
Freely accessible paper cited in the OP: [https://www.microsoft.com/en-
us/research/uploads/prod/2020/0...](https://www.microsoft.com/en-
us/research/uploads/prod/2020/01/CoDataInAction.pdf)

------
drol3
Cool :)

Sometimes seeing the same concept from different perspectives makes it easier
to understand. For a scala treatment of the same topic see

[https://underscore.io/blog/posts/2017/06/02/uniting-
church-a...](https://underscore.io/blog/posts/2017/06/02/uniting-church-and-
state.html)

Also in Scala; an interesting solution to the expression problem:

[https://i.cs.hku.hk/~bruno/papers/Modularity2016.pdf](https://i.cs.hku.hk/~bruno/papers/Modularity2016.pdf)

Its a paper, but its very light on dense academic prose and very heavy on copy
pastable code :)

------
dustingetz
So if I have a Java value object, data is the constructors and codata is the
getters ("eliminators")?

------
newen
Aren’t typeclasses just a better version of codata?

------
rjsw
I know people who are members of CODATA [1], didn't know they were working on
this kind of thing. Maybe people could search for previous uses when they are
picking a name for a concept, I think there are websites that let you do that
/s.

[1]
[https://en.wikipedia.org/wiki/Committee_on_Data_for_Science_...](https://en.wikipedia.org/wiki/Committee_on_Data_for_Science_and_Technology)

~~~
wk_end
Category theory, which is where the convention of using a "co-" prefix to
refer to the "arrow-flipped" dual of a category comes from, predates CODATA by
nearly twenty years. Maybe the CODATA folks shouldn't have presumed that they
were the category-theoretic dual of data?

