
Contravariant Functors Are Weird - gbrown_
https://sanj.ink/posts/2020-06-13-contravariant-functors-are-weird.html
======
vore
I was super confused by contravariant functors until someone gave me a
concrete example of an instance of it as "the left side of an arrow".

With a regular functor you have:

    
    
      fmap :: Functor f => (a -> b) -> f a -> f b
    

and you can think of something relatively straightforward type that fulfills
it it, e.g.:

    
    
      newtype Identity a = Identity a
    
      instance Functor Identity where
          fmap f (Identity x) = Identity (f x)
    

With a contravariant functor you have:

    
    
      contramap :: Contravariant f => (a -> b) -> f b -> f a
    

The Identity type can't fulfill Contravariant, because it makes no sense to
apply some transformation a -> b to a value and somehow get the preimage of
it. However:

    
    
      newtype Op a b = Op (b -> a)
    
      instance Contravariant (Op a) where
          contramap f (Op g) = Op (g . f)
    

Unwrapping this all, you get something like:

    
    
      contramap :: (a -> b) -> (b -> c) -> (a -> c)
    

which makes a lot more sense! You can apply some mapping from a -> b before
going from b -> c, giving you a -> c: the contravariant mapping of a function
is just the reverse composition of a function!

Of course, there are other kinds of contravariant functors, but this was the
one that stuck out the most to me.

~~~
bollu
A slightly more concrete version of this is

    
    
         data Predicate a = Predicate (a -> Bool)
    
    

this naturally has a contravariant functor instance: if you can tell me
whether something is true for "a", and you can convert "b -> a", then how do
you tell me whether something is true for "b"? Convert the "b" to an "a" and
see if it's true for "a".

Formally, you get the instance

    
    
         contramap :: (b -> a) -> Predicate a -> Predicate b
    
         contramap :: (b -> a) -> (a -> Bool) -> (b -> Bool)
    
    
    

The picture to have in mind is to imagine A as a space, and then to know that
some things are true in A (color them green) while others are false (color
them red). If you now want to color another space B using this space A, should
you have A -> B, or B -> A?

some thought reveals that A -> B may tell us inconsistent colourings. For
example, say we have a map {red, green} -> { b } where both "red" and "green"
map to "b". So what color do we assign "b"? There is no reasonable choice.

On the other hand, say we have a function B -> A. Since each element in b maps
to _one_ element of A, we can say

    
    
        color(b) = color of element that b maps to.
    
    

We need the fact that a function maps one value in the domain to exactly one
value in the codomain for this to work.

I tend to imagine the function from B to A as threads, whose endpoints in A
are soaked with dye. This dye "moves backwards" towards B. The uniqueness in
colors assigned to B is given by the fact that we can only have one thread
from each point in B.

------
foxes
Contravariant functors are actually really nice.

The classic example from maths is the spectrum (Spec) of a ring R. As a
functor, Spec(R) = the set of prime ideals of R. An example anyone can
understand: all the multiples of prime numbers p Z, in the integers Z.

Let R -> S is a ring homomorphism, then there is an induced map Spec(S) ->
Spec(R).

Spec establishes a connection between the category of rings and topological
spaces. Algebraic geometry is a whole area of maths that deals with this
connection.

~~~
bollu
While I know precisely what you are saying (have been learning scheme theory
this summer), this is hardly an accessible example to _pure math undergrads_ ,
let alone someone who's attempting to learn some functional programming, with
no heavy experience with abstract algebra.

First of all, why use Spec? Use ideals/varieties, it contains roughly the same
data, while being way better to intuit. I'll put my money where my mouth is,
and give it a shot.

Say we have some collection of points in |R^2, and we want to find equations
which define this set. We do this by creating a function

f: points in |R^2 -> set of polynomials whose common zeros are the points.

For example,

1\. f(unit circle at the origin) = { x^2 + y^2 - 1 }, because all points on
the unit circle satisfy x^2 + y^2 - 1 = 0.

2\. f(the full space |R^2) = { 0 } because the constant zero / the zero
polynomial is zero on the entire plane.

3\. f(empty set) = { 1 } because the polynomial/constant 1 is Nonzero on the
entire plane.

4\. f({all points on either the X axis or the y axis }) = { xy }, because
points on either the X axis or the y axis satisfy X = 0 or y = 0, which is
implied by xy = 0

5\. The intersection of the XY axes and the unit circle, which are the points
{ (+-1, +-1) } is cut out by the __common roots __of the polynomials { XY, x^2
+ y^2 - 1 }.

After some rumination, one will notice that as we increase the number of
"points", we will need to decrease the number of polynomials: each polynomial
is a _constraint_ , so having more polynomials is having less points that
satisfy these constrains.

This is the crux of the __contravariance __between algebra and geometry:
geometry describes the thing in itself, algebra describes how to get at the
thing using constraints. These will always be dual to each other.

How did I do at an attempt at an explanation?

~~~
Sharlin
I have undergrad-level understanding of both abstract algebra and functional
programming, and I have _absolutely no idea_ how either your or the GP's
example are connected to contravariant functors as understood through the lens
of functional programming.

~~~
foxes
Maybe a good example that ties both together is some sort of filter on a list?
If you add more constraints (more processing) the list gets smaller.

Suppose I have a function f : x -> y. x -> y is also a functor in y. If you
have an x-> y but want x->z, you can use y->z and adjust your original
function by post processing the output of f. However suppose you want w -> y
but you still only have f. You need to post process with w->x. So f is
covariant in y, contravariant in x. Things change in the opposite direction of
the adaptor.

Perhaps serialisation is a useful example?

Perhaps contravariance becomes even more useful if you use it in some
profunctor dimap where you have a producer and consumer and process, eg moving
through a data structure and pre/post processing things.

[https://www.youtube.com/watch?v=OJtGECfksds](https://www.youtube.com/watch?v=OJtGECfksds)

~~~
magicalhippo
> You need to post process with w->x

Did you mean pre-process?

------
joppy
Any function defines a covariant function by postcomposition. Take for example
the string-length function (len : string -> int). We can take any other
function which outputs a string, say (f : X -> string) where X is any fixed
type, and produce a new function (g : X -> int) by g(x) = len(f(x)). So the
len function defines a functor from the set of functions with signature (X ->
string) to the set of functions with signature (X -> int). For concreteness,
one can imagine this functor as replacing functions which output strings by
functions which output the lengths of those strings instead.

However we can also treat (len : string -> int) as a _contravariant_ functor
by pre-composing instead of post-composing. Say we have a function (h : int ->
Y), then we can form (k : string -> Y) by setting k(s) = k(len(s)). This could
be useful if we only cared about whether a strings length were a multiple of
5, say.

Some of the above are lies: in order to call these functors in the
mathematical sense (or the Haskell sense) you need to phrase things in just
the right way. But I think it gets the idea across about how simple the
difference between co- and contra- variance can be, with the example of
replacing f(x) by g(f(x)) or by f(g(x)).

~~~
magicalhippo
Thank you for the example, this really helped me to grasp the concept.

So given that I'm clueless when it comes to this topic, what kind of functions
aren't functors? Those that take multiple parameters or something like that?

------
bikenaga
Here's a description which is close to the way this arose in math. It's pretty
simple. Maybe it will be helpful for some people to see it removed from
programming issues. (Maybe not.)

From concreteness, let's suppose we're talking about sets and functions
between sets. (The same thing works in an arbitrary category.) Thus, if X and
Y are sets, Hom(X, Y) denotes the set of functions (morphisms) from X to Y.
Suppose you have a fixed function f: A -> B. You can compose it with functions
_into_ A or _out of_ B.

If g: C -> A maps into A, then f o g (using "o" to denote composition) gives a
function from C to B: that is, f o g: C -> A -> B. We started with a function
from C to A and wound up with a function from C to B. So we actually have a
function (functor) Hom(C, A) -> Hom(C, B), which is often denoted Hom(C, f).
We say that Hom(C, -) is covariant, because it respects the direction of
arrows in the second slot. (f went from A to B, and Hom(C, f) goes from Hom(C,
A) to Hom(C, B).)

If h: B -> D maps out of B, then h o f gives a function from A to D: that is,
h o f: A -> B -> D. So we actually have a function from Hom(B, D) -> Hom(A, D)
which is often denoted Hom(f, D). We say that Hom(-, D) is contravarient,
because it reverses the direction of arrows in the first slot. (f went from A
to B, but Hom(f, D) goes from Hom(B, D) to Hom(A, D).)

    
    
         g      f      h
      C ---> A ---> B ---> D
    

Thus, Hom(-, -) is actually a _bifunctor_ which is covariant in one variable
and contravariant in the other. Contravariant functors can be regarded as
covariant functors on the _opposite category_. What is happening with Hom is a
prototype for many of the ways that "covariance" and "contravariance" occur in
math; for example, covariant and contravariant tensors. (The vector dual space
function Hom(-, K) [where K is the ground field] is contravariant.)

One of the earliest descriptions of category theory (including variance) is
in: Samuel Eilenberg and Saunders MacLane, "General Theory of Natural
Equivalences". Transactions of the American Mathematical Society, Vol. 58, No.
2, (Sep., 1945), pp. 231-294

It's actually fairly readable. There have been many books and articles on
category theory since then, and many specifically directed toward computer
science (e.g. Michael Barr and Charles Wells, "Category Theory for Computer
Science" \- [https://www.math.mcgill.ca/triples/Barr-Wells-
ctcs.pdf](https://www.math.mcgill.ca/triples/Barr-Wells-ctcs.pdf)).

------
TheAsprngHacker
One thing I've never understood is polarity. To my understanding, positive
types are defined in terms of their introduction rules and negative types are
defined in terms of their elimination rules. However, don't types both have
introduction and elimination rules, making them positive or negative based on
how you choose to define them?

Also, how does polarity (emphasis on introduction versus elimination rules)
relate to variance, as this article presents?

~~~
maxiepoo
The idea of polarity comes from the category theoretic notion of a universal
property. Nice types have introduction and elimination rules but for negative
types the introduction rule is "reversible" whereas for positive types the
elimination rule is.

As an example, the function type `A -> B` is negative because the function
introduction rule

G, x:A |- M : B \---------------- G |- lam x. M : A -> B

is a bijection: the inverse is

G |- N : A -> B \------------------- G , x:A |- N x : B

The beta and eta equations encode exactly the two properties of this being a
bijection.

Positive types, like sums/alternatives/coproducts have their elimination rule
as their reversible rule, i.e. "pattern matching". So the rule

G , x1 : A1 |- K1 : B G , x2 : A2 |- K2 : B \--------------------- G , x : A1
+ A1 |- case x of { in1 x1. K1 | n2 x2. K2 }

Has an inverse

G , x : A1 + A2 |- N : B \--------------------------- G , x1 : A1 |- N[in1
x1/x] G , x2 : A2 |- N[in2 x2/x]

The reason people say the positive types are "defined in terms of their
introduction rules" is that you say "here are all the ways to build a term of
this type" (in1 and in2 for sums) and then the elimination rule is exactly
"pattern match on all of those possibilities". There is a dual way to think of
the negative types which is "here are all the ways to use a term of this type"
and the introduction form is a "co-pattern match" where you say "inspect all
of the ways I can be used and say what to do in each case".

If you know about category theory then the idea is that some types are defined
by representing a functor C -> Set (positives) and others by representing a
functor C^op -> Set (negatives).

Variance is I would say is an orthogonal concept, except that the only
primitive contravariant type former in lambda calculus is function which is
negative.

~~~
TheAsprngHacker
Thank you for the explanation of polarity, I found it helpful.

I just remembered that people use the +/\- notation to denote covariance and
contravariance (such as in OCaml syntax and Scala syntax). I think it's
possible that the author saw this and then related the +/\- notation to
polarity, even though variance is unrelated.

------
heinrichhartman
Let me give you another example:

Let X be a set, and

    
    
        Fun(X) = { real valued functions on X }.
    

Then X -> Fun(X) is contra-variant.

Indeed, if $F: X -> Y$ is a map between set, and f \in Fun(Y). Then you have a
natural function $f \circ F: X -> IR \in Fun(X)$. This is sometimes called a
pull-back of $f$.

Functors of the kind Space -> { some stuff* that lives on X } are often
contra-variant. E.g. functions, vector bundles, sheaves, differential forms,
etc.

------
magicalhippo
I'm not a functional programmer, but I like reading about it to expand my
horizon.

Some days I even think "hey functional programming looks fun, I should try
it!"

Then I read articles like this and realize that ship has sailed...

I'll be sticking to my procedural code, now with a light sprinkling of
functional-ish concepts.

~~~
madushan1000
Honestly, concepts like functors are not that hard, they're way more intuitive
than some of the OOP concepts when you get used to them a little bit.

~~~
magicalhippo
Yeah I guess it's mostly about not knowing Haskell syntax so the code samples
don't really do much for me, and not knowing any category theory so that's no
use either.

I mean the Haskell wiki[1] is no use to a guy like me. Not a complaint, it's a
reference after all. Wikipedia[2] isn't much better, which I do find slightly
disappointing.

At least the Wikipedia article has some links to the concepts involved so
should be doable to interpret the terse article after a bit of extra reading.

[1]: [https://wiki.haskell.org/Functor](https://wiki.haskell.org/Functor)

[2]:
[https://en.wikipedia.org/wiki/Functor_(functional_programmin...](https://en.wikipedia.org/wiki/Functor_\(functional_programming\))

~~~
dddbbb
The problem is also that in order to generically define functors, one needs
higher-kinded types. The Haskell/Scala code on the Wikipedia page is not
translatable to say, Java.

------
theschwa
How would I use contravariants to improve the architecture of some code?

I geek out on articles like this, but I frequently struggle to see where I
would actually apply it.

~~~
DanielMcLaury
I think in a lot of cases this is sort of like asking "how can I use an
abstract factory to improve my COBOL code?"

Like, if your code isn't already written in terms of classes, you probably
can't just jump in and use OOP design patterns to do anything.

Similarly, you may need to start from a reasonably functional codebase to
apply most functional programming concepts.

------
brmgb
TLDR: If you have a function from b to c and a function from a to b then if
you run the second function before the first one you can see the whole thing
as a function from a to c. It even works with multiple arguments of type b by
converting them all before hand, no kidding.

