Hacker News new | past | comments | ask | show | jobs | submit login
Corecursion and coinduction: how they relate to recursion and induction [pdf] (cam.ac.uk)
67 points by MindGods 11 days ago | hide | past | favorite | 11 comments





> One way to define codata is by corecursion: for example CountFrom(n) = cons(n, CountFrom(n + 1)) defines CountFrom(n) to be the infinite list starting from n:

TXR Lisp:

  1> (defun countfrom (n) (lcons n (countfrom (succ n))))
  countfrom
  2> (take 5 (countfrom 1))
  (1 2 3 4 5)
So that's what I was doing there: codata and corecursion. Who woulda figured?

It makes sense to have some terminology, I suppose. The definition of the list is clearly recursive. But the implementation doesn't actually recurse: it's not the case that countfrom calls itself, in spite of superficial appearances. A closure produced inside countfrom is (potentially) invoked, which happens after countfrom has terminated. If that happens, that closure then calls countfrom.


Another way to look at countfrom is that you have built--manually, as opposed to using any special syntax in your compiler--a "generator" for the sequence using a "coroutine". Had you written it in a language with support for typing the generator with syntax for coroutines, it would have looked directly recursive... but it wouldn't have been, as the implementation of that coroutine as expressed using normal routines would have had this property of storing thunks that are actually being called iteratively by the consumer. It thereby makes sense that the co-routine (which is sitting at a higher level to the routines which are used to implement it), being not a normal routine, doesn't have normal recursion: it instead has co-recursion. (Now, what is amazing to me, is that I open this document and not once does it even use the word "coroutine", even though all of this terminology makes a lot more sense if you are willing to do so ;P.)

It does look directly recursive

Furthermore, in a lazily evaluated language, we could use the regular cons function, instead of an explicitly lazy lcons version.

No coroutines are used here, only lexical closures.

I changed my mind; lazy semantics plus recursion doesn't give rise to a new kind of recursion.


> No coroutines are used here, only lexical closures.

So first off, I just want to provide some clarity on the language I used, as this statement makes me think I was misunderstood: I appreciate what you typed wasn't "using a coroutine" any more than if you typed a structure and put some functions nearby in a nice and simple pattern you "used a class". I am saying the thing you typed "is" a coroutine, which is very different.

I then explicitly said that what you were doing was "manually" building a coroutine, as you are in a language that doesn't use coroutines: it uses lazy execution. Like, I disagree with that sentence as what I said was you manually wrote what a compiler in a different language would have output if you could have directly used that feature, which you couldn't have because you didn't have it.

> It does look directly recursive.

However, that aside, you are correct here, and I noticed this a few hours ago during my tenth read of your code and it confused the hell out me, if it makes you feel any better. In particular, the thing that I realized is that the version of your code you would type if you were to do this as a generator in a language with coroutines actually doesn't* look recursive, and so I had played myself by getting the entire argument backwards.

> I changed my mind; lazy semantics plus recursion doesn't give rise to a new kind of recursion.

Ok, so the real issue to grok here is that this function wasn't "recursive", no matter that it looked recursive. An algorithm is recursive if it conceptually takes a large thing and breaks it down into smaller things in order to achieve some goal. If your thing is infinitely large on accident, this might never end, but the idea that there is an elusive "base case" looming in the future is a premise of recursion.

The opposite of this is something which starts with something small and builds something large. You often can't even directly express (again, you can write something similar) this in a language without either coroutines or laziness (and while I don't think the language you are working with here has either, the DSL formed by the lcons macro and take function is close enough), as the function you are writing is highly unlikely to naturally terminate.

Your list is something you are building up, not tearing down, so you are doing the opposite of recursion. FWIW, if you read the Wikipedia page on corecursion, which I just skimmed, it sets up a bunch of its examples using generators, as they are a natural way of expressing these kinds of algorithms in programming languages people are more likely to usr... but I no longer actually believe that the version of your code expressed with the syntax of a coroutine is "using" recursion at all (and yet, I think it "is" corecursive?) so I dunno: welcome to hell ;P.


"To make a list of numbers starting from N, we make a list starting from N+1 and cons N onto the front".

How is that not obviously recursive?

If we make it a list from N to M, we can use eager evaluation, over ordinary non-tail recursion on a conventional run-time stack:

  1> (defun count-from-to (n m)
       (cond
         ((> n m) nil)
         ((= n m) (list n))
         (t (cons n (count-from-to (succ n) m))))) ;; no lcons
  count-from-to
  2> (count-from-to 1 10)
  (1 2 3 4 5 6 7 8 9 10)
What looks "funny" about the original is the lack of any termination test; how can you return a list of all the numbers from N+1, and then tack N onto the front?

However, that doesn't require any new kind of recursion. The mathematical definition given in English at the top of my comment certainly has no issue; math uses whatever evaluation semantics is needed to make things work out, and frequently describes infinities.


Ir is just that you are working with a more general definition of recursion than the actual one: as I said, recursion isn't merely "a function which calls itself", it implies a certain problem solving strategy.

Recursion is "a method of solving a problem where the solution depends on solutions to smaller instances of the same problem". If it isn't being broken down into smaller instances towards some base case--an instance small enough to solve on its own--it isn't recursion.

The new function you have described is just different, because you have added the bounding into the code, and so now the function itself is solving a different problem: it includes the definition of take inside of it.


Recursion can solve problems using larger objects than the input.

For instance, a parser for telephone numbers of the form (123)-456-7890 might plausibly convert the case 1234567890 into this form, which is several characters longer, and then call itself recursively on that form.

So that is to say, a recursive term-rewriting system can be used to solve a problem, and in a term-rewriting system, the path through the rewrites can involve growth as well as shrinkage.

It meets the definition of solving a smaller instance of the sub problem if we take "smaller" as being, simply: "fewer steps remaining".

Recursion can also divide into smaller objects which represent overlapping subdivisions of the input space, but that are together larger than the input and end up creating more work. For instance, naive Fibonacci recursion.

Though, pedantically speaking, the numbers starting from N and the numbers from N+1 are sets of the same size ("countable infinity"), it's still the case that {N+1, ...} is "simpler" than {N, ...}. The latter contains all of the elements of the former, and also N, making it a proper superset, in spite of equivalent cardinality.

Recursion that proceeds from a superset to a proper subset (though they be of the same cardinality) meets the definition of identifying and solving a "smaller instance", though we cannot insist that "smaller" is interpreted in terms of raw set cardinality.


> So that's what I was doing there: codata and corecursion. Who woulda figured?

I can't quite make out the tone of it, and guessing tone is a fool's game, so I'll take it literally. I think that this is one thing that people miss who complain about the proliferation of mathematical terminology in modern functional programming. I see no reason to think that you're such a person, so, just to be clear, when I say 'you' below, I am addressing only a hypothetical such person, not actually you.

No-one wants to force you to think in terms of codata, coinduction, and corecursion if you're comfortable with your mental model of what lazy lists are doing; the point is rather to allow you to think in those terms—to leverage the vast theoretical power that already exists in the mathematics, rather than requiring it to be re-discovered in an ad hoc way.


The real upshot of coinduction is that it allows you to write /proofs/ of objects that are hard to reason about using induction.

In an inductive proof, we:

1. prove that the base case holds, step(0).

2. show that if step(n) holds, so does step(n+1).

So if we were reasoning about a finite numbers of numbers, we would show that some property P holds for:

1. the empty list, P([]).

2. assuming that some property holds for P(xs), show that the property holds for P(cons(x, xs)).

Now assume we want to prove something about an infinite list. What is the base case here? there is no "empty list" to start from, is there?

So the proof strategy is as follows:

0. We are trying to prove P(generator), where "generator" is the generator of an infinite list.

1. Assume that P(<<generator>>) holds.

2. Let the generator generate some data. Say it gives us (data, <<rest of generator>>)

3. Show that P(data, <<rest of generator>>) holds.

So, now, if we want to use the above "co-inductive" proof to prove something about a program that uses data from a generator, here's how we do it:

1. Let the program run and halt in finite time. Since it's spent finite time, it could only have pulled out a finite amount of data from the generator. So the generator is generator = (data1, data2, ...., data n, <<rest of generator>>)

2. Assume that P(<<rest of generator>>) holds --- we don't know if it holds or not, and our program can't inspect it anyway, so it doesn't matter! (sneaky, eh?)

3. We know that given P(<<rest of generator>>) and data generated from the generator, datan, we can show P((datan, <<rest of generator>>) 3.b Repeat to get P((data (n-1), data n, <<rest of generator>>)

3.c Keep repeating to get P((data 1, data 2, ..., data n, <<rest of generator>>)

This is a really powerful idea. If I had to write down a quip to express this idea, here's what I'd say: - Induction proves that _something is true_: It shows that a property holds for the whole object. It can do this because the object is finite. - Coinduction proves that something is _not false_: It shows that a property is not false for some finite prefix of an infinite object. We don't know what happens in the part that is "not observed", and we don't care, either.

The distinction really doesn't matter inside a programming language like haskell or LISP because they don't really have proofs. It only matters inside something like Coq or Agda, where we want to reason about programs.

There is more I could get into, like how coinduction/copattern matching allows us to prove that a program "will be productive", just as how we can show that a recursive program "will halt" by looking for an argument that becomes smaller.

For a taste of this, consider the programs:

    def fib(n):
       if fib == 0: return 1
       return fib(n-1) + fib(n-2)
    def loop(n): return loop(n) 
How do we prove that fib(n) halts while loop(n) does not? Note that fib(n) has a decreasing argument, the size of the number n passed as input gets smaller on each recursive call. On the other hand, the size of the recursive argument in loop(n) remains the same. We can formalize this to get a proof that fib(n) will always halt, as long as the inputs are natural numbers.

Now consider a similar case with two coinductive programs:

     def gen_const():
        yield 1
        yield from gen_const()

     def gen_loop():
        yield from gen_loop()
Which coinductive program will produce output? Clearly, it's going to be gen_const. How do we prove this? The idea is that the recursive (yield from gen_const()) is "guarded" by a (yield 1). So we are guaranteed to make progress (yield a 1) before we recursively call ourselves. On the other hand, in the case of gen_loop(), there's no production happening before the recursive yield, so we won't produce anything in finite time!

All of this is completely lost without the reasoning ability of coinduction and a language like Coq, Agda, or Idris that forces you to think about how to prove things about programs.

I hope this gives some idea of "why coinduction" --- it's not something that LISPers had figured out in the 70s that the FP people are now fetishizing and renaming.

Coinduction is a powerful set of proof techniques to prove things about programs that don't terminate, but make progress: web servers, operating systems, etc. None of these are supposed to "halt". they're supposed to "make progress". Coinduction gives us the mathematical calculus to talk about such objects.

It's an "open problem" on how to make Coinduction good to use inside these proof systems. See, for example, paco in Coq, Copatterns in agda, etc. This is still bleeding edge research.


Thank you, that was a good explanation. However…

> How do we prove that fib(n) halts while loop(n) does not?

Unless your number system involves saturating subtraction (which I admit is a possibility, though this wasn't stated) your definition of fib(n) will never halt. No matter what natural number you start with it will always descend into negative numbers eventually since the fib(n-2) term skips over the fib(0) base case when n == 1:

                                 fib(2)
          fib(1)                   + fib(0)
    (fib(0) +        fib(-1)     ) +   1
    (  1    + (fib(-2) + fib(-3))) +   1

Argh :) Well, this is why we verify code ;) Thanks for pointing this out.



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: