
Seemingly impossible functional programs (2007) - ogogmad
http://math.andrej.com/2007/09/28/seemingly-impossible-functional-programs/
======
ridiculous_fish
Here's my attempt to translate. I think it's probably wrong and welcome
corrections.

You have a function `p` which accepts an "element of the Cantor space"
(infinite sequence of bits) and returns true or false. You want to find some
value where it returns true, or prove that it returns false for _every_ value.

Crucially `p` is not a math function, it is a Haskell function. So you can't
say "p is true iff its argument is pi" because you can't write that in
Haskell. In fact, all `p` can do is look at finitely many bits - if it looks
at infinite bits, there is an input for which it will never return. And
because it looks at finitely many bits, there is necessarily a largest bit-
index which it looks at.

Now a generic-imperative programmer might attempt:

    
    
        for (bignum x = 0; x < infinity; x++)
            if (p(x)) return x;
        return NotFound;
    

but the problem is knowing when to stop, so as to return NotFound.

However, if we could detect when `p` demands more bits than we have produced,
then the search will terminate:

    
    
        int bits = 0;
        bignum x = 0;
        while (x < (1 << bits)) {
          try {
             if (p(x)) return x;
             x++;
          } catch (BitAccessOutOfBounds) {
             bits++;
          }
        }
        return NotFound;
                
    

Haskell's laziness enables it to "drive the loop" but the idea is not so
different.

~~~
shachaf
Laziness isn't the essential thing here -- the article's construction
specifically doesn't rely on laziness, and would work in a strict language
(almost verbatim -- you might need to write "lambda i: f(i)" instead of "f" in
a few places).

I'd say that the thing that makes the article's version work is that you can
pass a closure to the predicate, which is a channel for communicating extra
information that makes it not quite a black box. In an imperative language,
you could imagine passing in a function that mutates some state to get
information out (or throws an exception, the way you set it up). The article
shows that even that isn't necessary: You can pass in a sequence (represented
as a function from index to value) that closes over the predicate itself, and
recursively uses it to build a bit sequence that satisfies it, if one exists.

By the way, there's a clearer (I think) version of the core idea here that
uses conatural numbers (the naturals + infinity) that people don't usually
present, but might be worth writing up.

~~~
ridiculous_fish
In the article `p` really does take a list not a function! Haskell lists may
be "backed by" functions but these functions must be pure - no back-channels.
I'm not sure if this fact illuminates or obscures the point.

Anyways you can't pass a closure to the predicate, and the Haskell version has
no "back channels" other than termination. The idea is to arrange the
recursive search so that the predicate (not the search) demands the bits. This
is what terminates the recursion, and it's hard to arrange without laziness
(except by introducing more functions as you say).

~~~
shachaf
No, the article's p takes a function:

> type Cantor = Natural -> Bit > (#) :: Bit -> Cantor -> Cantor > x # a = \i
> -> if i == 0 then x else a(i-1)

(You could represent bit streams with something like data Stream = Cons Bit
Stream in Haskell, of course, but that's not what the article does. The code
in the article works in e.g. Python almost as-is.)

And you can certainly pass closures to the predicate. Not closing over a
mutable variable, of course -- I just mean that the bit stream function itself
has to have access to p, which is the trick that makes this possible. If you
couldn't do that -- if you could only query it with bit sequences that were
defined independently of p -- this wouldn't be possible.

~~~
ridiculous_fish
Coming back to this very late!

> No, the article's p takes a function

Yeah it does, but it's just misdirection.

A lot of this theory is gamified "you pick YOUR function, THEN I'll pick
MINE..." and the order is very important. Here representing sequences as
functions only obscures. The important idea is that the sequence is lazy, BUT
predetermined. It can't know what the predicate will be. That's where the
black magic comes in.

> You could represent bit streams with something like data Stream = Cons Bit
> Stream in Haskell

I wish it did, it would be much clearer. The article makes that point itself:

"For the type of infinite sequences, we could use the built-in type of lazy
lists..."

which is the point you made (which the article dismisses as insufficiently
mathematical).

> If you couldn't do that -- if you could only query it with bit sequences
> that were defined independently of p -- this wouldn't be possible.

No, there's the rub! The bit sequence and `p` are independent. You choose your
bit sequence, I choose `p`, and somehow execution finishes. That's exactly
what makes this "seemingly impossible."

There is a back-door dependence where `p` forces a lazy sequence to
materialize bits, but nothing more. The sequence cannot introspect `p`, and
vice-versa.

I think this my third time encountering this post and it took me a few tries
to get it!

~~~
shachaf
I didn't see this reply until today either.

On laziness, I only mean that the language feature isn't important. You can
certainly represent infinite data using laziness (and you must represent it as
some form of codata, such as a function), but it's not really the crucial
thing here.

Here's a simpler version of seemingly impossible programs, which is better for
explaining what I mean: Instead of bit streams, use conatural numbers, i.e.
the natural numbers + infinity. You can define these in various ways (e.g.
monotonic bit streams), but we'll use a regular Haskell type:

    
    
      > data Conat = Z | S Conat
    

Ignoring bottoms (which we won't permit), this has all the natural numbers as
inhabitants, but it also has infinity:

    
    
      > inf :: Conat
      > inf = S inf
    

Note that, if I give you a conatural, all you can do with it is peel off
successors; you can't distinguish infinity from "a really big number",
extensionally.

A predicate on conaturals is a total function that takes a conatural and
returns a boolean. As in the bit stream case, totality is a very strong
requirement. For example, "is n even" isn't a valid predicate, because the way
to check for evenness is to peel off pairs of successors until you reach
either 0 or 1, but that doesn't halt on infinity. In particular, to be total,
every predicate must give up eventually: After seeing k successors (for some
unknown k), it has to give up and return true or false. If there was no such
k, the predicate wouldn't terminate on infinity.

Now you can ask: Given a predicate, how can I decide whether there's any input
that it holds for? And this is "seemingly impossible": You can test the
predicate on infinity, and, say, on every finite number up to some N, but that
doesn't give you any guarantee about the numbers you haven't tested. If all
you can do is query p with any finite set of Conats -- even interactively,
deciding what to ask it based on its previous responses! -- you don't get
enough information to decide if the predicate is satisfiable.

The trick is to construct a special conatural which is defined in terms of p
itself. If you think of this more imperatively, you can imagine a program that
queries the predicate about increasingly larger natural numbers -- 0, 1, 2, 3,
... -- and prints out an "S" each time the predicate returns false, and a "Z"
when the predicate returns true. If the predicate is false for all inputs, it
prints out "S"s forever -- but that's still a valid (productive) conatural
number.

This conatural is -- by construction -- the smallest input that satisfies p,
if there is one, and infinity if there isn't. Now you can apply p to this
number, and, if there's any input that satisfies p, p will return true.

The point I was making above is that it's crucial for this conatural to be
able to call p. It doesn't introspect p, in the sense of being able to access
its source code, but it's not independent either. And, although laziness
expresses this elegantly, I wouldn't really say it's the core of the trick
here. You could express the same thing using Turing machines that write out
"S"s to their output tape, for example (as long as you don't require them to
terminate overall, only to be productive). You do need to guarantee that the
"conatural" Turing machine that you give to p has access to p (either by
embedding its source code or by being able to call it as an oracle).

Here's the implementation in Haskell (which is simpler than the bit stream
case given in the article, and doesn't need mutual recursion):

    
    
      > -- find p returns a conatural that satisfies p, if one exists.
      > -- (If one doesn't exist, it returns infinity.)
      > find :: (Conat -> Bool) -> Conat
      > find p = if p Z then Z else S (find (p . S))
    
      > -- Does any conatural satisfy the predicate?
      > exists :: (Conat -> Bool) -> Bool
      > exists p = p (find p)

------
amicin
I took Martin Escardo’s course on functional programming at university.
Nothing but positive things to say about the man. A fantastic teacher with a
real care for his students.

Makes me happy to see him on HN.

~~~
m_j_g
You may be interested in his brand new course on HoTT in Agda

[https://www.cs.bham.ac.uk/~mhe/HoTT-UF-in-Agda-Lecture-
Notes...](https://www.cs.bham.ac.uk/~mhe/HoTT-UF-in-Agda-Lecture-
Notes/index.html)

------
ogogmad
On a similar vein, you can write functional programs that can do exact Riemann
integration:
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.54....](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.54.395&rep=rep1&type=pdf)

------
cvoss
I think I have a half-baked understanding of what's going on here. Let me talk
through it.

Take a function f : unit_interval -> bool. I believe that if there is no bound
on the execution time of f, then f must not be a total function. (So
conversely, if f is total, then it has bounded execution time, and is
therefore only considering prefixes of the digit streams of up to some bounded
length. From there, searching for an example input x s.t. f(x) = true is
easy.)

Naively, it would seem plausible that there exists a total function which,
nevertheless, takes longer and longer to run on a particular sequence of
inputs. So let such a sequence be given, where the runtime of f(x_i) tends to
infinity, even though it is finite for each x_i.

But from topology, we know that a sequence of points in a compact space, like
the unit interval, has a convergent subsequence. The interesting thing happens
when you look at f(y) where y is the limit point of such a subsequence. It is
the case that as i increases, x_i and y increasingly look the same for longer
and longer prefixes of their decimal (binary) expansions. Therefore, f has to
do more and more work to tell apart x_i and y, which gives us an argument that
there is no bound on the runtime of f(y); i.e., f doesn't terminate for that
particular input. (Except, you have to get nitpicky about redundant
representations of reals like 0.0099999... vs 0.010000..., but we can always
choose the representation of y that suits us: the one that looks more like the
x_i's.)

~~~
contravariant
Yeah I think there are some small problems with your argument, unless it was
only meant to illustrate the idea.

For the sake of argument let's say we simply input both decimal
representations of each number. A program that simply tries to find the first
decimal that isn't 9 for a number in [0,1] would take an arbitrary amount of
time, but is still total because 1 can be identified in a single step.

The problem here is that while the function is total it isn't _continuous_ and
so the compactness of the topology is irrelevant. As far as I can tell the
article uses some trickery to ensure that computable functions _are_
continuous, in which case it does work.

~~~
cvoss
You're right, it's not a precise line of reasoning. I was considering only
computable functions (else "runtime" isn't a meaningful concept) and treating
the input domain as actually being the Cantor space, where 0.9999... and
1.0000... are actually distinct values. I just found it helpful to think about
the Cantor space in terms of the unit interval.

~~~
JadeNB
> the Cantor space, where 0.9999... and 1.0000... are actually distinct value

What do you mean here? The Cantor set is a subspace, not a cover, of [0, 1];
values in the Cantor set that are the same in [0, 1] are also the same in the
Cantor set.

~~~
thaumasiotes
Sure, the Cantor _set_ is the set of real numbers in [0,1] with trinary
expansions that do not contain 1 at any point.

A Cantor _space_ is any set that can be mapped to (and from) the Cantor set.

 _The_ Cantor space is the set of infinitely long bitstrings; instead of a
trinary number 0.200220222200..., you'd conceive of the same entity as the
string "100110111100...". The mapping is pretty obvious.

0.99999... and 1.0000... are neither of them bitstrings, so it's not really
clear what cvoss was thinking of. But it's certainly true that "0999999..."
and "1000000..." are distinct strings.

Going back to the Cantor set, you'd usually represent 1 as 0.22222222... for
obvious reasons of consistency (that is the only format matching the
representation of the rest of the set). It would be weird to represent it as
1.000000000...; that would look like you were considering the left end of the
Cantor set in [1,2] rather than the right end of the Cantor set in [0,1].

------
dang
If curious see also

2015
[https://news.ycombinator.com/item?id=10575626](https://news.ycombinator.com/item?id=10575626)

------
Kednicma
One way to imagine this space is as an indexed bank of light switches. Each
switch starts out unset, and our goal is to find a configuration on a finite
number of switches which will satisfy our search. The search can only check
one switch at a time and must commit after each check. This allows a basic
nemesis to compute a configuration by running the search repeatedly and
rescuing it after each time it fails by checking the last switch checked.

A gorgeous immediate consequence is that each search is looking for a solution
to a SAT problem! Each SAT solution describes the edge of a region of Cantor
space whose elements all have the searched-for property. And thus, while the
search can proceed in finite time, it must sadly be NP-complete.

------
billforsternz
Sadly this goes over my head. But, I've got to say, this is a very nice
looking website. It's nicely organised on github too.

------
ogogmad
Here is an application of the technique to proving Conway's cosmological
theorem:
[https://www.cs.cmu.edu/~kw/pubs/conway.pdf](https://www.cs.cmu.edu/~kw/pubs/conway.pdf)

