You're right about diagonalization when it's given our choices as input. When I said 'writing down' I meant 'in a self-contained way', i.e. without taking any input (such as our choices).
This rules out diagonalization, since that can't be run until it has a list of numbers to diagonalize. In the 'reverse' game, we're first running the counter-example-outputting program that Cantor provides, and then using its output to pick our numbers. If Cantor gives us a diagonalization program, we'll get a deadlock. In fact, the strategy to beat Cantor in this reverse game is simply the identity function: if Cantor's program claims that "X" isn't in our list, we can refute it by outputting "X".
The way I think about it is this: if we give Cantor our program up-front, he can run it to see what we're going to choose, then pick his 'move' accordingly (e.g. via diagonalization); if Cantor gives us his (self-contained) program up-front, we can run it to see what 'move' he's going to choose, and pick ours accordingly (e.g. via the identity function).
Both games are a win for whoever goes second.
Alternatives to the 'up-front' approach include:
- Interleaving or concurrent outputs, in which case there would be no winner: each time a number is added to the list, diagonalization extends its counter-example with another digit; each time the counter-example is extended, a number with matching prefix is added to the list.
- Interleaving or concurrent programs, with access to each others' source. I think the winner would be whoever has the most computing power (and can hence simulate their opponent further into the future).
Sorry if I wasn't making myself clear: I'm not at all trying to refute Cantor's proof by diagonalization of the uncountability of the reals. There seem to be others in this thread who do take issue with it, but I don't.
It looks like my point's been misunderstood. I'm pointing out that the whole reason that Cantor's proof works (and it does work) is because we're forced to provide our complete list up-front. Or, for those who don't like the idea of 'writing down' something infinite, we must provide a 'description' or 'program' which uniquely defines/generates our list.
You say as much here:
> you can't run his algorithm until you've committed to yours.
Exactly. In Cantor's 'game', his counter-example (a real number not in our list) is generated with full knowledge of our list. Our list is generated with no knowledge of his counter-example.
That's Cantor's proof, and it is not what I'm talking about.
I'm talking about a different 'game', where the dependency is reversed: the "counter-example" is generated with no knowledge of the list; the list is generated with full knowledge of the "counter-example", and hence is able to refute it (by including the "counter-example" in the list).
This 'inverted' setup has two important differences from Cantor's setup:
1) We must replace the program/description of 'a list of reals' with a program/description of 'a function from a real to a list of reals'.
2) We must replace the program/description of 'a function from a list of reals to a real' with a program/description of 'a real number'
In particular, point (2) means that diagonalization is irrelevant in this setup (which I repeat is not Cantor's setup!). It is irrelevant because it doesn't have the right type: it is a function from a list of real numbers to a real number but there is no place for such a function in this alternative setup; instead, we need a plain old real number. This is what I was trying to get across by saying it must be 'written down up-front' (i.e. not after waiting for us to think of a list), or being 'self-contained' (i.e. not requiring an input/reference to our list).
I hope that makes it clear why the following refutations aren't applicable:
> > if Cantor gives us his (self-contained) program up-front,
> Which he did. Diagonalization is an algorithm (and a trivial one at that).
Given what I've said above, we see that diagonalization is not self-contained/provided-up-front/fully-written-down or any of the other phrases I've tried to use to get my point across. Yes, we can write a program for diagonalization, but such a program encodes a function, which has no place in this scenario (which is not the one in Cantor's proof!).
> one of the inputs to his algorithm is your algorithm
Again, the algorithm cannot take my/your/anyone's algorithm as an input, since it cannot take anything as an input, since it is not a function, since I am not talking about Cantor's proof by diagonalization of the uncountability of the real numbers. I am talking about a different situation, in which the "counter-example" is just a number, and hence cannot be provided with any inputs, whether they be algorithms or unicorns.
> > Both games are a win for whoever goes second.
> That's right. But Cantor has to go second.
By "both games" I mean Cantor's proof and this different situation. Cantor doesn't have to go second; that's my point! Cantor wins because he chooses to go second. If we choose to go second, then we will win, by constructing a list which includes the supposed "counter-example".
> That's why his proof is correct.
I know it is; but I've never claimed otherwise, and that is completely beside the point.
The point is that Cantor's proof isn't actually about the real numbers! Instead, it's about how some infinite structures (like "all numbers") are too rich to be completely captured by any particular finite representation (there will always be missing numbers). It suffices to use a countable infinity, like the computable numbers.
Alternatively, we can think of Cantor's proof as talking about computational resources: diagonalization is trivial to state, as you say, but it is computationally difficult to run. This is because it runs the list-generator as a subroutine (to find the digits), which can be made arbitrarily hard by generating the list in an arbitrarily complex way. Since diagonalization then adds 1 mod 10, it always ends up doing slightly more work than it takes to generate the list. Hence we can see Cantor's proof as statement that more powerful computers can always beat less powerful computers, assuming the code is all globally known, because the faster computers can simulate the slower ones to see what they'll do. That the essence of the second player always winning.
> diagonalization is not self-contained/provided-up-front/fully-written-down or any of the other phrases I've tried to use to get my point across
But it is. The fact that one of its inputs happens to be a function doesn't make it any less self-contined.
> The point is that Cantor's proof isn't actually about the real numbers!
It isn't just about the reals, but it is definitely about the reals. The claim is: there exist well-defined mathematical structures that cannot be put into a one-to-one correspondence with the naturals. The proof is constructive, with the reals as the canonical example. Cantor's proof is about the real numbers in the exact same sense that Turing's proof of the non-computability of the halting problem is about Turing machines.
> diagonalization is not self-contained/provided-up-front/fully-written-down or any of the other phrases I've tried to use to get my point across
> But it is.
No it's not. It's a function. That is exactly the opposite of what I mean by "self-contained" (and all of the various rephrasings of how I defined it).
> The fact that one of its inputs happens to be a function doesn't make it any less self-contined.
True, it doesn't become less self-contained, because it's already not self-contained because it has an input. A function whose input is a function is just as self-contained as one whose input is a natural; or whose input is a teapot: they're all, precisely, not "self-contained" as per my definition.
I can't think of any clearer way of stating it. Maybe you're getting derailed by this phrase because you're interpreting it in some way other than the various equivalent, precise definitions I have given over and over?
Exactly. One way to define the same notion of "self-contained" is "not a function" , so it makes perfect sense that there are no self-contained functions.
Free variables are also not allowed, e.g. defining diagonalization in terms of some free variable "x" to stand for our list. That's effectively the same as being a function of "x", so again it doesn't count as self-contained.
Hence, a program for diagnonalization is not self-contained, since it would be a function, e.g.
function (List<Real> x) { return ... }
Some examples of programs which are self-contained are:
42
sqrt(2)
// Difficult to compute, but still a well-defined number
if (riemannHypothesis) then 1 else 2
// Doesn't halt, evaluates to 1.1111111...
let f = function(n) { return (1 / n) + f(n * 10); }
in f(1)
If we want to allow classical logic, then hypercomputation can be included too, e.g.
if (haltingProblem) then 1 else 2
So by "program" or "description" we just mean a finite representation of a (potentially infinite) object, e.g. like the `1.11111...` example, which we could never write down in 'fully expanded' form.
We need this idea of "programs" to give meaning to statements like "we must write down our entire list before Cantor chooses his counter-example", or "if we force Cantor to write down his supposed counter-example before we write down our list".
Since we're talking about infinite objects (real numbers with a never-ending list of digits, or lists of real numbers), we can't write them down in a 'fully expanded' form, since we'll never finish writing, and hence statements about what happens "after" are vacuous (for the same reason that proof assistants don't allow (non-productive) non-termination).
If we write down a program instead, then 'the entire thing' can be written down in finite time, and hence it makes sense to talk about what happens "after".
Consider the case where a real number is written down first and then a list is written down (i.e. the opposite of Cantor's proof):
- The number cannot be written down in 'fully expanded' form, since its digits never end, so we'd never get to write down the list.
- The number can be written down as a self-contained program. For example, `pi`.
- That program cannot be/use diagonalization, since diagonalization is not self-contained. We can't write down the diagonalization program as our number, since diagonalization itself is not a number (it's a function) so that's a type error. We can't apply diagonalization to anything, since we don't have access to any list of numbers (assuming that we don't have a time machine); we could make up a list of numbers to diagonalize, but in that case we might as well skip the diagonalization altogether and just make up our number.
The theorem we can prove about this scenario is that it's always possible for us to write down a list containing that number. This is trivial, since we're writing our list after the number has been written down (as a program), so we can just put that at the head of our list.
This "opposite situation" gives a result which is the "opposite" of that in Cantor's proof (in Cantor's proof the number is never in the list). Hence "the second player always wins", and the reason the second player wins is that they get to look at the first player's choice before making their own.
We could make a winning strategy for rock/paper/scissors in a similar way: no matter how complex the strategy of the first player is (even if it requires solving the Riemann Hypothesis!), we can always beat them because we get to look at their choice.
All of that is a red herring. The basic structure of Cantor's proof applies equally to finite (what you call "self-contained") lists. Consider:
Theorem: there are an infinite number of the integers.
Proof: Assume to the contrary that there are a finite number of integers. Then it is possible to construct a finite list L that contains all of the integers. Find the largest integer in L and add one. The result will be larger (by one) than the largest integer in L, and hence larger than all of the integers in L, and hence not in L, contradicting our assumption that L contains all of the integers. Hence no finite list can contain all of the integers. QED.
You are saying, "But if you give me any integer, I can produce a finite list containing that integer." That's true. But if you don't see at this point why that is irrelevant then you are beyond my ability to help.
Wow, I thought it was pretty clear when I said that self-contained == not a function. I don't know how you've interpreted that to mean 'is finite'...
The thing that must be finite isn't our list, it's the representation/program/description of our list. Here's a finite representation/program/description of an infinite list:
The list beginning with zero, where each successive element is one more than its predecessor
That represents/computes/describes an infinite list (containing all of the naturals in ascending order), but it itself is not infinite (it only contains 92 characters) and is not a list (it is a program/description/whatever).
That has nothing to do with being "self-contained". All I meant by "self-contained" is, as I've said, "not a function".
The program above is not a function; it has no inputs; it has no free variables; etc. hence it is a finite, self-contained program.
The list it describes/computes/represents is also not a function (it's a list); it has no inputs; it has no free variables; etc. hence it is an infinite, self-contained list.
If you like, we could flesh out examples elsewhere on these axes:
- An example of an infinite, self-contained program is: "the list containing the number zero, followed by the number one, followed by ..." and so on ad infinitum. That's self-contained, since it's not a function (it has no inputs), but it's infinite (of course, to represent it in a Hacker News comment I've had to use a finite description at a meta-level (the '...'), but such is the nature of these things!)
- An example of an infinite, not-self-contained program is: "given a number N, return the list containing N, followed by N+1, followed by N+2, followed ...". It's not self-contained because it is a function: it has an input (N). Again, we have to use a meta-level (the '...') to 'represent the infinite representation' ;)
- An example of a finite, not-self-contained program is diagonalization: "given a list L, return the number whose Nth digit is equal to the Nth digit of the Nth element of L, plus one mod 10." That's finite (it's only 117 characters long), and it's not-self-contained since it's a function (it takes the input 'L'). Note that in this case, we're computing a real number rather than a list, but that's irrelevant to the concepts of "self-contained" (i.e. not a function) and the distinction between representations and objects.
Just for fun, here's another not-self-contained program: "Given a binary tree, return its left-most leaf". That's got nothing to do with numbers, lists, diagonalization, etc. but we can immediately see that it's not self-contained, since it's a function.
The program/object distinction is needed because Cantors proof talks about "after" we've given/written-down/etc. an infinite list. We cannot write down such a list in 'unexpanded' form (e.g. [0, 1, 2, 3, ...and so on), since we would keep going forever; and hence any statement about what happens "after" (like Cantor finding a number that's not in our list) is meaningless, since there's no "after".
On the other hand, it's trivial to write down a finite representation/program/description of this list; that's exactly what I've written above as "The list beginning with zero...".
> The basic structure of Cantor's proof applies equally to finite lists... Theorem: there are an infinite number of the integers. Proof: ...
Yes. It also applies to Euclid's proof that there are infinitely many primes ("given a list of primes, I can write down a prime that's not in that list"). We can use it to prove that there's no "best" choice in rock/paper/scissors: given a choice, I can beat it". It's a very useful technique (as is diagonalization).
> All of that is a red herring
From what? Paying the bills? In that sense, all pure maths (including Cantor's proof) is a 'red herring'. I was pointing out (what I find to be) some interesting aspects of Cantor's proof:
- If we do the choices in the opposite order, we get the opposite result.
- If we interleave the choices (e.g. pick one more list element, pick one more digit, pick one more list element, etc.) then it's a "stalemate".
- If we make the choices concurrently, in a perfect information setting (i.e. each "player" knows their own "program" and their competitor's) then it's a win for whoever has more computing power: since they can simulate their opponent and perform the 'extra step' of add-one-mod-10, or append-number-with-diagonalized-prefix (depending on which "player" they are).
Isn't 'noticing and discussing interesting things about an abstract/formal system' basically the process of mathematics? So why is it a "red herring"? What were you hoping to achieve, other than (hopefully) understanding these aspects of the proof that I've tried to point out (which you can then ignore, or take in another direction, or whatever you like)?
> if you don't see at this point why that is irrelevant then you are beyond my ability to help.
Again, irrelevant to what and help with what? I don't plan on taking these observations any further (e.g. writing a paper), since I would imagine they're pretty clear to those with a mathematical mindset. I was pointing them out here because:
- The HN audience seems to like mathematical ideas (for example, this article!)
- The HN audience isn't all highly trained mathematicians, for whom this might be too trivial to discuss
- The HN audience seems to like computational ideas, which these aspects of the proof are related to (although classical mathematics isn't limited to turing computability; hence why I said a "program"/"description" could use hypercomputation, if we like)
> Wow, I thought it was pretty clear when I said that self-contained == not a function. I don't know how you've interpreted that to mean 'is finite'...
Because your definition doesn't make sense. Everything in mathematics is a function. I was trying to do you the courtesy of interpreting what you said in a way that made sense rather than just pointing out that you don't seem to understand what a function is.
> The program above is not a function; it has no inputs; it has no free variables; etc. hence it is a finite, self-contained program.
Now it sounds like what you're trying to get at is the concept of a "constant function" (https://en.wikipedia.org/wiki/Constant_function). Constant functions are functions, and they can have inputs.
So now it seems that you've just spent all this time and effort to point out that diagonalization is not a constant function. That's true, but I don't think it's a particularly interesting observation.
No, everything in mathematics is precisely what it is defined to be by whoever is doing the defining. If you want to introduce extraneous extra concepts into a theory, then that's fine; for example, Russell and Whitehead seemed to enjoy contorting set-based constructions into areas where they're completely unnecessary.
However, care should be taken to remain compatible with the original definitions. In this case, there is a clear distinction between "a foo" and "a function which takes some input and returns a foo which depends on which particular value was provided as input". To introduce functions as a 'universal substrate' such that everything is a function, then complain that every "foo" is a function, then introduce the concept of 'constant function' serves no purpose other than to obfuscate perfectly clear definitions under a mountain of irrelevant one-upmanship.
> So now it seems that you've just spent all this time and effort to point out that diagonalization is not a constant function.
That's what most of the time and effort have been spent on, yes. That's not at all what I was trying to "point out" though. I was simply assuming that this was common knowledge (for those familar with Cantor's proof, at least), and so I defined some terms like "self-contained" precisely so I that I could ignore algorithms like diagonalization.
> That's true, but I don't think it's a particularly interesting observation.
I agree it's not interesting. But I wouldn't even call it an "observation". Again, it's simply a trivial consequence of the definitions I gave, and I gave those definitions precisely so that I can ignore diagonalization, for this reason.
My actual observations are along the lines of "the steps of Cantor's proof can be rearranged in these ways, which give rise to these alternative results". Diagonalization doesn't matter for any of that, but you seem to keep trying to:
- Insert diagonalization into those rearrangements, despite it being a type error to do so (because diagonalization is a function, not a number; modulo whatever 'constant function' axioms would make you feel better)
- Complain that those rearrangements are "wrong" (they're not; they're my definitions, so they're "right" by definition!).
- Restate over and over various ways that my definitions do not correspond to Cantor's proof. Which is the entire point.
- "Prove wrong" those definitions, by showing that they're incompatible with various aspects of Cantor's proof (e.g. the order of steps) and diagonalization (e.g. its type). Which is completely backwards, since those definitions were chose precisely so that Cantor's proof and diagonalization will not work for them.
This rules out diagonalization, since that can't be run until it has a list of numbers to diagonalize. In the 'reverse' game, we're first running the counter-example-outputting program that Cantor provides, and then using its output to pick our numbers. If Cantor gives us a diagonalization program, we'll get a deadlock. In fact, the strategy to beat Cantor in this reverse game is simply the identity function: if Cantor's program claims that "X" isn't in our list, we can refute it by outputting "X".
The way I think about it is this: if we give Cantor our program up-front, he can run it to see what we're going to choose, then pick his 'move' accordingly (e.g. via diagonalization); if Cantor gives us his (self-contained) program up-front, we can run it to see what 'move' he's going to choose, and pick ours accordingly (e.g. via the identity function).
Both games are a win for whoever goes second.
Alternatives to the 'up-front' approach include:
- Interleaving or concurrent outputs, in which case there would be no winner: each time a number is added to the list, diagonalization extends its counter-example with another digit; each time the counter-example is extended, a number with matching prefix is added to the list.
- Interleaving or concurrent programs, with access to each others' source. I think the winner would be whoever has the most computing power (and can hence simulate their opponent further into the future).