Great article. Since it looks like a lot of folks are interested in this article, some extra background.
First, what is forcing? The article actually has a great description of ultrapowers (a key part of the construction) but it goes by a little fast, so you might like Tim Chow's "A beginner's guide to forcing" [1] which does a good job not only laying out the mathematical details at a high level, but also really clearly explaining the philosophy of how you prove something independent of the axioms. I found it very illuminating.
Second, the article has some interesting notes on how mathematicians go about what axioms to select and which not to. Penelope Maddy's "Believing the Axioms" [2] is the classic on this topic (it has two parts). It is focused on set theory, so it has a nice description of Martin's axiom and the arguments for and against. It was nice to read because it is a deeply technical argument (set theory is hard!) but at the same time there is no "right answer"—all of these axioms, after all, are independent of ZFC, and it's "ok" to add any of them. The arguments are sometimes aesthetic ("rules" versus "surprises"), sometimes pragmatic (what they can and cannot prove), and sometimes involve deep values of what the universe should be like (should higher cardinalities be like lower ones? weirder? simpler?).
It might all seem abstract, but if your day job is programming, imagine an argument over how to architect a large and complex system. Perhaps both architectures are possible, but which one is "right"? What arguments would you deploy? Set theorists are also building a large and complex system (the universe of sets) and are having arguments over how it should be built, which things it should make easy and which hard, which technologies should be supported natively (forcing?) and which should not.
> but at the same time there is no "right answer"—all of these axioms, after all, are independent of ZFC, and it's "ok" to add any of them.
Minor quible: Just because a potential axiom is independent of ZF(C) doesn't make it necessarily "okay" to add. Potential axioms can be unsound, for example if they prove new / untrue Sigma_1 statements. As an example, even in the likely circumstance that ¬Con(ZFC) is independent of ZFC, it wouldn't be "okay" to add ¬Con(ZFC). While the resulting system would be consistent, and does have models, the resulting system is unsound (in the sense of Tarski) because it asserts the existence of natural numbers that have no "written form" (i.e. the existance of natural numbers that are larger than any term you can write to denote a natural number).
That said, Martin's axiom, like the CH (or the axiom of choice), does not have any arithmetic consequences, and thus doesn't fall into this category of problematic axioms.
> because it asserts the existence of natural numbers that have no "written form"
I don’t see why that should imply it wouldn't be "okay" to add ¬Con(ZFC).
It may be highly counterintuitive, but the history of mathematics is full of counterintuitive results that nowadays are accepted as true in mainstream mathematics.
Well-known examples are the existence of irrational numbers, the claim that the set of natural numbers has the same size as that of the rational numbers, the existence of hyperbolic geometry, and the Banach-Tarski paradox.
The Banach-Tarski paradox is fine with me: Not all subsets of the real line or R^n for the set of real numbers R and positive integer n are measurable. Yes, the usual example of a non-measurable set uses the axiom of choice. The sets in the Banach-Tarski paradox are not measurable -- okay.
Of course the natural numbers can be put into 1-1 correspondence with the rationals -- how to show that is the classic Cantor diagonal argument.
At a tea before a seminar, I asked Max Zorn what Paul Cohen had just proved. Zorn didn't explain it and instead just loaned me his copy of Cohen's paper -- I lost it in a move! If Zorn wasn't strongly interested in Cohen's proof, then neither should I be.
For any set of axioms, there will be statements we can neither prove nor disprove. Surprising, interesting, got it.
Zermelo-Fraenkel set theory with the axiom of choice seem fine to me -- now back to the real work!
ZFC+¬Con(ZFC) proves "ZFC+¬Con(ZFC) is inconsistent".
So if were the case that ZFC+¬Con(ZFC) was sound then what it proves would be true, and "ZFC+¬Con(ZFC) is inconsistent" would be true. But that would mean ZFC+¬Con(ZFC) is actually inconsistent, and thus ZFC+¬Con(ZFC) would actually be unsound after all.
It's not "okay" because such a system proves that various particular Turing machines halt when, in fact, those machines do not halt. See https://news.ycombinator.com/item?id=27847719.
But to be a bit more specific, ¬Con(ZFC) says that the Turing machine that searches for a contradiction in ZFC does indeed halt. However (in all likelihood) such a machine does not actually halt, in the sense that it does not halt in 1 step, and it does not halt in 2 steps, and it does not halt in 3 steps, etc., and indeed (in all likelihood) for each numeral n, ZFC even proves that the machine does not halt in n steps.
(Now there is a small possibility that maybe such a machine does halt in some particular number of steps. If it does actually halt, that means it has found a proof that ZFC is inconsistent. But this scenario is even worse, because it means that ZFC itself is not only unsound, but inconsistent, (and hence ZFC+¬Con(ZFC) is also inconsistent). In particular ZFC+¬Con(ZFC) being inconsistent means it proves that every Turing machine halts, which is even more wrong in general, even if it happens to be right about this particular machine.)
> such a system proves that various particular Turing machines halt when, []in fact[], those machines do not halt.
Er, no. The fact is that there is no fact of the matter as to whether those particular Turing machines either a: do not halt at all, or b: halt after a (colloquially) infinite number of steps. (A implication of there being no fact of the matter is that, empirically, we can't tell the difference by running them, but we can't tell the difference for a machine that halts in 2^(2^(2^256)) steps (or not at all) either, so that's not very interesting on it's own.)
(As you note, we don't actually know that (for example) a Turing machines looking for contradictions in ZFC is one of those particular machines; indeed I'm not sure offhand if we actually know of any specific example of such a machine. But that's presumably not the issue here.)
ZFC+¬Con(ZFC) either wrongly proves that the machine that searches for an inconsistency in ZFC halts, or it wrongly proves that `while(true)` halts. Either way ZFC+¬Con(ZFC) is wrong about something.
ZFC+¬Con(ZFC) definitely proves that the machine that searches for an inconsistency in ZFC halts. That is a direct consequnce of ¬Con(ZFC). Keep in mind that by "proves" I'm only saying that there exists a logical deduction from the axioms. I'm not saying that it is true or false. Remember that logical deductions are "truth preserving", but we can only conclude that the conclusions are true if the axioms at the start are all true.
So again, we have a proof in some sketchy axiom system that the machine that searches for an inconsistency in ZFC halts. The question is whether that conclusion is a true statement or not, whether that machine actually does or does not halt. You can start up that program on your laptop while we consider it. Maybe it will halt during our discussion. I also want to point out that whether that conclusion is a true statement or not has no bearing on whether the deduction system was sound to begin with. Unsound systems can prove true statement, simply by accident rather than by design.
Case (1) that machine doesn't really halt, then ZFC+¬Con(ZFC) proves an untrue statement, and we thus the system is unsound.
Case (2) that machine does really halt. Then this particular conclusion wasn't untrue, but we have another problem. We can take the output of that machine, decode it, and we have a proof of an inconsistency in ZFC. Again this is just a deduction, it doesn't mean that math is wrong or the world ends. Just that the ZFC axiom system is so unsound that it is inconsistent, and it has some untrue axioms. Nevertheless we can use such a proof and derive a proof that ZFC prove `while(true)` halts, and in turn transform that into a proof in ZFC+¬Con(ZFC) that `while(true)` halts. And it is definitely the case that `while(true)` doesn't halt. So in this case we have found a different but still untrue statement that ZFC+¬Con(ZFC) proves. Our conclusion is the same: ZFC+¬Con(ZFC) is unsound.
So in either case `ZFC+¬Con(ZFC)` proves some machine halts that doesn't. I haven't concluded which of the two machines it is wrong about though I have my personal suspicions, but it is definitely wrong about one of them.
Axioms are true by definition; that's what "axiom" means. (They might also be false, but (if and) only if your system is inconsistent.)
> Case (1) that machine doesn't really halt
This is not a state of affairs: all you know at any given time is that the machine hasn't halted yet (after some specific finite number of steps), or that it has aquired some termination-precluding invariant like a infinite loop (which you can detect only strictly less than all of).
Regardless, the particular technical definition of truth I'm using is Tarski's definition of truth[1]. We don't have to agree whether Tarski's definition is "correct", and I'm happy to qualify my use of the word "truth" here, but I'm referring to a specific technical definition here.
Many entry-level real analysis courses will cover cardinality after introducing sets and functions. There's also a short article on Wikipedia. [0]
Your intuition about there being more rational numbers might be based on viewing the rationals as a proper superset of the natural numbers. You might similarly consider that there are more natural numbers than even natural numbers. However, by "renaming" every even number, and that shouldn't change how many there are, to half its value, we obtain the natural numbers. Formally, there exists a bijection between N and 2N, as between N and Q, and this is what mathematicians mean when they say that sets have the same size, or cardinality.
In this case, at most one of these two a priori quite reasonable statements can be true:
- if set B is a strict superset of A, B is larger than A.
- if you can map the times in A to the items in B in 1:1 fashion, A and B have the same size.
Giving up the first is deemed less problematic than giving up the second (probably because that means giving up comparing sizes of infinite sets at all, but I’m not familiar enough with that to be sure about that)
Hence, we've just mapped all rationals (with duplicates) to a single linear list. Since any linear list can always be mapped to the naturals (1, 2, 3, ...), they have the same cardinality.
The proof is when you can create a bijective function between the sets then they have the same cardinality. The “zip” function can do this between integers and rationals.
> he resulting system is unsound (in the sense of Tarski) because it asserts the existence of natural numbers that have no "written form" (i.e. the existance of natural numbers that are larger than any term you can write to denote a natural number).
Isn't that basically the definition of the natural numbers, ie. if you write down any natural number (say n) I can always construct a natural number that is larger than it (like n+1) ?
This surprisingly doesn't mean repeatedly adding 1 will exhaust all natural numbers -- there are models for the natural numbers with elements that can't be reached this way!
The ultrafilter construction gives one such model. You take the set of all sequences of natural numbers (n1, n2, n3, ...) then use an ultrafilter to decide which of these sequences are considered to be equal. The usual operations of natural numbers are defined term-by-term. You can think of the usual natural numbers as being the sequences (0,0,0,...), (1,1,1,...), (2,2,2,...) and so on. However, there are many additional numbers in this system, like (0,1,2,...) that are strictly greater than all the usual numbers, and these cannot be reached by repeatedly adding one. (The reason (0,1,2,...) is greater than (n,n,n,...) is that if we do the comparison term-by-term we get (false,false,...,false,true,true,true,...), and the ultrafilter will decide the comparison is true because it is true for all but finitely many terms. Ultrafilters are devices to consistently turn infinite sequences of trues and falses into a single decision, but every ultrafilter will make a true decision in the case there are only finitely many falses.)
Counter-intuitively, proofs by induction still work for this system... speaking with no authority here, maybe an intuition is that it's doing a hypercomputation.
> This surprisingly doesn't mean repeatedly adding 1 will exhaust all natural numbers -- there are models for the natural numbers with elements that can't be reached this way!
Isn't that just true for models of a first-order axiomatisation of the natural numbers? As far as I know, if you write down the Peano axioms in second order logic (as is required), you do get the natural numbers uniquely, and applying the successor function over and over again does exhaust all the natural numbers eventually.
The reason why you get non-standard models of arithmetic in first-order logic is the Löwenheim-Skolem theorem, but that isn't available for second-order logic.
Yeah, my understanding is that second-order arithmetic picks out the usual natural numbers, but it is a much stronger system since you can additionally quantify over subsets of naturals in your formulas.
The context of the comment is ZFC and its concept of the natural numbers (and what I had in mind in particular was what is its minimal infinite inductive set?). ZFC is a first-order theory, so "model" means model of a first-order theory.
> The reason why you get non-standard models of arithmetic in first-order logic is the Löwenheim-Skolem theorem
The ultrafilter construction I described is a way around the compactness theorem, in that it constructs new models semantically rather than syntactically. Ultrafilters are also a good way to prove the compactness theorem. But in any case, it's essentially the first half of the upwards Löwenheim-Skolem theorem before applying the downwards Löwenheim-Skolem theorem -- I think the significance of that theorem is that you can get models of any infinite cardinality you want, rather than just a strictly larger one. (Though I'm no logician.)
>The ultrafilter construction gives one such model.
It's worth noting that this construction relies on the axiom of choice, so exists in ZFC but not ZF. Generally a lot of counter-intuitive constructions disappear when eliminating the axiom of choice (such as the Banach–Tarski paradox and non-continuous functions).
I think the step function doesn't depend on choice. Say f(x) = 0 if x < 0 and f(x) = 1 if x >= 0. Also, suppose we define the reals through Dedekind cuts. We can define f(x) by checking whether 0 is an element of x as a cut. We can prove f is discontinuous in the usual way.
What I had heard is that Brouwer proved, using some kind of constructive/intuitionistic logic, that every function is continuous, but ZF is still based in classical logic. I believe the step function is not definable constructively.
Ok, but I'm not sure what your point is. Throwing out only the axiom of choice (which is the difference between ZFC and ZF) does not make things constructive. If you're thinking of Diaconescu's theorem, it's that choice implies the law of the excluded middle, but the converse is certainly not true.
I was objecting to this:
> Generally a lot of counter-intuitive constructions disappear when eliminating the axiom of choice (such as [...] non-continuous functions).
Can you explain what use these have? If natural numbers are (n, n, n, ...) then you have just made a new type of number not comparable to natural numbers.
A theoretical use is that it shows that the axioms of the natural numbers (the Peano axioms) aren't enough to pin down what we think the natural numbers should be -- there are these crazy non-standard natural numbers out there!
In context of the discussion, this non-standard model of the natural numbers gives some intuition about what happens if the consistency of ZFC is independent of ZFC and you add in the axiom that ZFC is inconsistent: your natural numbers will have to be something similarly weird.
> you have just made a new type of number not comparable to natural numbers
But they are comparable. The everyday natural numbers are embedded in this new system, which is what these (n,n,n,...) elements represent. One way to interpret what we've done here is to add in infinities to the number system so that all the normal operations still work, and there's even a total ordering on them!
Using the real numbers instead of the naturals, you get the so-called nonstandard real numbers, which is the number system used in nonstandard analysis. Nonstandard analysis is a way to do analysis without epsilon-delta proofs. Nonstandard reals include infinitesimals and infinities, similar to nonstandard naturals having infinities. There are even textbooks on the subject -- I haven't read them, but they claim it makes analysis easier.
The last thing that comes to mind is that nonstandard numbers of this exact type show up when you study certain infinite-dimensional number systems and calculate the points (like in the end of my comment).
Dropping down to Peano Arithmetic for a moment. We can consider adding a new constant 'c' for a natural number to the language along with the following infinite list of axioms about this remarkable constant:
- 0 < c
- 1 < c
- 2 < c
- 3 < c
...
Adding all these axioms is consistent. I.e. you can do induction upto 'c', whatever it is. Why is it consistent? Because if there was a contradiction, the proof of such a contradiction would be finite, and hence can only use a finite number of these new axioms (this is a so-call compactness argument). But clearly any finite subset of this list of axioms is consistent because it has a model where c is just defined to be 1 more than the largest numeral appearing in that list.
But all those infinite number of axioms taken together creates an unsound system because it claims that 'c' denotes a natural number that is larger than every written numeral.
Heading back to ZFC land, it turns out that (assuming ¬Con(ZFC) is independent of ZFC) adding ¬Con(ZFC) to ZFC similarly is similarly unsound in that it yields only models that have elements that are larger than every written numeral.
This sounds a little bit like an argument against systems with an infinite number of axioms to me. Or maybe it's fine but only under certain conditions?
You can find a natural number that is bigger than any one natural number, but you can't write one down that's bigger than every natural number in the ZFC sense.
Do you mean that given finite resources (like time and matter) we would be unable to express the number? Or are you talking about a sort of recursive thing where writing down a number which is a sum of all natural numbers up to and including itself is impossible since that number is bigger each time you look at it?
> that's bigger than every natural number in the ZFC sense.
I'm not sure what you mean by this.
Even for nonstandard models of the natural numbers there will never be a single number larger than all other numbers, since that violate the Peano axioms.
Did you mean by "in the ZFC sense" "the standard model?"
For this entire argument I will be operating under the assumption that ¬Con(ZFC) is independent of ZFC.
ZFC+¬Con(ZFC) proves "there EXISTS a natural number x such that x is an encoding of a proof in ZFC of 0=1".
Now suppose there is actually a numeral n, i.e. a specific written term written using symbols, e.g. (+,*,1), such that
ZFC+¬Con(ZFC) proves that "n is an encoding of a proof in ZFC of 0=1". We can mechanically decode such a n and check if it is indeed a proof that that ZFC proves "0=1" or not.
There are two possibilities: (a) the check is successful and thus we have found proof of a contradiction in ZFC. But if ZFC is inconsistent, then it proves everything. In particular ZFC proves ¬Con(ZFC), which violates our assumption that ¬Con(ZFC) is independent of ZFC.
Alternatively (b) our check fails and n does not encode a proof in ZFC of 0=1. But that statement "n does not encode a proof in ZFC of 0=1" is a true Delta_0 statement, and we can prove by induction that ZFC proves every true Delta_0 statement. Thus we can prove that ZFC proves "n does not encode a proof 0=1", and hence ZFC+¬Con(ZFC) proves "n does not encode a proof 0=1". But now we have found a statement Q such that ZFC+¬Con(ZFC) proves both Q and also ¬Q. This means that ZFC+¬Con(ZFC) is inconsistent. But if ZFC+¬Con(ZFC) is inconsistent, by the deduction theorem, ZFC proves ¬¬Con(ZFC), or equivalently ZFC proves Con(ZFC). This contradicts our assumption that ¬Con(ZFC) is independent of ZFC (it also implies that ZFC is inconsistent).
Thus we are left with the conclusion that there is no such numeral n. However it is still the case that ZFC+¬Con(ZFC) proves "there EXISTS a natural number x such that x is an encoding of a proof in ZFC of 0=1", and the only way this can be the case is any such x is a "natural number" which has no numeral that denotes it.
On any storage system you will eventually run out of memory, so there will always be a maximum integer that you can write down with what you have, and we can only write down a vanishingly small subset of the natural numbers.
Of course, this isn’t what mean when they talk about “writing down” a number, but I’ve long thought that assuming the existence of natural numbers that you can’t write down is poetically apt.
In some sense, the set of natural numbers is “too big” to be about everyday finite numbers alone and it seems fitting to acknowledge that our usual axioms can’t exclude weirdo large finite numbers that are simply too big to ever reach.
I wonder what you mean by "what's wrong with an unwritable natural"... It is sound to assume one! And it doesn't make any true mathematical facts false. But of course it's sound to assume many silly things. But why assume those things?
The reality is that mathematicians didn't first come up with the Peano axioms and then study the interesting consequences of them. Both as a matter of history and also as a matter of why most mathematicians do math, mathematicians first came up with numbers, and only later came up with axioms that allow them to do rigorous reasoning about them. The same is actually true of almost all math—real numbers were invented to do rigorous reasoning about calculus, which by that point had been around for a century plus; set theory likewise.
In other words, if the goal is to come up with any old consistent set of axioms, you can assume an unwritable natural number, that's fine, go ahead. But if your goal is to study natural numbers—which, like, I have a lot of experience with natural numbers in my day to day life, I'm pretty sure I could write all of them if I had enough time and space and so on—then you want to study natural numbers, not some other weird things where there are unwritable weird things.
This is not unsound in the usual sense of "logical soundness" (which applies to logical systems such as first-order logic, rather than specific theories in the system such as ZFC).
This is unsound in that it runs counter to certain philosophical commitments, namely that there should be some tangible, physical realization of all the natural numbers (although if you really go far in that direction you end up with ultrafinitism, which most set theorists would find unpalatable, so the philosophical implications of all this are rather tricky).
As you allude, it's rather ridiculous to make certain philosophical commitments but only apply them to natural numbers. I think ordinary finitism will do fine though, no need to go to ultrafinitism.
Mostly because it implies that some Turing machines halt that actually do not halt. That is unless you are willing to accept that a Turing machine can halt in some number of steps that is beyond any number that can be written. And I don't mean can't be written in the sense that we don't have enough paper. Just cannot be written in principle at all by our notation for numbers.
The difference here is that with busy beaver numbers, e.g. BB(101) we can, presumably, write their values with our notation; it's just that we often cannot prove that any particular value written in our notation does indeed denote the value for that function. So if we write 100000...0000 with an unholy number of 0s there, it might be the value of BB(101), in particular we might not be able to prove that it isn't.
On the other hand, for a non-standard number, c, it is definitely the case that 100000...0000 is not c, because, whatever c is, it is strictly greater than 100000...0000, or any other number we can write down.
And thus when it comes to proofs about the termination of Turing machines that do not actually terminate, the unsound system is claiming that some machine terminates, but it doesn't terminate in 1 step, nor 2 steps, nor 3 steps, nor ... nor 100000...0000 steps, nor 100000...0001 steps, nor 100000...0002 steps, nor .... However, regarding the BB(101), the (presumably) sound systems we use such as PA, or ZFC, do not claim that BB(101) isn't 100000...0000. They just may not be able to prove anything one way or the other.
They exist in that it is easy to prove "for ALL x there EXISTS y such that (there EXISTS a Turing machine with at most x states M such that (there EXISTS some n such that M prints y 1's and halts after n steps) AND (for ALL Turing machines with at most x states M, if (there EXISTS some n such that M halts after n steps) then (M prints no more than y 1's after n steps))". I expect you can prove this in systems as weak as Primitive Recursive Arithemetic.
It is true though that various (decidable) proof systems are unable to prove that the Busy Beaver function has any specific value beyond a certain point. However, where that cut-off is varies from proof system to proof system, with stronger proof systems being able to prove more an more values of the Busy Beaver function.
Maybe you find it weird that we can prove (Exists x. P x) without being able to prove P n for any particular numeral n? Welcome to the strange world of classical (i.e. non-constructive) mathematics.
> It is true though that various (decidable) proof systems are unable to prove that the Busy Beaver function has any specific value beyond a certain point.
Isn't the result stronger than that though? The value of BB(30) might be X. Or it might be Y. Neither value would cause any problems.
Thus, "the value" doesn't exist. There is no value that is the value of BB(30).
Yes, what you are saying is indeed the case in the sense that if BB(30)=n for some particular numeral n is independent of whatever proof system we are focusing on, for the sake of argument let's say ZFC, (though 30 feels a a bit on the low side for ZFC), then we can consistently add BB(30)=n or BB(30)≠n as axioms to the system, the same way to can for any other independent statement. And there will be arbitrary large values of n that are independent, so we could add BB(30)=n or we could add BB(30)=m or so forth for any numerals i so long as BB(30)=i is independent of ZFC (or whatever axiom system we are considering).
But! that does not mean that all these systems are sound, for if you add the incorrect value as an axiom, specifically if you add anything but the smallest value 'm' such that BB(30)=m is independent of ZFC, then you are in a similar situation to adding ¬Con(ZFC) whereby you only have non-standard models.
They way this works is that, suppose BB(30)=m for some particular m is independent of ZFC, but BB(30) is not actually equal to m. What we will find is that there is some Turing machine M and some non-standard natural number q such that, while M doesn't actually halt in reality, according to some non-standard model it does halt in 'q' number of steps, and, in this model, when it does halt, it leaves 'm' 1's on the output tape.
That said, your comment did push me towards the limits of my comfort zone, so it might be helpful to double check what I'm saying. I don't see how it could be any other way though.
To elaborate a little further on why BB(30) does have a true value.
If BB(30) truly equals some particular numeral m, then there actually does exist a Turing machine M that has no more than 30 states and does halt with m 1's on its tape. Not only does this machine really exist, but the fact that it halts in this state is true Delta_0 statement, and thus it is trivially provable, you don't even need induction. It is provable in Robinson Arithmetic, it is provable in Peano Arithmetic, and it is provable in ZFC. Thus we have that RA proves "BB(30) ≥ m" and PA proves "BB(30) ≥ m" and ZFC proves "BB(30) ≥ m", etc.
By transitivity each of these systems also proves "BB(30) ≥ m₀" for every numeral m₀ less than m. But if m₁ is a numeral larger than m, it is no longer the case that RA proves "BB(30) ≥ m₁" and PA doesn't prove "BB(30) ≥ m₁" and (assuming ZFC is sound) ZFC doesn't prove "BB(30) ≥ m₁"
Thus we "can tell" what the true value of BB(30) is because it is the greatest value in which RA (or PA or (presumably) ZFC) proves "BB(30) ≥ m" for the numeral m denoting that value. Equivalently the value of BB(30) is the least value m for which RA, PA, or any other sound system, does not prove "BB(30) ≠ m".
We "can tell" this value even if we might not be able prove it in PA, or ZFC, or "ZFC + there exists a Mahlo cardinal". Indeed each different proof system likely can extend the range of BB values that are provable.
If you add an axiom "BB(30) = m" for some particular numeral m as an axiom to ZFC and it is the wrong value of m, you will either get an inconsistent system if "m" is too small, or you will get an unsound system if "m" is too large. In case "m" is to large, the unsound system will wrongly prove that certain Turing machines halt that do not actually halt. Specifically it will falsely prove that the Turing machine that searches for (a Turing machine with no more than 30 states that halts with "m" 1's on it tape) will halt, when such a machine does not actually halt.
Yes they are independent of ZFC but they still exist. From the original article talking about CH which is also independent:
> This independence is sometimes interpreted to mean that these questions have no answer, but most set theorists see that as a profound misconception.
They believe the continuum has a precise size; we just need new tools of logic to figure out what that is. These tools will come in the form of new axioms.
I believe the same applies for BBs. They have precise values outside of ZFC but we still cannot ever know them because they are incomputable (??)
> but we still cannot ever know them because they are incomputable (??)
That doesn't mean you can't know them. We know BB(2). It means there is no single algorithm which is capable of yielding them all. But there could, theoretically, be an algorithm for BB(10), a different algorithm for BB(11), etc.
In fact, those individualized algorithms don't exist either.
By number that cannot be written, I don't mean cannot be written due to lack of paper; I mean a value that there is no notation for.
You cannot run a machine for "that many" steps because such a value is unreachable by steps. Non-standard models have a set of values that begin with a copy of the actual natural numbers, followed by some ordered set of copies of the integers in the sense that any values "beyond" the initial natural numbers have an infinite number of successors and an infinite number of predecessors, like integers do. By counting in steps it is not possible to move from a value in the initial natural number fragment to one of these non-standard values, because there are an infinite number of values in between them that you would be required to step through.
> By number that cannot be written, I don't mean cannot be written due to lack of paper; I mean a value that there is no notation for.
That's true for most real numbers in ordinary mathematics too, so if you accept the existence of those numbers, no reason why you can't do the same for the natural numbers in these non-standard models.
> You cannot run a machine for "that many" steps because such a value is unreachable by steps. Non-standard models have a set of values that begin with a copy of the actual natural numbers, followed by some ordered set of copies of the integers in the sense that any values "beyond" the initial natural numbers have an infinite number of successors and an infinite number of predecessors, like integers do. By counting in steps it is not possible to move from a value in the initial natural number fragment to one of these non-standard values, because there are an infinite number of values in between them that you would be required to step through.
Soundness in this sense (sigma_1 soundness) is defined by equivalency to the standard model (specifically, all sentences provable in the system must be provable in the standard model).
It is defined by equivalency to a standard model. Whether there is a single standard model is a philosophical question (which granted most, but not all, set theorists tend to agree with). Hence sigma_1 soundness is from a purely mathematical point of view a relative statement.
You're absolutely right. ZFC + ¬Con(ZFC) is a weird set of axioms, and no mathematician really studies it. I skipped this point, because we weren't really discussing such axioms, but it's an important point that there are different levels of mathematical / philosophical commitments, and realism and PA are way stronger commitments than the ones people have about cardinalities.
> the resulting system is unsound (in the sense of Tarski) because it asserts the existence of natural numbers that have no "written form" (i.e. the existance of natural numbers that are larger than any term you can write to denote a natural number).
Isn't this a straightforward implication of nonstandard analysis?
'because it asserts the existence of natural numbers that have no "written form"'
I mean, mainstream axiomatic systems assert the existence of real numbers that have no written form (after all, only countably many entities can have a written form), so how is this any different?
Why does forcing work? To me it seems flawed (which obviously means I don't understand it fully).
For diagonalization argument: 1) Assume every real can be assigned a natural number. 2) Do a bunch of steps that essentially find a new real that differs from any real you have listed from step 1. 3) Conclude that either your steps are flawed, or your initial assumption is wrong 4). Because your steps aren't flawed then your initial assumption (that every real can map to a natural number) is flawed.
That all makes sense to me. Forcing seems broken though:
1) Capture of list/set of all real numbers 2) Do a bunch of steps that essentially find a new real that differs from any real you have listed from step 1. 3) Conclude that either your steps are flawed, or your initial assumption is wrong 4). Because your steps aren't flawed then your initial assumption (that you can produce a list of every real number) is flawed.
Now you're left in an odd situation. Didn't you just conclude that it's impossible to have a set of all real numbers? Which isn't what you're trying to prove at all.
"Didn't you just conclude that it's impossible to have a set of all real numbers?"
Cantor's diagonalization proof proves that it's impossible to list all the real numbers with a list of size aleph-0, which is the cardinality of the set of natural numbers.
The forcing proof is an attempt to prove you also can't do it with the a list of the size aleph-1, which is the size of the power set of aleph-0. It purports to prove that you need a list of size aleph-2, which is the size of the powerset of aleph-1.
You can kind of think of this not so much as whether "Can Crysis run?" ("can you have a set of all real numbers?") but "Can Crysis run on this machine?" Do the real numbers need aleph-1 "resources", or aleph-2 "resources" to list?
> aleph-1, which is the size of the power set of aleph-0
In ZFC, aleph-1 is not the size of the powerset of aleph-0. Instead, aleph-1 is the next larger cardinal number after aleph-0. The size of the powerset of aleph-0 is called continuum or beth-1. In ZFC, we can show that the size of the set of real numbers equals continuum, but it is not possible to relate continuum to a specific aleph-k (though some can be ruled out using Easton's theorem).
Now, the statement that aleph-1 is the size of the powerset of aleph-0 is known as the Continuum Hypothesis (CH) and is independent of the axioms of ZFC. Therefore, your claim that aleph-1 is the size of the powerset of aleph-0 cannot be made in ZFC. The statement can be made in ZFC+CH, but then the question which aleph-k is the size of the real numbers has a straightforward answer: aleph-1.
Thanks for the reply. Question though. in Cantor's argument we explicitly mapped the reals to aleph-0 so it makes sense that our conclusion decides that mapping to aleph-0 is too small so it's size must be larger.
Where in the forcing process do we even "use" aleph-1? If we used aleph-1 then it could see the parallels and the argument would make sense - but all I see in the forcing process is "start with a set of all reals". Nothing about trying to map them to aleph-1. Maybe it's implicit, but couldn't I have just changed step 1 to instead be "start with a list of size aleph-90, then use forcing to prove that the reals are larger than aleph-90." In Cantor's argument it feels like we "used" the natural numbers. Here it seems more like we said take an arbitrary number of them and see this one isn't in it. But that arbitrary number of them (aleph-1) could've been any arbitrary number of them because we never really used any property of the set being aleph-1 vs aleph-90.
The article explained forcing in such a way as to simply restate what I thought we already knew: given a real, there is no "next" real. (ie, there are a non-countable-infinite number of reals between any two reals).
I don't see the newness that forcing brings to this.
The question is not where there are reals between reals, it's a question of the size of sets. In particular is there a set strictly larger than the natural numbers, but strictly smaller than the reals? Forcing allows us to construct such a set in ZFC.
Not a mathematician, so this probably has an obvious answer, but who says that "size" is a necessary universal property of a set? It doesn't seem any more rational to speak of size as a required property of a set than it would be to treat color or weight that way. Some sets have those attributes, but not all.
It seems perfectly reasonable to say that the set of real numbers is one of those sets that doesn't have a "size" property.
Size is a hand-wavy way of saying it. What we're really asking is given two sets A and B, is there a function from A to B such that no two elements from A are mapped to a single element in B?
It turns out if you use ZFC (in particular choice) then one direction must hold. Either such a function exists from A to B or from B to A. This means that we can order all sets using the existence of these functions, and so in that sense size is a "universal property."
That's weird. I thought that it was proven that existence of sets larger than aleph-0 but smaller than the number of real numbers is undecidable and you can add it (or negation of it) as additional axiom to math.
"The continuum hypothesis, which asserts that there are no sets whose cardinality is strictly between aleph-0 and c=aleph-1. The truth or falsity of this hypothesis is undecidable and cannot be proven within the widely used ZFC system of axioms."
The illustration in the article doesn't start with the whole set of all real numbers, but with just "a" set containing an uncountable number of reals, so there is no contradiction in that sense.
I'm assuming there's some unmentioned technical condition on which sets you are allowed to choose in order for the procedure to work, otherwise the argument would indeed seem to lead to a contradiction when choosing the set of all real numbers.
This assumption is correct - the quanta article left out quite a bit of detail about how forcing actually works, while trying to give some hint of the philosophy behind it.
As it turns out, you can still apply the forcing technique even if you start with the whole set of all real numbers. What happens in this case is a bit surprising: since there's no legitimate way to add new real numbers that weren't there previously, the forcing procedure cheats by pretending that real numbers have nonstandardly long decimal expansions.
To make the way it cheats a bit more concrete, we start by first imagining that there are "nonstandard integers" n which are bigger than all of the "true" integers from our starting model. Once we assume that these integers exist, we have to commit ourselves to providing answers to questions like, "what is the nth digit of pi?", "what is the n+1th digit of pi?", and so on in a consistent way. By answering these questions, we embed all of the original real numbers from our starting universe into the new universe. But the point is that now that we have all of these extra digits to play with, all of the original real numbers only fill up a tiny subset of the potential "real" numbers in the new universe, and we can happily go back to forcing new ones in.
Of course, adding nonstandard integers to our universe is quite a violent change, so we also have to worry about whether we've accidentally screwed up the way that the ordinals work in the process (for instance, the "first infinite ordinal" in the new universe now contains all of the nonstandard integers we added), and we need to worry about this again when we get around to forcing the new reals in. So a whole lot of technical details need to be ironed out carefully, using a few clever combinatorial arguments, to make the whole charade come together.
All of this cheating happens under the hood in a way that is not obvious at all when you first go through the technical details of the forcing construction. I only learned about the full perspective from Joel David Hamkins's "naturalistic account of forcing".
In a forcing-based consistency proof of a proposition P, you start with a model of ZFC (or a sufficiently large subset of ZFC, depending on the exact formalism being used), approximate a witness to P (e.g. a bijection between a set of reals and aleph_2) within the starting model, and then apply the forcing machinery to construct a new model of ZFC where P holds. Unlike the diagonalization argument, it's inherently metamathematical.
Maddy’s work is great and worth reading (even as an antirealist!), but it’s worth keeping in mind that debates about foundations are basically irrelevant to the working lives of the large majority of mathematicians. If you’re studying, say, extremal graph theory or the Langlands program or low-dimensional topology, the cardinality of the continuum is simply not relevant.
@pavpanchekha: thank you indeed for this great post and great references. I took ML and ran into CH, Ultrafilters but never really got my head around it. Reading with interest!
First, what is forcing? The article actually has a great description of ultrapowers (a key part of the construction) but it goes by a little fast, so you might like Tim Chow's "A beginner's guide to forcing" [1] which does a good job not only laying out the mathematical details at a high level, but also really clearly explaining the philosophy of how you prove something independent of the axioms. I found it very illuminating.
Second, the article has some interesting notes on how mathematicians go about what axioms to select and which not to. Penelope Maddy's "Believing the Axioms" [2] is the classic on this topic (it has two parts). It is focused on set theory, so it has a nice description of Martin's axiom and the arguments for and against. It was nice to read because it is a deeply technical argument (set theory is hard!) but at the same time there is no "right answer"—all of these axioms, after all, are independent of ZFC, and it's "ok" to add any of them. The arguments are sometimes aesthetic ("rules" versus "surprises"), sometimes pragmatic (what they can and cannot prove), and sometimes involve deep values of what the universe should be like (should higher cardinalities be like lower ones? weirder? simpler?).
It might all seem abstract, but if your day job is programming, imagine an argument over how to architect a large and complex system. Perhaps both architectures are possible, but which one is "right"? What arguments would you deploy? Set theorists are also building a large and complex system (the universe of sets) and are having arguments over how it should be built, which things it should make easy and which hard, which technologies should be supported natively (forcing?) and which should not.
[1] http://timothychow.net/forcing.pdf [2] https://www.cs.umd.edu/~gasarch/BLOGPAPERS/belaxioms1.pdf