If PA proves that it proves a statement, PA cannot conclude from that fact that it proves that statement.
If PA proves that it proves a statement, and then fails to prove it, PA is unsound.
There exist collections of statements such that PA proves that it proves each statement, and PA does prove each statement, but PA does not prove the collection of statements.
Our understanding of the last is helped by understanding that "PA proves that PA proves S" is logically not the same statement as, "PA proves S". Even though they always have the same truth value.
If PA proves that a number exists with some mathematical property - including being a Gödel number of a proof of something - then some number with that property must exist in every model, including the standard model. So there would have to be a standard number encoding a proof, and the proof that it encodes would have to be correct, assuming your Gödel numbering is.
The Gödel number for all of the standard statements in that collection will indeed exist.
But if it is an infinite collection, then a nonstandard model of PA will have statements in that collection that are not in the standard model, and they generally don't encode for correct proofs. (For one thing, those proofs tend to be infinitely long.)
We are talking past each other. I am responding to this:
"If PA proves that it proves a statement, PA cannot conclude from that fact that it proves that statement."
When you say "PA proves that it proves a statement," this usually means that it proves the existence of a Gödel number of the proof of the statement. If PA proves such a Gödel number exists, then via completeness one must exist in the standard model, and this number will be a standard natural number encoding a valid finite length proof.
If the above somehow doesn't apply to the argument you are making: how?
The existence of a Gödel number in a model of PA, does not imply the existence of a proof corresponding to that Gödel number. PA can only prove the existence of the Gödel number.
Now consider a function, definable from PA, named prove-gn. It is defined such that (prove-gn n) is the Gödel number for a proof from PA that G(n) terminates. This function is somewhat complicated, but it absolutely can be constructed. And from PA we can prove that the function works as advertised.
Suppose that we are in a nonstandard model of PA. For every standard natural n, (prove-gn n) will be a standard natural, and will correspond to an actual proof. So far, so good.
But any nonstandard natural n, and this model has many, will result in (prove-gn n) being a nonstandard natural. That Gödel number does not correspond to a valid proof. And therefore, in this model, PA will prove that it proves a statement, that it does not actually prove.
Therefore, from PA, we may prove that PA proves a statement, even though it does not prove that statement. In fact the statement may even be false. And absolutely none of this causes any contradiction or consistency problem, so long as the statement in question is a statement about a nonstandard natural number, and not a standard one.
Because there is no way from PA to determine if we are in the standard model, or a nonstandard model, from PA we cannot conclude that "PA proves that it proves" implies "PA proves". They really are different concepts. And we will cause ourselves no end of confusion if we confuse them.
> There exist collections of statements such that PA proves that it proves each statement, and PA does prove each statement, but PA does not prove the collection of statements.
I think this is true (although I would like a significantly more succinct proof than one using Goodstein sequences as I mention here https://news.ycombinator.com/item?id=44277809), but regardless this is substantially weaker than your assertion that there are statements PA can prove that it can prove, but cannot itself prove.
In particular I can prove that for all propositions P, PA proves P if and only if P proves "Provable(P)" (i.e. "P is provable in PA").
Given PA proves P, take the finite proof of P in PA and do the usual Godel encoding to get a single number. That is a witness to the implicit existential statement behind "Provable(P)".
In the other direction, given P proves "Provable(P)", take the natural number witness of the implicit existential statement. Because the standard natural numbers satisfy PA (this is basically equivalent to the assumption that PA is consistent), we know that this natural number is in fact a standard, finite natural number. Hence we can transform this natural number back to a proof of P in PA.
This equivalence is precisely the reason why the provability relation (more accurately for our purposes stated as "provability in PA") "Provable" is valuable. If the equivalence didn't hold it would be very weird to call this relation "Provable".
In particular, this means PA proves P if and only if P proves "Provable(Provable(P))" (and so on for as many nestings of Provable as you'd like).
> If PA proves that it proves a statement, PA cannot conclude from that fact that it proves that statement.
I kind of think I know what you mean by this, but definitely your conclusion is way too strong. In particular, it is true in a certain intuitive sense that PA cannot "use" the fact of "Provable(P)" to actually prove P. It is in a sense a coincidence that every P for which PA can prove P, PA can also prove "Provable(P)", since that equivalence is carried "outside" of PA. But that's not really anything to do with PA itself, but that's just rather the nature of any encoded representation.
Even what tromp is saying is quite strange to me (in apparently implying that it's normal for PA to have an omega-consistency issue). It would be very strange to assert PA is omega-inconsistent. It is tantamount to saying PA is inconsistent, since stating that PA is omega-inconsistent is saying that the standard natural numbers are not a valid model of PA.
> Even what tromp is saying is quite strange to me
Indeed I mis-spoke. While for all n, PA can prove P(n),
it cannot prove "for all n: P(n)".
I don't know if there is a name for this, but it's not an omega-inconsistency.
It would only be omega-inconsistent to have PA contradict "for all n: P(n)", i.e. to prove "there exists an n: not P(n)".
When Kolmogorov complexity is expressed directly in terms of the pure untyped lambda calculus, the oldest and simplest model of computation (except perhaps for SK combinatory logic, which is less expressive bit-for-bit), you cannot really complain of hidden complexity in the axioms. Even numbers have to be built from scratch.
> Kolmogorov complexity (wikipedia.org) is: "… the length of a shortest computer program (in a predetermined programming language) that produces the object as output."
> Small programs are interesting. But I’m also interested in small programming languages. Generally speaking, the smaller the language, the less expressive it is.
It was the study of Kolmogorov complexity that motivated me to come up with the smallest and most expressive language for writing minimal size programs, and define the various flavors of Kolomogorov complexity in terms of it [1].
Hi, I like your work. Do you think there could be a universal combinator (like iota) and some binary encoding that would produce even smaller interpreters than BLC?
I read your paper and you show that a simple encoding of S and K combinators seem to have longer interpreters. However..
What I don't fully understand, the complexity measure seems to depend on the binary encoding. So in theory, you could have a primitive combinator such that the interpreter is really short to express but the combinators for pair, true and false are rather long (when expressed in that primitive), thus hiding complexity in them.
Makes me wonder, if you should only measure complexity of the interpreter combinator, and not also complexity of the "quoting" combinator, which would transform already existing encoded combinator into doubly encoded one, forcing you to represent pair, true and false combinators as well.
> Do you think there could be a universal combinator (like iota) and some binary encoding that would produce even smaller interpreters than BLC?
No; I don't think so. A iota based code (1 for application, and 0 for iota) loses out badly to the SK based code (1 for application, 00 for K, 01 for S). It would instead be better to pick slightly larger bases, picking up some elements like I,B,C, or W.
> So in theory, you could have a primitive combinator such that the interpreter is really short to express but the combinators for pair, true and false are rather long (when expressed in that primitive), thus hiding complexity in them.
Of course, we're not just optimizing for self-interpreter size, but for overall size across a large set of tasks, such as the important theorems of AIT, or such as the milestones in a busy beaver (surpassing Graham's Number / TREE(3) / Loader's Number [1]).
The terms you mention, namely pair, true and false are not tasks, but the tiniest of data structures that you commonly use in programming tasks. Optimizing just for those is not as effective as for larger tasks that need to balance lots of data types and control structures.
> He guesses the ice in the berg is at least 1,000 years old, but could also be exponentially more ancient — even formed as many as 100,000 years ago.
That's not exponentially more (which would be a preposterous 2^1000 or 10^1000 years old). It's just 100 times more. Should I stop being annoyed at how media use the word and just accept their alternative meaning of "a lot" ?
High variance/confidence interval. Probably needs some C14 / O18 dating to narrow it down by field researchers gathering samples rather than us speculating from afar.
This is how language develops, I’m afraid. But imagine that the age is 10^k where k is something like “age class”. Then indeed the age grows exponentially :)
Journalists tend to just think of it as "a lot more", but since they didn't specify the base of the exponential we can at least find a way to make the article technically correct. There are fun classes that admit incomparable values, such as the Surreal games. If they'd said "the game {1 | -1} is exponentially more than { | }" then it'd be impossible to find a base to make the statement true. There's lots of fun to be had with this sort of math, as you know.
If we want to express ourselves using exponents, consider that 1000 years (1×10^3) and 9000 years (9×10^3) would be of the same "degree" of ancestry, while 100,000 years (1×10^5) would be of completely different (exponential) significance.
Of course you're using gravity on a device built out of little slopes, just as the Turing Tumble is.
Add a hand-cranked marble lift to a Turing Tumble and the two are much more similar. One is gate cranked, while the other is circuit cranked.
While the Roons device has much more mechanical overhead for distributing the clock signal, it does allow for more looping paths.
reply