Probably more accurate to say that BB(3, 3) looks hard; that is, it encodes a Collatz-type problem, and many Collatz-type problems are very hard to solve (including, of course, the classic Collatz conjecture).
However, this instance might not necessarily be hard. For one, the behaviour seems to be heavily biased; for another, we only have to consider a single trajectory instead of the trajectories for all integers (as with the classic Collatz problem).
Yeah, I've oversimplified a bit with this title. The more accurate statement is in the first paragraph of the article: "Solving the BB(3, 3) problem is at least as hard as solving this Collatz-like problem."
I also agree somewhat on the one trajectory vs. multiple trajectories point. However, note that (assuming we live in the world where this TM never halts) proving a single trajectory in this system is "harder" than a single trajectory in the classic Collatz conjecture. Specifically, (assuming the Collatz conjecture is true) proving any single trajectory is "simply" a finite computation. However, proving a single trajectory from the article requires showing that it never halts which will require some more fancy math!
Anywho, I don't want to oversell it. This does not prove that BB(3, 3) requires proving the Collatz conjecture or any existing well-studied open problem in Math. But I think it's sort of a "second best" result: As hard a problem akin to a well studied problem.
How hard is this Collatz-like problem? Well, let's see if anyone can solve it :)
> "Solving the BB(3, 3) problem is at least as hard as solving this Collatz-like problem."
> How hard is this Collatz-like problem? Well, let's see if anyone can solve it :)
I thought John Conway already proved that all instances of the halting problem can be converted to a Collatz-like problem [1]. So one could say this about all BB values, not just BB(3, 3). Some will be easy, some will be hard, but all are reducible to Collatz-like problems IIUC.
You are right that every TM can be converted into a Collatz-like problem using Conway's Fractran compilation. So technically the statement "Solving the BB(n, k) problem is at least as hard as solving a Collatz-like problem." is true in general.
However, the Collatz-like problems you will get from this completion will be gigantic, they will not have distilled the problem into a similar description of it's behavior, but instead created a more complicated way of observing that behavior. The Collatz-like problem I present here is a simplification of the behavior of this TM. If you observe the machine running you will see that it is effectively completing these transitions.
In other words, I am not arbitrarily choosing to convert this to a Collatz-like problem simply because it is possible. I am looking at the behavior of this machine and that behavior turns out to be Collatz-like naturally.
Of course none of this proves that my Collatz-like problem really is hard ... but as someone else here mentioned, being hard is not a mathematical thing, it is a belief we have about certain problems we cannot solve after considerable effort.
I think the point here is that it was not known whether all 3,3 halting problems are reducible to trivial Collatz-like problems. Unless someone is able to observe that sligicko's is actually trivial, then this provides a counterexample.
Some reductions via Conway's method give non-trivial Collatz problems even though other simple techniques show the halting question for those machines to be trivial; the Conway reduction sometimes reduces trivial Turing machines to non-trivial Collatz problems.
I'm hoping someone can enlighten me here. My understanding is that there is a turing machine of 748 states [0], which halts iff ZFC is inconsistent (Thm 1). But this machine is a "physical" object, in the sense that we can materialize it on a computer and run it. Though we don't have the computing power for this currently, there is nothing in principle stopping us from running this machine for BB(748) steps: if it halts, we have proven by Thm 1 that ZFC is inconsistent. If not, we have similarly proven that ZFC is consistent.
I want to stress that this is key to my confusion. This is not just some abstract result; this is a computation that we can perform and draw a real value from.
Of course, I'll now fall back on godel's second incompleteness theorem and say that one cannot prove, inside ZFC, that ZFC is consistent. But if the above turing machine halts, then we proved ZFC is consistent - a contradiction!
Where is the mistake here? My current guess is there is a technical detail in the proof of Thm 1 which uses a stronger metatheory than ZFC to show that the 758-state turing machine halts iff ZFC is inconsistent. This is not a contradiction, because yes, we can run the turing machine for BB(748) steps, but that will only show that ZFC+ proves ZFC is consistent, which is already well known - ZFC + there exists an inaccessible cardinal does the job.
However, I haven't looked at the paper in detail to know whether this is the case. Does anybody who has thought deeply about this problem have insight to offer?
The issue is the "run the turing machine for BB(748) steps" part. We don't know what BB(748) is. If the god of busy beavers came to us and told us that value, then we could (in theory) run the TM that long and just like you say, that would prove whether ZFC is consistent. But in order for us mere mortals to compute BB(748) we would effectively need to figure out if this specific 748-state TM ever halted (along with all the other 748-state TMs).
A lot of math draws conclusions from something not possible in practice (e.g. "let's take all the natural numbers and ...").
All the parent says: if it's possible even theoretically to learn the answer to the BB problem then we've proven something that cannot be proven, as shown by Godel incompleteness.
> This is not just some abstract result; this is a computation that we can perform and draw a real value from.
No, this isn't a computation we can perform. There isn't enough energy in the visible Universe we can use to increase entropy to run this computation.
Even if we built a computer that would use all matter and energy in the Universe, even if the computer only had one task and even if it ran the task as efficiently as is physically possible, it would not complete the computation.
So this is kinda where mathematics gets disconnected from physics and reality in that we can talk and reason about those things but they no longer have physical meaning.
It doesn't matter whether we can physically do it, as long as we can mathematically do it. In math running a TM for an arbitrary, finite number of steps is not a problem. The actual answer to OP's question was given in one of the other subthreads, namely: the result tells us that BB(754) is not computable in ZFC. Some 754-state TMs halt, others run forever, but there is no way to figure out (without an oracle) which of them runs the longest while still eventually halting.
>"It doesn't matter whether we can physically do it,"
Hmm. I'm not sure that I agree - philosophically if you could prove that something is computable mathematically and yet at the same time know that physically it cannot be done due to the laws of physics, I would say that imposes a "second order" halting problem.
By which I mean if you had some mathematical algorithm that could be proven to tell you some property of a system like halting, and you knew the number of steps required by that algorithm to produce that answer, if that number of steps is beyond the capacity of the physical world to provide, then there is a set of algorithms that can be proven to never be decidable.
I'm sure you are thinking, oh maybe we get more efficient, but to solve the bounding problem itself you must provide a constant time solution within the bounds of the universe, or else the "mathematical" solution is itself undecidable.
And even if you found a fast, low constant time solution (the holy grail), it still must be shown that the categories of problems are themselves infinite variations of finite categories, so that your low constant time solution could conceivably be applied within the limits of computation capacity, or else there will always be for some large enough set of problems some that are not decidable.
If you insist that only tractable problems are decidable (or computable), you don't really get a nice or useful theory of computation. You need a more fine-grained hierarchy.
Almost all real numbers are uncomputable (computable reals have measure zero; in other words, the probability of an arbitrary real number being computable is exactly 0). BB(754) is uncomputable in this sense even though its definition is extremely concise. Still, its value could be computed by a machine stronger than a TM (but it would again be unable to compute its own BB problem).
Of the countable subset of reals that are computable, almost all are like 𝜋, in that computability does not mean being able to actually calculate or represent its exact numeric value in the physical world. Only that it's in principle possible to compute it to any precision desired – but of course this isn't possible in reality either. Only in math.
Then there are numbers that are unbelievably large, like 3^^^^3 or (the immensely larger) Graham's number G, which nevertheless admit an incredibly compact definition, and we can prove various properties of them. Yet their numeric representation would not fit in our future light cone (either in space or in time). Where do you draw the line? After all, G is an entirely ordinary natural number, and indeed almost all natural numbers are greater than G.
Complexity theory draws a line, somewhat arbitrarily, between polynomial-time and superpolynomial-time problems, the former being tractable and the latter intractable. But a number might be easily defined in terms of a tractable function and still be entirely unfeasible to ever calculate or represent in the physical world. (Such a number wouldn't even have to be particularly large if the function that defines it grows sufficiently slowly.)
Nowhere is there a distinction made between "computable in reality" and "only computable in math", because it's impossible to formally define such a distinction. Some numbers are obviously within our reach, almost all others obviously aren't. In between there's a huge gray area.
Hmm, I think its the same problem, you are somewhat arbitrarily drawing the line between "math" and the universe (or maybe I'm agreeing with you and don't realize it).
A halting/decide-able problem is one that is precisely about tractability - does it end, and can you decide that. The maths is about whether there is a tractable solution, in theory ignoring any energy/mass requirements.
The maths of the maths of the tractability issue is also one of tractability - if you provide an infinite solution to an infinite problem then is it a solution or an admission of defeat? You can't even decide if you can decide that you can decide.
If there is no maths for the maths for the decidability question of is it tractable (not even energy computeable), then in my mind there is no difference between the two, i.e. you have no higher order reduction with which to express the lower order problem. Without a constant time solution, there may not be enough math to express the solution to the problem, i.e. this may just be another set of infinities, as you mentioned the integers, irrational numbers, precision, etc. (which are lower order entities for which we have tractable mathematical solutions).
And the lower energy bound of the universe to the set of problems to me remains interesting, in that maths itself is an invented set of relationships, in the sense that the set is invented and less in that the relationships are invented (they exist but we value certain relationships over others, so the set is invented), so any maths that cannot survive and exist within the information space that is the universe is to some degree irrelevant - if there is no way to relate to it, does it exist? I think so, we simply lack the ability to process or observe it.
Consider, for instance, a constant time solution to the decideability problem, however the maths to prove it require a set of equations and theorems larger than the processing time of the universe. Then a solution exists, perhaps, but we will never discover it. It doesn't mean it doesn't exist, but it does not exist for us, and we can neither prove nor disprove that one exists.
I think that it would be fairly easy (meaning that I think I could do it in 8 hours) to build a Turing Machine that prints 10^1000 1's on its tape and then halts. We would not be able to construct and run physical Turing machine that could do every move physically (by moving electrons) within 100 billion years, but we can still write the proof that it halts, probably with less than 3 pages. (In fact, we could probably tell you which state it is in for any i in the set {1, ...., halting step} without ever constructing and running a physical machine.)
On the other hand, if the "shortest" proof of halting is more than 10^20 pages long, then we would have major problems "writing" it.
I agree with your overall point, but note that computation doesn't actually increase entropy (we can do reversible computation, e.g. using quantum gates, Toffoli gates, billiard-ball computers, etc.)
Even if the universe is infinite, you can't use its infiniteness because you can't communicate partial results across infinite distances.
It is natural to think that the more time you have, the further you can travel to, potentially. But when it comes to the universe, the opposite is actually true. The more time passes, the less of the universe you can reach. A lot of universe you can see today is actually not at all reachable, meaning even if you shine the light back it will never reach the destination.
Computational capacity must not be able to travel faster than the speed of light, yes?
We are told that quantum entanglement cannot transmit information, I am unaware of how rigorously this has been proven/disproven, on the off chance there is something the scientists have missed, the requirement to travel might not be necessary. Either through some advanced entanglement (a novel approach or a not yet understood state of matter), or reaching for a more exotic theory, some additional dimension enabling warp or wormhole like behaviors.
Then computational capacity would depend on some ratio of power (to entangle, set up facilities, etc.) to time spent not doing these things, and might possibly have an upper limit, or not. Perhaps this is a halting problem in itself?
Only because our universe is currently expanding, if it wasn't or would stop in the future, given enough time you could eventually distribute the task over a large enough region to perform the computation and afterwards combine the results in one place. And one would probably have to throw in a couple of technical requirements, for example that the energy density does not decrease faster than the future light cone expands.
It can be finite, but you have to definitively prove the non halting cases are infinite (which, being infinite, can't be computed with brute force)
Look, you're arguing the collatz cobjecture can be proven by counting up by 1 & seeing when a number hits a loop without reaching 1 (which, if there is such a case, would eventually prove via refutation, but if not, you'd never know if you were about to hit an answer or not)
Mathematical uncomputability for problems with answers is a thing. Read up on Gödel's Incompleteness Theorem
Spacetime (might be) infinitely large, but that doesn't imply that there is infinite matter or energy.
That said, this is kind of a dumb argument because far lower k values have lower BB(k) values already exceed the apparent information value of the universe at any given instant. Maybe there is infinite energy and matter, but that's also irrelevant if we can't perceive more than a finite subset of it.
Edit: well, i guess what would matter would be the information value of time, multiplied by enough bits to store the machine—I'm not sure i'm literate enough in the area to compute that. But, assuming that the heat death of the universe reaches a single (possibly compressed) end-state, it should still be finite—it's seeming quantized, anyway.
It does not matter how much energy is there available in the Universe. Even if there is infinite amount of it, you can't still use it because only finite amount can ever be reached / affected by any single observer. Only finite amount of universe can ever reach any single observer.
So for example, there is a limit on the mass of the computer that can be constructed and still send its result to a single spot in space in the future. And the longer you wait, the smaller the limit because less and less of the universe is available to you to build the computer.
> Of course, I'll now fall back on godel's second incompleteness theorem and say that one cannot prove, inside ZFC, that ZFC is consistent. But if the above turing machine halts, then we proved ZFC is consistent - a contradiction!
No, the machine halts iff ZFC is inconsistent -- as you correctly stated up top. Somewhere along the way you got this reversed, looks like. There's the problem.
You're right, I misstated this - but I don't think this is fatal. The other sibling commenters pointed out the real issue with my thinking.
The argument goes the same even though I misspoke here. If the machine {halts, runs forever} then ZFC is consistent. But this is a contradiction; so ZFC must be inconsistent. Tada, I have an inconsistency proof!
That was the implied next step which made me think my logic was clearly incorrect (which, it was).
It's simpler than this still. If it runs forever (likely), then you will never be able to say anything about ZFC.
If you see it halt, ZFC is inconsistent. If you never see it halt, you CAN'T conclude anything.
But we could already do that under Gödel incompleteness, so there's nothing unusual there!
If you write down random proofs on paper and find a correct proof that leads to contradiction, you've proved ZFC inconsistent, without using BB. If you keep trying forever and never find one, you'll never be able to conclude anything at any point, just like with watching the machine run
Yep. But I think it's easy to show that this is circular, since you can't know BB(754) without knowing whether it runs forever.
And you can't prove that it'll run forever without seeing it go past BB(754) and still keep going
BB(754) is X if ZFC is consistent, Y otherwise
Since you can't prove that ZFC is consistent (only disprove), you can't know BB(754), which is the thing we were trying to use to determine whether ZFC is consistent in the first place!
The definition doesn't make it obvious, but this is just the same as plain Gödel incompleteness, we can't get any extra info about ZFC even in principle (unless we happen to see it halt, by chance)
> You're right, I misstated this - but I don't think this is fatal.
It is crucial at this types of results, when you search for a proof.
There are a lot of things true, better make a table of it, instead of a wall of text:
- If you observe the machine halting, ZFC is inconsistent.
- If the machine hasn't halted yet, you don't know if ZFC is consistent or not.
- If ZFC is inconsistent, the machine will eventually halt. (You have an upper bound for this, given a contradiction.)
- If ZFC is consistent, then the machine won't halt ever.
Also it is consistent with ZFC that this machine halts, since it is consistent with ZFC that ZFC has a contradiction. This means that if ZFC happens to be consistent, and you work in ZFC+contradiction, then you will know that your machine will eventually halt, yet it won't halt ever.
I don’t think that’s it. I think it’s easy to take such a machine and make a new one that halts iff ZFC is consistent.
Call the machine that halts iff ZFC is inconsistent A. Now consider the machine which computes the following algorithm:
Run A BB(754) times. If A halts, then run forever, else halt.
If A halts then our machine runs forever, and vice versa. Thus, our machine halts when ZFC is consistent and our machine runs forever when A halts, so ZFC is consistent iff our machine halts.
“Halts” doesn’t seem like a real word after writing that.
How do you make the new machine compute BB(754)? BB is the canonical example of an uncomputable function, precisely because you can decide the halting problem if you can compute it (or any upper bound). Granted, BB may be computed for specific arguments, as OP mentions for 1–4, but the existence of the ZFC-dependent machine is, at least to me, a very good argument that the boundary of what's possible is much lower.
Oh, sure. I was just pointing out that the hardness is in determining the busy beaver number and that it didn’t matter if your algorithm halts iff ZFC is consistent or if it’s an algorithm that halts iff ZFC is inconsistent.
No, if you had an algorithm that (you could prove) halts iff ZFC is consistent, then if that algorithm halts, you’ll have a proof that ZFC is consistent, which isn’t possible. Thus, the existence of such an algorithm would be a contradiction that proves the inconsistency of ZFC.
The problem with your construction is that it relies on knowing the value of BB(754), which is impossible to know so long as ZFC is consistent, since its value is dependent on the consistency of ZFC.
Conversely, if ZFC is inconsistent, then there exists a (finite) proof of this fact, so the opposite case isn’t a problem.
Essentially it’s like saying define X to be the length of the shortest proof of the inconsistency of ZFC, if one exists. If I could prove any upper bound on X, I could prove the consistency of ZFC, which, according to Gödel’s incompleteness theorem, would itself prove the inconsistency of ZFC.
The intuition is that a monkey typing randomly on a typewriter can come up with texts which either are or aren't valid proofs of ZFC or not, and each one which is a valid proof either is a proof of a contradiction or not. To check either of these things is mechanical. If ZFC is inconsistent, eventually the monkey should hit on the inconsistency.
It’s an unknown integer, whose value depends on the consistency of ZFC. Let me show you why this is circular.
I can define another integer N which is 1 if there exists a proof of the inconsistency of ZFC and 0 if there doesn’t (note that BB(754) already encodes this information). Then I can define a program that determines the consistency of ZFC thusly: if N=1, I define the program to immediately return false. If N=0, I define the program to immediately return true. Thus, there exists a program that can determine the consistency of ZFC, it’s one of the two programs I’ve defined.
The fact that there exists a program that returns the consistency of ZFC isn’t in question. The trick is proving that a particular program does so. Or if you like, proving that there exists a program along with a proof that it does so. What you’ve defined is an oracle: it depends on already knowing the answer to what you’re asking so it doesn’t have to compute it.
BB(754) is an uncomputable number. It's independent of ZFC, so an enumeration of all consequences of axioms of ZFC doesn't contain it. How is that supposed TM of yours is supposed to know whether it has run BB(754) steps or not?
Oh, but other slightly bigger TMs exist – lets say in class TM(860) for the sake of an example – that might halt with after a more steps than BB(754). This _sounds_ intuitive. But: how do you prove that? It might be that all TM(860)s either halt within BB(754) steps or then run forever. There indeed might be some that halt in finite steps after BB(754), but that is not guaranteed! You need to prove it. But with what?
Oops, never mind, there exists a simple construction that you can perform to each TM(754) that clearly extends BB(754) a finite amount. Maybe you are corrent that such Turing machine exists. But seems that identifying it isn't possible in ZCF.
I’m saying that all you’ve proven is that if you know a priori whether ZFC is consistent, you can construct a Turing machine that returns this value. If you consider that to be window dressing I don’t know what else I can tell you.
I was having a hard time reconciling this with the intuition that BB(n) is in principle "computable" (colloquially speaking) for any n - my thinking went that if I want to compute BB(n), I can enumerate turing machines and run them until they halt, since infinitely looping machines are excluded from BB(n). But of course I have now reduced this to the halting problem! How do you know when you're "done" for that n? You don't.
There's a difference between a TM/algorithm/etc. that computes a function, like BB(n) (for all Natural numbers n); versus computing a particular value, like BB(748).
For comparison, there is no TM which computes the halting function halts(p) (for all programs p); but it's easy to compute particular values like halts("exit") or halts("while(true){}")
BB(n) for any particular n is always computable, no exceptions. There simply is no single computable function that can compute BB(n) for every n, but for any particular n there absolutely is a TM that computes it.
Indeed, the value of BB(n) for each n is just a number; and it's easy to construct a TM which outputs some particular number. However, we don't know what those TMs are, for the same reason we don't know what those BB(n) numbers are.
The same applies to more limited settings too, e.g. boolean questions like whether the Collatz conjecture holds: I know that one of the following programs calculates the right answer, but I don't know which:
- PRINT "true"; HALT
- PRINT "false"; HALT
(Perhaps we should also allow `PRINT "independent of the given axioms"; HALT` as well!)
How can you come up with an "upper bound estimate" without having some idea of the structure of the computation of the specific 748 state Turing Machine in question?
Imagine you had an oracle telling you BB(748) excluding this machine. How do you get an upper bound on the runtime of this machine?
(There is an answer: this is surely not the optimal construction, and so such an oracle would give you an answer for some smaller ZFC machine which you could likely use to extrapolate a value for this machine. However, eventually you'll find a minimal state ZFC machine and that won't work anymore.)
If we've proved that a number is an upper bound on BB(748), then running for that many steps without halting means it has also run BB(748) steps without halting.
> Though we don't have the computing power for this currently
I don't think you get how big BB748 is. To use a metaphor:
If you took stuffed the entire observable universe full of computronium that can do more calculations in a fragment the size of a human cell than our entire civilization, then shrank that universe down to the size of a grain of sand and filled our entire universe with that, THEN did this once for every possible distinguishable person (ie. If you can say after a lifetime of detailed observation that person A isn't identical to person B then they're distinguishable) you still aren't even close to BB748. In fact, I'd be surprised if you're over BB10 and you're definitely under BB20.
BB(6) might be too large even to store (let alone compute) using all the mass and energy of the universe. I think the computable limit if you converted all the mass in the observable universe to energy (E = mc^2) would be BB(5), assuming adherence to the Laundauer Limit[0].
After calculating this I found a quote in wikipedia[1]: "There is not enough computational capacity in the known part of the universe to have performed even S(6) operations directly." That cited this paper[2], which is probably better than my model at utilizing the total available physics of the universe for calculation purposes. Anyways, the mass of the known universe is on the order of 10^56 grams[3]. Converting this all to energy using E=mc^2 yields on the order of 10^70 Joules (10^88 electron volts). Setting or clearing a single bit of information requires at minimum 0.018 eV. That allows about 10^90 bits.
BB(6) may require on the order of 10^90^2 bit flips. So it is absolutely not computable using all the mass and energy in the universe. In fact, I don't think it's even storable using all the mass and energy in the universe.
I don't really understand Busy Beaver, so if I got any of this wrong please correct me for the record.
His point is if you know the value of BB(748) then you don't have to wait forever, just BB(748) steps, as after that the Turing machine is guaranteed not to halt. The problem with his argument is that we don't know the value of BB(748). Not only that, it is incomputable, which resolves the contradiction.
Right, we don't know the value BB(748), and moreover, "knowing" it is not enough, we would still need a proof that a certain number matches BB(748). And such a proof is not easier than a direct proof of ZFC consistency, I presume.
BTW, a value can't be uncomputable, only a function can.
It is categorically false that BB(748) is not computable. On the contrary any particular BB(n) can be computed by some Turing Machine even though there is no Turing Machine that can compute BB(n) for every n.
A proof of existence is not the same as a construction, so the fact that we know that there exists a TM that computes BB(748) does not mean that we know which specific TM does it or how to construct such a TM.
It's easy to generate all proofs in any system, just pick any axiom, then apply any inference rule, and repeat. So if you ever generate both P and not P, then you've proven its inconsistent.
So if it's inconsistent, it's straight forward to prove. It's only if it's consistent that we can't prove it in any finite time.
A Turing machine which halts iff ZFC is consistent seems like it would be more interesting. In theory it would be possible to run it and show ZFC to be consistent that way.
Although I imagine it suffers from the problem that it can only be shown to work within ZFC or any theory powerful enough to prove the consistency of ZFC, which tells us nothing.
Sort of. Uncomputable functions exist because the halting problem exists. So any function whose definition includes anything to do with halting also becomes uncomputable. e.g. BB is uncomputable because you have to determine all the TMs that would halt given n and m.
The entire rest of mathematics is smuggled into BB via the halting problem: you can write programs that only halt if arbitrary mathematical conjectures are true or false, so anything that wants to solve the halting problem or solve BB has to be able to know all maths[0]. Of course, this is possible because Turing-completeness is the boundary of computability. Anything that can have a computer in it is itself a computer.
[0] This isn't actually the thing that makes halting undecidable. Undecidability comes from programs that "pull themselves into the halting problem" by only halting if a hypothetical halting problem decider would claim that they don't halt.
> you can write programs that only halt if arbitrary mathematical conjectures are true or false
I don't think this is quite right - you can write programs that halt iff arbitrary mathematical conjectures are provable (in a suitable system of logic/axioms such as ZFC), but there are problems (such as questions about Turing machines with a halting problem oracle[0]) for which we can't straightforwardly construct a program which halts iff they are true.
> Uncomputable functions exist because the halting problem exists.
Uncomputable functions exist because there are only countably many Turing machines. There are problems that stay uncomputable even if the halting problem were computable.
There are math problems that we "know" we can't prove or disprove, unless we can prove every statement both true and false (this is Godel's first incompleteness theorem). If we can prove every statement both true and false then our proof system just sucks and proving something means nothing so we should choose another proof system where that's not the case, so we assume we're in the first case (there are math problems that we know we can't prove or disprove). Incidentally Godel's second incompleteness theorem is that we can't ever prove we are in the first case.
And then ya, BB being incomputable means that as BB grows at some point they start to be able to encode programs that halt if and only if a problem we can't prove or disprove is true. So we can't prove that program does or doesn't halt.
Now, technically, proving/disproving something you can't prove or disprove would be proving false, which in turn could be used to "prove" every statement, and would therefore "encompass all maths", so what you said is in a sense correct. It's a threshold condition though, and one that kicks in well before you have turing machines large enough to encode "every" math problem. And in fact there is no finite number of states sufficient to encode every math problem (I can just make longer and longer strings of arithmetic for instance)...
So is this (BB) then just an exercise demonstrating our inability to reasonably solve the problem?
I suppose what I’m asking is what is its significance? Or why does this matter?
As a self taught programmer, I realize I have definite gaps in the more theoretical realms of computer science.
The collatz conjecture isn't a problem that we think is unsolvable, it probably (maybe even definitely?) has a solution. It is a problem that a lot of really good mathematicians have sunk a lot of time into and made basically no progress. Collectively the mathematical community has more or less decided that we don't know how to even approach solving this kind of problem yet, and probably won't be able to solve it in the foreseeable future.
This article is saying that solving for BB(3, 3) requires solving a collatz-conjecture like problem, so it's probably also beyond our current abilities in mathematics. It's not saying it's unsolvable, just that we probably won't figure out the solution anytime soon.
As for why solving BB(3, 3) matters... it doesn't really. It's just an intellectual curiosity. We've figured out some busy beaver numbers. We've proven some upper bounds on the biggest ones we can figure out, but there's a big gap between what we've solved and what we've shown we can't solve for. Making that gap smaller is something of a game.
The Busy Beaver problem sits somewhere on the range from "intellectual curiosity" to "lens that allows us to view the edges of uncomputability". I would guess that the majority of people doing work here are hobbyists (including myself). In fact, when Tibor Rado first introduced BB, he introduced it as the "Busy Beaver Game", so it has had a playful energy since the beginning :)
Greatly simplifying, the most important result in comparability is “the halting problem”, which means we have no way of telling, for any program, if given some input it will halt or run forever.
After that, it’s not surprising there is some BB we can’t solve, and it’s interesting to investigate which we can, and can’t, solve.
Not "for any program", but "for an arbitrary program".
We can trivially prove that a program that always goes to the next step, without branching, always terminates. With more effort we can prove termination of certain segments with conditional jumps, certain forms of recursion and loops, etc. This is not purely theoretic: e.g. EBPF programs in the Linux kernel are only admitted if the kernel can prove that they terminahe.
What we can't do is to prove it in an arbitrary case, without putting limits on the program's structure.
If BB is a computable function, then you can solve the halting problem by running every turing machine with n states for BB(n), and the ones that don't halt by then don't halt at all.
> Therefore, solving the BB(3, 3) problem is at least as hard as solving this Collatz-like problem
I don't understand why this is surprising, in fact it seems trivially provable. Aren't all BB(x, y) problems reducible to Collatz-like problems?
BB(x, y) problems are trivially converted to the halting problem:
- find all machines with x states and y symbols that halt, set all the non-halting ones aside.
- run all the halting machines together one step at a time until they've all halted. The number of steps you've executed is the value of BB(x, y).
I believe Conway presented a reduction from the halting problem to collatz-like problems, so finding BB(x, y) for any value of x and y can be reduced to a collatz-like problem with this two step reduction (BB to halting problem to Collatz problem).
I think you got the direction reversed. Here you have a reduction of B(x,y) to the halting problem, which only shows that (a subset of) halting is at least as hard as B(x,y). You need a reduction from collatz to halting to B(x,y) instead. Collatz to halting is trivial, halting to B(x,y) seems a bit less so - you need to precisely define which subset of halting problems can be reduced from Collatz yet is no harder than B(3,3).
My understanding is that this blog post converts a specific instance of the halting problem into a Collatz-like problem. It starts with a description of a specific Turing machine, and presents a Collatz-like problem that, if solved, also answers whether that Turing machine halts or not.
Isn't this almost exactly what Conway did, with the only difference being that Conway's proof works for all Turing machines, and this proof only works for a specific Turing machine?
Halting problem being reduced to [subset of collatz-like problems] says nothing about uncomputability of [all the rest of collatz-like problems] and even less about [does single specific collatz-like problem halt]
The only consquence is that "most general" definition of collatz-like problem as a whole is uncomputable
The halting problem often seems to “block” a lot of approaches to algorithmic information theory and induction that are based upon computable programs. However, is there any research on whether the halting problem actually has any sort of material influence on our ability to perform real-world induction?
For instance, if an oracle could tell us whether any given monotone universal Turing machine has reached a point in its execution where it will never print anything else to the output tape, would the results of induction using this oracle differ considerably from the case where we simply “skip” to the next program in an exhaustive search of program space after its fails to produce output for n consecutive steps (for sufficiently large n)? I’m referring to induction on cases that are not contrived as edge cases or adversarial examples (e.g., BB(3,3))—just induction on “regular” compressible data.
I'm a not educated in mathematics, so I can't know for sure if what I'm going to respond to your question is really relevant or not. But here it goes anyway.
I'm a security researcher, and I write my own fuzzers. Fuzzers are tools that automatically search for inputs that have security implications for the program that you're testing. They algorithmically generate and mutate inputs, feed them to a program, and observe what happens, tens/hundreds/thousands of times a second.
If an input crashes a program you can think of this as "halting" the program. To have a fuzzer that could find any or all bugs in any program you run it on, in a feasible amount of time, would surely involve solving the halting problem I think. And sure, even after billions of tests, people still manage to find bugs in image decoders, so the fuzzers we do have are not flawless.
At the same time, I have experienced in the real world that fuzzers manage to penetrate unexpectedly deep into complex programs if given enough time. The validation on input performed by the target under test, the limited memory and storage in a modern PC sort of helps to keep your fuzzer on the rails. Unless cryptography is involved, which are like computational tar-pits for fuzzers. Any well guarded/specified program acts as its own guardrail against needing to solve the halting problem for the fuzzer.
It convinced me of the following, regarding the finding of security bugs in programs:
1) Fuzzers are good at targeting programs that perform rigorous input validation, barring cryptography.
2) You don't need a fuzzer for a program that does not perform rigorous input validation (where the fuzzer doesn't necessarily work well.)
I'm not sure this is what you mean but obviously in practice uncomputable problems don't differ from problems which are perfectly computable in theory but far too slow to be computed in practice.
Is there any intution for why BBB (beeping busy beavers) can run for much longer (before quasihalting?)
One thing I see is that they effectively don't need to use a halt state. So maybe a 3-state BBB is sort of similar to a 4-state BB in that sense. What is there more to it?
A program can only halt once, but can beep any number of times. So you can make a size X program (or turing machine) that simulates running every program (or turing machine) up to some size Y >> X, and whenever any of those programs halts it beeps. The last time it beeps will be when it simulates the halting of BB(Y) after more than BB(Y) steps, so BBB(X) > BB(Y) >> BB(X).
IIRC basically this same construction means that while knowing BB(N) lets you (very slowly) compute the halting problem for programs up to size N, knowing BBB(N) lets you (even more slowly) compute the halting problem for turing machines supplied with a halting oracle up to that size.
Can anyone tell me what is the required pre-requisite of knowledge to understand things like this? Like will knowing basic calculus be enough? What specific topics or subjects will give me good fundamentals on these things?
A first or second year (undergrad) student doing theoretical science to this level is astounding to me. It was an upper-division course for me, albeit a prerequisite for things like compilers and cryptography, so I'm sure it could be put earlier in the journey.
Sipser's "Introduction to the Theory of Computation" was the book I had to read, and it certainly makes this post more accessible.
I've just checked with a few math books I like, and they are using ℕ and ℕ₀ throughout. What I don't like about ℤ⁺ is that 1) it talks about integers, although negative numbers might not be relevant in the current context at all, and 2) it's not really clear if ℤ⁺ denotes ℕ or ℕ₀. Of course, you have the same ambiguity problem with ℕ.
Oh, see how I numbered this 1) and 2), not 0) and 1)?
> Oh, see how I numbered this 1) and 2), not 0) and 1)?
So what?
> 2) it's not really clear if ℤ⁺ denotes ℕ or ℕ₀.
On the contrary. ℕ is ambiguous between the positive integers and nonnegative integers, though in my experience it's usually the nonnegative integers.
But ℤ⁺ is absolutely unambiguous. It's the positive integers.
And since you frequently need to refer to the nonnegatives, it makes sense to have a symbol for them too. There's almost no benefit to having ℕ available as a synonym for ℤ⁺. There's a lot of benefit to having it available as a synonym for ℤ\ℤ⁻.
Well, it means I usually need either ℕ or ℤ, not ℕ₀.
> And since you frequently need to refer to the nonnegatives
It depends on the context. You don't use ℕ₀ much, if you agree that counting starts at 1. You then use either ℕ or ℤ, which just looks cleaner. No reason to complicate such a simple concept as {1, 2, 3, ...} with something complicated such as ℤ⁺. As I said before, to understand {1, 2, 3, ...}, you don't need the concept of negative numbers at all, so why drag it in through confusing symbols?
> But ℤ⁺ is absolutely unambiguous. It's the positive integers.
Or the non-negative integers. And that's not just my opinion. Here is what ChatGPT has to say about that:
*system*:
You are a mathematics professor, and happy to chat about mathematics and the philosophy of mathematics.
--------------------
*user*:
What would you say ℤ⁺ stands for? Are there multiple options?
--------------------
*assistant*:
ℤ⁺ typically stands for the set of positive integers. This usually includes all the whole numbers greater than zero. However, there can be some variation in its definition depending on the context.
In some contexts, ℤ⁺ might be defined to include zero, so it would represent the set of non-negative integers. This is less common, but it does occur in some areas of mathematics.
The symbol ℤ comes from the German word "Zahlen," which means "numbers." The superscript "+" is used to denote that we're only considering the positive members of this set (or non-negative, in some cases).
So, yes, there can be multiple options for what ℤ⁺ represents, but they are all closely related. It's always a good idea to clarify these kinds of notational issues when they arise, to avoid any potential confusion.
I guess it is the difference between measuring the size of a set, and labelling the elements of the set with a number. Both can be called "counting", but the meaning is different. The first of something should always be labelled 1, but obviously the empty set has size 0.
That's the main problem I have with using ℕ for {0, 1, 2, ...}. It's easy to get into the habit of writing stuff like x_0, ..., x_{n-1} for n elements, and that's just ugly. x_1, ..., x_n is much better and clearer. On the other hand, 0 is useful when it comes to measuring the size of something (an offset, for example, or age).
I think ℕ and ℕ₀ is a good way out of this dilemma. ℕ is the natural numbers, and ℕ₀ is the natural numbers with, well, 0.
The other way out of this dilemma is what most here prefer, I guess: It's to say a label is just a label, and starting with label 0 when labelling the elements of a set, is just as good as starting with label 1. Then you just need ℕ = {0, 1, ...}, and ℤ for the integers, and you will not have much use for ℕ⁺ = {1, 2, ...}, because now sizing something and labelling something is one and the same. So you will now use x_0, x_1, ..., x_{n-1}. So you start counting from 0. I don't know, I just don't like it, but in the long run, maybe it is less confusing, because you unified the concepts of sizing and labelling, and now you can call both of them just counting.
I added a bit to my previous answer before seeing your reply. But yes, it does not really matter in terms of notation if you use ℕ and ℕ₀, or ℕ⁺ and ℕ. But both ℕ₀ and ℕ⁺ are slightly annoying compared to just ℕ, and so it changes where you start counting from: 0 or 1. If you start counting from 0, you will mostly not need ℕ⁺, and mostly just use ℕ and ℤ. If you start counting from 1, you will use ℕ more than ℕ₀, but you will use ℕ₀ often enough so that you need ℕ, ℕ₀ and ℤ.
Logic likes to unify things, so formal definitions usually start with 0, and conflate sizing and labelling. Note that Peano first started counting from 1. Later on he changed it to 0. Doesn't mean that's the right thing to do, though. Maybe these two concepts should be kept separate: ℕ for (default) labelling, and ℕ₀ for sizing.
More like discrete math and automata theory https://en.m.wikipedia.org/wiki/Automata_theory or theory of computation.
Hopcroft & Ullmann's intro book is nice. But there's so much related stuff here, it's just a starting point.
Calculus has little to do with this kind of problem. You should study theoretical computer science, in particular computability theory, to understand the article. Many CS undergraduate programs will have courses with public resources you can follow.
There is a more readable state table but each underscore separates a group of transitions for one symbol, writen as groups of 3 characters for each symbol. These are the symbol to write, new state and direction to move. The --- state halts.
Click on the link in the article [0] and you'll get a page that expands it to a human readable table. Each current (state, tape value) pair maps to a (new tape value, direction to move tape head, new tape value) triple.
His survey 3 years ago has kicked off quite a flurry of Busy Beaver activity of which my entire blog and https://bbchallenge.org/ are a couple examples.
BB(3,3) is busy beaver (BB) problem, a subset of Turing machines (TM) [0], with 3 states (A,B,C) and three symbols (0,1,2) that can be written onto the infinitely long tape. An additional criteria for a Busy Beaver-like Turing machine is that there is a "halt" state, which can be seen in the article's table for state C reading input 0.
The table in the article (under "The Machine" section) describes the state on the left hand column (A,B,C) and the input on the top row (0,1,2). The entry in the table describes, as far as I can tell, what the Turing machine writes back to the tape, the direction it goes and the state it transitions to.
So for entry 'A', '2', the entry is '1LC', meaning it writes '1' at the tape position it's in, moves one to the left then transitions to state 'C'. State 'C' on reading symbol '0' will halt, as is the requirement for it being a buys beaver machine.
I very weakly understand the 'A(a,b,c)' notation in the 'Analysis' section but I don't quite understand what it's saying. The line that says "A(a,b,c) = 0^\inf 12^a 11^b <A 11^c 0^\inf" is describing the input tape state, with s^x denoting symbol s repeated x times, but I don't quite understand what the "<A" is.
The table below it describes what I believe are essentially reduction rules. That is, "if the state of the TM is like this, we can deduce it will be reduced to this state" which is how it relates back to the Collatz conjecture.
Anyway, nice article. Wish there was a "glossary" section that described the esoteric notation.
> An additional criteria for a Busy Beaver-like Turing machine is that there is a "halt" state, which can be seen in the article's table for state C reading input 0.
Technically, that's "halt" _transition_. There's supposed to be an implied Halt state - but it's straight up easier to treat halting as smth like "invalid instruction" hardware exception.
> The line that says "A(a,b,c) = 0^\inf 12^a 11^b <A 11^c 0^\inf" is describing the input tape state
While yes, it can be an input - instead it is mostly treated as "intermediate" state of the tape. TM is quite low level, so various abstractions are welcome. If you can prove that input you're interested in (empty tape) gets to your abstraction and stays there - stuff gets simpler.
> but I don't quite understand what the "<A" is.
One of very useful abstractions without much cost is to keep track of direction head was travelling in. 1) instead of "head is on this cell and state kept outside" it allows inline notation, while visually pointing to the cell the head is on 2) it's mostly explained as representation of TM with 2 stacks which is easy to implement and run 3) direction is already encoded in the transitions "1LC == <C 1"
That allows nice rewrite rules like "B> 1 -> 1 B>" that simplify manual playthrough, and it works nice with intuition - e.g. B> keeps going right all the way until it encounters 0, like a ship with a pointy nose through the water.
And, well, additional state helps a lot of the time.
BB(n) refers to Turing machines that have n states and 2 symbols. The version with two arguments lets you specify the number of states and the number of symbols respectively.
A universal Turing machine can be made with two symbols, so in the sense that there's a reduction from one to the other, yes.
I would say a better question is to determine if there's some type of polynomial time/space reduction from BB(x,y) to BB(z), in construction and runtime. I suspect yes though I wouldn't be able to rattle off a proof without a lot of effort. See [1] which might answer this question.
However, this instance might not necessarily be hard. For one, the behaviour seems to be heavily biased; for another, we only have to consider a single trajectory instead of the trajectories for all integers (as with the classic Collatz problem).