Hacker News new | past | comments | ask | show | jobs | submit login
With fifth busy beaver, researchers approach computation's limits (quantamagazine.org)
530 points by LegionMammal978 2 days ago | hide | past | favorite | 136 comments





Some comments on this result by Scott Aaronson https://scottaaronson.blog/?p=8088

And for leisure-class beavers, some big related threads from earlier this year:

https://news.ycombinator.com/item?id=40453221

https://news.ycombinator.com/item?id=38113792

https://news.ycombinator.com/item?id=37910297


> leisure-class beavers

I can't not find this phrase funny. out of context it reads like something out of Terry Pratchett or Douglas Adams.


frankly, leisure beaver sounds like a raunchy innuendo

Vvvv

Exactly. Those are also not funny at all.

> There are many variants of the original busy beaver problem, and some Busy Beaver Challenge contributors plan to keep working on these.

One such variant is a functional busy beaver defined in terms of the lambda calculus [1]. Since it measures program size in bits rather than states, it allows many more values to be determined (37 so far versus only 6 for TMs), and the gap between the largest known value and values beyond Graham's Number is a mere 13 program bits. A closely related variant [2] can be directly expressed in terms of Kolmogorov complexity, which Mikhail Andreev argues [3] is crucial for applications in Information Theory.

[1] https://oeis.org/A333479

[2] https://oeis.org/A361211

[3] https://arxiv.org/pdf/1703.05170


A bit unrelated but you posted oeis so hoping you know. In the article they mention there are 17 trillion possible 5-2 turing machines. I tried finding the sequence for this but couldn't.

I found this https://oeis.org/A141475, but it gives 27 trillion for 5.


While the official formula [1] is (4(n+1))^(2n) , if one ignores the symbol written and head movement for transitions to halt, then this becomes (4n+1)^(2n), which is ~ 17 trillion for n=5.

[1] https://oeis.org/A052200


Isn't there another formulation of BB where we count shifts (from left to right, and from right to left) a TM makes instead of strings of contiguous 1s. I remember seeing a video about this definition.

The variant of BB discussed in the article is counting steps, not 1s. (I guess "shifts" is equivalent to steps, although that seems like a more roundabout way of specifying it.) Also, I don't think anyone counts strings of contiguous 1s, although of course one could define such a thing.

I worked for a couple years with a formidable and incomprehensibly smart engineer who ascended the IC ranks faster than anyone I’ve seen at an elite tech company. He quit the job a few years ago, and when I asked him his plans he told me he was going to work on the busy beaver problem. I can’t help but wonder if he is mxdys, the pseudonymous contributor mentioned in the article who wrapped up the formal proof of BB(5). I’ll probably never know.

If it were them would you be surprised that they wanted to remain anonymous?

I didn’t know him that well, but could see him being the kind of person who wants to avoid attention. He was happy to spend some time explaining the problem to me and why it was interesting and difficult.

I'm sorry, I don't get it. What are you suggesting to be the motivation to remain anonymous in this case?

I didn't mean to suggest anything. I was just interested in whether they thought that remaining anonymous was in keeping with their ex-colleagues character. Written communication is hard!

Prevents embarrassment if it turns out one’s idea is wrong, I suppose.

The person in question formalized the entire proof in Coq, which certifies its correctness.

To me, that makes your conjecture very unlikely to be true...


A message on LinkedIn or email never hurt!

[flagged]


This is just the same extremely poorly thought out argument against any type of pure research.

Pursuing knowledge for knowledge's sake always results in improvement down the line, even if it's impossible to predict when and how.

Do you think Benjamin Franklin expected a payoff from the experiments that led to the discovery of electricity? Or Newton researching gravity? Much of our knowledge about the universe comes from researchers picking at a topic just for the hell of it.

> I would rather someone of the intellectual calibre you describe work on problems more relevant to improving the world.

Impressive amount of entitlement here. Nobody owes you anything, and people are free to do anything they want. Not everyone wants to be a hero or great inventor, even if they have the skill for it. Nor is it some universal moral obligation for skilled people to spend their lives producing something profitable or beneficial to society.

Do you have hobbies? If so, you should feel bad because you're wasting your talents on useless unprofitable activities. Better go out and solve the world's problems.


Category Theory was called abstract nonsense by theoretical mathematicians, but it has helped improve type theory for computer language design.

G.H. Hardy was a British mathematician who expressed pride in the "uselessness" of his work, believing that pure mathematics was an art form that should not be tainted by practical applications. Ironically, his contributions to analytic number theory now underpins modern cryptography.

It’s weirdly difficult to study something for years - no matter how abstract — and successfully avoid any practical applications!


> It’s weirdly difficult to study something for years - no matter how abstract — and successfully avoid any practical applications!

This reeks of survivor bias and I'm interested to hear arguments why you're so confidence this is the case without resorting to the obviously very well-known survivors (as opposed to the thousands of research projects that ended apparently nowhere and long forgotten...)


I am not aware of any result of Hardy with applications to cryptography, but I'd be curious to be found wrong

He made significant contributions to the theory of elliptic curves.

> What's the payoff here?

Personal happiness? It's also more than short-sighted to assume that just because there is no immediately obvious overall benefit, that no such thing exists.

The techniques and insights developed and found during the pursuit of such fundamental and elusive problems can have profound side effects that may materialize only decades or even centuries later.


Right.

Why is it a smart person should be expected to use their capabilities for "the betterment of humanity"? There's a level of entitlement that goes with statements like that.


It might be worth flipping the question and asking what we can do to entice smart people to use their powers for the betterment of society!

Would you also ask "what could we do to entice artists away from art to instead use their powers for the betterment of society?"

I think we already do pressure artists into “productive” work - it’s very hard to make a living as an artist. I personally would like more art around my life!

One could voice the opinion that ("good") art is already a means for betterment of society. ;)

Curiosity-driven basic research is the most reliable way of bettering society in unpredictably transformative ways.

At the time when they were first explored, you could've easily said the same thing about subatomic particles, imaginary numbers and many other concepts of the like. Scientific advancement isn't always linear and the value of many discoveries is often only apparent after the fact, often after a considerable amount of time has passed.

Eh, imaginary numbers were invented to be useful. Cardano needed them as intermediary steps when plugging in his cubic formula.

one could argue that they weren't invented, merely found. nothing is invented, just discovered in math :-)

also the modern complex plane definition is not from Cardano.


I would classify advancing the state of the art knowledge in math/computers/science to be a noble feat in itself that doesn't need any justification (especially to those who have done less).

What’s the payoff of writing a poem?

Chicks dig it

Being done with it

I’m sure he cares very little what you think he should do.

The original Busy Beaver paper by Tibor Radó ("On Non-Computable Functions") is actually quite easy and fun to read.

For a modern version of the paper with some additional notes, see https://data.jigsaw.nl/Rado_1962_OnNonComputableFunctions_Re...


One notable thing here is that the proof is a Coq proof. I wonder if it is the first significant proof that starts out implemented in a theorem prover, instead of being a known proof translated to such a system.

Note that there have been computer-assisted proofs before (four-color theorem, Kepler's conjecture), but those were not done in a formally verified setting until later.


As far as I am aware, all of the proofs and techniques for each machine were present before mxdys managed to get the whole theorem into Coq: the main problem was that the deciders and manual proofs were disorganized and somewhat suspect. The worst offenders here were Skelet #1, which needed a bespoke program to accelerate it to its ultimate pattern [0], and Skelet #17, which took Xu seven pages' worth of dense reasoning to prove non-halting [1]. The full Coq proof put a much-needed degree of confidence into these results.

[0] https://www.sligocki.com/2023/03/13/skelet-1-infinite.html

[1] https://discuss.bbchallenge.org/t/skelet-17-does-not-halt/18...


Apparently, this is the proof in 19,000 lines of Coq: https://github.com/ccz181078/Coq-BB5/blob/main/BB52Theorem.v

> [The four color theorem] was the first major theorem to be proved using a computer.

https://en.m.wikipedia.org/wiki/Four_color_theorem

I guess maybe I don't understand what you mean by "formally verified setting", but I believe the four color theorem was first proven using a computer.

> Although flawed, Kempe's original purported proof of the four color theorem provided some of the basic tools later used to prove it.

It sounds like Kempe laid some of the groundwork, but then the theorem was ultimately proved using a computer.

I could be wrong though, I'm not an expert in this area.


The original four color theorem proof used a computer as a computational aid for some nasty casework: the procedure for checking each case and the list of cases that needed to be checked were found by hand.

Proving something in a theorem prover means the proof itself is an object constructed in the prover's language.


I think that's a bit too harsh. An entire complicated proof concocted solely looking at a computer screen in Coq? Every mathematician will have plenty of hand written sketches, ideas and parts of proofs. Does that mean it was "ported" to Coq?

> I think that's a bit too harsh. An entire complicated proof concocted solely looking at a computer screen in Coq?

Nobody is suggesting this, and in this case, it was indeed "ported" to Coq from existing sketches.

The distinction here isn't between on-paper-first vs computer-first. The distinction here is between using a custom computer program to perform computations for a mostly-paper proof, versus taking an existing general-purpose theorem-checker (Coq, in this case) and writing down the entire proof in its language so it can check it.


No, the point is that the proof itself should be written in the Coq language. The original 4-color proof in 1976 was written in English. They used computer programs to do certain computations, but the proof that those programs were correct and that they were computing the right thing was written in English and checked by humans.

I'm no expert either, but it's interesting to note that the original computer-aided proof was erroneous. ctrl-f for Schmidt on the Wikipedia page.

Attempts to prove BB(5) would have begun long before theorem provers were around. This BB was discovered in 1990, and all machines of size 5 would have been enumerated not long after.

Enumerating machines is trivial. Understanding and analyzing their behavior is hard.

Congratulations to the team! So the (blank tape) halting problem is solved for 5-state 2-color Turing machine programs. Has anyone tried applying these same techniques to the 2-state 4-color case? That seems like it would probably be tractable, although generally speaking colors are more powerful than states, so there might be some surprises there. (6-state 2-color and 2-state 5-color both seem intractable, perhaps even provably so.)

By the way, there is an extremely stupid but disturbingly widespread idea that humans are able to just intuit solutions to the halting problem, using the mind's eye or quantum mechanics in the brain or whatever. Needless to say, this did not factor into the proof.


> Has anyone tried applying these same techniques to the 2-color 4-state case?

I assume you mean the 4-color case. As I understand it, the deciders currently in use are sufficient to prove all the 2×4 holdouts non-halting. So the current champion gives us Σ(2,4) = 2,050 and S(2,4) = 3,932,964, barring some big errors in the decider design. The result just hasn't been organized in one place.

> (6-state 2-color and 2-state 5-color both seem intractable, perhaps even provably so.)

Yes, 2×5 has the Hydra, and 6×2 has the Antihydra, which compute the same iteration, but with different starting points and halting conditions. The standard conjecture (related to Mahler's 3/2 problem) is that this iteration is uniformly distributed mod 2, and a proof of that conjecture would very likely prove both machines non-halting, by yielding suprema and infima on the cumulative ratio of 0s to 1s. But of course there is no known method of proof.


> By the way, there is an extremely stupid but disturbingly widespread idea that humans are able to just intuit solutions to the halting problem, using the mind's eye or quantum mechanics in the brain or whatever. Needless to say, this did not factor into the proof.

The year is 52,000 CE and humans have solved BB(18) in the sense of exhaustively categorizing halting vs non-halting 19-state no-input programs. They have used a proof generator based on a logical theory called Aleph*, and at that time it had been known for 1.5k years that ZFC is incapable of establishing BB(18).

Compared to the year 2024 CE, considerable millennia before Aleph* came into use, it is clear that no program written at that point in history was capable of even using brute force proof checking to solve BB(18) in theory (like how we can enumerate and check ZFC proofs today to solve BB(??) in theory).

That's what is meant by the "humans intuit solutions to the halting problem" position. AFAIK, there's no known hard, theoretical reason why the above laid out future history cannot take place. And due to BB being incomputable, humans had to develop new theory to be able to construct the programs required. Something has to be accredited for the results, and it can't be computation since the programs did not exist.


> AFAIK, there's no known hard, theoretical reason why the above laid out future history cannot take place.

Probably the biggest issue is that they'd have no method to establish that Aleph* is consistent. To continue this BB chain indefinitely, you must invent further and further first-order theories, each of which might not be consistent, let alone Σ₁-sound. And with an Σ₁-unsound theory, any halting proof might not hold up in the standard model of the integers. You'd effectively have to postulate an indefinite amount of oracular knowledge.

Also, another physical issue: you can show that within any consistent, recursively axiomatizable theory, the length of the shortest proof of "the longest-running n-state machine is M" must grow at an uncomputable rate in terms of n. Ditto for the shortest proof of "machine M halts", where M is factually the longest-running n-state machine. Otherwise, a machine could use a computable bound on the proof length to solve the halting problem. Therefore, the proof should very quickly become too large to fit within our light cone.

In any case, the BB-related evidence for that position rested on BB(5) being determinable by extending the techniques used for BB(4). But in fact, it turns out that similar extensions don't even get you to BB(6). So there isn't anything to support the position, other than the pure speculation that anything is physically achievable given enough time.


Thanks for sharing interesting info!

How do we know that there would be consistency issues or Σ₁-soundness issues?

Your claim about proof size categorizing n-state machine halting status is new to me. Do you have any links to read more about this?

The argument doesn't make sense to me. Rather it seems like more of a consequence of BB being incomputable in the first place. The proof sizes for each BB(n) aren't expected to be computable at all. There is necessarily a different theory for each n (or intervals of n where each theory applies with limits on each).

> So there isn't anything to support the position, other than the pure speculation that anything is physically achievable given enough time.

Something something burden of proof something. It would be extremely fascinating to have a conclusive argument that large BB numbers cannot be solved.


> How do we know that there would be consistency issues or Σ₁-soundness issues?

From Gödel's second incompleteness theorem, no consistent first-order recursively-axiomatizable theory (i.e., a theory that can have its proofs validated by a Turing machine) can prove its own consistency. Thus, to prove that your current theory (e.g., ZFC) is consistent, you must move to a stronger one (e.g., Aleph*). But then you can't prove the consistency of that without an even stronger theory, and so on. Thus, you end up with an infinite regression, and you can't ultimately prove the consistency of any of these theories.

> Your claim about proof size categorizing n-state machine halting status is new to me. Do you have any links to read more about this?

Not really, other than some of my own ramblings on the bbchallenge Discord server. But it's not that long:

Suppose that the longest-running n-state machine M can always be proven to halt using a proof of under f(n) symbols, where f(n) is some fixed computable function. Then, you could construct a TM that, given n, enumerates every valid proof of length less than f(n) symbols, and checks whether it shows that a particular n-state machine halts. The TM then simulates all of the proven halters to see how long they run. By assumption, the longest-running machine M is included in this list. So this TM can compute BB(n), which is an impossibility. Therefore, the function f(n) cannot exist.

As a corollary, since it's "uncomputably difficult" to prove that the longest-running machine halts at all, it's no less difficult to fully establish the value of BB(n).


> Thus, you end up with an infinite regression, and you can't ultimately prove the consistency of any of these theories.

There is similar issue with even ZFC and PA. It’s not really a dealbreaker imo.

> Suppose that the longest-running n-state machine M can always be proven to halt using a proof of under f(n) symbols, where f(n) is some fixed computable function. Then, you could construct a TM that, given n, enumerates every valid proof of length less than f(n) symbols

The issue with that argument is that the TM which enumerates every valid proof can’t exist in the first place.

If you fix an axiomatic theory, it’s already known that the theory has a limit.[1]

If every theory has a limit, you need countably infinitely many axiomatic theories together to prove BB(n) for all n. So there’s no TM which can even enumerate all the proofs, since a TM must have finite states, and thus can’t enumerate infinitely many proof systems.

(In fact for similar reasons I believe a Halt program, which has infinite states but which works for all TMs with finite states, platonically exists. It’s an emulator and an infinitely long busy beaver number lookup table. The diagonalization argument doesn’t apply, since the infinite Halt doesn’t accept itself as input.

This Halt would have countably many states since each busy beaver number is finitely calculated and there’s only countably many of them.)

So it’s not clear that f(n) is uncomputable. If f(n) is the symbol count and not the binary encoded length of the symbols, it even seems that it’s trivially bounded by some constant for all n. The proof could be one symbol the meaning of which is encoded in the theory.

It is a fascinating question though. I’m sure there is some function of axiomatic theory proof checker TM size and binary encoded proof length which does grow with n. It’s unclear if it would be uncomputable though.

The consequence of it being uncomputable is that the universe doesn’t have the resources to even encode the theory and/or represent it’s proofs.

In fact I suppose even as long as it grows at all, there would be a limit to BB(n) which can be possibly determined. Very fascinating

[1]: page 5 https://www.scottaaronson.com/papers/bb.pdf


> There is similar issue with even ZFC and PA. It’s not really a dealbreaker imo.

We obtain PA from our basic intuitions about how the standard integers work, derived from empirical evidence. Everything past that involves increasing levels of intuition. So to continue it indefinitely, you must postulate an infinite amount of correct intuition, in some magical fashion that can never be captured in a computer. You can claim unlimited ingenuity all you want, but there's no a priori reason that it should indefinitely yield the truth, especially when it goes far, far past what our finite empiricism can provide.

We just haven't hit these limits yet, since very weak inductive theories are still sufficient for proving BB(5): we don't even need the full power of PA yet for our non-halting proofs. Thus why it looks like it should be so easy.

> If you fix an axiomatic theory, it’s already known that the theory has a limit.[1]

Not quite. Fix some consistent axiomatic theory T which proves PA. Then there will be infinitely many TMs which do not halt (in the standard model), but T cannot prove that they cannot halt, due to incompleteness. (Therefore T cannot settle the BB(n) question past a certain point, as Aaronson correctly says.)

But for every TM that does halt (in the standard model), T can prove that it halts, and the proof is to list out a trace of the TM's execution. Thus, every halting machine of every length has a halting proof in T.

The only benefit of a more powerful theory T is that it can "compress" this maximal BB(n)-sized proof into something more physically managable. But once we fix a certain T, we find (by my earlier argument) that it can only compress the proof so far, and the compressed size still must be an uncomputable function.

We can also see this by a forward argument, instead of by contradiction. Suppose that we'll accept any halting proof in a theory T. Then we can write a TM that lists through all proofs in T that are smaller than some bound N. (Notice that this is a finite set, since I've put an upper bound on it!) Then, for every proof that is a valid halting proof, the TM runs the corresponding machine. Then, the TM will halt, and its halting time will be greater than the halting time of any machine that can be proven to halt in T within N symbols. Set N to Graham's number (which is easily definable), and now the halting proof of the TM in T will not fit in our light cone.

(Notice how our TM clearly halts if T is Σ₁-sound! But since T cannot prove its own Σ₁-soundness, it doesn't have any way to prove our TM halting other than by the brute-force method.)

> In fact for similar reasons I believe a Halt program, which has infinite states but which works for all TMs with finite states, platonically exists. It’s an emulator and an infinitely long busy beaver number lookup table. The diagonalization argument doesn’t apply, since the infinite Halt doesn’t accept itself as input.

In that case, you just end up with the well-known oracle halting problem, where you equip a TM with access to this "infinite-state" machine. Then the problem is that you have a more powerful model of computation, but still with no way of solving its own halting problem. Much like how a consistent theory can only prove the consistency of weaker theories, not its own consistency.

> So it’s not clear that f(n) is uncomputable. If f(n) is the symbol count and not the binary encoded length of the symbols, it even seems that it’s trivially bounded by some constant for all n. The proof could be one symbol the meaning of which is encoded in the theory.

Of course I'm fixing a particular theory and a particular alphabet of constant size, the alternative would be absurd. The important part is about the ultimate behavior as n varies.


Thanks for typing all that out. It is very fascinating.

I’m just not convinced that fixing a theory and disallowing soundness axioms is any practical impairment for discovering busy beaver numbers.

Of course things get out of hand, but then why not collect evidence for soundness and let proofs avoid including an actual execution of the TM?

We don’t need a printout of grahams number computation to know a TM which computes it halts. why impose that here?

Has anybody done a compilation of such a PA proof generating/checking and simulating TM? It must have an enormous number of states and almost certainly wouldn’t be the BB of its cohort. Not including how it should be a similar thing where to prove it’s step count we shouldn’t need to emulate it.


> We don’t need a printout of grahams number computation to know a TM which computes it halts. why impose that here?

Of course, the TM which "computes Graham's number" can trivially be proven to halt, without running it fully.

But it is much harder to show that the TM (let's call it M) which "computes Graham's number, then plugs it into a big halting-proof generator (in a fixed theory T) as an upper bound on the number of symbols, then executes each proven-halting TM in order" also halts.

The problem is, just because a theory T gives a halting proof for a machine, doesn't necessarily mean that it halts. (That is, T isn't necessarily Σ₁-sound.) And if that doesn't hold, then M might run into a "fake halter" that can be proven halting in T, but doesn't truly halt in the standard model. Thus, the only ways to show that M halts are to either establish that T is Σ₁-sound, which can only be done by appealing to a stronger theory, or to run through each of the Graham's number of proofs, which takes astronomically long.

This is similar to an instance of the Berry paradox: if you could easily prove that M halts, then it would have a relatively short halting proof within T. But then it would find its own halting proof, and simulate itself. But then it would simulate itself simulating itself, etc., etc., and never actually halt. So T wouldn't be Σ₁-sound, since M doesn't halt even though you proved that it halts. The only way out of this trap is if M doesn't simulate itself, i.e., it takes more than Graham's number of symbols in its own halting proof in T.


Thanks for explaining this, it truly is fascinating.

This appears to be an argument that almost every axiomatic theory isn't Σ₁-sound. So I can only ask a few things:

1. How do I even learn about Σ₁-soundness? I've tried for 30 minutes now to search for it.

2. Is my impression correct? Does this argument show that for ex ZFC is not Σ₁-sound? If not, which axiomatic theories exactly fall into this trap?


> They have used a proof generator based on a logical theory...

I don't understand your scenario. If they're using a proof generator, that sounds like the opposite of intuiting or using the human mind. Maybe they used "intuition" to come up with new axioms for a logical theory, but that is not the same as determining of some particular concrete TM program whether or not it halts.


You got it - the creative developments of a stronger theory. This allows the creation of tools which can categorize TMs, tools which wouldn’t exist otherwise.

It’s fascinating that the entire space of finite amounts of random gibberish contains every such stronger theory.

As a thought experiment it does well. Interestingly the Church-Turing thesis seems to exclude ingenuity. That is, it doesn’t try to say there aren’t functions on natural numbers which are uncomputable but can be calculated with ingenuity. Seems that a ton of people conflate those things.


Why would Aleph* be necessary or relevant?

ZFC and larger theories are only relevant for infinite objects, not finite objects like BB.


I don’t have the understanding, but apparently there are finitary statements which are independent of ZFC. This has been used to prove that BB(745) is independent of ZFC.

https://mathoverflow.net/a/26605

Furthermore, Scott aaronson jokingly conjectured that even BB(20) is independent of ZFC. That’s my reference here where we end up needing Aleph* for BB(18).


> extremely stupid

This is simply a test for if consciousness has infinite computational resources.


> the 2-color 4-state case?

You mean the 2-state 4-color case...


Fixed, thanks

So we were just lucky that all non-halting programs of length 5 just happened to be provably non-halting?

Yes. In fact, Allen Brady feared as early as 1988 that there would be a totally intractable machine with 5 states [0]:

> Prediction 5. It will never be proved that Σ(5) = 1,915 and S(5) = 2,358,064. (Or, if any larger lower bounds are ever found, the new values may be substituted into the prediction.)

> Reason: Nature has probably embedded among the five-state holdout machines one or more problems as illusive as the Goldbach Conjecture. Or, in other terms, there will likely be nonstopping recursive patterns which are beyond our powers of recognition.

Luckily, this prediction did not come to pass, but only by a margin of one extra state.

[0] Allen Brady, "The Busy Beaver Game and the Meaning of Life", in Rolf Herken (ed.), The Universal Turing Machine: A Half-Century Survey, Oxford University Press, 1988, pp. 259–277. This chapter can also be found in the 2nd ed., Springer, 1995, pp. 237–254.


When you say "provably", do you mean that in the mathematical sense or in a practical sense?

For the practical sense, other commenters have already replied.

For the mathematical sense, I would say it would be pretty surprising to me if BB(5) had been undecidable -- 5 states and 2 symbols is just so few with which to try to encode undecidable behavior.

However, it's worth noting that as a consequence of the incompleteness theorem, there must be some n such that standard mathematics cannot prove the value of BB(n). And in recent years various people have been working on finding such n and seeing how low they can get the number. The current record[0] is 745.

I expect that record can be probably be brought lower (more easily than the record for computing BB can be brought higher!), but even so, that's a lot of distance between 5 (the highest we know) and 745 (the lowest we know to be unknowable).

[0]This is the current record both for ZFC and for PA, in case you're wondering what I mean by "standard mathematics"... so it at least ought to be possible to bring it down further for PA! I think so far nobody's found a way to do better for PA than for ZFC, even though, like, that has to be possible, right?


Are there any interesting mathematical problems that can be encoded with 5 states and 2 symbols, which now we can prove using BB(5)?

Mostly no: we did find some non-halting TMs that required new proofs, but none of those had the flavor of new math, per se. Indeed, we found that all but 30 of them could be proved by finite automata methods, meaning the TM's state/tape at any step could be reduced to one of finitely many states and we'd still know all we needed to know about future steps. I would argue that such a non-halting proof can't have much mathematical content. (Maybe a bit, in about the same way that an integer equation is sometimes proved unsolvable by considering it modulo n and checking every case.) Also, I learned some math I wasn't personally familiar with from the analysis of a particular machine: https://www.sligocki.com/2023/03/14/skelet-10.html (Zeckendorf's Theorem).

Tangential, but fun fact about Zeckendorf: In addition to Zeckendorf representation, there's also dual Zeckendorf (sometimes also called lazy Fibonacci), where instead of requiring no two consecutive ones, you require no two consecutive zeroes. (Not counting the implicit zeroes at the big end, of course.) It was surprising to me that this also works, but it does!

Actually, just as you can do bijective base-b [0], you can also do bijective Zeckendorf (using 1 and 2 with no two consecutive 1s). Although, as happens with bijective binary, bijective Zeckendorf is closely tied to ordinary Zeckendorf, so it doesn't offer much new. But bijective dual Zeckendorf doesn't work -- lots of numbers can't be represented!

One more fun fact about Zeckendorf and dual Zeckendorf: Write n>0 in Zeckendorf, and count how many zeroes it ends in. This will be even if the dual Zeckendorf representation of n ends in a 1, and odd if it ends in a 0. Similarly, if you write n in dual Zeckendorf and count how many 1s it ends in, this will be even if the (ordinary) Zeckendorf representation ends in a 0 and odd if it ends in a 1.

[0] https://en.wikipedia.org/wiki/Bijective_numeration


I don't believe so. But even if there were, unfortunately, these things basically work the other way around.

If you can encode a problem in n states, and you know BB(n), then, as you say, you could use that to solve the problem. Trouble is, how do you know BB(n)? In reality, the only way to determine BB(n) is to go and solve all such problems; there isn't any other easier method that you can apply that would then let you get answers for these problems as a consequence.

So, the value of BB(n) will always be a summation of "we did all the hard work of solving all the n-state problems", not something you do separately to get those answers out.


So I know we can prove that we cannot solve the halting problem for all TM.

But for any given TM, can we decide whether it is provable or not? Or will we meet some that we will never know if we can solve or not?


Sadly, we can't, for such a test would already be enough to solve the halting problem: if a TM's status is provable, enumerate possible proofs (of halting and non-halting) until we find one and know the result; if the status is not provable, then the machine certainly cannot halt.

My memories of theoretical computer science are quite rusty, but I seem to see an issue with your argument: if you need enumeration you meet semidecidability. In other words, if you start generating the proofs and you find the one you were looking for, then problem solved. But if you can't find the proof, you would need to keep generating them at infinity to find the one you need. You can't conclude the result in this way without enumerating all possible proofs. Unless, you have a way of limiting the number of proofs that need to be generated?

You are bounded by the fact that the statement is provable. Let a statement M be provable with a proof length K. By contradiction, if K is non finite, the statement must not be provable. Thus, there must be some positive integer K s.t. the proof length < K. Thus, it suffices to enumerate all proofs of length < K.

Insofar as we are given provability, we can solve halting.


> 5 states and 2 symbols is just so few with which to try to encode undecidable behavior.

Would you be brave enough to say the same thing about BB(6) or BB(7)?


Yeah, that would definitely surprise me.

Would you consider the Collatz conjecture as potentially undecidable?

Because that one needs to be solved for BB(6) already: https://wiki.bbchallenge.org/wiki/Antihydra

Considering how we made basically no real progress on it mathematically in a whole generation, solving BB(6) within the next decades would be a miracle, and I would bet a lot against it.

I can't see us EVER getting to BB(10), no matter how advanced humanity grows (and it would be meaninglessly large anyway).

I think 765 is just a huge overestimation based on the fact that it is somewhat straightforward to construct.


First off, it's not Collatz, it's a Collatz-like problem.

But yes, I'd be very surprised if Collatz were actually undecideable, even if it's well beyond the reach of current mathematics.

I agree with your statements about BB(10) and BB(6), but they just aren't very relevant. I agree those involve extremely difficult problems likely well beyond the reach of current mathematics, but I'd still be very surprised to find anything undecideable in there. There's a big difference between being truly undecideable and merely well beyond the reach of current mathematics!

(Also, the current record for undecideability is 745, or 765.)


BB(745) being independent of ZFC isn’t due to incompleteness theorems. It encodes a different theorem which is also independent of ZFC.

https://scottaaronson.blog/?p=2725


Well, that post is about 8000, not 745; but yes, in general, you are not going to use a Gödel sentence when trying to drive down the record.

This isn't disagreeing with what I was saying, though; I said incompleteness requires there is some such n, and then also said what we know about the record, I didn't say the record comes from a Gödel sentence.


You are right, sorry about that :-)

Hope my comment addresses a common confusion despite its unnecessarily corrective tone.


The article touches on that:

> But just four days ago, mxdys and another contributor known as Racheline discovered a barrier for BB(6) that seems insurmountable: a six-rule machine whose halting problem resembles a famously intractable math problem called the Collatz conjecture. Connections between Turing machines and the Collatz conjecture date back to a 1993 paper by the mathematician Pascal Michel, but the newly discovered machine, dubbed “Antihydra,” is the smallest one that appears unsolvable without a conceptual breakthrough in mathematics.


I'm curious how the Turing machines can resemble problems which take input? BB(n) is defined as a n-state Turing machine that starts off with an empty tape. Collatz(n) is how many steps are taken before it terminates when starting with input 'n'.

Does this mean a BB(6) machine which resembles Collatz is testing all possible values as part of it program and not part of anything on the tape (since the tape start out empty)?


It's not testing all values, but one particular starting point. In all likelyhood, it will never reach its stopping condition from this starting point, but proving this even for a single value is currently intractable. Compare with the "5x + 1" variant of the Collatz cojecture, where many values are believed (but not proven) to run off to infinity, never to return.

Edit:nvm see thread

For collatz, the empty input machine loops over all natural numbers and halts if it finds one which doesn’t eventually reach 1.

To prove that it never halts, you’d have to prove the collatz conjecture. Otherwise you’d have to find the smallest counter example of the collatz conjecture.


Suppose that there exists a natural number that diverges to infinity under the Collatz map. Then the Collatz conjecture would be false, but your machine would still run forever on that diverging number. As far as I am aware, there is no known machine that halts iff the Collatz conjecture is false.

You are correct, thank you

Bit of a side note, but the article is a bit misleading here.

Collatz-TM connections have been investigated much earlier.

My fav is John Conway in 1972, who showed that the generalized Collatz problem is undecidable.[1]

[1]: https://en.wikipedia.org/wiki/Collatz_conjecture#Undecidable...


All things considered, the proof is pretty short! It’s 19,000 lines of Coq (including white space and comments). And in my experience, if this were compiled into a traditional paper it would be significantly shorter than the Coq version. (Of course, length of the proof really isn’t a measure of difficulty or complexity, but it is a very rough yardstick we can use.)

When talking about the limits of human knowledge, I often think about the theorems that are provable, but so complex that no human being could understand them. Probably the most complex proof we have is the classification of finite simple groups, which is thousands upon thousands of pages, and there is probably very few or no people on Earth that fully understand all of it.

As the article suggests, BB(6) could be undecidable. But it could also have a proof millions of pages long, beyond the reach of humanity.


I wrote program to solve the cutting stock problem (https://en.wikipedia.org/wiki/Cutting_stock_problem) for a personal project. I couldn't (or didn't want to) use any existing program for it because my stock involved cutting pieces shaped like either /---/, /---|, or |---| and I didn't want to waste material on the 45 cut.

I find it interesting that the description of Brady's program to optimize search for BB(4) by cutting out search subtrees whose differences don't matter is fairly close to a description of what I did to make my program fast.


According to Scott Aaronson's blog post on this, there are 16,679,880,978,201 5-state Turing machines. I wonder if we know what percentage of them halt?

Edit: number of TM for n states: (4n + 1)^(2n). Found this (for smaller n), which is the kind of analysis I was curious about: https://github.com/LukasKalbertodt/beaver


We must know the percentage which halt! I can’t find it in the bbchallenge.org site, but every machine is categorized.

Not necessarily. The seed database only contains TMs in tree normal form (TNF), which already contains several reductions. To convert the percentage of TNF halters to the percentage of total halters, you'd have to account for the multiplicity of each TNF machine, which is a bunch of fiddly combinatorics. Also, you'd have to add the machines that have no TNF form due to never writing a 1 to the tape.

Maybe I’m just an unsophisticated code monkey, but I read a little about Busy Beaver from time to time and I just don’t get it. Why is this an interesting problem? What do we hope to learn from it?

IMO, because it gives you a way to know if a program will terminate or not. If it consists of 5 states and goes beyond 47176870 steps of execution, then it will never halt. However, we are currently limited to 1 to 5 states Turing programs. And you can make correspondance between some Turing machines and some theorems in Mathematics thus it gives you a way to prove them I guess.

The one thing I didn't understand is the distribution 1s and 0s on the tape. Does each Turing machine have to be solved for all possible combinations of 1s and 0s?

The game is defined to start with an infinite all-zeroes tape.

An important observation is that it doesn't really matter what you start with, because you have two ways to "reduce" each Turing machine to one that can start with all zeroes:

- You can "flip" rules/initial tape values, trying to get a machine that can start with all 0s. This is not always possible. A trivial case where it is possible is a machine that wants to start with all 1s (just flip everything).

- You can insert more rules to set the tape up as needed. This makes it an BBn+x busy beaver candidate, but at least it's not your problem at BBn now.

The last way can get a bit complicated because you need to insert ad-hoc logic, since the tape is infinitely long. So it's not a just a case of inserting setup at the beginning of the program (because you can't setup an infinite tape in finite steps). Also to be fair this second point is a non-trivial assertion and requires proof that it is actually possible to do in every case. It is obviously possible to do for every machine that runs in finite steps though (since it can only consider a finite amount of tape).

Luckily actual computers have finite memory which is generally initialized with 0 anyways.


Thank you! I’m trying to read up on it now also because this is definitely not something I know much about.


> Why is this an interesting problem? What do we hope to learn from it?

Because it's a complicated puzzle, not necessarily because we hope to learn something from it. Which could happen, but it's definitely not the primary goal here.


Can someone explain how there can a finite number of rules ? Could the rule "If it's a 0, change it to 1 and move 3 squares to the left" produce a different result than when moving 4 squares, or is that not the case ?

A turing machine rule can only:

- set the current cell’s value

- move by one cell

- switch to the next rule (or halt)

(A TM rule has one sub-rule for each valid value / symbol)

Turing machines are characterised by the number of values (or symbols) and the number of rules (or states) (and technically the number of directions but TMs are generally one-dimensional). The Busy Beaver game fixes the number of symbols to 2, and only varies the number of states e.g. BB(3) is played on 3-states 2-symbols turing machines.

Thus the number of possible rules in BB(n) (n-states 2 symbols turing machine) is necessarily limited, to (4n + 4)^2n.


You can only ever move one step to the left or right. If you want to move multiple steps, you need multiple states to use as a counter, or set up the tape a specific way ahead of time so you know when to stop moving.

Who else thinks mxdys is an AI program gone rogue? Thank you sweet mxdys!

https://scottaaronson.blog/?p=8088

> You need to find a mathematical reason why it can’t halt, and there’s no systematic method for finding such reasons—that was the great discovery of Gödel and Turing nearly a century ago.

That's only true for sufficiently large N, large enough to encode the halting paradox. (How large is that N? It's larger than 5!) Smaller N can and do fall to systematic methods.

> (x) = (5x+18)/3 if x = 0 (mod 3),

The second = should be \equivalent. This is a rare case where that actually matters, because is being used in both non-modular integer operations (divide by 3) and modular operations (where division by 3 is not defined).


BB(0) = 0

BB(1) = 1

BB(2) = 4

BB(3) = 6

BB(4) = 13

They just proved that BB(5) = 47,176,870.

It is known that BB(6) must be at least 10^10^...^10 (a tower of exponents fifteen levels high).

https://en.wikipedia.org/wiki/Busy_beaver#Known_values_for_%...


The definitions of the "BB" function don't all appear to be the same. The article referenced in this overall discussion says BB(2) = 6. In the notation of the Wikipedia article you reference, this would be S(2) = 6; S is the number of steps. What has now been proved is that S(5) = 47,176,870.

However, your BB values for 0, 1, 2, 3, and 4 match the Wikipedia article's notation for Sigma; Sigma is the number of 1s written on the tape at halting. In that notation, what has now been proved is Sigma(5) = 4098.


Yeah, this is a bit confusing, and the subject of repeated internal controversy. Most of the twentieth-century authors focused on the number of 1s, Σ(n), following Radó's original practice of treating σ(M) as the score of a machine M in the "Busy Beaver game". But when Aaronson re-popularized it in 2020 [0], he used BB(n) to denote the number of steps (which Radó called S(n)), and the bbchallenge project has been using this latter convention for publicity. Pascal Michel's website [1] has all the Σ(n) and S(n) bounds up to n = 7.

Personally, I think both functions have their strengths and weaknesses. Σ(n) is easier to calculate for machines that run too long to be simulated directly (e.g., Skelet #1 from the article) but leave a known pattern on the tape, and it also has historical priority. But S(n) has a simpler argument for being undecidable, since it provides a trivial filter for testing if a candidate machine cannot halt. Also, σ(M) is a bit weird in that it has no lower bound in terms of s(M), since an adversarial machine could do a colossal amount of work before wiping its tape at the end.

Regardless, past BB(3), there isn't any known size where the champion machines for Σ(n) and S(n) are different. (At least, the sets of champion machines aren't disjoint: Σ(5) = 4098 is shared by both the S(5) champion and another machine that runs a quarter as long.) The score of a machine is dominated by googological strength rather than technicalities in the definition.

[0] https://scottaaronson.blog/?p=4916

[1] https://bbchallenge.org/~pascal.michel/ha


> past BB(3), there isn't any known size where the champion machines for Σ(n) and S(n) are different.

My feeling is that this trend cannot continue forever, and for infinitely many N they are different. If they are always the same, then you could find the steps champion just by finding the marks champion. This would be convenient, because as you pointed out, steps are more logically important, while marks are more practically important. But this feels too good to be true, and so it probably isn't.


Hmm, around n = 2 or k = 2, there are only 2 added transitions for a machine to do "the next big thing" googologically, so that doesn't leave much slack for many different machines at the same level. But maybe that could happen closer to the n = k line, where each increment adds many new transitions. Or to the contrary, maybe each increment just does several "next big things".

Or the next big things were implementable at n-3, but finally pay off at n+1?

What does "practically" mean here?

Busy Beaver champion programs are said to run for super-exponentially many steps. But nobody has actually run their simulators for that many steps. Instead, simulators can prove tape fast-forwarding rules. Basically, you look for repeating tape patterns. If you can prove the pattern is correct, then you can apply that transformation again if some tape circumstance shows up again.

For example (using run-length encoding), 1^n 0 1^m might become 1^(n-1) 0 1^(m+2)

When the rule is applied, the transformation is applied directly to the tape, generally by manipulating some count variables.

Now, how many machine steps does it take to apply this transformation? Well, TBH I'm not really sure. It seems kinda complicated, especially when the rules get more elaborate. If you are trying to run your simulator as fast as possible, you probably don't want to bother calculating it at all anyway, since you can always rerun the analysis at a more leisurely pace later.

So when I say that marks are "more practically important", I mean that marks are central to the operation of advanced simulators, whereas steps are a derived afterthought value.

Logically, the steps are more important, since they give you an easy method for solving the halting problem for the state/color class.

So far, the markiest programs are also the steppiest. My conjecture is that they will turn out to be different in infinitely many classes. If they were always the same, you would be able to get the logical primacy of steps just from working with marks. And that sounds too good to be true.


Pascal Michael's website hasn't been updated with the last results yet, right?

Under "Turing machines with 5 states and 2 symbols", there's no mention of the definitive results


Thanks for this information, it's very helpful! Also I had no idea that "googological strength" was a thing. :-)


Looking at the pictures on Wikipedia regarding those programs, it's interesting how they seem to result in fractals.

This mainly has to do with the BB(5) champion having "bouncing" behavior: it moves from the left side of the current pattern to the right side and back again, each time extending it slightly. Since the time it takes to grow the pattern is strictly proportional to its current size, you end up with a bunch of parabolas that appear self-similar.

That is because these are small. Small machines can not help but still have those sorts of patterns in them. See all the simple 1-D automata of interest, for instance: https://atlas.wolfram.com/01/01/

As you step up I am fairly confident you'll see the winners get noisier and noisier. They'll have some sort of pattern, just ones you won't be able to comprehend. For some suitable definition of "noisier" this might even be a provable claim.


Ha, the funny thing is, there are conjectures both ways! Some people think winners get noisier and noisier, the spaghetti code conjecture. Others actually think the winners get cleaner and cleaner, the clean code conjecture.

Personally, I am slightly leaning towards clean code being more powerful. Chaos is hard to control.

https://nickdrozd.github.io/2022/03/12/formal-theory-of-spag...


I think human's attempts to make busy beavers may be clean, but chaos being hard to control doesn't matter to the "real" busy beavers because we pick from the entire range. I think that's one of the hardest things to wrap human brains around trying to intuit how the sequence behaves. We are selecting for the maximally pathological machines. If there is a frontier between chaos, order, and simply never halting, we are deliberately hunting the one that is just barely on that line. I don't think clean patterns can ever hope to compete with one that is exploiting chaos to just that bit that barely walks up to the line and doesn't quite, after a number of steps beyond any human ability to write down with any amount of notation we have, cross it.

But yeah, it's an opinion. :)


> We are selecting for the maximally pathological machines.

This is not quite right. We're selecting for maximally long-running (or mark-producing, etc) programs. Whether those programs turn out to be "pathological" in some sense is an open question, and a question that can only be answered (to a limited extent) by observing and analyzing actual champion machines. Apriori, there's nothing we can say about what a champion program will do or what its code will be like. The Spaghetti Code Conjecture says that these champion programs will tend to be complicated. I have argued in the past that this may not be the case. It's entirely possible that the code of a champion program will be written more or less straightforwardly, and its long-running-ness comes from executing some bizarre and exotic mathematical function.

Actually, I think the more likely outcome is that some champions will be spaghetti and some will be clean. If they were all spaghetti or all clean, then that fact could be exploited to speed up the search process by discarding all programs not matching the property. And that sounds too good to be true, so it probably isn't. Maybe BB(8) champion is clean, BB(9) is spaghetti, etc, and there are no reliable trends.


"Actually, I think the more likely outcome is that some champions will be spaghetti and some will be clean."

Ah, now, that's thinking with pathologies.


Clean code isn't the same as clean behavior/output, though. A Collarz calculator would have rather unclean output.

Wouldn't it make more sense for the longest running ones to be fractal in a sense given that the self-similarity is sort of compressible? I guess we'll probably never know, because it's not clear that we'll ever find another BB winner.

Fractals or fractal-like things seem to be the main way you get "interesting" behaviour out of very simple rules. Procedurally generated art/content tends to use them a lot (look at e.g. demoscene programs).

> "After decades of uncertainty, a motley team of programmers has proved precisely how complicated simple computer programs can get."

my team has also proved this via our production codebase


These researchers clearly haven't looked at our NodeJS ERP system.

Aaronson’s post on BB(5): https://scottaaronson.blog/?p=8088

From the article:

> To distill the essence of the halting problem into a simpler form, Radó imagined sorting Turing machines into groups based on how many rules they had — one group for all one-rule Turing machines, another for all two-rule machines, and so on. Sure, that leaves infinitely many such groups, since a Turing machine can have any number of rules. But the number of distinct machines in every group is finite, since there are only so many possible combinations of rules.

> With two rules, there are already over 6,000 distinct Turing machines to consider; that number swells to millions with three rules, and to billions with four.

I'm pretty sure the standard terminology is "states", not "rules". This deviation made it harder to understand.

Each state produces 2 transition rules, depending on the symbol at the tape head.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: