Hacker News new | past | comments | ask | show | jobs | submit login
Terence Tao Proves Result on the Collatz Conjecture (quantamagazine.org)
361 points by datashrimp on Dec 13, 2019 | hide | past | favorite | 153 comments

I got sucked into this problem as an undergraduate in mathematics. I similarly wanted to limit my search space and my initial work lead me down the path of prime factorization. If every prime number follows this pattern then logically every number also follows this pattern. Then you get to the hard part which is why or why not would a prime number follow this pattern. So then you go looking for the reasons why prime cycles can or cannot exist, your brain melts, and then you work on something else. Fun times though.

As a computer science (Informatics) student, I shy away from any path in a prove that makes me make some kind of statement about primes. They are the death of every approach they are involved in, to me.

From my one hour of thinking about this problem, I think sums of finite sequences are a much more promising approach. But obviously you'd know way more about all of this than me.

Many of the smartest mathematicians who have ever lived have spent collective man-decades thinking about this problem. Any idea any of us can come up with in an hour of thought will have been considered and rejected a thousand times already by others.

This is partly why it's such a trap. It's like trisecting an angle or squaring a circle -- easy to state the problem, easy to start playing around with it. In my amateur opinion, any proof related to Collatz would be similar in complexity and non-obviousness as the proof of Fermat by Wiles, if such a proof is even possible.

For most, it's far more fruitful to jump on https://math.stackexchange.com/ and work on questions there. Low impact, but high probability it is within anyone's realm of capabilities. Answering 100 questions people want to know is bigger impact than making 0% progress towards a problem you have no realistic chance of solving.

I know, I know. But I kind of want to figure out why what I'm thinking doesn't work. I want to run up against these problems myself.

There are tons of theorems that are most easily proved by showing that some statement holds for primes, and extending it through some sort of multiplicative structure to all numbers.

Sieve of Aristothenes + induction seem most likely to me.

"For every binary sequence of length X, the Collatz iteration ends at 1. Adding more zeros clearly does not change the result, therefor adding more ones is the only path which might disprove the conjecture." Eventually you build up to a catalog of proven sequences that all other numbers must necessarily step into at some point.

You just made it as hard as proving generalized Riemann hypothesis. Not good progress with that task. Might as well start working on Goldbach conjecture.

*Eratosthenes, best known for being the first person to calculate the circumference of the Earth.

I don't think primes are a factor - if it was 3n+1 is also prime, and it is easy to shown 3n+1 can be factorized into 2 and something else.

The starting number has to be a prime, and the result is that you would cycle to that number and no further.

Alternative result would mean literal mathematical chaos. The number would be an origin point of an attractor. Approaches to find such a counterexample would require something akin to guessing a fractal.

Trying to evaluate the Lyapunov exponent of this system could be fun. Applying Poincare-Bendixson and Bendixson-Dulac to a dual system that is so defined as to reject the final limit cycle and is differentiable, perhaps... Too big math for me. It smells of Hilbert's sixteenth problem.

Can you write more about prime cycles and your brain melting? That sounds neat.

i wandered into a talk at MAA ~2006, where an old professor spoke deadpan on how he “wasted his life” on this problem. it was a poignant comedy, and he delivered it well. dont fuck with this problem.

> Tao used this weighting technique to prove that almost all Collatz starting values — 99% or more — eventually reach a value that is quite close to 1. This allowed him to draw conclusions along the lines of 99% of starting values greater than 1 quadrillion eventually reach a value below 200.

Isn't this equivalent to saying that "99% of starting values greater than 1 quadrillion eventually reach 1"?

I mean, once you reach a value below 200 then you will continue and reach 1. Not only below 200, but below any limit that was experimentally verified (i.e. around 10^20)

If Colmin(N) is the lowest value reached by iterating the process from initial value N, Tao proved that

Colmin(N) < f(N)

for arbitrary f provided f tends to infinity, for almost all N. Here, "almost all" means something like "exceptions(N) / N, tends to 0 for large N", where exceptions(N) is the count of values that do not obey the inequality [0]

So it's an asymptotic result. The first million integers could all be exceptions - but eventually the proportion of exceptions dies out.

The ability to pick arbitrary f is very powerful. Pick the slowest-growing function you can think of. e.g. Tenfold-iterated logarithm. The inequality says for all but a negligible fraction of integers, Colmin grows slower than that function.

[0] Nitpick: I'm describing the natural density, but Tao needs the logarithmic density, where each exception n is weighted by 1/n.

Try the inverse Ackerman function instead: https://en.m.wikipedia.org/w/index.php?title=Ackermann_funct...

If non-computable functions are on the table, define f(n) to be the smallest k such that BB-k >= n, where BB-n is the n-th Busy Beaver number: https://en.wikipedia.org/wiki/Busy_beaver

Edit: looks like @tromp managed to reply before I added the bit about BB-n. :-)

Or the inverse TREE() function [1]. There's also the inverse Busy Beaver function if you don't insist on computability...

[1] https://en.wikipedia.org/wiki/Kruskal%27s_tree_theorem#TREE(...

Non-computable functions are indeed on the table as long as they tend to infinity in the limit..at the cost that figuring out when you start to hit that limit is undoubtably itself non-computable.

Actually it's a fair nitpick because logarithmic density 0 does not imply natural density 0 - so the claim in the article is misleading in more than one way.

For example, the set of numbers with a perfect square number of digits has log density 0 but natural density non-existent (inf 0 and sup 0.9). I believe it holds in the other direction though - natural density 0 implies log density 0.

If you choose giant starting numbers(like "Tree(3)"), then maybe "quite close" is 10^100.

Yes, I believe the key phrase was ”along the lines of” and Tao’s result doesn’t actually prove any particular constant bound which would indeed be a much stronger result.

I don't like the phrasing the article used. The domain of the Collatz function is the set of positive natural numbers, which is countably infinite. Unless I'm misunderstanding the quote, it doesn't make sense to refer to percentages of infinite quantities.

In the formal sense of the term, "almost always" means that a property applies to all elements of an infinite set, with all exceptions comprising a subset with a smaller cardinality than the original set.

I usually see this term applied in the sense of measure theory, where the sets under study are uncountably infinite (or greater). Being that the naturals are countably infinite, the only way I can interpret this statement is that Tao has proven all but finitely many elements do not have an orbit terminating in 1 under the Collatz map.

Is that correct? If not, would someone mind clarifying the use of percentages here?

You can make sense of a percentage of natural numbers with the concept of natural density: https://en.wikipedia.org/wiki/Natural_density

The natural density of a set is the limit as N goes to infinity of the proportion of the numbers up to N which are in the set.

For instance the natural density of primes, squares and cubes is 0 and the natural density of even numbers is 1/2. But all these sets are infinite so the same cardinality.

Thank you, but your last sentence is not really correct. The concept of cardinality captures different sizes of infinity. This is why I mentioned countable and uncountable infinites upfront: a countably infinite set and an uncountably infinite set do not have the same cardinality, but both are infinite.

This is not correct.

I think your basic mistake here is thinking that sizes of infinite sets has to be measured by cardinality. Yes, that's one way, but usually not the most useful way; it's a very crude measure. Its only real advantage is the fact that it doesn't need any context to work. In most cases, though, more context-dependent measures of size are more appropriate.

For instance, typically, "almost everywhere" doesn't mean "except on a subset of smaller cardinality", it means "on a subset of measure 0".

It's easy to see why you might get these confused. In many of the typical cases of measure theory -- let's say R^n with Lebesgue measure for concreteness -- you're looking at a set of cardinality 2^(aleph_0), and any countable subset will have measure 0. (Indeed, any set of intermediate cardinality will also have measure 0, if such a thing exists, although famously the question of whether it does cannot be resolved.)

But you can also have subsets also of cardinality 2^(aleph_0) which nonetheless have measure 0. E.g., in R, the Cantor set has measure 0, although its cardinality is equal to that of R itself. If a certain statement was true except on the Cantor set, we'd still say it was true "almost everywhere".

(And all this is assuming we're using "almost everywhere" in the measure sense. Sometimes it's instead used to mean, except on a meagre set, which is a different notion; but that's not the usual meaning, so someone using it that way would hopefully say what they mean explicitly.)

In the case of the natural numbers, we frequently measure sizes of subsets by their natural density. The natural density of a subset S of the natural numbers is simply the limit of |S&cap;{1,...,n}|/n as n goes to infinity (this limit is not guaranteed to exist, of course). So when talking about the natural numbers, if we say something holds "almost everywhere", typically this means it holds except on a set of natural density 0.

(Remember, math terms are heavily overloaded!)

I hear there may be some people use "almost everywhere" when talking about the natural numbers to mean "except on a finite set", but I'd consider this use confusing; if that's what you mean, just say that.

Hope that clears things up.

Edit: It turns out that Tao is actually using the logarithmic density here, not the natural density. Oy. The importance of actually reading the paper, I guess...

Thanks, this is a great comment! You've cleared things up for me a lot.

From Tao’s blog post, it sounds like the result is more in terms of things like: for almost all n<N, the maximum value of n’s Collatz orbit is bounded by extremely slow-growing functions of N (e.g. log log log log N).

Nice. This means that any resulting limit cycle will be tiny, thus hard to hit. So he actually proved that it's hard to disprove the conjecture.

Speculation only, but the method of proof might, in its key step, only bound to ~200. This is a more meaningful result to present if your reader knows that 200 is a as good as 1 because it loses nothing and also highlights something about the bound produced.

One of the aspects of math that I find beautiful is the deep connection between often seemingly unrelated topics. Connecting PDEs, which is a branch of analysis and what I consider "continuous" mathematics to something that seems to me like a discrete number theory problem is beautiful. The bit of intuition for why this might be the case in the article is nice:

> With a PDE, you plug in some values, get other values out, and repeat the process — all to understand that future state of the system. For any given PDE, mathematicians want to know if some starting values eventually lead to infinite values as an output or whether an equation always yields finite values, regardless of the values you start with.

> For Tao, this goal had the same flavor as investigating whether you always eventually get the same number (1) from the Collatz process no matter what number you feed in. As a result, he recognized that techniques for studying PDEs could apply to the Collatz conjecture.

My favourite story on this is Feynman's use of PDEs in the Connection Machine design.


I really want to buy the CM-1 T-Shirt, but with duty and shipping it's too much for a shirt the comments describe as being of poor t-shirt quality....



I've always wanted to see his analysis of the behaviour of the router in terms of PDEs.

Thank you so much for sharing this!

The connection is actually very well known. See https://en.wikipedia.org/wiki/Dynamical_system.

I don't doubt that. I'm not a mathematician, just a guy who took some classes in college. For a noob like me, this is all cool to see for the first time.

I don't actually think it is that well-known, you were completely right to point out. Also, I suspect the "dynamical systems" wikipedia page is completely irrelevant.

The Collatz map defines a discrete dynamical system on N^+. I don't know about your background, but the connection isn't hard to see for anyone with a tangentially related background. I have studied https://en.wikipedia.org/wiki/Arithmetic_dynamics (that page does mention Collatz incidentally) myself in a past research project, so the connection is known to me. The dynamical systems Wikipedia page serves as an entry point to the broad topic, but it's a huge field and intersects with many others so apparently a single short page isn't gonna cover everything.

What I meant was that "Collatz's got something in common with PDEs" isn't the unique insight of Terence. Of course the connection isn't obvious to, say, the average coder, but it's universally known among mathematicians who have at least studied related fields for a little bit. Terence's magic is on a much higher level.

I did my PhD in theoretical physics, and part of my thesis was on nonlinear eigenvalue problems in PDEs. I wasn't aware of the connection; though I guess it really depends on what you mean by "it's well known".

where on that page does it discuss the connection?

Now that was a great example of an accessible, technically and culturally accurate article. For stuff like this, Quanta magazine is generally the world leader.

What does "culturally accurate' mean?

I guess they mean it accurately captures the "mathematical culture" surrounding this problem, which I can confirm.

I assume that the characterisation of the problem as 'dangerous', 'alluring', tempting', etc. to mathematicians is accurate.

meh. No, I think this is the sensationalization we expect from pop-sci accounts.

I understand it as that people with expert knowledge from the field, mathematicians in this case, agree with the classification of the problem on a social and cultural level.

Was thinking the same thing! A catchy title that actually described the content.

I can totally see how one could be sucked into this problem.

The solution seems immediately reachable. After giving it some thought, I can prove that it is equivalent to "every number will at some point reach a power of 2" because from a power of two you then go straight to 1 through divisions.

This result seems even more reachable since I no longer need to prove that numbers eventually get smaller.

I'm sure this is not a new discovery to people who have worked on the problem. But part of me really wants to spend the rest of the day puzzling around with where that approach goes.

Edit: Thought about it more. The two ways it doesn't hit 1 would be for it to find a loop, that never hits a power of 2 or to grow infinitely.

Thinking about how a loop with 3n+1 and n/2 would have to function, I think it's probably pretty easy to prove that 4-2-1 is the only loop that is possible.

That leaves us with the infinite grows, but without ever hitting a power of 2.

Edit2: Turns out 4-2-1 being the only loop that is possible is actually not easy to prove at all and isn't proven. But no matter, that wasn't how I was going to continue working on the problem anyway.

The most promising solution to me is an inductive proof. Since I know all powers of 2 converge to 1 and only have /2 applied to them, my next step is to find the properties of numbers n such that 3n+1 is a power of 2.

Only for every second power of 2 is this a whole number. The numbers for the powers of 2: [4, 16, 64, 256, 1024] have the possible precedents: [1, 5, 21, 85, 341]. Interestingly every subsequent number can be built from the previous one by 4n+1. Clearly it's 4 because it's every second power of 2. I feel like, if you recognize the structure of every number in the tree, you can prove that every natural number has one of those structures. And then you are done.

Once again, I'm completely aware none of this is original thought. But I enjoy it anyway.

Edit3: Every even number will be divided by 2 until it is an odd number. Now we only have to look at the structure of the odd numbers.

> I think it's probably pretty easy to prove that 4-2-1 is the only loop that is possible.

Conjecture 2: https://terrytao.wordpress.com/2011/08/25/the-collatz-conjec... turns out its not :p

Damn, I was so sure it would be. This problem really is fascinating.

Had the same thought further down the comments.

Since you're getting even numbers at least 50% of the time, eventually you should go on a run.

You also benefit from hitting a y =2^n x N number because you get to greatly reduced the size of y

Further, every time you make it down to 1, you can append the values you hit on your way down to a list of seen values, as they're part of a convergence chain, meaning if you see them in a future cycle, you can stop your iteration early because you have a known outcome.

So this doesn't necessarily become a computationally heavy problem with large numbers.

Yeah, this is definitely at least a semi-decidable problem. By following the process of the video in the beginning you can generate all values that converge towards 1, and you will reach each value at some point. Now, all that is missing is the other direction. How would you test if a number doesn't converge towards 0 in an algorithmic way that always finishes?

And since 3n+1 will never be prime, you'll eventually reach a number which has factors which converge to 2^n

The only way to grow infinitely is to hit more odd numbers than even, but you're hitting even at least 50% of the time.

And eventually you'll get a number 2^n x N which will string together multiple divisions.

So you're hitting n/2 more than 50% of the time, meaning your number is getting smaller more often than not, and will eventually go to 1.

If you hit even exactly 50% of the time you will grow to infinity - that is a sequence of even/odd will grow so long as the sequence continues, it is only when the sequence isn't even/odd that you will shrink. So you need to prove that you are hitting even more than x% of the time where x > 50. I'm not sure how to calculate the exact value of X though.

This is one case where a single counter example would end the whole debate in about 5 minutes (I'm going to guess that is about how long it will take to calculate your whole sequence). However in absence of a counter example and the large number of of numbers that pass: it seems like it must be true - but we don't know how to prove it. We don't even know if it is possible to prove it (a proof that you can't prove it could be the greatest advance in math to this date - but again it is extremely hard to do that)

You would first have to prove that the number is actually a counterexample. I don't know what you mean by calculating the whole sequence, as it would be an infinite sequence. Proving a counterexample seems like a halting problem kind of problem.

Unless you find a counterexample which results in a loop N1 -> N2 -> ... -> Nn -> N1

Err... right. I stand corrected.

But the question of a counterexample (eventhough it probably doesn't exist) is interesting. Is there a number for which we don't know weather it is a counterexample because it would take too long to compute? I feel like any single counterexample you'd propose would be proven wrong pretty quickly.

The problem is if my counter example is a counter example it diverges to infinity. (someone else proposed a proof that there are no cycles other than at 1, if this proof is wrong my counter example could be a cycle and thus "easy" to show). However if goes to infinity it is really hard to see how you show it doesn't eventually converge if you just went a little big longer.

Note that every counter example you propose is actually a sequence of counter examples. It would be interesting to examine the properties of whatever numbers that counter example has in common. Though this is an obvious thing that I suspect someone has already done to no effect.

If you change the multiplier from 3 to 5 or 7, then you get sequences which diverge to infinity, whereas your argument would apply equally well, suggesting that the intuition you're leveraging is wrong.

A better intuition that usually comes up is to turn the problem into steps of (kn + 1)/2^a, and ask is the next number larger or smaller. For k = 3, the resulting map is usually smaller, but for k > 5, it is usually larger, so rather than orbiting, numbers are more likely to spiral off to infinity.

> The Collatz conjecture is quite possibly the simplest unsolved problem in mathematics — which is exactly what makes it so treacherously alluring.

IMO that distinction actually belongs to Goldbach's conjecture which is:

Every even integer greater than 2 can be expressed as the sum of two primes.

I think Goldbach is easier for people who know a bit about math (i.e., remember the definition of a prime number.)

Collatz is easier to explain for someone with no background.

Is it your opinion that understanding the meaning of "two primes" is simpler than understanding the meaning of even, odd, tripling, halving, and adding on3?

In my opinion it's not simpler, primes are a difficult concept. Tripling and halving is simpler.

What tripped me up for a second is that primes don't have to be distinct. So 4 is sum of 2 + 2 and 6 is sum of 3 + 3.

See https://terrytao.wordpress.com/2019/09/10/almost-all-collatz... for a more detailed explanation.

His actual result is this.

Suppose that f(N) is a function from the natural numbers to the reals that has infinity as a limit. Let X be the set of natural numbers who eventually go below f(N). Then the logarithmic density of X is 1.

Here logarithmic density is the limit of the following ratio:

    sum(1/n for n in X and < N)
  / sum(1/n for n < N)
Now you just have to pick f that grows very, very slowly. For example f(N) = log(log(log(x)))).

A few years ago I got interested enough in this that I bought the book mentioned in the article "The Ultimate Challenge: The 3x + 1 Problem."

If you want to see dead ends other people have gone down and read some actual results and learn some history it is a good read for the mathematically inclined. Some of the maths went over my head but I enjoyed it!

I must admit to being curious about this book. I don't see how this hasn't been proven already (though I can't write mathematical proofs myself)

If you look at this as a series of consecutive operations, every odd operation grows by a little over three, and is guaranteed to be followed by an even operation. Even operations shrink by a little more than three, so all that's required is to compare the cardinality of those two offsets to see whether continued operations are growing or shrinking. In the case of odd operations, as the infinite series of operations continues, the offset gets smaller and smaller. In contrast, the size of the offset for even operations doesn't change as the series of operations continues, which should mean that numbers shrink more than the grow, just ever so slightly.

(To understand the even operations, you have to look at expected values. 1/2 of all even operations will produce an odd operation after dividing by two, 1/4 after dividing by four, 1/8 after dividing by eight. The infinite sum (1/2)^2n converges to 1/3, with an actual value of 1/3 - (1/3)*(1/2)^2n.

By using expected value, you are assuming that the distribution of the even numbers is uniform. You will need to prove that first, and I suspect that will be difficult.

This argument is called a heuristic. It's not actually a proof.

It is an extremely persuasive one, however, and and as result of it, nearly all mathematicians believe the conjecture is true.

The persuasiveness of the heuristic is one of the things that makes the problem so confounding. It's obviously true; it's silly to argue that it isn't.

There are some good ideas in there, but I can see a few problems with this proof:

1. It doesn't show the result for all natural numbers, only some "high probability" fraction of them (similar to what Tao did).

2. The expected value of the ratio being 1 doesn't imply that the actual number goes to 1, since it could be stuck in an endless loop with itself as the minimum.

3. 3n+1 doesn't necessarily have a uniform distribution over the even numbers. Tao gets around this by picking some "stable" subset of numbers that don't move too much under the transformation.

This was the problem that turned me from a music major to a math major and eventually led me down the path to computer science. It's a beautiful and cold problem and I love to see any progress made on it.

It is hilarious to me that the thread about this "Dangerous" problem is full of approaches to solving it :)

It's an interesting puzzle, but is there any significance to it?

Are there an infinite number of sequences like this, but with different numbers or different operations? Is there any value proving things about them? Seems like a shot in the dark looking at an arbitrary sequence and proving things about it.

It isn’t about the result, it’s about the tools that have to be discovered to get that result. It’s likely there’s an as yet unknown branch of number theory required or perhaps a hidden link to something apparently unrelated.

I feel like you've asked this in an unnecessarily adversarial tone, but the question is a great one whose answer is valuable for understanding this type of thing.

I don't know what the answer is, I'm just saying it's a good question.

I have a stupid question. If binary representations of numbers in the hailstone sequence for Collatz conjecture can be written as a 2-tag system (a → bc, b → a, c → aaa), and m-tag systems are Turing complete for m>1, and Collatz conjecture is a sort of halting problem in this framework (there is a halting criteria), and "deciding whether a particular algorithm for Turing machine will/won't halt on every input" is a totality problem, which is also undecidable — can a proof of undecidability of Collatz conjecture be approached this way? What is the major problem with this line of thought?

If I understand your comment correctly, "deciding whether a particular algorithm ..." means an arbitrary algorithm. For a specific algorithm, of course sometimes it is possible to prove that it halts for every input. It's only undecidable if the algorithm is regarded as an input.

That video at the top is very neat. Was it manually made or is there software that makes it easy to do?

Also which is the comment they're talking about? I can't seem to find an anonymous one talking about this on the original post https://terrytao.wordpress.com/2011/08/25/the-collatz-conjec...

About the video:


One of the first results if you Google "collatz online". That's also probably why it's confusing - because it wasn't made for the article.

> That video at the top is very neat.

It doesn't contain number 9, so I somehow miss what it was supposed to demonstrate.

The animation is a bit strange. I think it's meant to show which numbers will eventually end the cycle at 1. Having the numbers appear in the reverse order compared to what the rules are does not help, in my opinion.

Take 16 and 5. While 16 is shown first and 5 after, the meaning is actually that you can get from 5 to 16 by applying the odd rule (3*x+1) and from 10 to 5 by applying the even rule (x/2).

I understood the rule used. What I question is the:

- usefulness of animation

- the non obviousness of the numbers that are displayed, v.s. all other remaining in the observed range.

At least, non animated version of end frame allows better comprehension of what is shown.

Oh, I see. It's about creating the set of numbers that do end up at 1 by starting at 1 and applying the operations in reverse order. That is neat. They should have explained that in the article.

Probably made with d3.js.


Huh, neat. I just read about this after seeing it as an example in UPenn's CS 194 "Introduction to Haskell" notes.

  hailstone :: Integer -> Integer
  hailstone n
    | n `mod` 2 == 0 = n `div` 2
    | otherwise      = 3*n + 1

Now see how fast you can make it run! 22 year old me was the king of the project Euler forum for that specific problem ("find the longest sequence below 1000000") until my post got removed (which all newer posts for old problems are).

Doing those stupid things actually made me learn a lot about how computers work

it really is fantastic how simple it is to express, i sketched in the console as i was reading the article


My initial idea is to go the other way around: We can count natural numbers using the successor function, starting at 1. E.g.: 1, s(1) = 2, s(s(1)) = s(2) = 3, and so on. So if we see a number n, we know it's s(n-1), which is s(s(n-2)) and so on, until we reach 1. So we have a nice, linear chain, and it's trivial to construct an algorithm f(n) [or more mathematically, a function f: N -> {s,ss,sss,...}] that (i) produces the necessary operations to reach any given natural number n in that chain, starting at 1, and (ii) terminates for every input n.

Now we don't use s(x), but instead two operations that are inverse to the Collatz rules: a(x) = 2*x and b(x) = (x - 1)/3 (with b of course is not always being possible). Starting at 1, this can now be used to construct a tree containing "some" numbers, and we can use that it to describe a path to a number from the root.

If the conjecture holds, the tree constructed from these functions has to contain all natural numbers.

My nudge is then: If I have a number n, can I give an algorithm/function f': N -> {a,b,aa,ab,ba,bb,...} that (i) finds a path to n, and (ii) always terminates? Answer: I have no idea =)

E.g.: f'(5) = aaaab <=> b(a(a(a(a(1))))).

So we have f'(1)=no idea how to represent, f'(2)=a, f'(3)=aaaabab, f'(4)=aa, f'(5)=aaaab, f'(6)=aaaababa, f'(7)=aaaabaaab[...], f'(8)=aaa,...

(However, maybe it is not possible to even construct f': I believe the target space of f is of another infinity than the target space of f' -- obviously this isn't my area of expertise).

The algorithm is easy: enumerate all possible strings in increasing order of length, and return the first string you found that returns n.

If there is such a string for n, then this will find it. It will terminate if and only if there is such a string for every integer (i.e., the Collatz conjecture).

If it only was that simple :) since we can not solve the halting problem, pure breadth first search doesn't help. We need to find another algorithm that makes use of the tree's structure.

If bfsearch could be proven to be the best algorithm, this would reduce the conjecture to the halting problem.

It's interesting that it seems that some very short propositions require very long proofs.

Are there any results on the length (or maybe Kolmogorov complexity) of the shortest proof for provable propositions of length N? (in a specific logic system, I suppose)

I.e. how hard to prove is the hardest proposition of length of N and how does it grow with N? how about the average random provable proposition?

The question as I, not a mathematician, read it is

> For a given number, can using the operations 3n+1 and n/2 make it converge to a value of 2^i

Because that's what you need. If you can get on the 2^i branch, you've won.

Another way to look at it that doesn't require a convergence proof is to show that ℕ = S, where S = {1} ∪ {n×2 : n∈S} ∪ {(n-1)÷3 : n∈S ∧ n-1|3}.

Not a mathematician but that is too close to mesenne primes to make a quite fun possible connection. Since messene you have points that 3n cannot hit.

But it's 3n+1, so you're always getting an even number 50% of the time.

And for every number you hit along the way, assuming you make it down to 1, you can stop your iteration if you've previously seen the number

> And for every number you hit along the way, assuming you make it down to 1, you can stop your iteration if you've previously seen the number

You can stop your iteration if you come to a previously encountered value, because the (n+1)th number in the sequence is strictly a function of the nth number. Assuming that you make it down to 1 gives you nothing.

The whole point is to prove that you always make it to 1, or -- equivalently -- to show that you can never reach a previously encountered number without first reaching 1.

But you'll always reach a previously seen number because 3n+1 will never be a prime number.

No. If you could prove that you'll always reach a previously seen number, then you could simply say "Let n be the smallest number for which the Collatz Conjecture is false, we have shown that we'll always reach a number < n for which the Collatz Conjecture is true, hence n does not exist."

I don't follow this either. ((3n+1) / 2) is always greater than n. If all Collatz sequences are periodic (or periodic after a finite prefix), how does that prove that they all achieve a value less than some constant k?

(Or even that they all achieve a value which is less than their own first element?)

(3n+1)/2 is always greater than n, yes, but you'd need to find a loop where you're always alternating 3n+1 and n/2 to reach a rate of increase which could not be overcome. Which I take to mean that if n/2 occurs > 50% of the time, will eventually take the series to a number lower than your starting value and eventually to 1.

Maybe the threshold isn't 50%, but the proof then becomes finding that threshold


I am satisfied that, under the assumption that all Collatz sequences are periodic[1], they must all achieve a value less than their own initial value. (Unless that initial value is 1.)

Because the sequence is periodic[1], the ratio of (3n+1) applications to (n/2) applications must be constant[1]. This implies that the value of a term of the sequence will look like ((3^pn)k / (2^qn)), where p/q is approximately equal to the ratio of (3n+1) applications to (n/2) applications. Since no integer greater than 1 is simultaneously a power of 2 and a power of 3, this must trend toward positive infinity or toward zero as n increases. But for the sequence to increase without bound violates the assumption that it is periodic[2] -- so the terms of the sequence must instead trend toward zero.

[1] I'll use the simplifying assumption that all Collatz sequences are periodic after a prefix of 0 terms. For sequences that are periodic after a prefix of more than 0 terms, just treat them as equivalent to the same sequence with the prefix removed.

[2] A periodic sequence can take on only a finite number of values (equal to the period). But if the sequence increases without bound, it must take on an infinite number of distinct values.

I can see that 3n+1 is never prime. Why does that guarantee that you will always reach a previously seen number?

The idea is that with 2n you always go down to one. But for some values of that happene to be messene prime + 1,they cannot be reached by 3*(whatever) + 1, only from above.

So focusing on the points that this conjecture reaches only that way, can probably give a faster way to find messene primes.

Since we're multiplying by 3 and adding 1 only if n is odd, i.e. for some n = 2k + 1, the result will be even 100% of the time, i.e. 3n + 1 = 3(2k + 1) + 1 = 6k + 4 = 2(3k + 2). Since this result is even, we'll immediately divide by 2 and we've gone from n = 2k + 1 to 3k + 2, a larger number. Of course this number could be even or odd but it's not immediately obvious that we'll ever encounter a number smaller than n.

> In the 1970s, mathematicians showed that almost all Collatz sequences — the list of numbers you get as you repeat the process — eventually reach a number that’s smaller than where you started

I don’t understand why this doesn’t prove the conjecture. Like if you start with x, and you reach y: y < x, then couldn’t one reäpply the quoted statement to show there exists z: z < y < x and wouldn’t iterating of this statement eventually lead to 1 < … z < y < x

Reread the text you pasted: It says they proved it for _almost_ all sequences.

Ohh I’m stupid lol

Don't put yourself down. I (and I'm sure many others) have thought the same at first. But maybe we're all stupid?

Just reading the summary of Collatz reminds me a whole lot of the Josephus problem. 3x+1 is a similar expression you'd see when you're trying to enumerate indices in that problem. Dividing by two is similar to going once around the circle killing every other index. I wonder if that's been explored?

I suspect that about the only other class of numbers that could loop starts from a prime. (Because otherwise after a series of steps you could essentially run the procedure in parallel.) So you would have to find a specific class of prime numbers. This is darn hard to find ever.

> Then this past August an anonymous reader left a comment on Tao’s blog. The commenter suggested trying to solve the Collatz conjecture for “almost all” numbers, rather than trying to solve it completely.

maybe a time traveller nudging a result that will be significant for humanity's future?

Well, studying a conjecture for "almost all" numbers doesn't actually prove anything about the conjecture because just a single exception would disprove it. There are a few conjectures that had particularly large exceptions [1]. It's possible that Collatz exception will be so large that it's effectively uncomputable or it's existence is unprovable in ZFC [2].

Somebody probably tried it, but I wonder if describing particularly long sequences where r_(n+1) = 3*r_n+1 is always odd would help. There are results for arbitrarily large arithmetic progressions having some specific property [3], so there's no reason why other progressions couldn't be studied in that way.

[1] https://en.wikipedia.org/wiki/P%C3%B3lya_conjecture

[2] https://en.wikipedia.org/wiki/Zero_sharp

[3] https://en.wikipedia.org/wiki/Green%E2%80%93Tao_theorem

The rule is that if r_n is odd, r_{n+1} = 3 * r_n+1, so because of that condition, 3 * r_n+1 will always be even.

Yeah, I forgot that little fact. It should be (3*r_n + 1) / 2 then.

Almost all would be a good result, especially if the exceptions could be identified and characterized. That'd just change the conjecture e.g. "All numbers except those with more than 37 prime factors above 18 quadrillion" or some such.

That depends on how well we can come up with the "almost all" that fits. If almost all just leaves just 10 numbers not covered it is easy to test those 10 and now you have proved it by exhaustion. If almost all is still an infinite set though that method would fail. It still can be useful though because now we know to focus on that small that don't fit into the previous proof which might lead someplace. Or it might not - this is a difficult problem.

Had the same thoughts. This is also not the first time that some comment on Terry Tao's blog nudged him or his research.

Or the first time he's achieved an important result. I feel like investing in Terry Tao futures...

Well, it's kind of his job to come up with interesting results…

What don't they instead work from the opposite direction: Create an abstract model of a Collatz loop that doesn't end with 1(instead looping back to same number) and prove it can't exist(Reductio ad absurdum).

This approach has been explored. It has only been helpful for showing that certain types of (short-ish) cycles can't occur [1]. Showing that the sequence can't get larger and larger forever (without repeating) with this approach has been a non-starter.

[1] See https://en.wikipedia.org/wiki/Collatz_conjecture#Cycle_lengt... for a few references. The most recent appears to be the paper "Theoretical and computational bounds for m-cycles of the 3n + 1 problem" by Simons and de Weger.

I know this is a fluff comment and meta, but I know many of us came here from /. years ago. This article was put on /. and comparing the comments of /. and HN on this article really highlights how far /. has fallen.

Is slashdot full of people who think they can solve it in the comment thread?

No it is filled with people who think that the mathematician should do something more useful with their time, like make targeted ad algorithms, or something.

I want to know who made that comment, what a unit.

Tao will be remembered 500 years from now.

the proof if it exists will involve math is the very complicated and abstract and cutting-edge, similar to the proof of Fermat's Last Thereon . It will involve some sort of result, I am guessing, from complex analysis an number theory and then generalized in such a way as to prove this result.

Approximating almost up-to, is not rigorous proof in math, but looking forward to positive conjectures.

The title seems to have been changed. Did he prove it for good or for most of the values?

What would it mean if they found a number — one number — that didn’t adhere to this rule?

You cannot find ONE such number. Either there is a loop that doesn't include 1, so you found a sequence of them. The other possibility if the sequence goes to infinity and so you found and infinite number of counter examples.

Of course that's one way of proving this wrong. But you have to start above 18 quadrillion. Because all the numbers less than that have been tried.

Has anyone tried really, really large numbers picked at random?

One approach. But I'd suggest it has less than a one-in-18-quadrillion chance of working. From the statistics so far.

It's trivial to do. Here: https://js.do/code/383036

How would you calculate the number sequence out to infinity?

I guess my question was more “what could that mean?”

Also known as hailstone problem from GEB

The Collatz conjecture predates GEB by many years. Renaming it to something else by the author of GEB was a way to misappropriate someone else's work.

Um, no. Collatz's conjecture was that the sequence will always reach one, and that conjecture was entirely avoided in GEB. Any number that would result in the 1-4-2-1 closed loop was referred to as "a wondrous number" in the GEB dialogs, and the concentration was on the idea that while you could test any number for the property of wondrousness, nobody knew how to prove that any given number would be wondrous without testing it. It was used as a soft introduction to the halting problem.

I'm pretty sure the "halistorm numbers" term also predates GEB.

1 is the only number for which 3n + 1 = 4n. So a number that didn’t reduce to 1 would have to get stuck in a loop of multiple repeated values. That seems unlikely.

any even number can be divided by 2 until it reaches 1. any odd number ends in 1,3,5,7,9. Any of those numbers multiplied by 3 and added to 1 ends in an even number. I don't get why this is hard?

> any even number can be divided by 2 until it reaches 1.

6 wants to have a word with you.

thank you

>any even number can be divided by 2 until it reaches 1.

It's hard because this is wrong.

18 is an even number. 18/2 is 9. 9 can no longer be divided by 2 and is not one.

>any even number can be divided by 2 until it reaches 1.

What do you mean by that?

They thought all even numbers are powers 2


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact