# What does proving the Collatz Conjecture entail?

From the get go: i’m not trying to prove the Collatz Conjecture where hundreds of smarter people have failed. I’m just curious.

I’m wondering where one would have to start in proving the Collatz Conjecture. That is, based on the nature of the problem, what’s the starting point for attempting to prove it? I know that it can be represented in many forms as an equation(that you’d have to recurse over):
\begin{align*} f(x) &= \left\{ \begin{array}{ll} n/2 &\text{if }n \bmod2=0 \\ 3n+1 &\text{if }n \bmod2=1 \end{array} \right.\\ \strut\\ a_i&= \left\{ \begin{array}{ll} n &\text{if }n =0\\ f(a_i-1)&\text{if }n>0 \end{array} \right.\\ \strut\\ a_i&=\frac{1}{2}a_{i-1} – \frac{1}{4}(5a_{i-1} + 2)((-1)^{a_i-1} – 1) \end{align*}
Can you just take the equation and go from there?

Other ways I thought of would be attempting to prove for only odd or even numbers, or trying to find an equation that matches the graph of a number vs. its “Collatz length”

I’m sure there’s other ways; but I’m just trying to understand what, essentially, proving this conjecture would entail and where it would begin.

#### Solutions Collecting From Web of "What does proving the Collatz Conjecture entail?"

Proving this conjecture indirectly would entail two things:

1. Proving that there is no number n which increases indefinitely

2. Proving there is no number n which loops indefinitely (besides the 4, 2, 1) loop

If one does these things then you have an answer to the collatz conjecture (and if you find a case of either of these things you have disproven the collatz conjecture obviously)

Of course this is just one approach that comes to mind, there are other possible methods which are beyond my own knowledge

Adding to @Adam B.’s answer: examinig the conditions of possible cycles leads to the relations of powers of 3 and powers of 2, focusing on problems which are still not solved either.

One can look at it in terms of approximation : what is the smallest difference between perfect powers of 3 and perfect powers of 2, relative to the magnitude of one of them or of the rational approximation of log(3)/log(2) where we find an unsolved detail in the Waring-problem (see mathworld, “power fractional parts”). Some first steps to the proof of nonexistence of cycles (in the positive integers, in the negative integers we have at least 3 additional cycles) were provided by Ray Steiner 1996 and later by John Simons and Benne de Weger who proved the nonexistence of a certain type of cycles using that rational approximation approach.

Or one can look at the problem of cycles in terms of modular conditions, and arrive at other unsolved properties of the relation of powers of 3 to powers of 2. There is, for instance, the formulation in terms of “z-numbers” done by Kurt Mahler.

Unfortunately, even if that 3/2-problems were solved, that would not mean that the Collatz-problem was also solved and vice versa; for instance the solution of the Waring-problem-detail would only solve the “1-cycle” problem but not the general “m-cycle” problem with m bigger than roughly 70 (using the notation of Simons/De Weger): the mentioned conditions are not including each other. (The Steiner/Simons/De Weger articles are linked in the wikipedia, a more basic, amateurish article of mine adressing these aspects a bit less cryptic can be found here )

I think, for the question “which approach should be proposed to attack the problem”, the best answer is “read Lagarias”. Lagarias has a very good overview about many various attempts to the handling of the problem and gives also comments.

Just a simple one, but which can at least lead to the disproof of the so-called “1-cycle”,”2-cycle” … up to “68-cycle” as named by John Simons/Benne de Weger, so this approach at least showed to be not completely useless:

Let’s denote a step $a_{k+1}= (3a_k+1)/2$ as “u” and one step $a_{k+1} = a_k/2$ as “d” for simpler symbolical notation. If at a certain first number $a_0$ such steps follow as “uududdud…dud” and if the resulting $a_n$ is then equal to $a_0$ then we have a cycle.

Let’s now denote a sequence of arbitrarily many equal operations “uuuu…u” as “U” and accordingly “D” . So beginning at $a_0=1$ followed by a “ud” results in $a_2=1$ forming a cycle, and this is also “UD” with $\small \text{length}(U)=\text{length}(D) = 1$ . First Ray Steiner (see references at the wikipedia-entry) looked at the question whether cycles of the form “UD” exist besides that $1 \to 1$ cycle (which he calls the “trivial cycle” or “circuit“) . He could prove that indeed there is no non-trivial cycle of this form – but although the attempt is simple, the tool to prove the nonexistence needed heavy machinery from transcendence number theory… John Simons and Benne de Weger could extend the method to disprove also the existence of “UD UD“, “UD UD UD” cycles, and to introduce names they gave that general forms the names “1-cycle” for “UD“, “2-cycle” for “UD UD” and “m-cycle” for the general case. Of course, any cycle of arbitrary length has the form “UD UD … UD” , but the technical aspect behind that approach is powerful enough to disprove the existence of “m-cycles” for m up to 68 (and possibly now a bit more). Simons/deWeger concluded their 2006-work on this with the remark, that despite the remarkable success of this method it cannot be extended much and a complete new idea needs to be introduced to achieve any more progress with this ansatz.

Now despite that this ansatz did not finally lead to success I think it is an answer to your question “what is a good starting point” because first Steiner, later Simons/deWeger believed it were “a good starting point” and looked deeper at the mathematical properties, implications and consequences of when the analysis of the problem was attacked with the according formulae.

To study the Collatz-problem with this ansatz it seems to me, the reformulation of the transformation rule as
$$a_{k+1} = {3a_k +1\over 2^{A_k} } \qquad \qquad \small \text{on odd numbers a_{k+1} and a_k }$$
is much useful. For the introductory discussion I also like it more to replace the indexing of the argument $a_k$ by simply taking the consecutive letters a,b,c…, with z for the last one (if needed) and $N$ the number of transformations and $S$ the sum of exponents $A_k$, so for one transformation I write
$b= {3a+1 \over 2^A}$ or $c = {3b+1 \over 2^B}$ and so on.

So let’s look whether a cycle with one transformation exists. It needed the form
$b = (3a+1)/2^A$ and $b=a$ so we could look for solutions
$$a= {3a + 1 \over 2^A} \\ 1= {3 + 1/a \over 2^A} \\ 2^A = 3 + 1/a$$
and we see immediately, that $|a|$ cannot be different from $1$ because otherwise the result were fractional, and also that $a=1$ gives a solution for $A=2$
(and of course $a=-1$, but originally L. Collatz considered only positive $a$ here) .

Now to extend this to the question, whether a cycle of two transformations can exist, we write first
$$b= {3a + 1 \over 2^A} \qquad \qquad a= {3b + 1 \over 2^B}$$
and assuming that a solution exists in $a$ and $b$ we could write their product
$$b \cdot a = {3a + 1 \over 2^A} \cdot {3b + 1 \over 2^B} \\ 2^A \cdot 2^B = \left(3+{ 1 \over a}\right) \cdot \left(3+{ 1 \over b}\right)$$
and we see,
* if a and b assume the smallest possible value 1 we get
$$2^{A + B} \overset{?}{=} (3+1)(3+1) = 4^2 = 16$$
which has of course the solution of the trivial cycle as before $A=B=2$.
* If now a and b are greater, let’s insert the largest values or even let’s look at the limit when they go to infinity we’ll have
$$2^{A + B} = 2^S \overset{?}{=} (3+0 )(3+0) = 3^2 =3^N = 9$$
Of course, there is no natural A,B allowing this, so if there were any solution at all we needed a perfect power of 2 between $3^2=9$ and $4^2=16$ and this doesn’t exist, and so a cycle of $N=2$ transformations does not exist.

We could now proceed and look for a cycle of $N=3$ transformations with different positive odd numbers a,b,c and would find the question, whether there exist a perfect power of 2 between $3^N=27$ and $4^N=64$. Here we find, that there is one, namely $2^5=32$ and so we have to do more.
The equation is simple enough to enumerate possible solutions and disprove it this way; but we still can argue with a certain generality:
– none of a,b,c can be divisible by 3 (because they all result of a transformation $(3x+1)/2^X$),
– they must be odd and
– none can equal 1 (otherwise we had the trivial cycle) , and
– they must all pairwise be different (because otherwise the cycle would be shorter).

Thus the minimal values were $a=5,b=7,c=11$. If we insert them into the product-formula we get
$$2^5 \overset{?}{=} \left(3+{1 \over 5} \right) \cdot \left(3+{1 \over 7} \right) \cdot \left(3+{1 \over 11} \right) = {16 \cdot 22 \cdot 34 \over 5 \cdot 7 \cdot 11}=32 \cdot { 34 \over 35 } \lt 2^5$$
and we find two things:

1) the given minimal values for a,b,c allow no solution

2) and because increasing the numbers a,b,c would decrease the rhs there cannot be any more solution, because already with the minimal values attempted, the rhs is smaller than the lhs.

While the type of transformation in the above examples are of the typee “uD“, “uD uD” and “uD uD uD” where always one ascending step is followed by one or more descending steps, it shows more interesting structure if we look at the “UD” transformation (or “1-cycle”). One “UD” transformation can be characterized by the number N of “u” steps and the number S for the number of “d” steps plus N. Let for notational convenience write L for N-1 then we get then for
$$b = {3\left({ 3^L \over 2^L}a+{ 3^L – 2^L \over 2^L}\right)+1 \over 2^A } \\ b = {3\left({ 3^L }a+{ 3^L – 2^L }\right)+2^L \over 2^{L+A} } \\ b = {{ 3^N }a+{ 3^N – 3 \cdot 2^L }+2^L \over 2^{L+A} } \\ b = {{ 3^N }a+ ( 3^N – 2^N) \over 2^{N+A-1} } \\$$
and for the question of cycles we can derive:
$$a \overset{?}{=} {{ 3^N }a+ ( 3^N – 2^N) \over 2^{N+A-1} } \\ \to 2^{N+A-1} =2^S \overset{?}{=} \left( 3^N + { 3^N – 2^N \over a} \right)$$
and for the 1-cycle with $\text{length }(U)=N$ and $\text{length }(D)+N = S$ we get the “critical formula” which must be solvable in natural numbers
$$a = {3^N-2^N \over 2^S-3^N } \\ \qquad \qquad \text{with S satisfying } 2^{S-1} \lt 3^N \lt 2^S \to S= \lceil N \log(3) / \log(2) \rceil$$

This gives of course the small-number-solution $\small N=1,a=1,S=2$ (the trivial cycle) but also solutions in the negative numbers: for $a$ we find $\small N=1,S=1 \to a=-1$ and $\small N=2,S=3 \to a=-5$ by $\small a={3^2-2^2 \over 2^3-3^2 } = {5 \over -1} =-5$ (Another cycle in the negative numbers is a “2-cycle”, including $a=-13$ I think).

This formula is the problem of the 1-cycle which was solved by Ray Steiner and (extended to the m-cycles) by John Simons and Benne de Weger up to $m=68$ or $m=72$. The possibility of existence of solutions depends on the smallest distances of perfect powers $2^S-3^N$ which has not yet a fluent formula and which makes it a challenging attempt on its own.

The reason for why I show here this considerations is that it made me myself stepping into the deeper formalisms of number theory with that powers of 2 and 3 and their modular conditions, approximations etc which I found much interesting for the self study. The first three examples above are of the form “uD” “uDuD” and so on. Looking at “UD“, “UDUD” in the fourth example seems even more interesting – and I think, that there is still something more interesting general numbertheoretic structure in that way of calculations – and that’s why I propose such an attempt here in trying to answer your question.

The way I interpret the problem is

For there not being cycles:

That Collatz (What else? Important question) iterations change the prime factors of some arbitrary number such that future (Collatz) iterations of that number can never return to the original, or any preceding, set of prime factors. In essence, that prime factors, through at least these specific functions, get “mangled beyond return.” Or that the only path to a previous set of prime factors is by inverting your previous operations, though this is probably too strong.

For no diverging numbers:

The aforementioned set of prime factors converges to a set of solely {2^n}, and thus to the empty set.

The problem can be just those two points:

1) is there a loop?
2) is there a sequence that increases without bound?

However, another way to solve it would be to show there cannot be two distinct < families >. Up to quite high values of n, we know empirically that starting with any given n less than that value, repeatedly choosing n/2 for even n and 3n+1 for odd n gives a sequence ending in 1.

Call this set of sequences ending in 1 the < terminate-in-1 family >.

The task then amounts to testing if there can be a < deviant family > where either a loop or a sequence increasing without bound would amount to a deviant family. Then the challenge (in order to prove Collatz/Ulam/Thwaites correct) is to show that any other family of sequences must somewhere produce a number that is within the terminate-in-1 family.

The terminate-in-1 family contains, and if it existed any deviant family would contain, infinitely many natural numbers each.

One approach could be to use the inverse function. Start with unity and apply the inverse function $g(n)$ recursively. Visualize each positive integer as a node and each iteration as a path. To prove the conjecture, one has to show that every node will be reached eventually. To disprove the conjecture, one has to show that there exist at least one node which will never be reached.
\begin{align*}
g(n) &=
\left\{
\begin{array}{ll}
2n &\text{if }n \bmod2=0 \\
(n-1)/3 &\text{if }n \bmod2=0 \hspace{5pt} \& \hspace{5pt} n \bmod3=1 \\
2n &\text{if }n \bmod2=1
\end{array}
\right.\\
\strut\\
\end{align*}
Note that while $f(n)$ is a one to one function, $g(n)$ is not.

See this question.

Proving the conjecture is equivalent to finding a measure of “simplicity” of positive integers such that repeatedly applying the function in the $3x+1$ problem eventually simplifies any input.

That is precisely what would be entailed in any proof, but apparently it is extremely hard to find such a thing. The $3x+1$ iteration can increase all the commonly used height measures (such as ones based on the size of an integer or its prime factorization) quite far before bringing them back down, and bounding those excursions is the same problem of finding a quantity that controls the process.

By the same logic, a disproof by showing an infinite orbit exists would necessarily entail finding a complexity measure such that the iteration, on some inputs, makes the integer increasingly complex. This is far beyond current ability for the same reason, and also because the conjecture appears to be true.

The other possibility is to find a finite cycle. Finding one does not necessarily entail more than a giant computation, but there could, hypothetically, be other methods than searching directly for a cycle.