Sum-of-Three-Cubes Problem Solved for ‘Stubborn’ Number 33 178 points by furcyd 9 months ago | hide | past | web | favorite | 81 comments

 Numberphile video on the topic with Andrew Booker: https://www.youtube.com/watch?v=ASoz_NuIvP0which is a followup from the 2015 episode https://www.youtube.com/watch?v=wymmCdLdPvM when the 33 problem was unsolved.The announcement elicited this Mathoverflow discussion about "mic drop" moments in mathematics: https://mathoverflow.net/questions/325105/what-are-some-note...
 Some additional links: Sander Huisman's 2016 paper about solving the three cubes problem for 74, using a supercomputer in Lyon [0]. The approach for 33 is quite similar, just augmenting the range searched (and some additional tricks to further reduce the search space).Andrew Booker's paper about 33 [1].It's quite funny how Numberphile inspired these searches.
 Interesting that the second number is prime and the others have very large prime factors (only 5 and 4 distinct prime numbers for 1st and 3rd numbers respectively):`````` (8,866,128,975,287,528)³ + (–8,778,405,442,862,239)³ + (–2,736,111,468,807,040)³ = 33 696950821015779435648178972565490929714876221952 - 676467453392982277424361019810585360331722557919 - 20483367622797158223817952754905569383153664000 = 33``````
 In a way it makes sense - if the numbers all had smaller prime factors, they probably would have been found sooner than now, and so wouldn't be as remarkable
 I briefly glanced at the paper and they talked about prime numbers a bunch.
 The last unsolved integer under 100 is 42...
 Indeed. I suspect somewhere out there, Douglas Adams must be smiling. :-) [1]Interesting that Lewis Carroll also makes references to 42.
 Perhaps its cubic summands will have a more transparent encoding.
 Spooky.
 Haha Nice!
 Can someone ELI5 what the issue is with numbers that have a remainder of 4 or 5 after being divided by 9? I'm assuming it's something simple that I'm just not seeing, and having trouble googling for it.
 The only possible values of a cube mod 9 are 0, 1 and 8. (If you expand (9n+r)^3 you'll see that all the terms divide by 9 except r^3. So the value of a cube mod 9 depends only on its own mod 9 value. So we only need to check that 0, 1 and 8 are the only values achieved by the first 9 cubes.) If you add three of these values the only possible answers are 0, 1, 2, 3, 6, 7 and 8 mod 9.
 The issue is that each term n^3 equals +1, 0 or -1 mod 9. (Test the numbers 0-9). When you add three of these values, you can't get 4 or 5 (mod 9). Thus, you can't get a result that equals 4 or 5 mod 9.
 Anybody have an idea of how much it would cost to have done this search on AWS?
 At \$0.0425/core/hour for Compute-Optimized instances (c5), 23 core-years costs \$8562.9
 You could probably reach out to someone at a local university and get access to a cluster.
 Does anyone know of a useful application for this fact? Serious question.
 The mathematician Edmund Landau said (translation from German): Number theory is useful, since one can graduate with it.(from his Foreword to Vorlesungen über Zahlentheorie (Lectures on Number Theory) (1927).)
 Not right now... but maybe in the future, who knows. There was a time number theory was viewed as pointless: "The theory of Numbers has always been regarded as one of the most obviously useless branches of Pure Mathematics." - G.H.Hardy.He might be very disappointed that number theory underpins modern cryptography.
 Perhaps the algorithm used to discover the solution may be of application. As is often the case in pure math, the methods used to find solutions to otherwise inconsequential problems later find use years down the line.
 If the solutions are that hard to find, you could probably make a commit/ reveal scheme out of it.
 Apparently not. It more or less says so in the article.
 > other times, a solution is known not to exist, as with all whole numbers that leave behind a remainder of 4 or 5 when divided by 9Curious how we are able to prove that and with certainty say this is a rule?
 Off the top of my head: the cubes of 0,1,2,3,4,5,6,7,8 mod 9 are 0,1,-1,0,1,-1,1,-1.The sums of three of these, mod 9, are 0, 1, 2, 3, -1, -2, -3 or 0, 1, 2, 3, 6, 7, 8, so the sum of 3 cubes can not be 4 or 5 mod 9.
 I would have liked to see a bit more comment on the algorithm itself in the article. But I suppose I can find that elsewhere
 I've been able to find vague descriptions of the algorithm used but not any actual code. Might have to ask the author for it.
 See it here: https://youtu.be/ASoz_NuIvP0?t=393 rewind a bit if you want more explanation.
 So simple to check, yet so hard to find. I love these kind of mathematical problems.In irb or python interpreter copy and paste`````` 8866128975287528**3+(-8778405442862239)**3+(-2736111468807040)**3 `````` Though another question, are there an infinite number of possible solutions to this problem? Or just 1 solution with numbers in this length?
 > are there an infinite number of possible solutions to this problem?Unknown. From the article:'A major result would be to prove the conjecture that k = x³ + y³ + z³ has infinitely many solutions for every whole number k, except those k that have a remainder of 4 or 5 after being divided by 9.'
 It works in your browser with javascript if it supports bigInt[0]`````` 8866128975287528n**3n+(-8778405442862239n)**3n+(-2736111468807040n)**3n `````` [0]:https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...
 Does anyone know if there is a conjectural characterization of those numbers which can be expressed as the sum of three cubes? The article mentions not being congruent to 4 or 5 mod 9 is a necessary condition. Is it (conjecturally) sufficient?
 In the Numberphile video it is conjectured that there are infinitely many solutions for all those numbers, but that solutions are very sparse. The smallest solution for 42 might involve very high numbers.
 Thanks! It seems they mentioned this in the article as well:"A major result would be to prove the conjecture that k = x³ + y³ + z³ has infinitely many solutions for every whole number k, except those k that have a remainder of 4 or 5 after being divided by 9."
 (8866128975287528)^3 + (-8778405442862239)^3 + (-2736111468807040)^3If I command-space on OSX and put this into the search bar (which accepts math), I get exactly 30,000,000,000,000.
 echo '(8866128975287528)^3 + (-8778405442862239)^3 + (-2736111468807040)^3' | bc
 How many solutions are there?
 The conjecture (that hasn't been proven) is that every integer that does not equal 4 or 5 mod 9 has infinitely many solutions (integers that = 4 or 5 mod 9 are proven to have no solutions). Parametric equations for 1 and 2 have been found that generate infinitely many solutions for those values but it is otherwise an open question.
 > With 33 out of the way, the only one left is 42.I know this is just a coincidence, but it's a rather poetic one.
 The Masons will have a field day.
 The solution for 42 is coming soon
 Do you have a source for that? It took 7.5 million years for Deep Thought to discover the significance of 42, and it didn't even break it up into a sum of three cubes :) https://hitchhikers.fandom.com/wiki/Deep_Thought
 I can't believe no-one has mentioned out loud the reference to The Hitchhiker's Guide to the Galaxy: "The Answer to the Ultimate Question of Life, the Universe, and Everything is 42"
 > That problem, stated as k = x³ + y³ + z³, is what number theorists call a Diophantine equation — a kind of algebraic structure whose properties have fascinated mathematicians for millennia. “They’re rich enough to encode [other mathematical] statements that have nothing to do with Diophantine equations,” said Browning. “They’re rich enough to simulate computers.”You would certainly hope so, since every value a computer works with is integral.
 It says they can simulate computers, not that computers can calculate them.
 Yes, and I'm saying since they can represent every value a computer can represent, this doesn't come as a surprise.
 A computer does more than store values: it also performs computation. And “they’re rich enough to simulate computers” refers to the celebrated (and surely surprising!) theorem of Julia Robinson, Martin Davis, Hilary Putnam, and Yuri Matiyasevich, showing that (to state the result in a couple of ways):* given a computer program that prints integers, there is a Diophantine equation whose integer solutions are precisely the set of integers printed by that program,* given any computer program, one can represent it by a Diophantine problem: one can write down a polynomial equation such that it has integer solutions if and only if the program halts.For example, because we can write down a computer program to print the prime numbers, there exists a certain (explicitly known) polynomial whose integer values are precisely the prime numbers. And as we can write down a program which halts only if (say) the Riemann hypothesis is true, we can write down a Diophantine equation which has integer solutions only if the Riemann hypothesis is true.(What you're saying about being able to represent every value a computer can represent would be true for say linear or quadratic equations too, but those are not rich enough to simulate all of computation.)Essentially, you can build a computer out of diophantine equations.You can read more at https://en.wikipedia.org/w/index.php?title=Hilbert%27s_tenth... and good expositions by Poonen at http://www-math.mit.edu/~poonen/papers/sample_beamer.pdf (talk slides) or http://www.ams.org/notices/200803/tx080300344p.pdf (article).
 To clarify the comparison I made, consider the set of integer solutions to linear equations (or to be more precise, the Diophantine set of a linear polynomial). For example, for the equation x+y=k, the set of integer solutions k is merely the set of all integers. For the equation 10x+6y=k, you get the set of all even integers. For the equation 10x+6y+7z=k, again you get the set of all integers. So the sets of integers that you can obtain as the respective solutions to linear equations are all very simply structured. (If I'm not mistaken, they are always simply the multiples of the gcd of the coefficients of the variables, possibly translated by some constant offset e.g. if one is considering 6x+9y+15z+7=k, the set of obtainable k (by plugging in various values for (x, y, z)) is “all integers that are 1 mod 3” — so the set is always simply an arithmetic progression.)Similarly for quadratic equations — you still get sets that are easily parametrized. Something like this is what motivated Hilbert to ask the question in the first place. Nothing so far would have suggested that if you go up to higher-degree polynomials, you should eventually be able to obtain any computable (i.e. recursively enumerable) set as the set of integer solutions to some polynomial equation. Yet that's the case.The contrast was never with real numbers, so mentioning real numbers is neither here nor there. (Though I understand how one can get that impression, if one thinks that what is being highlighted is “Diophantine solutions” in the sense of “integer solutions” as opposed to “real solutions” or whatever.) The surprising fact being highlighted by the number-theorist in the quote (and one of the highlights of 20th-century mathematics) is that the structure “the set of integer solutions to a polynomial equation” is rich enough to encode all possible computation — i.e. polynomials are not as “simple” as one may think. (Please take a look at those links if you haven't; it's really beautiful and super surprising that a program can be encoded as the set of integer roots of an integer polynomial.)
 >Booker found this odd trio of 16-digit integers by devising a new search algorithm to sift them out of quadrillions of possibilities. The algorithm ran on a university supercomputer for three weeks straight.So...did he just brute force it?EDIT: Should've read further down. Looks like, yes:>But usually, solutions are “nontrivial.” In these cases, the trio of cubed integers — like (114,844,365)³ + (110,902,301)³ + (–142,254,840)³, which equals 26 — looks more like a lottery ticket than anything with predictable structure. For now, the only way for number theorists to discover such solutions is to play the mathematical “lottery” over and over, using the brute force of computer-assisted search to try different combinations of cubed integers, and hope for a “win.”
 The Numberphile video, mentioned elsewhere, explains how a search of O(n^3) is reduced to a linear search via some clever tricks. From there, the linear search is quite brute force. Several numbers are searched at the same time, indicating that no solution for 42 could be found under 10^16. Some supercomputer infrastructure seems to be necessary ("don't try this at home").
 Wow, linear? I had recognized this as a variant of the interview problem where you try to find a triple that sums to zero in a list, but that only shrinks it to quadratic.
 The paper ends with "The total computation used approximately 23 core-years over one month of real time.", so yes, I think running this on your 8-core desktop is not going to cut it.
 But what about a \$400 GPU with 4096 shaders?I haven't looked at the algorithm, but a linear search probably can be GPGPU computed in parallel. A Vega64 is around \$400 and runs 16384 threads on 4096 physical shaders at roughly 1.2GHz.The main issue is if whatever data-structures used for this search will actually fit inside the microscopic amount of memory that is available to each shader.
 You should probably assume that people using supercomputers have thought about "just using a GPU."
 Don't assume anything.They are not using GPUs and from the paper it looks like they haven't thought about it. I think a GPU cluster would be a speedup, but big number multiplication is relatively slow on GPUs, so maybe not as big speedup as with other algorithms.
 A GPU cluster, sure. But the discussion was about running the algorithm on a single GPU on your desktop.
 Sure, you're right I was comparing cost of the 2 systems, with 1 GPU it's an interesting question, but as the author had CPU cluster available, it was the logical choice for him.
 Why? GPU Programmers cost a lot more than a month-worth of supercomputer time.Its probably cheaper and easier to just port your code to Amazon cloud or Google Cloud for a month, rather than to port your code to a GPU.
 Generally yes, but actually porting this is probably a fun pet project that could be done in a few weeks for fun to find the secret number for 42 :). It's too bad the paper doesn't have a GitHub link.
 Oh yeah, that's why I'm interested in that algorithm :-)Honestly, the more and more I look into this, it seems like GPUs are the ideal computational platform for this problem. We're talking about a simple equation, with numbers that fit inside the 64-bit integer space (less than 10^19). GPUs can calculate that somewhat easily (The compiler probably turns 64-bit multiplies into 4x32-bit multiplies + adds, but that's still low in memory usage...)I'm not sure if one GPU would be enough to compute everything. But we're talking about 512-cpu cores at 1-month of computing. A GPU probably can solve it within 6 months.I'm talking about replicating the work done on "33" by the way, not on 42. If 10^19 is sufficient for a search space, then 64-bit ints can be used.------------It won't be an easy port btw. The three variables are written as x^2 - 2xy + y^2 == (33 - z^3) / d, where d = (x+y).The Numberphile video said that "d" is brute-force searched, because there are very few values of "z" that would satisfy the above formula. So every shader probably can brute-force d on its own in a GPU.Overall, seems pretty efficient to run this on a GPU. Very little memory, everything probably fits in GPU-register space.
 What you are talking about is to exploit the mass parallelism of GPU, but I suspect that there are huge differences between each of the calculation/evaluation, which is not a good pattern for GPU threads. As GPU threads work together in bundles, or warp, the cost on sync'ing each thread may be high.But still, worth a try if you have one, and don't forget SHOW HN once you have luck ;)
 > but I suspect that there are huge differences between each of the calculation/evaluationIts a (nearly) linear brute-force search. Every thread is going to run "guess d", and then calculate "(33 - z^3) / d = x^2 -2xy + y^2".All values must be integer, so (33 - z^3) / d must be an integer value. Which means the selected value "d" must be a factor of (33 - z^3). So "z" is going to be a small search space. I would expect that similar "d" values would have similar amounts of "z" values that satisfy this solution, so thread divergence doesn't seem to be a major issue.That's about as uniform as it gets. Its a guess-and-check search, very much akin to cryptocoin mining. It seems like the GPU is the best architecture to the algorithm.
 What about the RHS of the equation? x^2-xy+y^2 are three big-number multiplications and two big-number adds, where digit number do matters.
 Not as "big number" as you might think. The x, y, and z terms were: (8,866,128,975,287,528)³ + (–8,778,405,442,862,239)³ + (–2,736,111,468,807,040)³ = 33.The biggest number will fit inside of 53 bits, 54 if you include the sign bit. Which means all of the values fit inside of 64-bits (including the sign bit). The cubes will fit within 192-bits.If this truly were a "big number" problem, then brute-force wouldn't be a valid methodology!
 I got your point. It seems that we are just not on the same page. As the product of any cube can turn out to be 192-bit, it should be impossible to fit into a GPU instruction or even a single CUDA C statement.That's why I call it big-number calculation: manual carry/overflow handling, larger-than-register size operands, tedious indexing/bug prone if coding from scratch, ... but of course you can still argue that it is not "big."Or maybe now GPGPUs have big integer support? The last time I did actual GPU computing was CUDA 7 I guess, using cublas in daily basis so never ever been familiar with integer computation in GPU.
 > That's why I call it big-number calculation: manual carry/overflow handling, larger-than-register size operands, tedious indexing/bug prone if coding from scratch, ... but of course you can still argue that it is not "big."Its not a general big-integer implementation which can be arbitrarily sized. Since you know all results fit inside of 192 bits, and that all operands are less than 64-bits, you can easily code for this special case.192-bit x 64-bit is way easier to debug and test than 192-bit x 192-bit. Its probably the bulk of your work, but I don't foresee any major complication here. Its just grade-school arithmetic.`````` struct cube{ long long val[3]; } cube multiply(cube x, long long y){ cube toReturn; toReturn.val[0] = x.val[0] * y; toReturn.val[1] = x.val[1] * y + __umul64hi(x.val[0] * y); toReturn.val[2] = x.val[2] * y + __umul64hi(x.val[1] * y); return toReturn; } `````` This is grade-school arithmetic level. This isn't very difficult at all. The only thing you needed to know is that CUDA has an intrinsic to take the top 64-bits of a 128-bit multiplication.Without any if-statements, the GPU can calculate the above across different threads without any divergence. You can likely optimize this further with MAD instructions, but the above would be a good "first step" towards this problem.As I said earlier: this wouldn't be an "easy" port, but it would be possible, and it would likely be very efficient. I'm not going to think through all the edge cases, but the bulk of the algorithm seems very easy to do efficiently on the GPU architecture to me.
 And of course, code I type up in less than 5 minutes has a variety of bugs (compiler errors AND logic bugs). Despite the mistakes, I stand by my reasoning that the above code is relatively simple.And again, I'm not saying GPU-porting is easy. I'm just saying its possible, and likely is the most cost-effective use of hardware. Whether or not it is worth the programmer time is a separate question of course. But I don't see anything that's stopping a GPU from efficiently computing THIS problem, especially if its restricted to 192-bit as the biggest number in the whole algorithm (worst-case 64-bit cubed).I've tried to port other algorithms to the GPU and it didn't work. But THIS sum-of-three cubes problem looks like an easier job than most algorithms to me.
 > But I don't see anything that's stopping a GPU from efficiently computing THIS problem, especially if its restricted to 192-bit as the biggest number in the whole algorithm (worst-case 64-bit cubed).Fair enough. Also thanks for the demonstration, it updates my understanding.But still at some moment in the future, we may need a general big-number algorithms for calculation due to the inevitable expansion of search space. There is no guarantee that unsolved k's, says 100 < k < 200, can be found in min(|x|,|y|,|z|) <= 192 bit.> But THIS sum-of-three cubes problem looks like an easier job than most algorithms to me.Good luck and I'm looking forwards to any updates.
 X^2 doesn't fit in 64 bits, but after a little bit of search I found the __mul64hi PTX instruction that makes it easy :)
 > I haven't looked at the algorithmYou can see it here: https://youtu.be/ASoz_NuIvP0?t=393Rewind a couple of minutes if you want the explanation.
 That's actually really simple and clever. That definitely makes the search space much smaller and brute-force is now possible.
 `````` 1 core year = 12 core months 23 core years = 276 core months `````` So, 276 cores running for one month? Is this right?
 Yes, but I am left hanging on the specific details. He factors to yield`````` (x + y)(x^2 - xy + x^2) = 42 - z^3 `````` He brute force searches over values d and finds a value z such that d divides 42 - z^3 (evidently these are rare and easy to find) and then solves the following two equations for x and y (which is easy):`````` d = x + y x^2 - xy + x^2 = (42 - z^3)/d `````` The part I don't get is how to find the z's given d. Does anyone know where to find these details?
 Yes, but smartly. People have been brute forcing solutions to this problem using computers for decades now. The tricky part is reducing the search space.
 I feel like we need a better name for this type of approach, because many people (myself included) take "brute force" to mean naively checking every solution candidate. If you're doing something clever like reducing the search space and/or lowering the big-O, that doesn't seem "brute." Even if you're throwing dozens of core-years at the problem, that seems like "force" but not "brute."I can't think of anything snappy, though. "Clever brute force" is the best I've got, and it's a mouthful. I tried looking for antonyms for "brute," the options aren't great. Well-bred? Polished? Humanitarian?
 Civilised force is accurate but sadly colonial...I think "refined force" is a solid option. Selective works well but might be too literal since "select" is a common algorithm verb."Judicious force" is my favorite and "prudent force" comes second.
 Cute force
 Groot force
 Reduced force

Search: