Hacker News new | past | comments | ask | show | jobs | submit login
Galactic Algorithm (wikipedia.org)
381 points by jonbaer 9 days ago | hide | past | web | favorite | 71 comments

Ah. I still remember my computational geometry professor telling us about an algorithm to do triangulation (composing a polygon into triangles) in linear time: "It's very complex. I don't think anyone has actually implemented it".

It shocked me at the time that there were algorithms like this.


Fast and robust polygon triangulation was the hardest challenge in the period when we developed the OpenGL version of our 2D vector graphics engine (www.amanithvg.com), it was in the early 2000.

I clearly remember all such research papers. After studying all of them and after three complete rewrite of the engine, we finally implemented a classic 2 step algo: sweepline polygon decomposition to monotone polygons, and then monotone polygons to triangles.

We had really heavy headaches dealing with intersections, autointersections, splitting edges, comparing and merging original vertexes with vertexes created by splitting edges, managing edges connectivity and topology.

After some try, we totally discarded the idea to write it in floating point arithmetic, preferring fixed point integers instead (and this was a real turning point for the robustness!).

Anyway, good times to remember!

They mention O(n log* n) being quasi linear time, I had to look up log * n because I never saw this before.

And then there's O(n alpha(n)), where alpha(n) is an inverse the Ackermann function, and which grows even more slowly than log* n. The famous union-find algorithm has this complexity.

Yeah I discovered a fast algorithm for the travelling salesman problem with neighbourhoods (i.e. cities are shapes not points). Emailed the author to inquire about code, turns out they never wrote any.

What was shocking about this? Seems like something you could always solve by hand, even if you couldn't quite enumerate the steps you need to take to do it. I guess that's sort of my baseline for whether I expect an algorithm to exist.

The shocking thing was that it is a known algorithm but (presumably) nobody ever implemented it.

In linear time was the key part.

In addition to galactic algorithms, there are also galactically proven algorithms. One example is the algorithm for matching with mismatches which I presented in the first chapter of my doctoral thesis; I proved that it was faster than other algorithms for inputs of at least ~10^30 bytes.

In practice, it wins starting at around 10^4 bytes.

How do you prove that something is faster than everything else? Was it already running in linear-time?

There aren't many non-trivial lower bounds out there, and I think they all use one of three ideas (diagonalization, crossing sequences or oracle reconstruction)!

Sorry, I was unclear. I proved that my algorithm was faster than other known algorithms; I didn't prove that it was faster than all other possible algorithms.

I've never understood the "number of the atoms in the universe" argument. The number of states the universe can be in doesn't seem to be equal to the number of atoms. For example, just two atoms could encode lots of numbers simply by using their distance. Quantum physics would affect it, but I mean in principle: we are not switching atoms on and off to encode state.

It’s 2^1729 digits, (vastly) more digits than there are atoms in the universe.

The vastly bit is kind of an understatement.

Each stable isotope is indistinguishable from every other carbon isotope. So, you can probably encode a few bits per atom assuming you can somehow read this data. However they are talking vastly larger numbers of digits here. Where there are only ~10^80 digits worth of atoms in the visible universe, but hey bump that to 10^90 it does not help.

Sure you might encode 10^6 or hell I will give you 10^100 bits of data per atom or something but that’s not even close to helpful. It’s still on the order of K atoms in the universe and you want to encode 10^400+ * K bits of data. So each atom needs to encode 10^400+ bits and remember each stable isotope is indistinguishable from every other atom of that stable isotope.

You forgot about ordering.

8 binary bits in order can encode 256 states, as if every bit was capable of encoding 32 states.

Yea, that specific number fits into 2 kilobytes of memory, but it’s so large it’s hard to compare it to anything.

Which is why I suggest comparing it to the number of bits required to encode the universe. If all you wanted was to store a single arbitrary number and could read write the universe you could do anything out to 2^(10^(80 * k)) where k is larger than 1 but I suspect below 100.

It's really just a phrase used nearly-rhetorically in order to convey the vast scope of a thing. Everyone understands that the number of atoms in the universe is a huge number, so it's a good benchmark.

When people use this phrase they're not trying to claim that the number of atoms in the universe is directly related to the problem at hand; they're just trying to convey scale.

It's not states, it's logarithmic of states (# of dimensions of finite size) . Think about writing down a number. One atom per digit is a pretty natural heuristic for the optimal spatial cost of information. Yes you can get clever, but there's no point, were already at astronomical levels of imprecision.

As I said, one atom per digit is not really natural. Just two atoms could encode an infinite number of numbers, by measuring their distance. Well except quantum physics might get in the way, not allowing us to measure with arbitrary precision. But that would be another argument.

Another argument would perhaps be the energy required to do the computation. Maybe that relates more directly to the number of atoms in the universe, via Einstein's equation?

> Well except quantum physics might get in the way,

This is a pretty huge well except. When people talk about information that can be stored in the universe this is exactly the limitation they have in mind.

It's just there for a reference, maybe a better one would be Kurzweil's ultimate laptop (made from the whole universe). But the interesting part is that I don't think the ultimate laptop takes into account interactions and state between atoms. AFAIR it assumes computation is done on every available dimension (like electron spin) but not on how atoms move and interact with each other. Intuitively it shouldn't make much difference though.

The number of digits of the possible interactions between the atoms is still proportional to the number of atoms.

What counts as an interaction? It seems to me there can be an infinite number of interactions between just two atoms.

Number of states of the universe is a quantum value, not a classical one.

So what quantum value is it? Shouldn't the argument then at least say "the number is greater than the number of possible states of the universe", or something like that?

How would you read that? (Distance between two distinct atoms) doesn't seem practical

Doing computations with single atoms also doesn't sound very practical.

Measuring distances, I suppose there could be numerous ways, like measuring the gravity or electric pull (not sure what it is called in English). I think only Quantum theory says we can not measure to arbitrary precision, or at least if we do, there are other issue. Still, that would be another argument than simply pointing at the number of atoms.

Can’t wait to be told to implement some of these for my next whiteboard interview

Perfect candidate for primality testing! https://www.mersenne.org/various/math.php#lucas-lehmer

> 1729

Why is this hyperlinked? I was hoping the link would help explain why 1729 and not some other number, but it’s just trivia...

> One immediate practical effect would be to earn the discoverer a million dollar prize from the Clay Mathematics Institute.

I mean, I can’t argue with the practicality of that.

Probably because it's a pretty famous number, being the Hardy–Ramanujan number, and having a pretty famous story. There's quite a few videos on it [0] [1].

[0] https://www.youtube.com/watch?v=_o0cIpLQApk

[1] https://www.youtube.com/watch?v=LzjaDKVC4iY

Yeah not sure why it's hyperlinked, it's a rather dull number.

“But it’s not dull at all,” Ramanujan replies, “it’s actually the first number that can be written as the sum of two cubes, two different ways!”

But imagine he had said, "it's the number of dimensions your Fourier transform needs for the fastest way to multiply two numbers" :)

BTW who things the other person was really just baiting Ramanujan to say something like this?

The full anecdote as told by Hardy is

> I remember once going to see him when he was ill at Putney. I had ridden in taxi cab number 1729 and remarked that the number seemed to me rather a dull one, and that I hoped it was not an unfavorable omen. “No,” Ramanujan replied, “it is a very interesting number; it is the smallest number expressible as the sum of two cubes in two different ways.

Hardy just saw it on a taxi cab and expressed sadness; Ramanujam pointed out the mathematical beauty of the number.

The real question is, did he know of this fact already or did he come up with it on the spot after hearing what Hardy said?

I suspect he knew it already. It kind of stands out as an almost-repeated digit pattern if you look at a table of cubes (in base 10), as the cube sums concerned are 10³+9³ = 1000+729 and 12³+1³ = 1728+1. Showing that it's the smallest such number is not difficult, but would take a bit of thought to come up with on the spot. You can do it by checking a few combinations of terms from your table:

   n   n³
   1    1
   2    8
   3   27
   4   64
   5  125
   6  216
   7  343
   8  512
   9  729
  10 1000
  11 1331
  12 1728

Because 1728 is an incredibly surprising number, showing up most prominently at the center of elliptic curves, modular forms, and L-functions: https://en.wikipedia.org/wiki/J-invariant

And 1729 is 1728 + 1. Not even kidding, that is a thing that happens.

I'd argue that it shouldn't be linked, because it's being used in the wrong context. It would be like linking an music album titled "Outer space" to the Wikipedia article on outer space.

I also wonder why that number. But as others said, it's a pretty famous number all by itself so it makes sense for there to be some notice paid to that- if there were no hyperlink I would've thought it's weird and maybe someone made a mistake.

1729 has no special meaning for the multiplication algorithm. In the paper Harvey writes:

> In Section 5 we establish (1.3) with the explicit constant K= 1728, and in Section 5.4 we list some optimisations that improve it to K= 8.

> In Section 5, we will simply take d:= 1729 (any constant larger than K would do)

So 1729 is the 42 of 4-digit numbers.

This is the curse, there are not enough small(ish) numbers for the enormous number of properties they could have, that we come up with :)

No mention of the number of dimensions for Fourier transform multiplication though: https://en.wikipedia.org/wiki/1729_(number)

I can't explain it, but it's the algorithm discussed here, if you want to read up on it: https://news.ycombinator.com/item?id=19474280

The 1729 link has since been removed, likely for the reasons user saagarjha stated -- wrong context.

Interestingly, the change that added it was also quite recent: https://en.wikipedia.org/w/index.php?title=Galactic_algorith...

"The famous quantum factoring algorithm of Peter Shor may or may not be a galactic algorithm. It is of course one of the great results in theory, ever. It has sparked funding for research centers on quantum computation that have promoted many other advances.

If and when practical quantum computers are built Peter’s algorithm will be one of the first algorithms run on them. Right now it is a galactic algorithm. But, perhaps it is the best example of the importance of galactic algorithms."

is Graham's Number a galactic number?


You could say so:

> As with these, it is so large that the observable universe is far too small to contain an ordinary digital representation of Graham's number, assuming that each digit occupies one Planck volume, possibly the smallest measurable space. But even the number of digits in this digital representation of Graham's number would itself be a number so large that its digital representation cannot be represented in the observable universe. Nor even can the number of digits of that number—and so forth, for a number of times far exceeding the total number of Planck volumes in the observable universe.

> for a number of times far exceeding the total number of Planck volumes in the observable universe.

Just so I'm clear. They're saying that not only can I not fit Graham's Number in all the Planck volumes of the universe; and I can't even count the digits of GN and write that in the Planck volume of the universe (and so on), but the number of "indirections" is itself so large as to not fit in the universe?


    1. GN (can't fit).  
    2. Number of digits in GN (can't fit).  
    3. Number of digits in #2 (can't fit).  
    4. Number of digits in #3 (can't fit).  
    N. <-- The numbered list item itself won't fit.
Am I understanding that right?

Yes. log(log(log... GN))) applied X times (where X is the number of planck volumes in the universe) is still greater than X.

Where log = base 10 logarithm.

Hofstadter talks a bit about this abstraction in his article 'On Number Numbness'

> If, perchance, you were to start dealing with numbers having millions or billions of digits, the numerals themselves (the colossal strings of digits) would cease to be visualizable, and your perceptual reality would be forced to take another leap upward in abstraction-to the number that counts the digits in the number that counts the digits in the number that counts the objects concerned.

And despite that, Graham's Number is still in the countable set. :-)

It shouldn't be surprising, because the Graham's Number is all about applying those indirections recursively into more indirections.

But have you heard of Rayo's number?

Just yesterday, I read the comments on the Physical Polynomial-Time complexity class on Complexity Zoo: https://complexityzoo.uwaterloo.ca/Complexity_Zoo:P#php

This sort of thing highlights the difference between theoretical and practical algorithms.

In practical algorithms, one does not ignore constants, one takes into account actual hardware (particularly cache effects), and one does not ignore real world distributions of inputs.

This reminds me of the spaghetti sort algorithm: https://en.m.wikipedia.org/wiki/Spaghetti_sort

Same thing, but a little more practical for computing in case your computer doesn't have a hand and a table: https://rosettacode.org/wiki/Sorting_algorithms/Sleep_sort

The godfather of a galatic algorithm is the known (!) algorithm that solves SAT in polytime.

However, its runtime is only guaranteed to be in polytime, if someone guarantees that P = NP.

Before an unbeliever comments (as I would do myself if I wouldn't know the algorithm): This algorithm involves enumerating f(M, t) where M is the Gödel number of a turing machine T, t an integer and f the output of T after t steps on the original input. This output is interpreted as variable assignment. If it satisfies the formula, the SAT instance is satisfiable. Otherwise the enumeration continues.

If P=NP there is a turing machine that always outputs a satisfying assignment if one exists with t being polynomial bounded by the input size. As this turing machine is constant, the algorithm runs in poly-time.

So, in simpler terms (for me):

- start with a file size (let's say 4kb), and a duration (let's say 1 hour).

- generate all possible files of the given size (there are 2^32768 of them), mark them as executable and run them. If they don't finish after the given duration, kill them.

- check the output of each program that didn't crash. If one matches the solution, OK. If it doesn't, try again with a longer duration and larger size.

It doesn't just solve SAT. At galactic scales, it will either be the optimal solution to any problem, or be as fast as checking the answer, whatever is slower. If all that code generation thing doesn't depend on the input size, it is constant time. That the constant is many times the age of the universe doesn't change the complexity class.

Yeah, you got the point ;) This however only works for problems where you can validate a solution in polynomial time. There is still the problem what happens when a solution does not exist. Even in this case, the algorithm must terminate in polynomial time.

Interesting, I don’t remember this proof (maybe I learned it, but it’s been a couple of years since university). Do you remember the “stop condition”? At some point the algorithm would have to “give up” and declare the problem unsatisfiable.

I don't know. You could also enumerate all proofs and check whether P is a proof that a turing machine T solves SAT in polytime (if P=NP, this requires a finite number of steps ^) and then run T.

^ It would be funny though if P=NP, but for no turing machine exists a proof that it solves SAT. This would have to be ruled out for the algorithm to work.

Can you give a reference, or a hint on where to read more? This sounds really interesting, but Google has failed me.

Actually, I don't know anymore where I read about this idea - if you find out, please let me know! I've added a description of the algorithm.

If all that is required is a super large constant (as in some cases), why not assume the large constant, do the calculation, then factor out the large constant?

Because the “faster” algorithm is faster than the “slow” algorithm in the range where this constant is added... but not faster than just applying the “slow” algorithm directly. So the only thing you would achieve is to slow down the calculation overall.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact