Hacker News new | past | comments | ask | show | jobs | submit login
The Best Algorithms of the 20th Century [pdf] (uta.edu)
87 points by sirchristian on Sept 2, 2012 | hide | past | web | favorite | 20 comments



For more on #5 (original Fortran optimizing compiler), including the source code, see http://www.softwarepreservation.org/projects/FORTRAN/ .

For more on #7 (Quicksort), I especially recommend Hoare's paper in the Computer Journal, which explains many details about the algorithm that are unfortunately not widely known (such as the correct way to partition for Quicksort): http://comjnl.oxfordjournals.org/content/5/1/10.abstract (the full text is freely downloadable).


Khawrazmi was a Persian Muslim. I stopped reading right there.


What does religion have to do with algorithms?!


Suppose the article was about the greatest boxers of the 20th century, and it talked about the Arab boxer Muhammad Ali. Would you object to someone pointing out that Ali was not Arab, but rather an American Muslim? His being Muslim is relevant not because it is his religion, but because it explains why an American of his era has such a foreign name--he changed his name when he converted to Islam.

Same thing is going on here. The article says al-Khwarizmi was an Arab. He was actually a Persian who followed the Arab religion (Islam), which probably explains the Arab-appearing aspects of his life and work.

Does it matter if he was Arab or Persian? Some might argue that all that matters is what he discovered. The counter argument is that by mentioning things like nationality, ethnicity, religion, etc., in these kind of articles, it helps reinforce the idea that science transcends national borders, ethnic boundaries, religion, and so on. I believe I read something by Asimov once where he said that was the reason he always included the nationality of scientists he wrote about.


So is it somewhat equivalent to saying someone is protestant when they are catholic, and being outraged at such a mix-up? I still don't quite understand why it has emoted such apparent disgust as to stop reading, but then I am not a religious person.


It's not about religion. Equivalent would be something like saying an important Mexican scholar was Canadian.


I doubt it has anything to do with religion. He is inferring from the mistake of saying a Persian Muslim was an Arab that the author did not put much effort into research for the article, and so has decided it is not worth reading.


The article's statement in the second sentence that al-Khwarizmi was an Arab is an error. The person you're responding to was almost certainly not being bigoted, he was reacting to a factual error in the introductory paragraph of the article.


I see your point and agree. However, many will not readily discern Persian from Arab and focus on the more interesting content: the algorithms


Fair enough. I commented because it seemed very unfair that he was downvoted. People are too eager to do that. The point he made might not seem very important, and might be misunderstood, but it's true.


Because someone didn't factcheck an irrelevant fact othside of his expertise, for which it is entirely normal to trust 'dubious' sources?

(In which I include the author's memory)


FFT FTW!


Hmm, let's see: One of the algorithms is the fast Fourier transform (FFT). It is sometimes described as a form of matrix factorization and multiplication.

Another algorithm is Dantzig's simplex algorithm for linear programming: It is basically just elementary row operations much as in Gauss elimination on a, usually, vastly 'under determined' system of linear equations. The math for the simplex algorithm is mostly nicely presented via matrix theory.

A third algorithm in the list is the QR algorithm for finding eigenvalues.

So, of the 10 algorithms, at least three are closely related to just matrix theory. Amazing.

Then, for computer science I would add an observation: A few days ago a venture firm principal asked me about my project, "So you have an algorithm?". I had to respond, "Well, yes, but by itself an algorithm doesn't have much to recommend it and, thus, doesn't mean much.".

Well, this list of 10 algorithms supports my observation: For a problem as complicated as those solved by the 10 algorithms in the list, an algorithm by itself doesn't mean much, really doesn't mean anything. Instead, to take any such algorithm seriously, we need something we can take seriously and logically prior to the algorithm.

Well, as in the 10 algorithms, what is prior is just some applied math, typically with theorems and proofs. Then due to the theorems and proofs, we take the applied math seriously. Then we take the algorithm seriously because and only because it is a careful implementation of the data manipulations in the applied math. So, net, what is really crucial for such an algorithm is the logically prior applied math.

Of course, at times we can proceed without any prior applied math and work just with an algorithm that we got, say, from just heuristics. Then we may be able to take the algorithms seriously after a lot of empirical testing.

So, here's my point: For reasonably complicated problems, the key is some applied math, and we take a corresponding algorithm seriously only because of the logically prior applied math.

Or, computer science: The most important work with algorithms is work with logically prior applied math complete with theorems and proofs. Work with algorithms without such prior applied math is close to hopeless.

Venture firms and limited partners: If an entrepreneur has some crucial, core 'secret sauce' in running code that can be called an 'algorithm', what is crucial (prior to already having a financially successful company based on that code) is the corresponding applied math, not the 'algorithm' and not just the code.

Information technology entrepreneurs: If your business is trying to solve a serious problem with an algorithms and some corresponding code, then don't start with the algorithm and, instead, start with some applied math.

Computer science students: If you want to do good work with algorithms, study appropriate topics in a math department, not 'algorithms' in a computer science department.

Computer science professors: Algorithms are crucial to your field, but your approach to algorithms skipping prior applied math complete with theorems and proofs is bankrupt, with no good reason to take any such algorithm seriously, and will lead to a long walk on a short pier and blocked progress for your field. What's just crucial for your interest in algorithms is in the math department and not in your department. Sorry 'bout that!


It's an article from SIAM. Of course it's heavily biased towards applied mathematics. It's not only biased against computer science, it's biased against pure mathematics. Do you doubt that Buchberger's algorithm will reverberate down through the millenia? Even within applied mathematics, the list leaves out multigrid methods, the only linear-time algorithms in their class, which seem a shoe-in given their criteria for inclusion.

It would hardly be difficult to make a very long list of pure CS algorithms and data structures that could stand head to head against the likes of the multipole method in both industrial application and scientific value.

Your knowledge of computer science is evidently shallow. Educate yourself and you might think twice before making such ignorant pronouncements.


I wonder what proportion of "pure" CS algorithms can be conceptualized as an application of one of the fixed point theorems and/or properties of monotone operators. There has got to be _something_ useful about the functional analysis perspective and people doing serious work in algorithms are certainly familiar with it as a group.


> I wonder what proportion of "pure" CS algorithms can be conceptualized as an application of one of the fixed point theorems and/or properties of monotone operators.

You might enjoy http://www.amazon.com/Graphs-Dioids-Semirings-Algorithms-Ope.... It has some cool ideas, but you'll first have to wade through a sea of abstract nonsense a la Bourbaki.


You make several points. It appears that you don't like my main point but have no good argument against it or replacement for it; I will respond and try to explain my main point again.

One of your points seems to be that the selection of 10 best algorithms is not very good. I would agree: I would have selected heap sort instead of quicksort because, given a positive integer n, the execution time of heap sort in sorting n items is proportional to n ln(n) in both average case and worst case and, thus, heap sort meets the Gleason bound for the fastest possible sort by comparing pairs of keys. The execution time of quicksort on average is n ln(n), and in practice faster than heap sort, but in worst case appears to run in n^2. Quicksort seems to do better on locality of reference for a virtual memory system, but there are ways to improve the locality of reference of heap sort.

For my main point, that the list of 10 best algorithms was well chosen is not very important.

Here is my main point again:

"For reasonably complicated problems, the key is some applied math, and we take a corresponding algorithm seriously only because of the logically prior applied math."

So, since you don't like this point, I will try to explain in more detail:

First, we are considering 'algorithms'. So, let's agree on what we mean by an algorithm: I will accept running code in a common programming language -- C/C++, Fortran, PL/I, etc. -- or something similar in 'pseudo-code'.

So, the issue is, given an algorithm, what do we need to take it "seriously", that is, be sure it does what it is intended to do?

Second, briefly let's return to sorting: Quicksort and heap sort are darned clever. But, given the algorithms, in the form I just mentioned, it's easy enough to trace the logic informally and confirm that the algorithms actually do sort. For the running times, finding those is more work but not very notable math.

So, net, for these sorting algorithms, it is easy enough for us to take them seriously for their ability to do what is promised -- sort or sort in n ln(n).

You also mentioned data structures. Well we can say much the same for AVL trees or the data structures used in a fast implementation of the network simplex algorithm for least cost capacitated network flows, etc. For some of the data structures used in, say, dynamic programming, more is needed to take the algorithms for those data structures seriously. Similarly for some uses of k-D trees.

So, for some algorithms and data structures, we can take them seriously just 'by inspection'.

Third, consider, as in the list of 10 algorithms, trying to solve a fairly complicated problem. Examples could include the discrete Fourier transform, finding eigenvalues and eigenvectors of a symmetric, positive definite matrix, least cost network flows, matching to minimize the most expensive single match used, linear programming, quadratic programming, iterative solution of large systems of linear equations (e.g., via Gauss-Seidel). Maybe we are trying to solve a problem in non-linear programming and want an algorithm to achieve the Kuhn-Tucker conditions.

Given an algorithm for one of these problems, to take the algorithm seriously we need more than just 'inspection'. So, since just 'inspection' no longer works, the question is, how can we take such an algorithm seriously?

Actually, there is some fairly tricky math needed for us to take seriously the simplex algorithm for linear programming: For the set of real numbers R, a positive integer n, and Euclidean R^n with the usual topology, let F be the intersection of finitely many closed half spaces of R^n, and let linear function z: R^n --> R. Claim: If z is bounded above on F, then z achieves its least upper bound. For us to take the simplex algorithm seriously, we need to know that this claim is true. Note: The proof is not just usual analysis with converging sequences from Rudin's 'Principles'.

For more, given a linear programming problem, it may be feasible or infeasible. If the problem is feasible, then it may be bounded or unbounded. If the problem is feasible and bounded, then we want to know that there is an optimal solution and that the simplex algorithm can find one in finitely many iterations. Since the simplex algorithm considers only extreme point solutions, we would like to know that, if there is an optimal solution, then there is an optimal extreme point solution. In the simplex algorithm, there is a sufficient condition for optimality, but there can be optimal solutions and optimal extreme point solutions without this sufficient condition. So, we need to know that the simplex algorithm can achieve the sufficient condition. The nicest solution to these issues I know of is via Bland's rule by R. Bland, long at SUNY. It's not obvious.

Again, let's be more clear: Given an algorithm as above, that is, just code or pseudo-code, for the simplex algorithm with Bland's rule, solution of linear equations with double precision inner product accumulation and iterative improvement (Forsythe and Moler), detecting and handling degeneracy (basic variables with value 0), detecting infeasibility, detecting unboundedness, considering the reduced costs as a sufficient condition for optimality, the code will look like just so much gibberish with no good reason to take it seriously. To take the code seriously, we need some prior math where the algorithm just implements the manipulations specified by the math.

Yes, apparently computer science regards the simplex algorithm as an algorithm in computer science: E.g., the algorithm is discussed in Chapter 29 of

Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein, 'Introduction to Algorithms, Second Edition', The MIT Press Cambridge.

commonly called 'CLRS'.

E.g., recently I encountered a question: For some positive integer n, given n client computers and one frequency band with bandwidth x Mbps, design a wireless network to provide all x Mbps to all n client computers simultaneously. Is such possible? Yes. Does this violate Shannon's information theory? No. So, how to take any such proposal seriously? Sure, some math. Basically the solution is one heck of a strange 'antenna' pattern with n nodes with each client getting just the node with just their data. It all boils down just to linear algebra after taking some Fourier transforms.

So, again, given a fairly complicated problem, when trying to evaluate an algorithm to solve this problem where just 'inspection' no longer works, the question is, how can we take such an algorithm seriously?

Well, to take the algorithm seriously, we need more than just the algorithm, that is, more than just the code or pseudo-code.

As I mentioned, one way to take the algorithm seriously is a lot of empirical evidence. Alas, this can be slow and is not very satisfactory.

So, what is left? You mentioned nothing. And computer science has nothing.

I gave a solution: Start with the real problem. Get a solution via some applied math complete with theorems and proofs. That is, properties of the real problem provide assumptions for the theorems, and the conclusions of the theorems provide the desired solution. Then the software, that is, the 'algorithm', implements the data manipulations specified by the math. We take the math seriously because of the theorems and proofs. Then we take the algorithm seriously because, and only because, we took the math seriously.

Net, without the math, there is little reason to take the algorithm seriously. With the math, to evaluate the algorithm, we should evaluate the math and then confirm that the algorithm does what the math specifies.

So, for your

"Your knowledge of computer science is evidently shallow. Educate yourself and you might think twice before making such ignorant pronouncements."

What is relevant here is what I wrote; whether my "knowledge of computer science" is "shallow" or not is irrelevant.

I said nothing "ignorant"; you gave no solution to the question of how to take an algorithm seriously when 'inspection' is not sufficient; and I gave apparently the only good solution we have.


I hope you don't expect me to respond to every statement of yours in that mountain of text. This response of mine has already grown too long.

You don't have to convince me that mathematics is important--my background is in pure mathematics. I just don't think CS people are hapless fools who need applied math people like you from the outside to help them out. In my book, the non-systems part of CS is already a part of mathematics.

Since you bring it up, I've always considered numerical methods as firmly a part of applied mathematics. Combinatorial search and optimization problems are in the heartland of CS. The simplex method straddles numerical optimization and combinatorial optimization and is a bit of a hermaphrodite. Low-dimensional LPs are important in computational geometry and that field has produced some beautiful algorithms. Here's Seidel's randomized algorithm: Throw out a random hyperplane and solve the smaller problem recursively. If this optimum satisfies the excluded constraint then we are done. Otherwise the true optimum must lie on the hyperplane, so we can reduce the dimension of the problem by 1.

> Net, without the math, there is little reason to take the algorithm seriously.

An algorithm _is_ mathematics. The reason I said you seemed ignorant of computer science is this apparent belief that computer science is a random bunch of algorithms without any supporting theory.

> Well, to take the algorithm seriously, we need more than just the algorithm

Your premise is based on a strawman. How do you think algorithms are designed? They aren't generated randomly. They are generated based on exactly the kind of insight that generates theorems and proofs. There may not be tight proofs of every property we'd like to know for certain. That is certainly true for the simplex method! Its worst-case running time is exponential (e.g. Klee-Minty), and there is a small cottage industry devoted to proving its running time is polynomial in the average case (for various definitions of 'average case'). People have been using the simplex method very effectively since its inception despite knowing that its pathological behavior can be very bad indeed. Had you wanted to make your point more effectively, you should have picked interior point methods, which have the additional benefit of working for a much wider class of convex problems like SDPs.

> Claim: If z is bounded above on F, then z achieves its least upper bound. Note: The proof is not just usual analysis with converging sequences from Rudin's 'Principles'.

This has nothing to do with linearity or convexity. It's true for any continuous function f : X -> R on a closed subset F of a complete topological space X. If f is bounded above on F then its supremum on F is achieved at some x in X. By the supremum property and continuity of f, you can find a net in F which accumulates around x. This Cauchy net converges to x since X is complete. Because F is closed it contains its own accumulation points. Hence x is in F.

This proof is exactly at the level of Baby Rudin if you drop back the generality a bit and work with metric spaces.

> For us to take the simplex algorithm seriously, we need to know that this claim is true.

Nonsense. Any implementation of the simplex method is already going to introduce round-off error, so the theoretical difference between optimizing on a set and its closure has no practical consequences. If your point is that the achievement of the supremum ensures termination of the simplex method, that is false. The simplex method is hill climbing on a polytope's vertex graph. Whenever there is a local tie between neighboring vertices you need a tie-breaking pivoting rule. Bad pivoting rules (e.g. lexicographic ordering) will lead to cycling and non-termination. In the absence of such local ties, all you need to know to prove termination is that the number of vertices is finite and that the objective function strictly decreases each step. Even the proof of termination when pivoting with Bland's rule is purely combinatorial.

An example of something that's actually important in practice is how strong Lagrange duality supplies non-heuristic stopping criteria for convex programs via dual feasible points. That way we know what we're giving up by being suboptimal. Duality theory is also much richer and subtler than the kind of mindless definition-chasing freshman homework problem you bought up above.


> You don't have to convince me that mathematics is important--my background is in pure mathematics. I just don't think CS people are hapless fools who need applied math people like you from the outside to help them out. In my book, the non-systems part of CS is already a part of mathematics.

Would, could, and should be, but so far in practice in the universities and elsewhere very much is not.

> Combinatorial search and optimization problems are in the heartland of CS.

No: They are in the "heartland" of optimization, 'operations research', and applied math. The CS people aren't good enough with the theorems and proofs to make progress with the math. E.g., not nearly enough CS profs went through a careful theorem proving course in abstract algebra or through Rudin's 'Principles'.

> An algorithm _is_ mathematics.

Well, an algorithm is necessarily mathematically something, but by itself what it is mathematically is unknown and, thus, not in any very meaningful sense mathematics.

I gave a definition of an 'algorithm': Again, yet again, just look at the code or pseudo-code, and you don't have much. To take such code seriously, need some logically prior math, some actual math, as in a math department and in the tradition of von Neumann, Halmos, Rudin, Birkhoff, Bourbaki, etc. CS tries hard to avoid such math and, thus, is stuck in making progress in algorithms.

> Your premise is based on a strawman. How do you think algorithms are designed? They aren't generated randomly. They are generated based on exactly the kind of insight that generates theorems and proofs.

No: Big themes now in CS are to come up with algorithms by whatever means -- genetic, intuitive, heuristic, 'clustering', neural network fitting, 'rules', 'machine learning', 'artificial intelligence', etc. -- often with no "insight" at all and, in particular, and most significantly, with no reason to take the algorithm at all seriously.

> there is a small cottage industry devoted to proving its running time is polynomial in the average case (for various definitions of 'average case').

K. H. Borgwardt.

Empirically the running time on usual problems has long been known to be about 3m iterations for a problem with m constraints.

> Had you wanted to make your point more effectively, you should have picked interior point methods, which have the additional benefit of working for a much wider class of convex problems like SDPs.

You are missing my point: I'm using the problems in the top 10 list, optimization, digital filtering, etc. just as sources of examples of my point. Again, yet again, just for you, one more time, please actually read it this time, my point, already repeated over and over and over, and here repeated yet again, about algorithms, and having nothing to do with optimization, is:

"For reasonably complicated problems, the key is some applied math, and we take a corresponding algorithm seriously only because of the logically prior applied math."

Here I said "For reasonably complicated problems" but said nothing, zip, zilch, zero, about optimization or digital filtering. My claim holds for algorithms, all algorithms, ALL of them, for whatever purposes or problems, in the full generality of algorithms "for reasonably complicated problems".

Why? Again, yet again, with just an algorithm, all we have is just the code, and for any very complicated algorithm that means next to nothing down to nothing at all. This point is the same for simplex for linear programming as it is for interior point methods for achieving the Kuhn-Tucker conditions or for anything else complicated.

So, again, yet again, given the algorithm, just the code, how to take it "seriously"? There ain't but just two ways: First, can evaluate the algorithm just empirically by running it on a lot of data. This way was long ago adopted essentially in whole by the entire CS 'artificial intelligence' community. Second, can have something 'logically prior' that do take seriously because it has theorems and proofs. For this second way, CS is stuck-o because they are not good enough with theorems and proofs because they didn't take the right courses in the math department in grad school.

You find another way, another means, another 'paradigm', for taking an algorithm seriously, and I will jump for joy, but I'm not holding my breath until then.

> This has nothing to do with linearity or convexity. It's true for any continuous function f : X -> R on a closed subset F of a complete topological space X. If f is bounded above on F then its supremum on F is achieved at some x in X. By the supremum property and continuity of f, you can find a net in F which accumulates around x. This Cauchy net converges to x since X is complete. Because F is closed it contains its own accumulation points. Hence x is in F.

You typed too fast, way too fast. You are flatly, badly, easily, trivially, clearly wrong!

Counterexample: Let R denote the set of real numbers and let x, y be in R and x, y > 0. Then the problem is to find x to minimize y subject to y >= 1/x.

Then y is bounded below but does not achieve its greatest lower bound, which is 0. Done.

Why? There is no compactness. In your argument, you omitted the assumption of compactness. And as we know for positive integer n, in Euclidean R^n, a subset is compact if and only if it is closed and bounded. Yes, I made As in the course in Rudin's 'Principles'!

The set I illustrated

{ (x, y) | x, y > 0 and y >= 1/x }

is closed in the usual topology for R^2 but not bounded.

So, your argument based on Rudin style convergence does not establish my linear programming

Claim: If z is bounded above on F, then z achieves its least upper bound.

Since F is an intersection of finitely many closed half spaces, F is closed. But F need not be bounded. Still my claim holds even when the feasible region F is unbounded. So, proofs need to make some crucial use of linearity, that is, that the function to be maximized is linear and the half spaces are from linear functions.

Beyond Bland's proof (which really came from some of his work in matroid theory), can also establish the results for linear programming and simplex by some careful use of 'separation theorems' or, if you will, theorems of the alternative. But, again, linearity is crucial.

We very much do need to know the properties I listed of the simplex algorithm and linear programming.

If you are concerned about floating point, then just do the arithmetic exactly. For this, consider the M. Newman approach of solving a system of linear equations by multiplying through by suitable powers of 10 to convert all the data to integers and then solving in exact machine single precision arithmetic in the field of integers modulo a prime for a sufficiently large set of prime numbers and constructing the multiple precision numerators and denominators via the Chinese remainder theorem. Besides, in practice, the floating point issue is usually not very serious.

What you meant about Lagrangians is not so clear, but, sure, there is some nonlinear duality theory where the dual of a nonlinear minimization problem is a maximization of a concave function which, then, can be approximated by supporting hyperplanes found from efforts with the primal problem. The proof is short and easy. To write out the proof, I'd want TeX but don't have that on HN.

Uses of this result are commonly called 'Lagrangian relaxation'. Once I had a 0-1 integer linear program with 40,000 constraints and 600,000 variables and got a feasible solution within 0.025% of optimality in 905 seconds on a 90 MHz computer from 500 iterations of Lagrangian relaxation. To maximize the approximations to the concave function, I used the old Watcom Fortran version of the IBM Optimization Subroutine Library (OSL).

This work with Lagrangian relaxation is a multidimensional generalization of H. Everett's single dimensional 'generalized Lagrange multipliers' that he used on DoD resource allocation problems at his Lambda Corporation (after he did his 'many worlds' version of quantum mechanics).

Again, such an example doesn't add or detract from my basic point that we have no good way to take an algorithm by itself seriously and, thus, for an algorithm to be taken seriously need some logically prior work we do take seriously, which is forced to be math with theorems and proofs, which the algorithm merely implements.

Again, find another way to take an algorithm seriously and I will jump for joy. So, in particular, as long as the CS people pursue algorithms without careful, powerful, logically prior work with math theorems and proofs, they will be stuck-o for progress. Sorry 'bout that.


> So, of the 10 algorithms, at least three are closely related to just matrix theory. Amazing.

"Mathematics is the art of reducing any problem to linear algebra." - William Stein




Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: