Hacker News new | past | comments | ask | show | jobs | submit login
First improvement of fundamental algorithm in 10 years (mit.edu)
231 points by phreeza on Sept 27, 2010 | hide | past | web | favorite | 37 comments

Their "represent it as a matrix" approach reminded me of this Richard Feynman story:

   [...] Richard had completed his analysis of the behavior 
  of the router, and much to our surprise and amusement, he   
  presented his answer in the form of a set of partial 
  differential equations. To a physicist this  may seem 
  natural, but to a computer designer, treating a set of 
  boolean circuits as a continuous, differentiable system is
  a bit strange. 
  Feynman's router equations were in terms of variables 
  representing continuous quantities such as "the average 
  number of 1 bits in a message address." I was much more 
  accustomed to seeing analysis in terms of inductive proof 
  and case analysis than taking the derivative of "the 
  number of 1's" with respect to time.

Great story. I'm a physicist and I've read most of the Feynman stuff, including many of his papers, but I wasn't aware he worked at Thinking Machines Corp.

I submitted it as a stand-along link:


from http://theory.csail.mit.edu/toc-seminars/ (can't find the actual paper...)

Electrical Flows, Laplacian Systems, and Faster Approximation of Maximum Flow in Undirected Graphs

The maximum flow problem and its dual, the minimum s-t cut problem, are two of the most fundamental and extensively studied problems in Operations Research and Optimization. They have many direct applications and are also often used as subroutines in other algorithms.

In this talk, I'll describe a fundamentally new technique for approximating the maximum flow in capacitated, undirected graphs. I'll then use this technique to develop the asymptotically fastest-known algorithm for solving this problem. For graphs with n vertices and m edges, the algorithm computes epsilon-approximate maximum flows in time \tilde{O}(m^{4/3})poly(1/epsilon) and computes epsilon-approximate minimum s-t cuts in time \tilde{O}(m+n^{4/3})poly(1/epsilon).

We compute these flows by treating our graph as a network of resistors and solving a sequence of electrical flow problems with varying resistances on the edges. Each of these may be reduced to the solution of a system of linear equations in a Laplacian matrix, which can be solved in nearly-linear time.

This is joint work with Paul Christiano, Aleksander Madry, Daniel Spielman, and Shanghua Teng.

The article refers to the max flow problem. About all this article does is imply that the new method involves the adjacency matrix of the graph and provides little else in the way of details. A better article is needed, not this oversimplified press release.

Kelner is set to give a talk about it tomorrow[1]. Anyone know if they create videos for these?

[1]: http://www.csail.mit.edu/events/eventcalendar/calendar.php?s...

The method sounds very interesting, but the title seems to overstate the work a bit. This is an approximation algorithm, meaning that it's an improvement on an _approximate_variant_of_ a fundamental problem, right?

If you're able to find a rapidly converging sequence of approximations, then in finite time you can approximate arbitrarily well. Given the realities of finite numerical precision, this can be exactly as useful as an exact solution.

We'll know more tomorrow.

Sadly, "finite time" is not what we need - they are looking for approximations that will be much rougher than the imprecision of finite numerical precision, and they want the approximation algorithm to have a good provable run time.

In general, taking theoretical approximation algorithms and shrinking the approximation factor to below some threshold where you "tell the difference" is a dead end. For instance, this algorithm's running time has a poly^{1/\epsilon} factor, so you pay a huge cost for it. It is better to think \epsilon = 1/3 or so.

Is there necessarily a difference? Some fundamental problems, including such simple things as the physical 3-body problem, don't have analytical solutions. That doesn't mean that approximate solutions solve an approximate problem. Algorithms can be designed to give an approximation to the optimal solution of the problem, which is known to be achievable only through ultimately numerical methods.

I don't know whether that is the case here: it is possible that they solve a different problem from the fundamental problem. However, even in that case: if the transformations made to turn the fundamental problem into a more tractable problem are generally applicable, then is there really a difference between solving the fundamental problem and solving the transformed problem?

I think the title is appropriate. The approximate solution is the interesting one, as the exact solution is believed to be intractable.

This is absolutely not true. The wikipedia page for the max flow problem lists several (slower) poly-time algorithms for solving exact max flow. Most theory-101 classes cover at least Ford-Fulkerson.

I don't think that's true. See, for example, Finding Maximum Flows in Undirected Graphs Seems Easier than Bipartite Matching (Karger-Levine, 1997): http://citeseerx.ist.psu.edu/viewdoc/summary?doi=

Any MIT people know if this is open to the public? Might be worth leaving work a bit early to listen to that talk.

The email note and notice in the elevator says "THEORY COLLOQUIUM: Open to the Public".

Adjacency matrices are taught in undergraduate linear algebra classes. This article makes it sound like they invented them.

They're taught in secondary schools in Scotland. (Or at least they were when I did CSYS Maths (Pure), now Advanced Higher...)

yes, but the article tries to explain the achieved work to people unfamiliar with Graph Theory or CS; thus in very simple terms and with analogies, they even define what a graph is before proceeding... I guess that's what it meant to be, short and simple press release with a (I'm assuming) follow up with the algorithm itself.

Very little info, but I'd argue that the parallel qualities of the new algorithm are at least of similar importance to the improvement in complexity.

Old speed: (N + L)^(3/2) New speed: (N + L)^(4/3). (n=nodes, l=links)

>For a network like the Internet, which has hundreds of billions of nodes, the new algorithm could solve the max-flow problem hundreds of times faster than its predecessor.

Granted, it's massively useful for Internet-scale calculations. But it's also massively useful that this is just operations on a matrix - throw the sucker at a video card, and watch it finish thousands of times faster on cheaper hardware.

edit: oh, and:

  100,000,000^(3/2)) / 100,000,000^(4/3) ≈ 21.5
  1,000,000,000^(3/2) / 1,000,000,000^(4/3) ≈ 31.6
  1,000,000,000,000^(3/2) / 1,000,000,000,000^(4/3) ≈ 100
Anyone know the scale of N+L for the internet? I somewhat doubt it's in the trillions. Which is not in any way to suggest there are not significantly larger applications - just nitpicking at journalism's grasp of numbers.

throw the sucker at a video card, and watch it finish thousands of times faster on cheaper hardware

This is nonsense even for dense operations. But this matrix is sparse, in which case GPUs are within a modest factor (2-5 or so depending on the matrix, whether you use multiple cores on the CPU, and whether you ask NVidia or Intel). And if the algorithm does not expose a lot of concurrency (as with most multiplicative algorithms), the GPU can be much slower than a CPU.

The right answer is in the middle. Of course speed up depends on what are you comparing, but if you benchmark GPU against decent four core CPU the speed up is in order of magnitude.

For dense Matrix currently is about eight times (http://forums.nvidia.com/index.php?showtopic=172176) on CUDA 3.1. That is before the CUDA 3.2 which claim 50-300% performance increase.

For sparse matrix the speed up against multi core CPU is about ten times (http://forums.nvidia.com/index.php?showtopic=83825).

All available as a public libraries (http://developer.nvidia.com/object/cuda_3_2_toolkit_rc.html).

jedbrown, please provide a source of your estimates. Of course I'm interested in some highly optimized libraries like BLAS. Hand written code would be on both systems several times slower.

"On the limits of GPU Acceleration" (short summary from Richard Vuduc): http://vuduc.org/pubs/vuduc2010-hotpar-cpu-v-gpu.pdf

"Understanding the design trade-offs among current multicore systems for numerical computations" (somewhat more technical): http://dx.doi.org/10.1109/IPDPS.2009.5161055

"Debunking the 100X GPU vs. CPU myth: an evaluation of throughput computing on CPU and GPU" (from Intel, but using the best published GPU implementations): http://doi.acm.org/10.1145/1815961.1816021

BTW, you may as well cite CUSP (http://code.google.com/p/cusp-library/) for the sparse implementation, it's not part of CUDA despite being developed by Nathan Bell and Michael Garland (NVidia employees).

Really great comment, thank you.

This sounds cool, but how is this not a solution that's already been considered? It seems intuitively obvious to me that you can use a circuit simulator to solve this problem pretty quickly by assigning conductances to the edges of the graph. Modern circuit simulators are pretty good at converging on a solution in the presence of strong nonlinearities, so in principle you could even have edges that are more than simple conductances.

(I have a CS degree, but I'm a circuit designer, so it's been a while since I've thought seriously about problems like this and I'm clearly not in touch with the state of the art. This question isn't snark, it's really a question.)

> This sounds cool, but how is this not a solution that's already been considered? … I have a CS degree

Well… did you think of it?

Fair enough :)

(In my defense, I've been solving other problems instead...)

The theme with these MIT announcements seems to be HCA: Hope, Certainty, and Assuredness by analogy to FUD and with similar use of misrepresentation, simplification, and appeal to authority.

Actually, another case: Vladimir Yaroslavskiy recently came up with a dual-pivot quicksort method where the number of swaps is reduced from 1nln(n) to 0.8nln(n). Yes, he's only twiddling the constant factor, but the benchmarks have been promising, with some sorts being reduced to 25% to 50% of their previous running time. Josh Bloch, one of the shepherds of Java, took the time to implement it as the default sorting algorithm in Java 7 (whenever that gets released). More details at: http://permalink.gmane.org/gmane.comp.java.openjdk.core-libs...

How timsort from Python stacks up, I don't know offhand. But that's a different beast instead of an improvement to a fundamental algorithm.

Here's a good background paper by Spielman that seems to lay the foundation for the methods used: http://www.cs.yale.edu/homes/spielman/PAPERS/icm10post.pdf

Although not the actual result itself. But looks like a really good read.

It's an approximate algorithm (nothing wrong with that).

I'm confused at how they claim O(m^{4/3}) (m is the number of edges) when even initializing an explicit n by n matrix is O(n^2), unless they assume that the graph isn't sparse (m>=k*n^1.5).

Or perhaps the matrix is sparsely represented.

On page 11 of the paper I reference above Spielman makes the following statement:

"The most obvious way that sparsification can accelerate the solution of linear equations is by replacing the problem of solving systems in dense matrices by the problem of solving systems in sparse matrices. Recall that the Conjugate Gradient, used as a direct solver, can solve systems in n-dimensional matrices with m non-zero entries in time O(mn). So, if we could find a graph H with O(n) nonzero entries that was even a 1-approximation of G, then we could quickly solve systems in LG by using the Preconditioned Conjugate Gradient with LH as the preconditioner, and solving the systems in LH by the Conjugate Gradient. Each solve in H would then take time O(n^2), and the number of iterations of the PCG required to get an ǫ-accurate solution would be O(log ǫ−1). So, the total complexity would be O((m + n^2) log ǫ−1)."

I'm not really sure how you get O(n) 1-approximations of G (or even really what a 1-approximation is yet, as I haven't really read the paper), but that seems to be the idea, I think.

A 1-approximation is an approximation within a factor of 1. Basically that means that the solution is nearly exact. A 2-approximation would be no more than twice as bad as the optimal solution, and an n-aproximation would be no more than n times as bad as the optimal solution.

This is a common thing to use in randomized or approximate algorithms. For example, solving min-cut by assigning to 2 sets randomly is provably a 2-approximation.

I believe you mean Max-Cut can be 2 approximated with the trivial random algorithm. Min cut can be solved exactly, and that algorithm would give an expected n^2 approx factor (for min cut).

Anyone have a link to the actual paper(s)?

I looked around, but the only links I could find appear to be selected textbook chapters that describe old ways of doing it. Even Dr. Dobbs just reprinted the MIT press release. So far, the only useful link I can find is to the scheduled talk and someone already posted that.

If anyone can find anything more than this link, it would be interesting to me, too: http://www.csail.mit.edu/events/eventcalendar/calendar.php?s...

EDIT: Incidentally that page does tell us a bit more about the solution than the press release everyone is reprinting. Specifically, it says: "We compute these flows by treating our graph as a network of resistors and solving a sequence of electrical flow problems with varying resistances on the edges. Each of these may be reduced to the solution of a system of linear equations in a Laplacian matrix, which can be solved in nearly-linear time."

As far as the article goes, this new solution could very well be just a different version of the Simplex method, which can also solve this kind of problem.

As some already said: better article needed.

Couldn't this problem be simplified by considering the nodes and paths to be a resistor network, and solving for the effective resistance between the source and drain nodes?

How about the processing overhead of converting the complete graph to a matrix in the first place? Wouldn't this offset(some)gains of the new algo?

Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact