
New algorithm can dramatically streamline solutions to the ‘max flow’ problem - nkvl
http://web.mit.edu/newsoffice/2013/new-algorithm-can-dramatically-streamline-solutions-to-the-max-flow-problem-0107.html
======
Shish2k
For those (like me) who clicked because algorithms are fun but don't know the
max flow problem specifically:
[http://en.wikipedia.org/wiki/Maximum_flow_problem](http://en.wikipedia.org/wiki/Maximum_flow_problem)

TL;DR: what's the fastest way to transport a large amount of data over a mesh
of many small pipes

As someone who knows little about this problem or the laws of physics, I
wonder if it could be solved using physics -- set up a series of physical
pipes, pump water from A to B, measure the flow at each point? If that would
work, how complicated would the mesh need to be that solving it with physics
is faster than brute-force computation?

~~~
seabee
Some people, when confronted with a problem, think "I know, I'll use a
physical model." Now they have two problems.

(The new problem is dealing with fluid dynamics.)

Water-based computation has already been done of course:
[http://en.wikipedia.org/wiki/MONIAC_Computer](http://en.wikipedia.org/wiki/MONIAC_Computer)

~~~
pmahoney
The Soap Film: An Analogue Computer

[http://www.americanscientist.org/issues/feature/the-soap-
fil...](http://www.americanscientist.org/issues/feature/the-soap-film-an-
analogue-computer-)

------
jamra
Could it be?

[http://arxiv.org/abs/1304.2338](http://arxiv.org/abs/1304.2338)

~~~
X4
Yes. Downloaded that too. Here is more:
[http://math.mit.edu/~kelner/publications.html](http://math.mit.edu/~kelner/publications.html)

I cant wait to see the C99 code for that! _(maybe I should try that myself)_

~~~
carterschonwald
i'm kinda hoping to have some time to hack out a haskell implementation of the
SDD solver next month (not sure if i'll have the time then, but I hope to)

~~~
X4
that's cool! I've not seen any use of a SDD, what do you intent to use this
for, or could you tell how it could be used?

~~~
carterschonwald
excellent question! Basically you could use the SDD solver
[http://math.mit.edu/~kelner/Publications/Docs/1301.6628v1.pd...](http://math.mit.edu/~kelner/Publications/Docs/1301.6628v1.pdf)
in any context where you'd use an SVD (pseudo inverse / least squares) solver,
and your matrix is symmetric and diagonally dominant.

The key bit however, is that the SDD solver as above, has really nice
asymptotics for sparse matrices.

theres a few fancy algorithms that need to be engineered along the way, and
theres also the question about how the constant factors work out in practice!

------
erex78
An Almost-Linear-Time Algorithm for Approximate Max Flow in Undirected Graphs,
and its Multicommodity Generalizations:

[http://arxiv.org/pdf/1304.2338v2.pdf](http://arxiv.org/pdf/1304.2338v2.pdf)

~~~
Rizz
The paper also mentions someone else who also recently found a near linear
time approximation for maximum flow:
[http://arxiv.org/pdf/1304.2077v1](http://arxiv.org/pdf/1304.2077v1)

------
yread
The abstract talks about ϵ-approximation for m edges in O(m^(1+o(1))ϵ^−2). If
this algorithm is practical that would be quite a speed up!

------
murbard2
This could be quite useful for Ripple, which is basically solving a max-flow
min-cost problem over a credit lines graph.

~~~
jplewicke
Ripple is actually a generalized network flow problem, in which the flow
through an edge can be multiplied by an exchange rate when converting between
currencies. As a result, regular network flow algorithms don't apply. There's
also another issue with using a minimum cost criterion, since the costs are in
different currencies and it's not clear how to put them on the same footing.

~~~
murbard2
If you make cost the log of the conversion rate plus the log of the %age fee,
and treat each node as a distinct node for each currency, you do get max flow
min cost.

------
mrdmnd
Kelner was my advisor as an undergraduate - I loved listening to him talk
about graph problems. He always seemed to have a novel way of looking at
something like this. I'm glad to see him get some excellent new results!

------
piqufoh
This sounds really cool. Are there any links to the paper / presentation
anywhere? I would love to learn more.

~~~
stonesixone
Kelner's web site has the link (and confirms that it corresponds to what will
be discussed at the ACM-SIAM symposium):

[http://math.mit.edu/~kelner/publications.html](http://math.mit.edu/~kelner/publications.html)

[http://math.mit.edu/~kelner/Publications/Docs/1304.2338v2.pd...](http://math.mit.edu/~kelner/Publications/Docs/1304.2338v2.pdf)

------
igravious
Was the theoretical lower bound of this `max-flow' algo known? Just curious.

Also, are there a class of algorithms where the current practical state of the
art differs greatly from the theoretical or how would I go about finding out
the answer to this question? Much obliged.

I kind of thought a lot of these routing problems were worked out or known not
to be work-out-able. Just shows what I know :/ * sigh _

------
DannyBee
I'm always wary of these kinds of research papers because they are often not
using comparative benchmarks sanely.

However, that said, there are all kinds of interesting max flow problems, so
hopefully it pans out in practice. Heck, even some profile based compiler
optimizations can be formulated as max-flow problems.

~~~
ahomescu1
These papers improve time complexity, not actual running time. I'm not sure it
ever makes sense to do comparative benchmarking, since you can't test on
arbitrarily large inputs (which is what's important when looking at algorithm
complexity). For smaller inputs (for example, N in the 1000-1 million range),
constant factors impact running time significantly. To compensate, you'd have
to run a benchmark of size 10^10,000,000.

EDIT: For an example, let's say the current best algorithm for some problem
takes O(N^2), with a small constant (let's say 100 * N^2 operations). Then
someone comes up with a O(N^1.9), but with a larger constant (let's say it
takes 1,000,000 * N^1.9). The newer algorithm will practically be slower for
any reasonably-sized input you can throw at it, even though it's a major
theoretical breakthrough.

~~~
Rizz
As you are talking about reasonably sized input, your constant examples should
also be reasonable. 10000 times difference in the constant is not reasonable.
You wouldn't even get that from going from all data in the level 1 cache to
random memory access with 10 times more instructions.

And "these papers" often do care about running time and will say whether it is
possible to implement the algorithm efficiently, if you're lucky they'll even
have done an implementation and you can see at what dataset sizes they are
faster in practice.

And even if it were slower in practice, research like this is very useful,
because it helps gives other researchers inspiration, ever since someone
thought of looking at these problems like electrical flow in 2010, there have
been ever faster approximations for it, while others will find ways to
implement the algorithm efficiently, just in case a naive implementation of
the offered algorithm isn't efficient enough.

Dismissing such research without having found any actual extreme
inefficiencies in the proposed algorithm is not an attitude suited to anyone
who wants to see progress.

And finally, the actual speed increase in the past decades hasn't been from
O(N^2) to O(N^1.9) but from O(V _E^2) to O(V_ E), which obviously does have a
real impact, and even with a non-realistic 10000 times bigger constant it
would still be far faster in any graph with more 10000 edges. And graphs with
billions of edges aren't unusual anymore.

~~~
ahomescu1
> And even if it were slower in practice, research like this is very useful,
> because it helps gives other researchers inspiration, ever since someone
> thought of looking at these problems like electrical flow in 2010, there
> have been ever faster approximations for it, while others will find ways to
> implement the algorithm efficiently, just in case a naive implementation of
> the offered algorithm isn't efficient enough.

> Dismissing such research without having found any actual extreme
> inefficiencies in the proposed algorithm is not an attitude suited to anyone
> who wants to see progress.

I never dismissed this research or said it's not useful; on the contrary, it's
very interesting. I was just arguing against the usefulness of benchmarking
results for such algorithms, when the main improvement is the lower
complexity.

EDIT:

> And finally, the actual speed increase in the past decades hasn't been from
> O(N^2) to O(N^1.9) but from O(VE^2) to O(VE), which obviously does have a
> real impact, and even with a non-realistic 10000 times bigger constant it
> would still be far faster in any graph with more 10000 edges. And graphs
> with billions of edges aren't unusual anymore.

One concrete example I had in mind is Strassen's algorithm, which reduces
matrix multiplication from O(N^3) to O(N^2.8) or O(N^log 7). It doesn't seem
to be worth it to use Strassen over the simple algorithm for relatively small
matrices.

There are other examples of algorithms and data structures that offer an O(N
log log N) time, where the previous algorithm needed O(N log N) to solve the
same problem. You need a very large data set to justify using the former,
especially if the latter are much simpler to implement.

> As you are talking about reasonably sized input, your constant examples
> should also be reasonable. 10000 times difference in the constant is not
> reasonable. You wouldn't even get that from going from all data in the level
> 1 cache to random memory access with 10 times more instructions.

10000 was just an example, I guess it was unrealistic. In practice, I think an
100x slowdown from an algorithm that has bad cache behavior makes some sense.
[1] says that an L3 cache hit has a latency of 1-300 cycles, as opposed to a
L1 hit that is only 4 cycles. That's almost an 100x difference, so
theoretically a cache-friendly algorithm can be that much faster (and that's
just the difference between L3 and L1, main memory takes even longer to
access).

1 - [http://stackoverflow.com/questions/4087280/approximate-
cost-...](http://stackoverflow.com/questions/4087280/approximate-cost-to-
access-various-caches-and-main-memory)

------
ckaygusu
I just skimmed through the paper thinking maybe I can grab one thing or two.
What I've found there is completely black magic.

~~~
GarvielLoken
Your method of learnings is flawed. The specific lesson to be learned here is
that they changed their view on the solution for a max-flow problem from a
serial probing algorithm to a parallel electrical probing solution and lowered
the time-complexity. The general lesson to be learned here is that they
changed their view on the solution for a massive parallel multiple choice
problem from a serial probing algorithm to a parallel probing solution and
lowered the time-complexity.

The actual implementation is uninteresting and will always vary according to
domain and optimization, but the intuition lesson is always a valid tool that
we now have. And in hindsight this was a really obvious solution, carouse one
should try all the routes and identify the bottlenecks.

Food for thought, single cpu core vs GPGPU.

------
abc_lisper
It looks like this means we can now redirect traffic on highways to avoid
congestion. I know current gps' do this now. But in future, they could suggest
different paths for different users, even though they are travelling to the
same approximate destination.

~~~
mcintyre1994
That sort of thing seems like a real possibility for self-driving cars and
drones too. Swarm navigation sounds like a really interesting area actually.

------
bidev
Just a thought that , could use of Drones for transportation make this problem
less useful ?

I admire solution presented here however just saying.

~~~
cliveowen
Optimizing the distribution of goods through a network of highways is just one
of the, literally, hundreds of applications where this could (and probably
will) be useful.

~~~
tubbzor
You are very correct. Max-flow/min-cut reduces to tons of other problems where
this algorithm will be optimized.

Here's a nice little introduction I found with some reductions on the first
couple slides:
[http://www.cs.princeton.edu/courses/archive/spr04/cos226/lec...](http://www.cs.princeton.edu/courses/archive/spr04/cos226/lectures/maxflow.4up.pdf)

I just finished my algorithms undergrad class last semester; one of the
questions on the final was 'you are a terrorist looking to cut off supplies to
your enemy, which 3 nodes must you destroy to cut off the supply network?' And
there was a complicated graph with probably 10 nodes and twice as many edges
(very difficult to brute force in the allotted time), and we had to find the
min-cut which would severe the flow. Some really cool problems reduce to max-
flow.

~~~
mcintyre1994
Our lecturer told us that the problem was initially considered simultaneously
by Russians trying to optimise the rail transport of stuff from USSR back to
Russia and Americans trying to work out the cheapest way to destroy that
network. Not sure if it's true, but it helps me think about the problem a lot.

Edit: I clicked the link, he used them exact slides. That set of slides comes
up a lot, they're great.

~~~
tubbzor
You are right! Edmonds-karp was developed (really just a modified Ford-
Faulkerson approach with a BFS as opposed to a random augmenting path...but
we'll give it to him), yielding an O(ve^2) algorithm. A russian (soviet)
mathematician named Dinic nearly simultaneously, and independently, developed
an O(ev^2) algorithm which we can see beats Edmonds-karp as edge density
grows.

