
Dijkstra’s algorithm and the Fibonacci heap - tosh
http://maryrosecook.com/blog/post/the-fibonacci-heap-ruins-my-life
======
thegeomaster
Please do note that, depending on the dataset size, it may be faster to just
use a plain old binary heap. In this case, the big O cost for Dijkstra's
algorithm is O((E + V) log V), as opposed to the better O(E + V log V) for a
Fibonacci heap. Note, also, that Fibonacci heap operations are amortized time,
so if you're not operating on a large enough dataset, the costs will not
amortize enough and you may end up with a slower real time.

A binary heap also has better locality of reference, because you have more
chances of a cache hit, especially when you're near the root. The Fibonacci
heap, in contrast, keeps a number of pointers to memory locations that are all
allocated on-the-go and possibly very far apart. If the binary heap is
implemented as a B-heap, as described by PHK [1], it can be made even faster.

My point is: please benchmark. Always benchmark. "Stupid" algorithms and data
structures may work much better for your case than you would have thought.
Programmers are notoriously bad at finding the real bottlenecks... so please
benchmark and profile. Fibonacci heaps, AVL trees, KMP, Coppersmith-Winograd,
these are all little works of art and theoretically great solutions. With big
data, they _will_ perform better than more naive solutions. But that big data
doesn't happen as often as you might think. Computers are weird, have too many
quirks, and their operation will surprise you, no matter how skilled you are.
So profile your programs.

[1]:
[http://queue.acm.org/detail.cfm?id=1814327](http://queue.acm.org/detail.cfm?id=1814327)

~~~
anaphor
Thanks, this is a pet peeve of mine. Theoretical bounds are all nice and such,
and it's cool if you can prove a better running time using some algorithm, but
if it doesn't actually perform better in practice most of the time then it's
not really worth much. I blame computer scientists who are in a rush to
publish the best theoretical solution but never bother to consider whether it
has practical impacts. Also textbooks who blithely claim you can do better
with such and such a data structure using a certain algorithm.

See this StackOverflow question for some nice insights:
[https://stackoverflow.com/questions/504823/has-anyone-
actual...](https://stackoverflow.com/questions/504823/has-anyone-actually-
implemented-a-fibonacci-heap-efficiently)

~~~
alco
It is worth pointing out here that Quicksort has complexity of Ɵ(n²) in the
worst case. Yet it has worked out as the best one in practice in many
applications.

~~~
shanusmagnus
That's true, but it's also worth pointing out that the worst case can be
trivially prevented by shuffling the array before you Quicksort it. Which is
maybe a confirmation of your larger point.

~~~
ruggeri
Strictly speaking, it is not true that shuffling first avoids the worst case.
Given a deterministic shuffling function, some permutation of input needs to
shuffle to sorted order, which will again trigger the worst case.

Of course, shuffling is still potentially helpful because ascending/descending
order is often common in many applications.

~~~
brudgers
Randomization of the pivots produces O(n log n) average running time with a
very low probability of any other running time for any value of n where the
worst case running time matters.

~~~
ruggeri
Even without randomization of pivots expected run time is O(n log n). If the
order of the input data is random, I don't believe randomizing pivots changes
the distribution of running times.

What changes is that one very common permutation of data (ascending data) does
not have O(n * 2) performance.

~~~
brudgers
There's little justification for expecting data to behave randomly unless our
algorithm introduces it.

~~~
mtdewcmu
Correct. In general, real data will seldom, if ever, look like random data.

------
ruggeri
Cool article.

Dijkstra's algorithm factors into two components: (1) time spent selecting
best paths and (2) time spent updating the best currently known paths.

In a naive implementation, a best path to every unvisited vertex is kept.
Selection of a best path to a vertex v_0 is made by a linear scan of the paths
(O(V) each time). Locking in a path causes us to possibly update paths to
every vertex v_1 that is touched by an edge from v_0 (constant time per out
edge of v_0).

Overall, time complexity is O(|V||V|) selecting paths, and O(|E|) updating
paths. Since |E| is bounded by |V|(|V| - 1), the time is O(|V| |V|)
(insensitive to density).

Using a min-heap to store best known paths, we spend O(log(|V|)) time
selecting each path, but an update takes O(log(|V|)). This means the total
time complexity is O(|V|log(|V|) + |E|log(|V|)).

In the case of dense graphs, this is worse: O(|V| |V| log(|V|)). We're trying
to keep the heap organized to allow fast extract, but there are too many
updates and the heap is changing too much iteration to iteration. OTOH, if the
grpah is sparse, |E| is in O(|V|), so we reduce the time complexity to
O(|V|log(|V|)).

A fib heap keeps the same extract time, but updates are O(1) amortized. Thus
the time complexity is O(|V| log(|V|) + |E|). If the graph is dense, this is
O(|V| |V|) (as good as naive). If the graph is sparse, this is O(|V|log(|V|))
(as good as bin heap).

This is useful if we do not know whether the graph is sparse or dense.
However, I am not sure what the constants are on Fib heaps. If you know the
density of the graph, I certainly expect using the appropriate of the first
two ways is superior. Another thought: you could always speculatively execute
both algorithms and see which finishes first :P

Anyway, I hope that's not too boring of me.

Edit: Grr. Asterisks.

------
brudgers
I've been taking Roughgarden's Algorithms: Design and Analysis on Coursera and
implementing the assignments in Racket. So in a sense, I identified the
author's plight with my own recent and current experiences.

One of the early lessons I learned is that designing functional versions of
algorithms is a whole higher level of hard over and above implementing
conventional versions. For me, getting to a functional or less imperative
version is an iteration of an imperative preliminary design. There's a stage
at which it is easier not to get bogged down with data structures as values
and instead let them be variables because it allows me to simplify the
transition to the next state as (next-state). I don't have to add additional
parameters to every function and coordinate them between functions.

I think that it easy to lose site of the fact that functional programming is
supposed to make our lives easier by making it easier to reason about our
code. Jumping from imperative pseudo-code to an idiomatic functional
implementation is not easier for me to reason about than from the pseudo-code
to It's direct implementation. And it's easier to reason from a direct
implementation to a more functional one after I've got a feel for the direct
implementation.

Probably it's an intellectual shortcoming on my part that functional
implementations don't come readily to me when looking at imperative pseudo-
code. I just have to live with the fact that even money says I am below
average and will probably have to be naughty and use mutation sometimes.

~~~
AndyKelley
I highly recommend Tim Roughgarden. Best class I've taken on Coursera by far.

~~~
brudgers
I took Discrete Optimization. and it's pretty good, too. But I was way out of
my depth The cool thing is though that it gave me anchors for a lot of the
algorithms in Roughgarden- I look at the algorithm and see an application for
it from that class

But Roughgarden is top drawer, too.

------
colanderman
This isn't a problem; you can get around it by indirecting such "pointers"
through an (immutable) array. When you want to reference something, make sure
it's in the array and reference its location. When you want to "mutate"
something, update the array with that thing's old entry pointing to its new
value instead.

(Side note: the union-find algorithm is another interesting algorithm with the
same "problem".)

~~~
jblow
"This isn't a problem"? You're so sure?

How many nodes are there? Let's presume there are a lot, otherwise the problem
is trivial and speed doesn't matter anyway. So now you have an N-long
immutable array that you are copying every time you want to change a node
pointer? So you have changed the operation of pointer-changing from O(1) to
O(N)? What does that do to the run time of the algorithm?

Also, your garbage velocity just went WAY up. What does this do for the
runtime of your program generally?

~~~
jblow
(The language implementation can of course engage in strategies to avoid a
full array copy, that you as the user have little insight into, but these are
often questionable as they slow down run time in other cases, and anyway, they
can only mitigate this problem, which is not going to go away.)

~~~
colanderman
Uh, most good implementations of languages with immutable arrays perform O(1)
or O(log n) array updates. The two I use most, Mercury and Erlang, both do
(O(1) and O(log n) respectively).

If you, the user, has "little insight" into the runtime of data structures
you're using, you have bigger issues. Every standard library I've ever seen
guarantees the asymptotic behavior of hash tables, sets, linked lists, etc.
Immutable arrays are no different.

------
usamec
There is no reason to implement fibonacci heap for Dijkstra. Either you have
relatively small data (up to 10 mil. nodes and edges) and normal heap is a way
faster or you have bigger data and you should start thinking about things like
A* where fibonacci heap is again useless.

You should note, that log n factor is so small (up to 40 on the biggest data
you will ever see), that it can be easily dominated by other constant factors
in implementation (like cache locality, ...).

~~~
m3koval
Why would you not use A* from the beginning? It's a trivial extension to
Dijkstras and is often orders of magnitude faster when an informative
heuristic is available.

Also, both algorithms require identical data structures. After all, Dijkstras
is just A* with a zero heuristic.

I do agree about the constant factor, though: it's likely that a binary heap
would be faster on most data sets.

------
ambrop7
Keep in mind that functional programming is inherently less efficient than a
random access machine[1]:

... And it is not easy to create their equally efficient general-purpose
immutable counterparts. For purely functional languages, the worst-case
slowdown is logarithmic in the number of memory cells used, because mutable
memory can be represented by a purely functional data structure with
logarithmic access time (such as a balanced tree).

In this case I think you could use (a pair of?) balanced trees to keep a
mapping between the graph and the FIbonnaci heap, as well as for any "mutable"
state you need to keep per node.

[1]
[http://en.wikipedia.org/wiki/Functional_programming#Efficien...](http://en.wikipedia.org/wiki/Functional_programming#Efficiency_issues)

~~~
xyzzyz
Fortunately, most functional languages (including Haskell) provide mutable
random access arrays, so the slow-down is theoretical.

~~~
acqq
But then the code is not "pure" anymore and it even isn't different than the
non-functional one.

~~~
xyzzyz
This just proves that "pure" in this sense is a useless concept.

~~~
lvh
How so? All I'm seeing is that some speedups aren't possible. It still seems
like a useful concept, particularly since "faster" was never one of its
promises (maybe "easier to parallellize", though.).

~~~
lucian1900
Interestingly, in some cases it is actually faster. Modern hardware is
peculiar in that respect.

------
lelf
And zippers are just beyond cool. You can formally take a derivative of data
structure to find one.

List is L(a) = 1 + a L(a) (≡ Nil or Cons(a, L(a))

L(a) = 1/(1-a)

L'(a) = 1/(1-a)^2 = L(a)L(a) (≡ Pair of L(a) and L(a))

This is our zipper. One list holding for the tail and one list to remember how
we get to where we are.

This one is trivial, but for e.g. trees it will be utterly fascinating.

~~~
TheLoneWolfling
Why does 1/(1-a)^2 = L(a)L(a) ?

~~~
nilkn
I'm not familiar with this calculus of data structures, but I'd surmise it's
because L(a) = 1 + a L(a) implies that (1 - a) L(a) = 1.

------
kylebrown
> _Immediately, I discovered that tree structures are more complicated in
> languages like Clojure that have immutable state. This is because changing a
> node requires rebuilding large parts of the tree._

This must have implications for implementing a Merkle tree as an immutable
data structure. Anyone know how a bitcoin client in clojure might deal with
this?

~~~
Volundr
You'd use mutable state, either in the form of java collections, or atoms and
family.

Even Haskell, where the State monad itself is actually immutable, provides an
escape hatch into mutable state for when it's really, truly required. Just if
you do use it when it's not, great shame shall be visited on you and your kin.
By which I mean someone in #haskell will very politely point out how you could
have done it purely.

~~~
kylebrown
Right, using a mutable structure is the easy way. But is there a way that
would maintain the advantages of immutability? (eg memory-efficient and easy
undo)

~~~
taeric
Are those truly the advantages of immutability?

Easy undo can be achieved just as easily by reversing operations, in many
cases. No need to keep two copies of a potentially large structure when you
could just keep the diff and reverse apply it. Oddly, I think typically you
would do both. Keep a few large snapshots with small diffs between stages.
That is digressing, though.

As for memory-efficiency... not sure how keeping many potentially large copies
is more efficient than just modifying a single copy.

Now, I can agree that in many cases it is nice that it prevents you from
worrying about race conditions across threads. Though, I'm also not convinced
that immutable things are any better than traditional locking strategies. Is
that race truly won?

~~~
dbaupp
One doesn't need to keep large copies, since immutability allows common
substructures to be shared. (That said, it still seems rather unlikely that an
immutable implementation will use less memory.)

~~~
Rapzid
That's not a property of immutability. That's a property of certain
specialized data structures, see "Purely Functional Data Structures" for a
start, implemented in clojure.

~~~
chowells
Well, it's a property of basically everything except arrays. So it's more than
just "certain specialized data structures".

~~~
taeric
To be fair, many folks learn immutable data structures with the so called
"persistent" data structures.

Which are cool, to be sure. They are far from new, however. So, I'll be a
little surprised if it turns out they are a panacea.

------
sujayakar
In the opposite direction of pragmatically avoiding "fancy" data-structures,
you can find a description of a persistent min-heap via Brodal queues in Chris
Okasaki's wonderful book _Purely Functional Data Structures_. Otherwise, it
doesn't make too much sense to use the Fibonacci heap algorithm without
mutable update.

~~~
hvidgaard
Speaking of, I had Mr. Brodal in my algo class. That man thinks in algorithms.
He can make it seem so intuitive and easy to understand, when he's explain it
at the blackboard. That lasts until you're trying to implement it and get
bogged down by details. I miss the algorithm classes.

~~~
sujayakar
That's so cool! Any fun stories or obscure algorithms that came up?

------
icarus127
You may be interested in this paper
www.cs.ox.ac.uk/ralf.hinze/publications/ICFP01.pdf. It solves Dijkstra's
algorithm as an example for the implementation of a priority search queue data
structure, which is very similar to a priority search tree. These structures
act as a heap on one index and a search tree on a second index. So you get
both efficient update and find-min.

~~~
fbrusch
I think your link is the perfect answer to the problem highlighted in the post
(which is, incidentally, a problem I was facing myself, so thank you!).
There's also an Haskell implementation of priority search queues:
[http://hackage.haskell.org/package/PSQueue-1.1/docs/Data-
PSQ...](http://hackage.haskell.org/package/PSQueue-1.1/docs/Data-PSQueue.html)

------
pcvarmint
Heaps are the the most elegant data structure, IMO.

So many ways to prioritize. So many ways to support the same basic operations,
with different tradeoffs.

My semi-outdated site:
[http://leekillough.com/heaps/](http://leekillough.com/heaps/)

~~~
amitp
I love that page! I link to it from my own semi-outdated page:
[http://theory.stanford.edu/~amitp/GameProgramming/Implementa...](http://theory.stanford.edu/~amitp/GameProgramming/ImplementationNotes.html#data-
structure-comparison)

~~~
shanusmagnus
I in turn love your page -- so useful how you show all the data structures
working together to enable a larger task. Bookmarked.

------
gelisam
How about adding new elements with lower keys instead of decreasing the
existing elements? Then, once we count that more than half of the n elements
are duplicates, we could spend O(n log n) operations cleaning up the heap by
recreating it.

------
pilooch
Fibonacci heaps are very interesting beasts, but implementation is slightly
complicated. For C++ developers there's one in Boost that is not so easy to
use quickly.

I've recently put an alternative one in C++11 here:
[https://github.com/beniz/fiboheap](https://github.com/beniz/fiboheap)

------
madaxe_again
Ah, I went on this journey a decade or so ago when building a social network -
"how are we connected" type feature, and tried to go down the same route with
a Fibonacci heap - and eventually threw in the towel and implemented A* , then
D* a while later when scale started to be an issue.

------
jmc734
Why can't the Fibonacci heap node store a pointer to the graph node and when
it is updated, use that pointer to store the new pointer to the heap node in
the graph node?

------
candu
Someone else mentioned this below, but it deserves repeating: Okasaki's Purely
Functional Data Structures is definitely worth reading.

------
Kiro
Why was the title changed? It was originally the same as the post's; "The
Fibonacci heap ruins my life".

~~~
tzar
From the guidelines: "Otherwise please use the original title, unless it is
misleading or linkbait." I think the original title qualifies as at least one
of those.

