
The Most Important Algorithms - iamanet
http://www.risc.jku.at/people/ckoutsch/stuff/e_algorithms.html
======
dododo
quite a lot of structure amongst these algorithms that the list misses out:

\- dynamic programming comes from solving Bellman's equation and Q-learning is
an approximate means of doing dynamic programming.

\- It's possible to do beam a* search, where beam search, a* search, and best-
first (greedy) search are special cases.

\- in the continuous domain, greedy search is essentially gradient ascent.

\- you can use discrete differentiation to find the gradients for gradient
ascent (actually, i would replace the first order method suggestion of this
list with newton-raphson or some quasi-newton raphson method like BFGS).

\- EM ties some bits together.

\- if you replace the max of the viterbi algorithm with a summation, then you
get the sum-product algorithm which is essential in the E step of EM. if you
use viterbi instead of sum-product, you get something known as zero-
temperature EM, which is an approximate form of EM

\- the M step of EM is typically just gradient ascent!

\- you can use max flow to do a similar thing to viterbi on certain graphs

and so on...

~~~
jules
I think the dynamic programming meant here is more general. They mean saving
results of a recursively formulated algorithm in a table so that you don't
have to recompute them. This can be used for example for finding longest
common subsequences, but it is much more general.

Can you elaborate on the general search procedure?

Is there a good resource that ties algorithms together like you've done here,
but less concise?

~~~
dododo
they do mean dynamic programming in a more general sense: but dynamic
programming was invented originally to solve Bellman's equation. it turns out
many other problems have a similar structure which is quite surprising! before
bellman's seminal work in 1940s, it wasn't known how to efficiently these
problems. indeed, you can often do something like Q-learning for finding
approximate solutions to dynamic programming problems even faster.

the general search procedure is just A* search but with the beam search
pruning each time you expand the fringe of exploration.

i don't know a good source, i picked this up from a bunch of courses/research.

~~~
jules
Thanks! You should consider writing an article about this :)

------
nostrademons
This reads like a textbook, i.e. it's algorithms that are highly useful to CS
professors and students but not necessarily to practitioners. If I had to pick
a top-10 list based on my professional experience, it would be:

1\. Tree traversal. This comes up _all the time_ , from directory walks to DOM
traversal to code analysis to folds over abstract data types. It's also
something that you need to know and can't always rely on having a library
function for, because many things exhibit tree-like structure without actually
being instances of your language's Tree data type.

2\. Hashing. Obviously as a base for hashtables (which are often your
dictionary data structure of choice), but also as a general technique for
generating a small fingerprint from a large set of data.

3\. Statistics - mean/median/mode, but also things like confidence intervals,
regressions, sample sizes, how to make sure your populations unbiased, etc..
You are evaluating your product, right? Knowing how to make good inferences
from data is critical, particularly since there're lots of ways you can do it
wrong.

4\. Sorting. This can usually be hidden behind a library function, but it's
useful to understand because many other algorithms have very different (and
often better) performance characteristics if the input data is kept sorted.
Binary search falls into this bullet point too.

5\. Data compression algorithms. Another one that you'll rarely have to
directly implement, but knowing their characteristics helps you make good
speed/space tradeoffs for the rest of the system.

6\. Bloom filters. This is one that I never learned in school but tends to be
incredibly useful when dealing with massive data sets. Oftentimes, you want to
be able to quickly _reject_ elements if they're _not_ in a set, but don't
really care about it taking a while if they _are_ in a set (because you expect
that many more elements will be outside of the set than inside it). A Bloom
filter is one of the fastest, most space-efficient ways to do this.

7\. Topological sort. Dependency graphs come up all the time in practical
programming problems, and usually you want a way to linearize them into a list
of tasks that you can perform in order. It's also a shame that many languages'
standard libraries don't have a topological sort function, so oftentimes you
_will_ have to implement this one yourself.

8\. Support vector machines. These are your bread & butter machine-learning
classifiers, and are really nice to know when you've got a bunch of data and
want to try to automate some rote classification job.

9\. Physics simulations. I've found it surprising how useful my intro physics
course knowledge has been. Things like vectors, position vs. velocity vs.
acceleration, damped harmonic oscillators, FFTs, etc. It's most useful when
building UIs and games - oftentimes, you can make a UI feel significantly more
natural by adding easing functions that behave like acceleration when you ease
in and frictional damping when you ease out.

10\. Unification. I'm probably biased because a lot of my hobby programming is
in compilers & type systems, but unification comes up all the time in
compilers, and is a really general technique for equation solving.

~~~
kiba
_Physics simulations. I've found it surprising how useful my intro physics
course knowledge has been. Things like vectors, position vs. velocity vs.
acceleration, damped harmonic oscillators, FFTs, etc. It's most useful when
building UIs and games - oftentimes, you can make a UI feel significantly more
natural by adding easing functions that behave like acceleration when you ease
in and frictional damping when you ease out._

Any good physics book that I could look into? Preferably books that integrate
physics with programming?

~~~
nostrademons
I believe my course used Halliday, Resnick, and Walker, but at the intro
level, basically anything should do.

I wasn't even really thinking about the integration with programming - the
parts I've used have mostly been just knowing what the basic equations were,
along with representations like vectors and such. If you have a function of t,
then all you need to do to simulate it is advance t by some infinitesimal
amount (based on the frame rate, usually) and then compute your new positions.
I guess that when you get to more complex simulations, then things like fast
matrix multiplication and symbolic differentiation may be useful, but I've
never actually used them in my own programming.

------
miguelpais
A* is not just that. The way it is said seems to imply that it is equal to
Best First Search. Best First Search only relies on the heuristic function, so
for the following graph, where A is the start node and h is the heuristic
function:

+---A---+

|...........|

B h:1....E h:4

|

C h:2

|

D h:3

The traversal order according to Best First Search would be:

{A, B, C, D, E}

While A* will search by g(x) = h(x) plus d(x), the latter being the distance
function from the start note to the given node. So, assuming the distance
function to be equal to the depth of the node in the tree it would traverse it
this way:

{A, B, C, E, D}

g(D) = h(D) + d(D) = 3 + 3 while g(E) = h(E) + d(E) = 4 + 1 = 5, so E goes
first.

------
SlyShy
Thanks for posting this, I have a bunch of algorithms to learn. :)

~~~
mahmud
Instead of going by a checklist, try to approach it by domain: Searching;
Sorting; floating point, integer computing, and bit-manipulation; numerical
analysis and computation; DSP; optimization and dynamic-programming;
information theoretic stuff like compression and encryption; graph theoretic
algorithms; symbolic algebra; geometric and hierarchic data structures and
algorithms; statistical, probabilistic and inferential algorithms; string and
sequence processing along with linguistic techniques, etc.

The whole point of big encyclopedic texts like Cormen et al. is to wet your
feet and give you a broad exposure to various techniques. That way you have a
small idea on what you want to use next, and you know which domain to focus
your research.

My one recommendation is to ditch programming languages with huge boot-times
when playing with algorithms. You want something that you can interact with
live and see results; in that regard, even a symbolic algebra system like
Maxima would be better than C, C++ and Java. Lisp and Python would be ideal
for most algorithms.

~~~
iamanet
I am going through Cormen et al. with the objective of getting my feet wet but
the whole process is painfully slow. I thought it would be nice to have at
least some familiarity on the widely used algorithms across the globe.
However, you are right on ditching programming languages with huge boot times.
I am happily trying out my algorithms in Python.

~~~
sb
CLR is an extremely well written algorithms textbook, but I use it more as a
reference than for self-study. My first algorithms book was Sedgewick's
Algorithms (where all algorithms were presented in Pascal), which is a very
good algorithms text that is much lighter on several aspects.

Recently, however, I came across the following gem: Algorihtms + Data
Structures = Programs, by Niklaus Wirth. (an Oberon version of 1994 is
available for free: <http://www.oberon.ethz.ch/WirthPubl/AD.pdf>). I think
this is hands down one of _the_ best algorithm books. It amazes me how much
content Niklaus Wirth is able to present in concise, yet crystal clear
writing. Besides the usual algorithms, he includes very interesting
applications that probably no other book does: using the partitioning of
QuickSort to find the median (pg. 56 in above PDF, Section 2.3.4), based on
algorithm by C.A.R. Hoare, and an in-depth discussion of polyphase sort (pg.
70, Section 2.4.4), which might be interesting for heavily distributed sorting
(that at least what I imagined as a possible application when I recently re-
read parts of it).

~~~
iamanet
Thanks. BTW, here is the correct link for the book that you referred to on
your comment - <http://www-old.oberon.ethz.ch/WirthPubl/AD.pdf>.

~~~
sb
Thanks for the correction!

------
keefe
Including dynamic programming is kind of like including divide and conquer.

nice list I'd also nominate :

support vector machines, back prop neural net training, delaunay
triangulation, floyd-warshall, kruskal's mst, newton's method, edit distance
algorithm, huffman coding or something with compression (I'd vote zip), some
kind of reduction algorithm and we can't forget simulated annealing

------
rayval
Good list.

I would include some more pragmatic algorithms, such as Frank Liang's
hyphenation algorithm (PhD thesis at Stanford in 1983)
<http://www.tug.org/docs/liang/>

Also De Casteljau's algorithm for Bezier curves from 1959 and deBoor's
algorithm for B-splines.

The above are the ones that I have had to implement in the past as part of my
work. There is a whole bunch of other algorithms in computer graphics that are
equally (or more) important, which many of us rely on but fortunately don't
have to implement.

Oh, another classic, pragmatic algorithm is John Carmack's implementation of
BSP-based pseudo-3D rendering for the Doom game engine in 1993. See
<http://doom.wikia.com/wiki/Doom_rendering_engine>

------
j_baker
Erm... How did this list include the merge sort and heap sort, but not the
quicksort?

~~~
beagle3
I don't agree with that list, but I think quicksort is given way more weight
than it deserves. I guess it's a desired result of whomever named it
"quicksort".

It's not particularly quick, has horrible worst case guarantees (which, if you
care to improve, would make it slower still and very complicated), and is easy
to get wrong on many accounts (repeating elements; already sorted input;
unlimited recursion depth).

heapsort is simpler than quicksort, and has the best worst-case complexity you
can get. It's not stable, but then neither are most quicksorts.

quicksort has its place, but it gets a lot more attention than it deserves.

~~~
_delirium
The main advantage to quicksort is that it's still the fastest average-case-
in-practice of the common algorithms. If you want an _O(nlogn)_ worst-case,
introsort combines quicksort, which is fast on average but _O(n^2)_ worst-
case, with a fall-back to an _O(nlogn)_ heapsort, with a guarantee that it'll
never take more than _O(nlogn)_ before it falls back, making it still worst-
case no worse than _O(nlogn)_. It'll of course be a constant-factor slower in
those fall-back cases than if you had just used heapsort or something
directly, but it'll asymptotically be no worse, and on average will get you
the nice quicksort win.

Link: <http://en.wikipedia.org/wiki/Introsort>

~~~
beagle3
> The main advantage to quicksort is that it's still the fastest average-case-
> in-practice of the common algorithms.

I don't think that's true if you average over the available distribution of
_implementations_ in the wide. Almost all implementations do one to three of
the "crimes" I mentioned above. And frankly, I've met too much real world
cases that triggered an n^2 behaviour on an existing implementation to ever
settle for anything with worse than n*log n worst case.

If you're looking for the fastest-in-both-theory-and-practice algorithm,
that's TimSort, but it's not in place. The places _in practice_ where
Quicksort is the right answer are few and far between. I haven't seen any in
the last 10 years.

------
known
<http://www.itl.nist.gov/div897/sqg/dads/terms.html>

------
mfukar
Since when is dynamic programming considered an algorithm?

------
Herring
pfptw2m

