
What algorithm blows your mind? (Reddit compsci) - barrkel
http://www.reddit.com/r/compsci/comments/dw4l8/reddit_what_algorithm_blows_your_mind/
======
jgrahamc
I've always found error detecting/correcting codes to be lovely. A great
example, that's easy to understand, is the Hamming(7,4) code.

<http://en.wikipedia.org/wiki/Hamming(7,4)>

I got interested in these codes because of the problems involved with deep
space communication.

[http://en.wikipedia.org/wiki/Error_detection_and_correction#...](http://en.wikipedia.org/wiki/Error_detection_and_correction#Deep-
space_telecommunications)

~~~
icegreentea
I learned Hamming code in school this summer (digital systems) .It was
probably the single most magical thing (even more than flipflops! Loops make
memory!) I was exposed to that class.

I remember upon learning it someone went "wow, who came up with that". The
prof answered dryly "some genius named Hamming".

------
mayank
All very nice suggestions below, but I found suffix trees missing:
<http://en.wikipedia.org/wiki/Suffix_tree>

And the reasons why it blows my mind:

(1) Knuth called it "Algorithm of the year 1973" (possibly because it beat an
intuitive lower bound that he had for a problem I can't remember).

(2) It's relatively new for a "core", first-principles algorithm. Ukkonen's
algorithm is from 1995, although there are earlier versions.

(3) This BOOK is largely devoted to the many, many applications of suffix
trees: [http://www.amazon.com/Algorithms-Strings-Trees-Sequences-
Com...](http://www.amazon.com/Algorithms-Strings-Trees-Sequences-
Computational/dp/0521585198) It should be required reading for anyone
interested in advanced algorithms.

(4) Longest common substring in linear time.

(5) Lempel-Ziv decomposition in linear time.

(6) Longest repeated substrings in linear time.

And too many more to list.

------
nl
Bloom Filters: <http://en.wikipedia.org/wiki/Bloom_filter>

Sort of a probabilistic hash, where you trade space for accuracy. But it's
also like a memory function - it can remember if it has seen a piece of data
before.

~~~
daeken
What I love about bloom filters is that when checking to see if something is
in a bloom filter, you can only get false positives, not false negatives. That
property is just awesome to me.

~~~
bobds
Also the fact that you can use more bits per element and reduce the error rate
for false positives. Using around 3 bytes per element, can bring the error
rate down to 0.001%.

~~~
signa11
also the fact that you can use multiple hash functions to reduce the error
rate even further is pretty nice. this is of course without incurring the
overhead of extra storage space...

~~~
cperciva
Using multiple hash functions requires that you use additional storage in
order to maintain the same occupancy ratio.

------
gamache
Singular value decomposition. It continually amazes me how many problems --
from noise reduction to Netflix prediction -- SVD can be usefully applied to.

<http://en.wikipedia.org/wiki/Singular_value_decomposition>

~~~
cdavid
The core ideas behind SVD and eigen values are very rich and profound. It has
always fascinated me when I studying maths in undergrad.

Someone else already mentioned compressed sensing, which expands on some of
those ideas. Terrence Tao had a pretty good presentation on the topic:
[http://terrytao.files.wordpress.com/2009/08/compressed-
sensi...](http://terrytao.files.wordpress.com/2009/08/compressed-sensing1.pdf)

------
jefffoster
Compressed Sensing is exciting - using a sequence of low res images to obtain
a higher resolution sample. It was discussed in Wired
(<http://www.wired.com/magazine/2010/02/ff_algorithm/all/1>) together with a
compelling example (minimizing the time of a young patient in an MRI machine)

~~~
thalur
That sounds far too good to be true. I guess if you don't care about the fine
details then its a decent technique, but to take the MRI example, what if the
thing that was wrong was only visible in those small details?

~~~
cdavid
There are other things which sound too good to be true and yet are true (the
sampling theorem for exact reconstruction of periodic signals is pretty
unintuitive IMO, and even more "magic")

~~~
Retric
Filling in missing information with a reasonable guess is vary different from
having accurate information. Looking at a low res impressionist painting you
are going to fill in based on reasonable estimates for the real world not what
is actually there.

~~~
nuitblanche
In compressed sensing, we don't do guesses. Guesses are reserved for
inpainting.

------
endtime
For me, it's boosting. Basically, if you make the assumption that your weak
classifier can achieve X% correct on any random distribution over a data set,
then you can create an ensemble of weak learners that together get Y% correct
on the same set, where Y > X.

Question 3 here will walk you through the proof: <http://bit.ly/cQ03na>

~~~
Figs
Please don't post links through bit.ly and other URL indirection sites. It
hurts the web by making it much more difficult to follow the link in the case
that an intermediary goes out of business. Hacker News and your post will
probably be around in a few years, but will bit.ly? (What's their business
model, exactly...?)

For future reference, the unshortened version of his link is:
<http://www.stanford.edu/class/cs221/handouts/cs221-ps2.pdf>

~~~
benmccann
Not to mention the real URL shows that the link is going to Stanford whereas I
have no clue whether the shortened link is going to Stanford, an XXX site, or
rick rolling me.

------
smcl
Quake3's Fast InvSqrt function:

<http://www.beyond3d.com/content/articles/8/>

It's been discussed here before, and no doubt it'll be cited numerous times in
the comments of that reddit post, but it was the first time I'd ever seen such
trickery. When I got into low level DSP programming I saw many more examples
of clever hacks, but this is the one which sticks in my mind above all of
those.

------
cperciva
My mind is blown by the algorithm for matching with mismatches which I present
in the first chapter of my thesis. It shouldn't be, given that I discovered
this algorithm -- but somehow "I take the Fourier Transfer of the FreeBSD
kernel" sounds more like the punch line to a joke than the first step in an
algorithm.

~~~
michael_nielsen
Speaking of incredibly cool algorithms based on Fourier transforms, the
Schoenage-Strassen algorithm for multiplying integers is also amazing:

[http://en.wikipedia.org/wiki/Sch%C3%B6nhage%E2%80%93Strassen...](http://en.wikipedia.org/wiki/Sch%C3%B6nhage%E2%80%93Strassen_algorithm)

At least to me, it seems hard to imagine that multiplying two N-bit integers
could be done with less than O(N^2) operations. Schoenage-Strassen lets you do
it using O(N log N log log N) operations!

~~~
lmkg
This is the one that blew my mind. "If we treat the numbers as polynomials,
then the multiplication can be reduced to a Fourier transform!" is also one of
those things that sounds like a punchline. Each step of the algorithm just
sounds stupider and stupider, and slower and slower... and then you get to the
last step, where all the multiplications by exponentials of complex numbers
get replaced by bitshifts. Suddenly the entire algorithm collapses in on
itself to reveal something efficient.

------
davidj
493108359702850190027577767239076495728490777215020863208075
018409792627885097658864557802013660073286795447341128317353
678312015575359819785450548115719393458773300380099326195058
764525023820408110189885042615176579941704250889037029119015
870030479432826073821469541570330227987557681895601624030064
111516900872879838194258271674564774816684347928464580929131
531860070010043353189363193439129486044503709919800477094629
215581807111691530318762884778783541575932891093295447350881
882465495060005019006274705305381164278294267474853496525745
368151170655028190555265622135314631042100866286797114446706
366921982586158111251555650481342076867323407655054859108269
562666930662367997021048123965625180068183236539593483956753
575575324619023481064700987753027956186892925380693305204238
149969945456945774138335689906005870832181270486113368202651
590516635187402901819769393767785292872210955041292579257381
866058450150552502749947718831293104576980909153046133594190
302588132059322774443852550466779024518697062627788891979580
423065750615669834695617797879659201644051939960716981112615
195610276283233982579142332172696144374438105648552934887634
921030988702878745323313253212267863328370279250997499694887
759369159176445880327183847402359330203748885067557065879194
611341932307814854436454375113207098606390746417564121635042
388002967808558670370387509410769821183765499205204368255854
642288502429963322685369124648550007559166402472924071645072
531967449995294484347419021077296068205581309236268379879519
661997982855258871610961365617807456615924886608898164568541
721362920846656279131478466791550965154310113538586208196875
836883595577893914545393568199609880854047659073589728989834
250471289184162658789682185380879562790399786294493976054675
348212567501215170827371076462707124675321024836781594000875 05452543537

------
RiderOfGiraffes
Lenstra-Lenstra-Lovasz:

[http://en.wikipedia.org/wiki/Lenstra%E2%80%93Lenstra%E2%80%9...](http://en.wikipedia.org/wiki/Lenstra%E2%80%93Lenstra%E2%80%93Lov%C3%A1sz_lattice_basis_reduction_algorithm)

High dimensional work is bloody hard, and this algorithm works amazingly well.
I've spoken with Lenstra (one of them) and he's amazingly insightful on these
things. He helped to crystalise my understanding of why high-dimensional
spheres should be thought of as "spikey," rather than "round."

~~~
jules
Why are high dimensional spheres spikey?

~~~
RiderOfGiraffes
Ah. I really should write that up. Far too long to explain in a comment here,
far to interesting (to me!) to forget or ignore.

I'll write it up and submit it. Anyone who cares to email me can get an early
version to read, and your feedback would be useful.

Please.

Thanks.

~~~
orangecat
Please do. My guess would be similar to Zaak's: the n-dimensional volume of an
n-dimensional hypersphere approaches 0 as n approaches infinity.
(<http://www.mathreference.com/ca-int,hsp.html>). A hypersphere with radius 1
centered at the origin has to get out to 1 along each axis (e.g. x=1, y=z=0
for a 3 dimensional sphere), and in order for that not to contribute much
volume, it has to be a narrow spike rather than a gradual curve.

------
barrydahlberg
How is generation of the Mandelbrot Set not on this list? That something so
simple could produce something so indescribably beautiful...

~~~
8ren
Agreed, but while the notation's simple, the iterative effect is hard to
follow -

it's like a spinning rod, angle doubled and length squared each iteration,
with a rod of fixed length and angle added to it each iteration (if the
spinner is facing away, it gets shorter, but if towards, it gets longer).
Iterate til it goes over 2. If it doesn't then it's in the Set (or maybe you
need to iterate more...). [that's my current understanding]

It's pretty clear to me that I can't visualize what it will do over a few
iterations (apart from simple cases) - and I sure can't see how it would
produce self-similar (ie fractal) shapes. Agreed that that's amazing.

------
michael_nielsen
The quantum search algorithm: <http://en.wikipedia.org/wiki/Grovers_algorithm>

It lets you search a completely unstructured N-item search space using the
square root (!) of N queries, not the N queries you'd think were necessary.

Also, the algorithm is so simple that once you know it's possible, and
provided you're very comfortable with basic quantum mechanics, it's almost
trivial to find the algorithm.

------
abhijitr
Ok, not quite an algorithm, but the PCP theory
(<http://en.wikipedia.org/wiki/PCP_theorem>) is easily the most mind-blowing
result in comp sci in the past 20 years.

~~~
mzl
PCPs are true magic. I once learned how to construct a PCP-proof with
verification using just 7 bits, but it still feels like magic to me. The 3-bit
variants are just crazy.

------
rhettinger
The Burrows-Wheeler transform used in bzip2 is truly mind-blowing.

Many algorithms are ordinary genius (eventually you would have come up with it
because the problem space dictates the solution), but this one is
extraordinary genius (mind-blowingly original and non-obvious). Every time I
see it, I'm amazed that it works.

------
cosbynator
Welzl's algorithm for minimum enclosing disk does it for me. Pick a point,
recursively see if it is in the disk without the point - if not, it is part of
the definition the minimum enclosing disk (forced to be on the boundary).
Randomized, simple, clean and expected O(n).

Not enough people know about it!
[http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.46....](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.46.1450&rep=rep1&type=pdf)

------
ardit33
Kruskal, Prims and reverse-delete algorithms to find the Minimum Spanning Tree
in a graph, are fun, not that hard, and very practical.
<http://en.wikipedia.org/wiki/Minimum_spanning_tree>

Think about it, every time you get a google map direction/route, one of them
is in play.

~~~
varjag
Also a bit different, but perhaps even more widespread application of spanning
trees is STP family of L2 network bridging protocols.

<http://en.wikipedia.org/wiki/Rapid_Spanning_Tree_Protocol>

------
joe_bleau
CORDIC (<http://en.wikipedia.org/wiki/Cordic>), used to efficiently calculate
trig functions with only very basic hardware requirements (add, sub, shift,
and table lookup).

~~~
hxa7241
. . . relatedly, from Jim Blinn, from Marvin Minsky -- drawing a circle (near
enough), or doing incremental rotation, using just integer add, sub, and two
shifts:

    
    
       N = 128 * 3.14159
       X = 1000
       Y = 0
    
       MOVE(X,Y)
       FOR I = 1 TO N
          X = X - (Y >> 6)
          Y = Y + (X >> 6)
          DRAW(X,Y)

------
elblanco
Some of the sorting algorithms, as somebody learning them, it's almost
entirely opaque how somebody could come up with those.

It seems like they must have sprung forth wholecloth to the inventor in the
shower...it's almost impossible to have iteratively developed some of them
because even small changes in the algorithms produce terrible results. I
remember thinking over and over again, "how the hell could somebody come up
with this?"

Searching in comparison looks very engineered, very studied, something that
most people could come up with given need, motivation and time.

------
jeffmiller
BSP trees blew my mind when I discovered they were under the hood of Doom.
This was back in the era of 386 and 486 PCs.

------
dlnovell
The Genetic Algorithm. It totally blew my mind when I read about it's
successful application to evolving satellite antennas for JPL, and the
automated design of an analog->digital signal converter that relies on a
component that isn't actually connected to anything else.

------
chrisaycock
Poul-Henning Kamp (creator of Varnish) recently published the B-heap
algorithm. It's a cache-friendly version of the binary heap storage we all
learned as freshmen.

PHK points out that mapping node n to nodes 2n and 2n+1 will cause cache
misses (and often page faults) as we vertically descend the tree. So instead,
rearrange the mapping so that the child nodes are often very near their parent
node. That way, heap comparisons are usually performed in the same virtual
page, which cuts-down on the number of disk accesses the operating system must
perform.

<http://queue.acm.org/detail.cfm?id=1814327>

------
mzl
More of a complexity result than an algorithm, but it is a really fun result
anyway. In complexity class we were given the problem to simulate an n-tape
non-deterministic Turing machine with a 2-tape non-deterministic Turing
machine with just a constant slow-down (which turned out to be 3). It really
gave an insight into the power of non-deterministic choice.

------
alphaBetaGamma
The binomial heap <http://en.wikipedia.org/wiki/Binomial_heap>

It's a heap supporting merging in log(n). While it's usage might not be very
common, it is a beautiful data structure.

------
djmdjm
I might be cheating a little, but
<http://en.wikipedia.org/wiki/Grovers_algorithm> blows my mind.

------
program
The heapsort. It sounds very strange.

<http://www.youtube.com/watch?v=iXAjiDQbPSw>

------
kd0amg
I'd go with Tarjan and Hopcroft's planarity testing algorithm, whose runtime
is linear in the number of vertices.

------
nuitblanche
FWIW, some of these algorithms: Bloom filters, Error Correcting Codes are
connected to the Compressive Sensing.

------
cosmicray
while looking thru some 360 assembler code one day, I came across this ...

XC loc1,loc2

XC loc2,loc1

XC loc1,loc2

It took a day of head scratching, and various notes, before I understood that
it was a clever way of exchanging the contents of two storage locations,
without a third intermediate location. Three consecutive exclusive-Ors.

------
signa11
for me simplex (george-dantzig) is very very cool.

------
jules
It surprised me that matching is not NP-complete.

~~~
mikleb
But maximal matroid matching is NP-hard. ;)

Approximating the minimum spaning tree weight in subliner time is very
unexpected algorithm.

------
J3L2404
Arithmetic coding:
[http://en.wikipedia.org/wiki/Arithmetic_coding#Encoding_and_...](http://en.wikipedia.org/wiki/Arithmetic_coding#Encoding_and_decoding:_overview)

The entire message is coded into a single number and it approaches the
theoretical maximum for compression.

Too bad about software patents!

------
noverloop
In heuristics, I like Asynchronous teams [
[http://www.cs.cmu.edu/afs/cs/project/edrc-22/project/ateams/...](http://www.cs.cmu.edu/afs/cs/project/edrc-22/project/ateams/WWW/home_page.html)
]

In multi-robot coordination I like the free market system [
<http://www.frc.ri.cmu.edu/projects/colony/architecture.shtml> ]

And regarding machine learning I like neural nets (MLP), however the
algorithms that currently blow my mind are convoluted neural networks (CNNs)
and deep belief networks trained with auto-encoders.
<http://www.youtube.com/watch?v=AyzOUbkUf3M> on the 21-minute mark to see it
in action.

