
O-notation considered harmful (use Analytic Combinatorics instead) - jgrant27
http://jng.imagine27.com/index.php/2013-02-10-121226_analytic-combinatorics-is-better-o-nation-considered-harmful.html
======
rcoh
I became irritated when quick sort was claimed be O(n^2). Not only is that not
true in practice, it's just wrong for all but the simplest implementations of
the algorithm.

With minor tweaks to the selection of partitions, quick sort can be made O(n
lg n). Specifically, if combined with linear time median finding, we can pick
perfect partitions every time. (Of course, this algorithm is painfully slow in
practice).

All this said, the idea of analytic combinatorics is interesting for a deeper
understanding of algorithm performance. It quantifies behavior that good
engineers understand is swept under the rug by Big-O notation.

Of course, if you really want good performance, you'll have to understand the
algorithm beyond just it's combinatorial properties -- caching locality and
coherence, branch mispredictions, function call-overhead etc. play an enormous
role when it comes time to really make code fast.

To write fast code, you need the whole picture.

~~~
beagle3
> Not only is that not true in practice, it's just wrong for all but the
> simplest implementations of the algorithm.

I have found way too many implementations in the wild, with various
optimizations that STILL had an O(n^2) worse case, one that was actually
triggered by real data.

> With minor tweaks to the selection of partitions, quick sort can be made O(n
> lg n). Specifically, if combined with linear time median finding, we can
> pick perfect partitions every time. (Of course, this algorithm is painfully
> slow in practice).

You are mostly contradicting yourself here. It is painfully slow in practice,
therefore, no one does linear time median finding; therefore, partitions
aren't equal sized, and the O(n lg n) guarantee CANNOT be made.

Have you ever seen a widely used quicksort implementation that actually has an
O(n lg n) guarantee? I haven't. Closest I've seen is median-of-five-random-
elements, which probabilistically is excellent, but is STILL not an O(n lg n)
- if you have an adversary, and they build something like
<http://www.cs.dartmouth.edu/~doug/aqsort.c> adjusted to your algorithm,
you'll get a O(n^2).

Also, it is quite surprising how many quicksort implementations out there will
do O(n^2) if they get a vector of ALL EQUAL VALUES. I have seen that happen in
practice (including the Java standard library at the time I tested it), and I
haven't seen one text book mention that the partitions should be (all < pivot)
(all == pivot) (all > pivot), which is the only practical way to avoid it.

------
guard-of-terra
They can't be serious if they don't provide a cheat sheet - a page worth of
notations for common algorithms demonstrating how to use the thing and how
it's superior.

Here in the real world we don't have time to read a 800+ pages book on a
subject promising to bring marginal productivity benefits. Me, I'd better read
some fiction instead. It's so much better for your soul than any CS topic.

~~~
stevvooe
Not to mention the condescending attitude:

 _Most engineers I know would still argue that Merge Sort is a better solution
and apparently Robert has had the same argumentative response even though he
is an expert in the field. In the lecture he kindly says the following : “…
Such people usually don’t program much and shouldn’t be recommending what
practitioners do”._

------
jules
Quick sort being O(n^2) and merge sort O(n log n) has _nothing_ to do with
O-notation (asymptotic analysis) vs analytic combinatorics (exact step
counting). With analytic combinatorics you may get the exact number of steps
the algorithm takes yes, but the leading term in quicksort's step count would
still be n^2 and the leading term for mergesort's step count would still be n
log n. The problem this post is about has _everything_ to do with worst case
vs average case. You can analyze average case and worst case both
asymptotically and exactly.

As far as the practical utility of the techniques, for analyzing time,
O-notation is clearly much better. Analyzing the exact number of steps taken
is in most cases impossible, even with the powerful tools of analytic
combinatorics. Analyzing the asymptotic time complexity is in comparison
trivial. Even if you _do_ manage to stumble on a case where it's possible to
determine the exact number of steps, that still doesn't tell you anything
about the constant factors involved in the algorithms, since you just have the
number of steps for some definition of a step (those objecting that you could
determine exactly how many CPU cycles an algorithm will take are living in the
past -- with modern processors this is no longer feasible at all). The only
thing that an exact step count can get you is non-leading terms that the
O-notation hides. For example an algorithm might be O(n^2) when in fact the
number of steps is n^2 + n. Knowing that extra +n is (almost) never of any
practical importance, especially given that we're counting steps not time in
seconds.

Don't get me wrong, analytic combinatorics is a beautiful subject and may even
come in handy in _some_ practical cases, but this post is vastly over-hyping
it. By the way, even if you do want to count combinatorial structures exactly,
instead of going the analytic combinatorics route in practice it often makes
more sense to just define the exact count recursively and memoize. You don't
get a closed form solution this way, but it is much quicker to do and can
handle far more cases.

------
revelation
Of course O-notation is mostly misguided, but then there are places where it
comes back to bite you. Remember that DoS that worked on basically every web
framework out there because everything is a clever hash-table nowadays? O(n^2)
means very slow very quickly.

And I certainly wouldn't want to use a database with O(n) lookup or worse.

~~~
praptak
Same goes for any implementation of quicksort with deterministic choice of
pivot. It is possible to construct a DoS-type permutation to make it
quadratic.

> O(n^2) means very slow very quickly.

You mean Theta(n^2) :-)

~~~
Derander
If we're being really picky it is possible to deterministically find the
median of n elements in \Theta(n) time so we're able to deterministically
select as our pivot the median element.

This gives deterministic \Theta(n log n).

As mentioned elsewhere this algorithm has a fairly large constant factor and
is not used in practice.

------
JD557
I can't say I completly agree with this. Even though big-O notation is far
from perfect, I assume analytic combinatronics has its problems as well: since
it's based on the scientific method, it should have some errors (although I
have not read about it, so I might be wrong).

Also, I never seen anyone claim that quick-sort is O(n^2). Usually it's
considered only the average case, where it's also O(n log n). This is where I
believe Analytic Combinatronics should come in: if you want to compare two
algorithms with the same order of growth on the average case. Otherwise, I
think it's better to use big-O (analytic combinatronics to compare linear vs.
exponential growth algorithms seems a little bit of an "overkill").

~~~
beagle3
Every ref to quicksort worth its salt mentions that it is O(n log n) average
case, and O(n^2) worst case.

The problem with "average case" analysis is that you must give your
assumptions; Without precisely stating your assumptions, "average case" is
useless.

------
rck
Using complex analysis to study generating functions that describe time
complexity is definitely interesting, and there are some cases where using
generating functions can make a complicated analysis a bit easier to deal
with, but I don't see any reason to think that analytic combinatorics is more
practical on a day-to-day basis than big-O notation. Analytic combinatorics is
much closer in spirit to the techniques that Knuth uses to analyze performance
in The Art of Computer Programming, and I don't know many programmers who
prefer Knuth's style of analysis to big-O.

------
patrickg
Isn't that why we look at best/average _and_ worst case? Not only the worst
case alone?

~~~
schabernakk
My thought exactly. Quicksort for example has an average case of n log n, just
like mergesort.

Also, Big-O notation can primarily be used to see how algorithms scale with
larger datasets. Not to see which algorithm is faster. (Although in a lot of
cases, the better O-notation algorithm is also the faster one.)

~~~
tensor
It seems that analytic combinatorics (what is advocated instead of big-Oh) is
the modern study of things like average case analysis. That said, it seems
like bad advice to advocate ignorance of the worst case. Both average and
worst cases should be considered.

------
jiggy2011
IMO, Understanding Big O helps you avoid doing stupid stuff like O(n!) where
the problem can grow faster than your ability to purchase hardware to keep up.
But you probably know when you are writing code like this intuitively.

If you have very performance critical code you _probably_ have some sort of
range of inputs in mind. In which case it's more practical to just do
benchmarking and statistical based evaluation.

------
danek
What is the origin of these articles that take the form "X Considered
Harmful"?

What's wrong with adding the word "is" between "X" and "considered"?

Who is doing the considering?

What is X harmful towards?

Can anyone explain why this is a thing?

~~~
shadowfiend
This is a thing because it is an allusion to Edsger Dijkstra's letter “Go To
Statement Considered Harmful”. See more at
<http://en.wikipedia.org/wiki/Considered_harmful> . Evidently it predates that
letter, but its pervasion in computing blog posts and such is probably due to
the relation to Dijkstra's article/letter.

------
freework
Most of the time, if you're thinking about big-O, you're practicing pre-mature
optimization. Just write the code using your language's sort() function. If by
profiling you determine the call to sort() is a bottleneck, then __AND ONLY
THEN __should you consider analyzing the big-o implications of the different
sort algorithms.

~~~
dinkumthinkum
No, that's not true. That may be true if your whole world is making CRUD apps
but there is a whole world out there. People use the phrase "pre-mature
optimization" as a crunch to not understand how computers work.

~~~
philhippus
So Donald Knuth was trying to get out of understanding computers, got it.

~~~
Derander
The actual quote reads: "We should forget about small efficiencies, say about
97% of the time: premature optimization is the root of all evil."

O(n) vs O(n^2) is not a small efficiency for non-trivial data sets. This is
the point that was being made. Many people forget that the quote hinges on the
word "small" and then use that as an excuse to disengage their brain when it
comes to basic things.

------
amit_m
I don't get this post.

O notation specifies the asymptotic behavior of (mathematical) functions. e.g.
sqrt(n) = O(n/log(n))

In order to use O notation to describe the performance of an algorithm, one
must specify 1. What is being measured and 2. What is the class of inputs.

------
gnosis
Cached version:

[http://jng.imagine27.com.nyud.net/index.php/2013-02-10-12122...](http://jng.imagine27.com.nyud.net/index.php/2013-02-10-121226_analytic-
combinatorics-is-better-o-nation-considered-harmful.html)

------
tbirdz
Just in case you missed it, the book is free off the author's website:
<http://algo.inria.fr/flajolet/Publications/book.pdf>

