A Gentle Introduction to Algorithm Complexity Analysis

ssurowiec · on July 4, 2012

For anyone interested in this subject I'd highly recommend the Algorithm Design Manual by Steven Skiena[0]. He goes over this in pretty good detail in the first ~1/4 of the book. This was my first introduction to Algorithm Complexity Analysis and by the end of that section I already found myself looking at my code in a different light.

[0] http://www.amazon.com/dp/1849967202

kruk · on July 5, 2012

I would recommend the Introduction to Algorithms by Cormen. It covers not only the complexity analysis but also describes the most important algorithms, data structures and classical problems solutions. http://www.amazon.com/dp/0262033844

mbenjaminsmith · on July 5, 2012

Great article. I've understood the why and what of Big-O but never how to do the analysis.

I have a question for the informed however:

The article says that we only consider the fastest growing parts of an algorithm. So, counting instructions, n^2 + 2n would just be n^2. But why do we do that? Imagine an algorithm where we have n^12 + n^2 + n^2 + n^2 + n^2, etc. Do we really ignore each n^2 section?

[Edited]

Qworg · on July 5, 2012

I assume you mean n^2 + 2n right?

Given your example, you ignore the n^2 because n^12 grows SO much faster. For any "normal" amount of operations, n^12 wipes out 4*n^2 in terms of cost.

mbenjaminsmith · on July 5, 2012

Correct, I meant n^2.

Certainly there are cases where the "insignificant" parts of an algorithm are not so, correct? In a simple case n^12 dominates a few n^2 sections, but there must be exceptions to that.

Are there cases where a more complete analysis is done -- like comparing two similar algorithms? Or at that point are you actually implementing the algorithms to test and measure?

kruk · on July 5, 2012

Yes, there are cases where the "insignificant" part is so significant it's actually more efficient to use asymptotically slower algorithm for everyday problem sizes. The Big O notation is not so much about running speed but about scaling.

While the "insignificant" part would usually have to be really large to influence the running time, it's often the case that one algorithm is faster than another due to the second one having a large constant multiplier, even though the Big O notation would suggest otherwise (e.g. O(10nlogn) would be faster than O(1000*n))

It's also common that even if one of the algorithms is slower in the worst case scenario, its average case performance is better than asymptotically better solutions. A popular example of such might be the quicksort algorithm, which has the worst-case running time O(n^2) but in practice is usually faster than most O(nlogn) sorting algorithms.

mbenjaminsmith · on July 5, 2012

"Yes, there are cases where the 'insignificant' part is so significant it's actually more efficient to use asymptotically slower algorithm for everyday problem sizes. The Big O notation is not so much about running speed but about scaling."

I do understand that. My point was that the analysis is likely too optimistic in some cases. Maybe that's not an issue. I guess the whole thing is fairly arbitrary anyway, and conservative, since we're going to use "worst case" wherever it can be applied.

flebron · on July 5, 2012

It's not about optimisim, but about eventual dominance of functions.

If you have n^7 and 5(n^6 + 7n^2), then for every n >= 6, the first is larger. In general, for every pair of polynomials f and g of max degrees s and t respectively, and f will eventually dominate g if and only if s > t. Constant multiples are taken into account in big-O, but in a sense they can only delay the inevitable - for any nonzero constant multiples you pick to multiply f and g by, there always will exist a point n such that after that point, f dominates g forever.

It might be that that n, however, is too large for your purposes - perhaps that's what you mean by optimism. Sedgewick has written about other types of analysis that take this into account, see Algorithms for the Masses: http://www.cs.princeton.edu/~rs/talks/AlgsMasses.pdf .

scott_s · on July 5, 2012

I think you're getting tripped up on what the analysis is. It's a worst-case analysis. That is, at every step of the way, you're figuring out what is the worst possible behavior. Hence, you won't ever fall into the trap you're worried about. When doing worst-case analysis, and we say "this section of code is n^2", and "this section of code is n^3", as n grows, that will always be true.

You're probably thinking of not worst-case analysis (which is what is typically done for Big O), but average-case analysis. That is, you're not asking about what is the worst possible running time, but on average, what will the typical running time be? That requires a more complicated analysis. Specifically, we have to start giving probability distributions to our inputs. In worst-case analysis, reasoning about the inputs is easy: just figure out what inputs will give us the worst performance, no matter how rare those inputs may be. In average-case analysis, we have to start figuring out what the likely inputs will be. That requires more work and more assumptions. Typically, you do average-case analysis with a particular problem domain in mind - otherwise, you can't reason about what kinds of inputs are frequent or rare.

jkbyc · on July 5, 2012

It is just a simple kind of analysis. Normally people working on algorithm and data structure analysis would do large amounts of fairly advanced math to analyse the complexity in detail, including all the "insignificant" parts.

Apart from worst-case complexity, they would also analyse average complexity, amortized complexity, etc.

malkia · on July 5, 2012

for example binary search in array of strings woulde be overkill and performance hog, if the list on average has dozen elements (for example table with methods or commands)

lotharbot · on July 5, 2012

logs and fractional exponents often require deeper analysis. If n=1,000,000 then log n = 6, so n log n would actually be faster than 7n on that size data set.

Certain special cases also require further analysis. Some algorithms run extremely fast, or extremely slow, on specific types of data. For example, with data that's expected to be nearly sorted, some sorting algorithms will run faster than expected while others can choke.

Operations that run at very different speeds also require deeper analysis (or testing). If an algorithm requires O(n) disk reads, O(n^2) memory allocation, and O(n^4) calculations, it's possible for any one of those factors to dominate on real-life data sets.

leif · on July 5, 2012

In terms of just the expression, yes, you just count the largest term. In cases where you count more closely, it's either in the constant (quicksort vs mergesort) or in terms of removing randomness or amortization.

tsurantino · on July 5, 2012

I've been doing the Coursera course on Algorithms (from Stanford by Tim Roughgarden). He's going over concepts like Big-O notation as well as analyzing algorithms. This particular article is a refreshing read, and I would highly suggest the Coursera course as an accompaniment if someone wants to go deeper.

captaincrunch · on July 4, 2012

A gentle intro? It's a bloody novel!

ucho · on July 5, 2012

Nope, it is just an introduction. I would love to see next parts, my memory of Complexity Analysis is very rusty. Basically I only remember people _repeating_ that course, leaving the exam just after seeing questions :)

rlu · on July 4, 2012

I wonder if the title is in jest. Clicking on this link and then being greeted with a wall of text (literally! It fills my entire screen) is hardly gentle.

I'm sure it's a good read just not right now >_>

zaptheimpaler · on July 4, 2012

I'd say thats why it is gentle. Most CS books will compress that content into 2-3 pages and expect you to figure the rest out as you go. On the other hand, this article doesn't require you to take some time to think about it at all, its all right there.

wollw · on July 5, 2012

The classic CLRS Algorithms book actually starts out with a five chapter "Foundations" part that goes over computational complexity in a good amount of detail. It goes over theta, big and small O, and big and small omega notations and their properties similarly to how this article seems to. It has an entire chapter on determining the computational complexity of recurrences and another on randomized algorithms. The five chapters together are about 130 pages and probably not what I would call gentle but I found it surprisingly approachable.

dionyziz · on July 5, 2012

CLRS has an exceptional introduction to complexity and in fact is an exceptional book on its whole. However, it is not always easy to follow for high school students who don't have a mathematical background, as it tends to get a bit formal. It's the same reason many programmers find it repulsive and wouldn't read past the first few pages. For example, the proof for the Master Theorem is clearly not that easy to follow. That's exactly the reason I wrote this article so that it's complete yet approachable to people without a mathematical background.

yuushi · on July 5, 2012

That being said, the section containing the proof is starred if I remember. I imagine most people, even those reading CLRS, skip over it.

rlu · on July 4, 2012

Fair enough

sodelate · on July 5, 2012

similar topic in many Computer Algorithm Books