
Big-O Algorithm Complexity Cheat Sheet - ashleyblackmore
http://bigocheatsheet.com/
======
wting
You can pass some interviews by blindly memorizing, but it's unnecessary. If
you understand a concept, then you can reason its big O. Memorization implies
a superficial understanding that may be revealed later.

If you don't understand something, spend a few hours and implement it.

    
    
        "I hear and I forget. I see and I remember. I do and I understand."
    
        - Confucious

~~~
mixedbit
Agree, buy a good book (for example Cormen), learn the algorithms and
implement them to get a good understanding.

Try to use the table to answer a following question: what is a time complexity
for finding a next item (according to a key) to a given one in a hash table?
Memorizing such stuff does not make much sense, but if you understand basic
concepts, you will figure it out quickly.

There are basic errors in the table:

BFS, DFS are for graphs, not just trees, and their complexity is not b^d (what
is b and d anyway?).

Quick sort expected complexity is nlogn and this is different than average
complexity. You can also make quicksort worst-case complexity to be nlogn.

You can't sort anything with space smaller than number of items you are
sorting.

You can use buble and insertion sort not only for arrays but also for lists
and time complexity does not suffer.

~~~
rttlesnke
> You can also make quicksort worst-case complexity to be nlogn.

Quicksort in the worst take can take O(n^2) time, not O(nlogn).

~~~
gamegoblin
You can use a randomized selection algorithm to find the median in linear
time, and if you use the median as a pivot you will never get worst case n^2
behavior.

This is not used in practice because the probability of getting worst case
behavior is extremely slim if you do some clever, and cheap, tricks.

~~~
kolistivra
Randomized selection algoritm is actually O(N^2) in the worst case. Median of
medians is the O(N) worst case selection algorithm.

------
algorias
This is why it's a terrible idea to ask for some random algorithm runtime in
an interview. It says absolutely nothing about programming or reasoning skill.

I'd rather ask a candidate to explain their favorite data structure to me and
derive it's big-O complexity right then and there. Being able to regurgitate
the correct answer doesn't count for anything in my book.

~~~
fatemayasmine
Often the followup question to what is the O notation of X is why? So if you
are just going to memorize the cheat sheet then it won't get you very far.

It is good to know what you should know about though.

~~~
dllthomas
I was asked the O() of binary search. I said "log n". They asked "what base?"

... obviously they wanted 2, but O() doesn't work that way - changing base is
a constant factor. Sometimes there's a tension between figuring out someone's
understanding of the algorithm and someone's understanding of the notation
(and math behind it). Of course, I gave them both answers...

~~~
crossroads091
What was the second answer?

~~~
bastijn
2 and the "no need for base in big-O" makes 2 :).

------
vault_
This is a pretty limited list of algorithms. It should definitely include
linear time sorting algorithms (e.g. bucket or radix sort), as well as graph
algorithms (shortest path at least, but also probably all pair shortest path
and minimum spanning tree).

There should also be a section about heaps and their operations. There are a
huge number of ways to implement a heap (e.g. linked list, binary tree, or a
more exotic structure like a binomial of fibonacci heap) and there are a lot
of tradeoffs in complexity of different operations between them.

~~~
pbiggar
Radixsort isn't linear (<https://en.wikipedia.org/wiki/Radix_sort>). I have
actually published research that said Radixsort was linear, only to have it
later explained to me that it is not linear, for a subtle an hard to remember
reason involving number theory.

~~~
nostrademons
It's O(k * n) where k is the number of digits. It takes log-b(N) digits to
represent N distinct integers in base-b, so O(k * N) reduces to O(N log N)
when there's no bound on the range of keys.

It's still pretty useful for sorting data where you know the keys are small
integers (say, less than a machine word).

~~~
pbiggar
Thank you!

So if you put a bound on the size of the keys, let's say 32bit, it becomes
linear? Obviously it would be cheating to put a giant number here :)

~~~
nostrademons
Right. Basically, if the key size is bounded, then k becomes a constant and it
reduces to O(N).

------
anonymouz
I suppose the general idea for the colors is something like: green = best in
category, red = worst in category, yellow = neither best nor worst?

In that case bubble sort and insertion sort should be green for the best-case
time complexity ( O(n) vs. O(n log(n)) for quicksort/mergesort).

It might also be interesting to make the plot dynamic and allow the visitor to
play with different implicit constants for the individual asymptotic bounds.

~~~
pbiggar
When reading it, I felt that red was "don't use in production", green was
"fine to use in production" and yellow was "maybe use in production".

~~~
sbov
Except insertion sort is faster than quicksort for smaller N because of the
overhead involved in quicksort - quicksort has large constants which big-O
notation isn't designed to show. This is why many library sorting algorithms
fall back to insertion once the things you are sorting gets small enough.

~~~
pbiggar
All "don't use in production" tags should be accompanied by "unless you really
know what you're doing".

------
wfunction
If you need this, you're doing yourself a disservice by looking at it.

Go back and learn the concepts so that you're not memorizing anything.

~~~
bastijn
What if you do not use this for an interview, butalready have a job that
doesnt require you to apply this every day? I know a lot of these algorithms,
but not all. I hardly ever need them, but when I do this might be a good
starter to browse from.

thinking cheat sheets are only for interviews is limited. I would say knowing
everything from head is useless with todays internet. Have a solid base,
understand a few and google the rest tailored to your situation. I like cheat
sheets to quickly remember what to explore.

~~~
wfunction
> thinking cheat sheets are only for interviews is limited

I never even used the word "interview", are you sure you're responding to the
correct comment?

~~~
bastijn
true, the interview part was for large discussion above. The general concept
of my reply applies to your comment though. I think your statement is a bit
harsh. Or im misinterpretting what you are trying to say. Saying cheatsheets
are useless is not correct imho.

------
alecbenzer
I'm not totally sure what you mean by "dynamic array", but the vector
algorithm for insertion (which you should probably at least include, if by
"dynamic array" you were implying a more naive insertion scheme) is O(1)
amortized.

~~~
gizmo686
If we are talking about the same thing (an array that you reallocate to twice
its size when you fill), then it has an amortized O(1) time to append
something to the end. If you want to insert something in the beggining, you
will need to move n elements down an index. A random location will average
n/2, which is O(n)

~~~
alecbenzer
Ah, true. I guess the chart should probably distinguish between arbitrary
index insertion and appending.

------
Retric
I never understood why people look at 7-8 sorting methiods and ignore Radix
sort which often beats everything else at O(n) average case.
<https://en.wikipedia.org/wiki/Radix_sort> I mean is the assumption that
people would never actually need a useful real world understanding of the
topic?

~~~
nostrademons
Radix sort is technical O(k * n) where k is the number of digits. This is very
useful when you know k falls within a bounded range (eg. sorting a bunch of
integer keys, all of which range from 0-255), but it reduces to O(n log n) for
arbitrary keys, because in general you need log n digits to represent n
distinct items.

~~~
Retric
By that definition sorting strings using Merge sort for example takes (K * n
Log n) which is still worse because string comparison is worst case O(k) not
O(1).

~~~
nostrademons
Whenever you talk big-O you have to be aware of what your primitive operations
are. When talking about normal sorting algorithms we usually assume comparison
is a primitive operation, and then we're measuring the number of comparisons.
This is not actually the case for strings (and several other data types), but
that cost is the same regardless of which comparison sort you use, and so it
usually doesn't matter in your analysis.

With radix sort, you're usually considering using it precisely _because_ K is
likely to be significantly smaller than log N, and so it's absolutely relevant
to the actual problem at hand.

(For that matter, multiplication is not constant time either - it's O(N) in
the number of bits, which is O(log N) in the size of the values stored - but
this is conveniently forgotten in most algorithm analysis. If you limit the
problem to integers that fit into a machine word, then this factor drops out
as a constant, and nobody cares.)

Regardless of what algorithm you're working with, you have to be aware of the
limits of the abstraction you use to analyze it. Fibonacci heaps are O(1), but
nobody uses them because the constant factors swamp other simpler algorithms
with worse computational complexity. And sometimes it's faster to use a red-
black tree (or even linear search over an array) than a hashmap because
hashmaps are technically O(k) in key size; red-black trees are too, for
comparisons, but in a sparse key space the processor usually only has to
examine the first 1-2 characters before it can bail out of the comparison
routine while the hashmap has to examine every character.

~~~
Retric
True enough. The idea for Big-O notation is really cost = O(whatever) *
(algorithms constant difficulty factor) + (algorithms overhead). My point is
if you start adding difficulty factors then the same terms often wind up in
your other algorithms. Granted string comparisons are generally O(log k) and
pure Radix would end up as O(k) but you can also short circuit a MSD Radix
sort if the buckets are small enough which effectively drops things back to
O(log k) assuming sparse inputs. (if it's not sparse your not going to be
doing anything past a depth of about 4 anyway.)

------
calebegg
Why is O(n) red and O(n log(n)) yellow? Clearly, O(n log(n)) is slower.

In general, whether a specific complexity is good or bad differs greatly based
on what you're doing. I don't think it's a good idea to have the colors be the
same everywhere. A particularly bad instance is how every single data
structure is O(n), which is red/"bad".

------
Jabbles
Very few commenters think this is a good idea. The majority of posts lament
the rote learning and lack of understanding involved. Why then, is this
upvoted so much? Is it that people think the comments are worth reading so
much that they upvote the article in the hope that other readers will read the
comments? Are the people commenting negatively upvoting the article in the
hopes their comments will be more widely read†? Are people afraid of flagging
articles?

†testable hypothesis, data requested

~~~
asafh
You're assuming the same set of people that are commenting are those that are
upvoting this article.

Another hypothesis is that those are two largely disjoint populations on HN.
With the smaller one displeased with the article and is likely to express that
in comments. The other, larger one is pleased with the article and doesn't
bother much with comments.

------
coldcode
If you only want to hire me based on answering big-O questions I don't want to
work there anyway. 32 years of working on highly complex and performant stuff
and not once did I think in terms of big-0. Optimizing is not about knowing
the math but knowing how to measure and how to interpret what you measure.
Big-O might make you feel smart but it's a tiny part of actually constructing
something complex and optimal.

~~~
rdtsc
I mostly agree. big-O knowledge helps you pick the right data structures and
algorithms for the job. The idea is that it will often help you avoid having
to rediscover all those complexity classes by trial and error when looking at
systemtap traces or time stamps in the log stream.

For example in real-time systems picking a btree vs a hash based data-
structure might work better sometimes since there is a less of a chance of a
sudden spike related to hash re-sizing, instead there is a small penalty to be
paid during insertion. I believe that. Have I actually measured that? No.
Because it would involve re-writing a bunch of code and it would take time. So
I don't know if big-O had saved my ass here.

That is just one example.

Or say when when it comes to large data storage, knowing the base data
structure used in the database will give you some expectation as to how it
behaves when the size grows.

All that said, it is hard to point back in 7+ years and say, aha, I know
exactly how many times knowing big-O saved me from spending extra time and
effort debugging. I can think maybe only of one or two times recently when I
had to think about big-O so I mostly agree with you.

It certainly seems that not knowing anything about big-O will not terribly
handicap someone who knows how to use debugging and profiling tools. There are
probably other more practical bits of knowledge that are more important to
know.

Despite this these kind of questions are very popular. I see a few reasons. 1)
"Big Company" interviews. Big companies love hiring fresh college grads from
good schools. Those don't have a lot of relevant software development
experience. But they have to be selected and tested somehow so theoretical CS
is the goto tool. 2) Other companies just copy the interview questions from
big company interviews thinking "well they are so big and successful because
they are using these kind of questions to select candidates". Whether it is
true or not, I don't know but I believe that processes goes on behind the
scenes.

------
omershapira
Slightly academic, but this cheat sheet gives out some shorthand explanations
to many of the methods in the Big-O document:

[http://www.scribd.com/doc/39557873/Data-Structures-Cheat-
She...](http://www.scribd.com/doc/39557873/Data-Structures-Cheat-Sheet)

~~~
0xdeadc0de
>To download or read the full version of this document you must become a
Premium Reader.

~~~
omershapira
WHAT! I'm sorry about that. It's on my own website now:
<http://playground.omershapira.com/Notes/DS_CS.pdf>

~~~
mitchi
This is great! Gratitude

------
westurner
* [https://en.wikipedia.org/wiki/Computational_complexity_theor...](https://en.wikipedia.org/wiki/Computational_complexity_theory)

* <http://mathworld.wolfram.com/ComplexityTheory.html>

* <https://complexityzoo.uwaterloo.ca/Zoo_Glossary#O>

* <http://dbpedia.org/resource/Depth-first_search>

* <http://en.wikipedia.org/wiki/Category:Infobox_templates>

* <http://www.w3.org/TR/rdf-schema/>

------
lettergram
Actually walking through each problem is the only way to understand. There's
not even a point in memorizing, it takes longer to memorize than to actually
learn how to come up with those values.

------
pbiggar
Can you annotate it with the subtype of the algorithm. For example, you have
O(logN) worst-case space complexity for quicksort - that is not true on all
variations, include to naive one.

------
andreasvc
[http://webcache.googleusercontent.com/search?q=cache:PPvTs45...](http://webcache.googleusercontent.com/search?q=cache:PPvTs45TpbEJ:bigocheatsheet.com/)

------
tlarkworthy
If you want to visualise big O runtime, you draw it on a log-log scale. The
linear gradient on the log-log plot is the factor. i.e. if its at 45 degrees
its O(n), if its at a gradient of 2:1 its O(n^2). Handy fact to work out your
big O without having to do the tedious math! (See
<http://jcsites.juniata.edu/faculty/kruse/cs2/ch12a.htm>)

~~~
eru
A log-log plot is useful for polynomials, but has nothing to do with big O
notation in general.

~~~
tlarkworthy
I would argue the polynomial factor in an performance computing is about the
only thing that matters in a practical setting. The difference between nlog(n)
and n, is barely discernible when looking at actual results (and the k factor
is much more important then). If your algorithm is x^n or n! your f __*ed
anyway so its not important cases for tuning. In high performance algoriths,
after you add adaptive caching etc. your results are highly dependant on your
data, in this casse you expect to get results like O(n^2.34) and stuff so you
can't work it out through the analytical approach. Your only recourse is
empirical measurement in which case log-log plots are the only sensible
choice. The author of the article only has a linear plot on the page which is
almost always the worst choice for graphing algorithmic performance, hence I
brought up the issue.

------
dmead
not once is theta or omega used, so this cheat sheet isn't all that
descriptive.

~~~
flebron
To be fair, people really aren't too interested in Omega, at least not in
conjunction with worst cases. Omega is more suitable for best cases, and that
in turn is slightly useless without any knowledge of how common it is. For
instance, telling you that bubblesort is Omega(n) in the worst case isn't
terribly useful, and telling you it's Omega(n) in the best case is somewhat
more enlightening (you now have an absolute asymptotic lower bound), but still
not really useful without knowing that the best case is going to be very rare
(most of your n! possible input permutations have a lot of inversions).

Theta is a bit more interesting, however. I think it speaks to the "tameness"
of the algorithm.

------
fmax30
I don't know why people don't use balanced BST ( std::map in c++) for storing
the adjacency lists of a graph. Sure the insertion would take O(log n) time
but , I think the overall benefit would be greater than the costs. Correct me
if I am wrong.

~~~
ufo
Storing the graph structure in a BST is only useful if your graph is very
sparse _and_ you need to have fast lookup for checking specific edges (say,
given two nodes, find the cost for the edge between them).

If your graph is dense, using a an adjacency matrix is simpler and will be
faster most of the time. If you don't need to query specifific edges and all
you need to do is iterate over the edges for given vertices than using
adgacency lists (or vectors) is simpler and does the job just as well.

~~~
fmax30
if my graph is dense then would using a adjacency matrix take like O(V^2)
space complexity ? Anyway I was just suggesting this because I use it in
practice. Just wanted to know the cons of it if any.

~~~
gsg
The space taken by an adjacency matrix doesn't depend on edge count. That's
why it is usually the favoured representation for dense graphs.

------
montecarl
I would like to see the same type of complexity cheat sheet for algorithms for
common math problems: addition, multiplication, division, subtraction,
factorization, solving linear systems, solving eigen systems, matrix
inversion, etc.

~~~
ufo
Those depend a lot on the context. Firstly, the complexity for the arithmetic
operations you listed mostly only matters if you are working with big numbers
(something that is not very common to be a bottleneck). Factorization doesn't
have a polynomial algorithm so I don't see why the complexity matters anyway
(its still going to take longer than your lifetime with hard inputs anyway).
As for linear systems, it depends a lot on your input and the problem you want
to solve. If we talk about the Simplex algorithm that most people use,
empirically it takes around cubic time but its still an open problem to find a
base-choide heuristic that does not have pathological exponential worst case
performance. In addition to that, many important problems are modeled as
linear programs but will have extra special structure that let them be solved
with more efficient algorithms.

Finally, you got me when it comes to the numerical stuff (eigenvectors and
matrix inversion). I haven't looked into that in a while.

~~~
eru
> In addition to that, many important problems are modeled as linear programs
> but will have extra special structure that let them be solved with more
> efficient algorithms.

To specify: Those more efficient algorithms can surprisingly often be
expressed as variants of the simplex method, too.

------
kriro
Pretty cool, thanks.

I think the graph at the end is the most useful thing. Really helps
understanding the complexity in relative terms.

Adding tree stuff would be cool (especially for search which is often
implemented as either a tree or graph version)

------
fatemayasmine
It is smart how you can edit the table and contribute via Github.

------
mmanfrin
Question: I'm a junior software developer that did not get a CS degree. What
would be the best way to learn and understand this sort of stuff?
Coursera/Khan? A book?

~~~
anonymoushn
The math behind this stuff is in the first third of most calculus textbooks.
The CS half can be found in CLRS. You can also probably learn it from TopCoder
tutorials or usacogate (I recall there being a mirror of usacogate that did
not require you to do all of the problems in order to advance).

~~~
henrikschroder
CLRS? Why you young whippersnapper! Back in my day it was called CLR, and
those three letters were good enough for us! Kids these days...

mmanfrin, it's this book:
<http://en.wikipedia.org/wiki/Introduction_to_Algorithms>

~~~
why-el
The book along was a little terse for me. The videos from OCW however are
priceless, even entertaining sometimes.[1]

[1] [http://ocw.mit.edu/courses/electrical-engineering-and-
comput...](http://ocw.mit.edu/courses/electrical-engineering-and-computer-
science/6-046j-introduction-to-algorithms-sma-5503-fall-2005/index.htm)

------
zukhan
You should create an empty version of the cheat sheet so people can test their
understanding of various data structures and algorithms.

------
Dylan16807
I'm amused that every single data structure is proportional in size to the
amount of data stored and therefore marked as red.

------
graup
Well done!

As others pointed out, one might be better off really understanding them, but
for a quick overview this is a very usable website.

------
AYBABTME
This is very shallow and of limited use.

~~~
kintamanimatt
It's a cheatsheet, not a tutorial.

------
anonymoushn
Why don't your sorted trees support indexing? Mine support indexing in O(lg n)
time.

------
oakaz
Is this only for interviews?

------
exabrial
/trolling on

Awesome. Next time I'm trying to pass some silicon valley interview, I'll have
to look this up. Till then, I think I have more practical problems to worry
about.

/trolling off

