If you don't understand something, spend a few hours and implement it.
"I hear and I forget. I see and I remember. I do and I understand."
Try to use the table to answer a following question: what is a time complexity for finding a next item (according to a key) to a given one in a hash table? Memorizing such stuff does not make much sense, but if you understand basic concepts, you will figure it out quickly.
There are basic errors in the table:
BFS, DFS are for graphs, not just trees, and their complexity is not b^d (what is b and d anyway?).
Quick sort expected complexity is nlogn and this is different than average complexity. You can also make quicksort worst-case complexity to be nlogn.
You can't sort anything with space smaller than number of items you are sorting.
You can use buble and insertion sort not only for arrays but also for lists and time complexity does not suffer.
I assume it's referring to extra space used. Most analysis of space I've seen is referring to this, not the space required to store the elements.
Breadth and depth. These could apply to graphs as well, since these algorithms draw a tree while traversing the graph.
No legal DRM free option as far as I can see.
Fifth link was a direct to the PDF.
Quicksort in the worst take can take O(n^2) time, not O(nlogn).
Of course, nobody does this, because the selection is not really free. The constant factor cost of choosing the optimal pivot hurts the average case, making the modified quicksort tend to do worse than heapsort or introsort.
This is not used in practice because the probability of getting worst case behavior is extremely slim if you do some clever, and cheap, tricks.
Problem ill-stated as posed: Does not specify if the 'next' key is hashed, or even what 'next' means in this context.
Well, keys in a hash table are hashed. This implies that unless you're searching for a specific key (e.g. "42") rather than a condition (e.g. "smallest key greater than 42") then the time complexity is necessarily O(N).
Interestingly enough this works both ways. To say more accurately, memorization MAY imply superficial understanding. Memorization (and associated intuitive pattern-matching) may also lead to understanding.
A cheat sheet like this is also good shortcut refreshing one's memory of the concepts if this knowledge is not used on a regular basis.
I remember learning the 9 * table as a child and suddenly realizing that n * 9 = (n * 10) - n , thinking "hmm, does n * x = n * (x + 1)
- n" why yes it does!
I know what you are thinking `Why would you post something in Chinese? How could non-Chinese speakers understand?' and the reason is I want y'all to check out this super cool add of for Firefox: https://addons.mozilla.org/en-us/firefox/addon/perapera-kun-... With this plugin, the meanings of the words show up onmouseover and you can pretty much get the meaning.
I'd rather ask a candidate to explain their favorite data structure to me and derive it's big-O complexity right then and there. Being able to regurgitate the correct answer doesn't count for anything in my book.
It is good to know what you should know about though.
... obviously they wanted 2, but O() doesn't work that way - changing base is a constant factor. Sometimes there's a tension between figuring out someone's understanding of the algorithm and someone's understanding of the notation (and math behind it). Of course, I gave them both answers...
Only if the base is constant. For example B-trees have O(log_B n)-time operations, and you can't ignore the B because it is a parameter (hence non-constant).
There should also be a section about heaps and their operations. There are a huge number of ways to implement a heap (e.g. linked list, binary tree, or a more exotic structure like a binomial of fibonacci heap) and there are a lot of tradeoffs in complexity of different operations between them.
It's still pretty useful for sorting data where you know the keys are small integers (say, less than a machine word).
So if you put a bound on the size of the keys, let's say 32bit, it becomes linear? Obviously it would be cheating to put a giant number here :)
If k ∈ Θ(log N), then O(Nk) becomes O(N log N), which is asymptotically no better than any optimal comparison based sort.
However, if k ∈ o(log N), then we get an asymptotically better algorithm.
A comparison isn't always O(k). It's O(k) for strings. It's O(1) for machine-word integers. It can be much more than O(k) for user-defined types, eg. it can be O(k^3) if you have to multiply a matrix out and take the determinant of the product.
So the fact is that quicksort is a factor O(log n) slower than radix sort. The flip side is that quicksort is more generally applicable to any comparison function whereas radix sort only works for lexicographic sorts. In almost all cases that is exactly what you want, but for example I'm not aware of any way to efficiently sort rational numbers with radix sort.
Number theory? Not seeing the connection.
The usual bound of Omega(n log n), proved using decision trees, is only applicable when your only operation is to compare two elements. Radix sort asks for more than this, and so it can assume a specific structure of its input, so it can violate the lower bound.
Depending on which operations you assume, sorting can become more or less easy. In the extreme case, if you can ask the array "Please sort yourself." as a basic operation, sorting is O(1). Radix sort assumes bitmasking as a basic operation, which falls into the "make things easier" spectrum, leading to an O(n) algorithm under the stated assumption of constant bit length (or any encoding, really, it doesn't really need to be bits).
For example, one might implement a variant of InsertionSort that stores a duplicate count for distinct elements, and then an insertion sort on e.g. 32-bit integers would require at most 4294967296 comparisons per insertion -- a constant factor that can be technically ignored in the complexity analysis. (I did warn you that the constant factor would become unwieldy!)
Note that this doesn't require values to be integers -- it suffices for them to be comparable with a lot of duplicates. The variant of InsertionSort described above would require O(k×n) comparisons where `n` is length of the input list and `k` is the number of distinct values, or O(n) whenever `k` is bounded by a constant (as required for radixsort to work in linear time).
That's not to say that radixsort doesn't outperform other sorting algorithms in practice -- it usually does. However, that isn't obvious from a strictly complexity theoretic point of view.
In that case bubble sort and insertion sort should be green for the best-case time complexity ( O(n) vs. O(n log(n)) for quicksort/mergesort).
It might also be interesting to make the plot dynamic and allow the visitor to play with different implicit constants for the individual asymptotic bounds.
O(log n) green
O(n log n) yellow
Which is weird, since green-green-red-yellow-red is a really confusing order. I don't know why they did it that way.
Go back and learn the concepts so that you're not memorizing anything.
thinking cheat sheets are only for interviews is limited. I would say knowing everything from head is useless with todays internet. Have a solid base, understand a few and google the rest tailored to your situation. I like cheat sheets to quickly remember what to explore.
I never even used the word "interview", are you sure you're responding to the correct comment?
With radix sort, you're usually considering using it precisely because K is likely to be significantly smaller than log N, and so it's absolutely relevant to the actual problem at hand.
(For that matter, multiplication is not constant time either - it's O(N) in the number of bits, which is O(log N) in the size of the values stored - but this is conveniently forgotten in most algorithm analysis. If you limit the problem to integers that fit into a machine word, then this factor drops out as a constant, and nobody cares.)
Regardless of what algorithm you're working with, you have to be aware of the limits of the abstraction you use to analyze it. Fibonacci heaps are O(1), but nobody uses them because the constant factors swamp other simpler algorithms with worse computational complexity. And sometimes it's faster to use a red-black tree (or even linear search over an array) than a hashmap because hashmaps are technically O(k) in key size; red-black trees are too, for comparisons, but in a sparse key space the processor usually only has to examine the first 1-2 characters before it can bail out of the comparison routine while the hashmap has to examine every character.
Actually, multiplication is worth than O(bits). A naive approach will yield O(bits^2), but you can get something like O(bits^r) where r is not too much bigger than 1. See https://en.wikipedia.org/wiki/Multiplication_algorithm#Fast_... for an introduction
In general, whether a specific complexity is good or bad differs greatly based on what you're doing. I don't think it's a good idea to have the colors be the same everywhere. A particularly bad instance is how every single data structure is O(n), which is red/"bad".
†testable hypothesis, data requested
Another hypothesis is that those are two largely disjoint populations on HN. With the smaller one displeased with the article and is likely to express that in comments. The other, larger one is pleased with the article and doesn't bother much with comments.
First of all... what difference does it make?
Beyond that, I's say "it's impossible to know". Upvotes don't have strictly defined semantics on HN. And they also serve as a defacto "bookmark" mechanis. Given both of those factors, it's hard to justify assuming any correlation between "support for the content of the article" and the number of upvotes. Some people are signifying "This headline caught my eye, I want to save it to read later", some are endorsing the content, other don't endorse the content but are voting in favor of the resulting discussion, etc., etc.
Is it that people think the comments are worth reading so much that they upvote the article in the hope that other readers will read the comments?
Very possibly. Seems like a perfectly reasonable scenario to me.
Are people afraid of flagging articles?
Why would somebody flag this? It's not off-topic or spam. Just because you think a cheat-sheet isn't a great idea, is hardly a good reason to flag the post. But, then again, flagging also doesn't have particularly well-defined semantics either. :-(
There's no reason to assume they are the same parties.
For example in real-time systems picking a btree vs a hash based data-structure might work better sometimes since there is a less of a chance of a sudden spike related to hash re-sizing, instead there is a small penalty to be paid during insertion. I believe that. Have I actually measured that? No. Because it would involve re-writing a bunch of code and it would take time. So I don't know if big-O had saved my ass here.
That is just one example.
Or say when when it comes to large data storage, knowing the base data structure used in the database will give you some expectation as to how it behaves when the size grows.
All that said, it is hard to point back in 7+ years and say, aha, I know exactly how many times knowing big-O saved me from spending extra time and effort debugging. I can think maybe only of one or two times recently when I had to think about big-O so I mostly agree with you.
It certainly seems that not knowing anything about big-O will not terribly handicap someone who knows how to use debugging and profiling tools. There are probably other more practical bits of knowledge that are more important to know.
Despite this these kind of questions are very popular. I see a few reasons. 1) "Big Company" interviews. Big companies love hiring fresh college grads from good schools. Those don't have a lot of relevant software development experience. But they have to be selected and tested somehow so theoretical CS is the goto tool. 2) Other companies just copy the interview questions from big company interviews thinking "well they are so big and successful because they are using these kind of questions to select candidates". Whether it is true or not, I don't know but I believe that processes goes on behind the scenes.
Theta is a bit more interesting, however. I think it speaks to the "tameness" of the algorithm.
If your graph is dense, using a an adjacency matrix is simpler and will be faster most of the time. If you don't need to query specifific edges and all you need to do is iterate over the edges for given vertices than using adgacency lists (or vectors) is simpler and does the job just as well.
From my experience, using vector<list<int>> behaves more poorly due to terrible locality of lists. My use of red-black trees for graphs is mostly limited to implementing Dijkstra using set<pair<cost, node_index>> as a queue, since the priority_queue in <queue> does not have a DecreaseKey operation. Using that set (and some map (needn't be std::map, could well be a vector) of node_index to cost for faster compare during Dijkstra's neighbor loop) can make for a very fast, short, and easy to implement Dijkstra.
My usage is mostly competitive programming, so YMMV.
Just because std::set and map are completely awful doesn't mean you should completely give up on trees.
Finally, you got me when it comes to the numerical stuff (eigenvectors and matrix inversion). I haven't looked into that in a while.
To specify: Those more efficient algorithms can surprisingly often be expressed as variants of the simplex method, too.
I think the graph at the end is the most useful thing. Really helps understanding the complexity in relative terms.
Adding tree stuff would be cool (especially for search which is often implemented as either a tree or graph version)
Free interactive Python textbook on Algorithms & Data Structures:
Robert Sedgewick and Kevin Wayne of Princeton have a great textbook (code samples in basic Java):
They also teach a two-part Algorithms course on Coursera:
Thomas Cormen (the C in CLRS) also has a new book called Algorithms Unlocked which introduces some popular algorithms in pseudocode.
There's a lot of good stuff.
mmanfrin, it's this book: http://en.wikipedia.org/wiki/Introduction_to_Algorithms
As others pointed out, one might be better off really understanding them, but for a quick overview this is a very usable website.
Awesome. Next time I'm trying to pass some silicon valley interview, I'll have to look this up. Till then, I think I have more practical problems to worry about.