A really cool algorithmic problem I've been obsessed with lately is stable sorting in O(n log n) time and O(1) extra space. It's possible (Trabb Pardo 1977, Katajainen and Pasanen, Huang and Langston, etc.) but all known algorithms are very complicated. As far as I understand now, there seems to be a "wall" at sqrt(n) extra space. If we have that much, it's not too hard to write a stable mergesort or quicksort with the right time complexity. But if we are restricted to O(1) or O(log n) space, the algorithms become much more involved, using a sqrt(n) buffer inside the array to encode information about the rest. There are working implementations on GitHub (GrailSort and WikiSort), both are over 500 lines.
Here's a couple innocent-looking special cases that are still surprisingly hard:
1) Given an array [a1, a2, ..., an, b1, b2, ..., bn], rearrange it into [a1, b1, a2, b2, ..., an, bn] using O(n) time and O(1) extra space.
2) Given an array of zeroes and ones, sort it stably using O(n) time and O(1) extra space.
I'd be very interested to hear about any advances in this area.
Nit-pick: I believe there are no sorting algorithm in O(1) space: just keeping the index of an element takes log2(N) space. Most algorithms hide this fact by assuming numbers have an infinite range. If you do, then you can use a single data for any algorithm, given a sufficiently clever encoding :) (Just use a number base-N and then you got as much easy-to-access, direct-addressable storage as you want per elements.)
When people talk about these things, they usually assume something like the transdichotomous model (machine word size is unknown, n fits in a machine word, arithmetic on machine words is O(1)). It gives the right complexities for everyday algorithms.
> 2) Given an array of zeroes and ones, sort it stably using O(n) time and O(1) extra space.
I'm missing something here. If the array's values truly are all in {0,1}, then all the zeroes are indistinguishable and you get stability for free (same for the ones).
mysort all01list = let ones = sum all01list in replicate (length all01list - ones) 0 ++ replicate ones 1
On the first glance the first problem seems quite impossible.
For example, if n is power of two, then for each prime number < n there will be a cycle starting with that number. If n is not a power of two, I haven't yet seen any good explanation of cycles.
Any hints? We can't use a or b in any way?
Yet I have only one idea - that we can permute the elements somehow (in a O(1) reversible way like a Gray code) so that the cycles would form something computable in O(n) time and O(1) space
Interesting problems. But is there a practical reason why you'd want to put such stringent restriction on extra space? I mean, space is relatively cheap nowadays (except of course if locality of reference/cache size is an issue, where it translates back into time).
b) you might not have extra space. For example, if you are implementing an out of core sort (i.e. sorting huge on-disk datasets), you want to maximize the size of the chunk you will sort in memory and do not want to waste any memory for the temporary storage. (this is related to locality of reference of course, you can a make similar argument for out-of-cache sorting).
On one hand, amelius is right, there are already simple fast stable sorts using O(sqrt(n)) extra space and that should be okay for all practical purposes. (Though they aren't very well known, and inventing one will take you a week or two.) Allocating 30K elements to help sort a billion elements is no big deal, you don't need to squeeze further. My interest is more academic, how come the known O(1) solutions are so complex compared to O(sqrt(n)) and can they be simplified.
I didn't know about this problem before but now I'm incredibly curious. I love these problems where you find a trade-off between two seemingly unrelated things, here how going from o(sqrt(n)) to o(log(n)) memoey increases the code complexity by order of magnitude. If you take the code for an o(1) or o(log(n)) memory sorting algorithm, I wonder if you can identify a subroutine that if memoized would give you an o(sqrt(n)) memory sorting algorithm you already know.
I mentioned that in the first comment. Fast in-place stable sorts usually have a step where they must freely rearrange some things (blocks, medians, etc.) and remember their original order, to preserve stability in case some of them were equal. It seems like no one can squeeze the number of these things below sqrt(n). If you have sqrt(n) extra space, you can store a permutation there. If you don't, you use part of the array itself as temporary bit storage, by changing the relative positions of elements. That's where the complexity comes from.
I must be missing something, because problem 2 seems easy. It's just an array of boolean values, right? So you just pass over the array once, counting the number of zeroes. Then pass of the first n elements (for n the number of zeroes) and write zero to them, and write one to all the rest of the array. That's clearly stable (right?), requires only two passes over the array so it's O(N) time, and the only memory used is an index for looping and the count of the number of zeroes so it's O(1) memory. I'm guessing I'm missing something important, what is it?
Suppose you have a type A ('0' in the original description) and a type B ('1' in the original description). Suppose any instance of A < any instance of B, but any given instance of A is not necessarily even comparable to another instance of A. Similarly any instance of B is not necessarily comparable to another instance of B. Now stably sort an array containing instances of A and B.
OK, now I see why this is hard, thanks. Now I'm gonna spend the next few days mulling over this problem :)
I noticed the problem is relatively easy if you're dealing with a linked list instead of an array, because you can easily move elements around. If there's some way to amortize that in the array case...
A linked list has O(n) extra space to play with, compared to an array, so of course everything is easier. You can't transfer algorithms from one to the other.
I've spent the last hour browsing www.techiedelight.com and I would like to say this is a great resource for coding problems. I like that the methodology for solving them is discussed for each posting.
I would be interested in hearing recommendations for other such sites that specifically discuss the methodology for approaching these problems.
Geeksforgeeks.com is another such resource for learning the approaches as well but I would be curious to hear any other suggestions as well.
The k'th largest element in an array can also be found by the nth_element() algorithm already included in the STL. It has linear complexity as opposed to the O(n log k) and O(k log n) variants proposed here.
But std::distance(first, last) is how n is defined.
std::nth_element is typically implemented using a fancy version of quickselect called introselect. You can imagine quickselect as a version of quicksort where the recursive call is only executed on the side which contains the element. Introselect adds some fanciness to ensure worst-case linear running time if quickselect takes too long (bad pivot selection).
No, as per my previous comment, it t does not use a heap. It's like quicksort, but only doing one of the recursive calls:
- choose a pivot p randomly from the elements
- partition into elements <= p and > p. This takes time O(n).
- if the number of elements <= p is at most k (the rank of the element we want), do a recursive call on these elements. Otherwise recurse on the other elements (those >p).
Because a random pivot splits the elements well enough most of the time, this has an expected running time of O(n): 1 + ½ + ¼ + ⅛ + … < 2 in the best case, and similar constants if the partitioning is a little worse. But in the worst case, when the pivot is the smallest/largest element in each iteration, it's O(n²). That's where the fanciness of introselect comes in: if quickselect doesn't converge, it switches to an algorithm with worst-case O(n) time, which is slower in most cases but saves the day when quickselect has trouble.
and median of medians can be used to implement quicksort with O(NlogN) worst case. It is usually not worth it as other hybrid quicksorts like introsort are faster in practice.
On an unrelated note, I was once asked to prove the upper bound for median of median on an interview...
Yup, exactly. It's a fun area that gets even more interesting when you look at distributed systems, where every node only stores a small portion of the overall data and you want to keep communication volume low.
>"It's a fun area that gets even more interesting when you look at distributed systems"
Can you elaborate a little bit, specifically how selection algorithms help reduce "chatter" in distributed systems? I am not familiar with this and this sounds like an interesting context.
I have heard of Merkle Trees for similar but that's obviously hashing and not selection algorithms, or is there some connection I am not making?
Well if you'd like to select the k-th biggest element from a distributed array it's impractical - if not impossible - to send everything to one machine so another approach is needed. A fairly simple algorithm would select a pivot randomly on a random node and broadcast it to all other nodes. Each node then does the partitioning locally and counts partition sizes. Now you do a sum all-reduction on the sizes to decide which side you need to recurse on. This takes logarithmically many rounds in expectation just like the sequential algorithm, but it could result in very unbalanced work if the data isn't distributed randomly. It also has even more trouble if the pivot selection is bad because of the communication overhead. There are various techniques to overcome these challenges but it's a bit much to go into here.
Thought this was going to be "problems in STL data structures" not "solving problems using with STL". A better title might be "Solving algorithm problems in C++" as STL isn't really important to the solutions.
Sure it is. It's as much as showcase of practical applications of the STL as it is of algorithms. Any algorithms book has implementations, the great thing about this site is that it shows you how to use the STL in real world applications.
Does anyone know of a website that teaches you data structures and algorithms, preferably in C, in an interactive way? I've always found that's the best way for me to learn.
But it can be implemented iteratively instead of recursively, so that there is only one stack frame. O(1) space for binary search is entirely possible. The link demonstrates both versions, the iterative one uses O(1) auxiliary space and the recursive implementation uses O(log n), unless the compiler uses tail call optimization :)
Since C++11 elision of copies of named local lvalues being returned is mandatory. In practice this was elided by older smart compilers and new compilers elide a whole more when returning.
That seems entirely unrelated to tail call optimisation though? For once, "return foo(bar, baz)" isn't returning a named value, it's returning an rvalue. Secondly, return value optimisation doesn't optimise away the stack frame. Thirdly, the standard isn't as strict as you make it sound: compilers may elide the copy if they can, but if they cannot they have to move, never copy.
Funny how I focused on "problems" and "using STL" when I read the headline the first time. I remember C++ fondly but only because my mind has now forgotten all the "fun" of debugging STL.
Just looked at the kth largest algorithm and author claims max heap method is k log n ! His method is actually n log n. The min heap is n log k and the right way to do it.
It has a ton of warts and complexity but it also has the features you need. If you program in C++ you can have simple solutions 95% of the time and when the intrinsic complexity is higher the language has the tools to save the day. Many other languages will solve the simple parts slightly cleaner but the complexity explodes on the difficult bits.
However if you are looking for fun I really like Rust, it has a nice c++ feel without having to worry so much. Personally I still think it is a bit too new for "enterprise coding" but it is maturing nicely and learning it will really teach you a lot.
Most modern general purpose languages will provide you with equivalent functionality. But the C++ standard library and STL, not to mention Boost, has taken this very far.
"The Algorithm Design Manual", by Steven Skiena, is very readable and it's generously free to download. I was going to post the link, but I couldn't unmangle it from Google. It's easy to find.
I don't see any links to it from any of Skiena's or the publisher's webpages, so it's not clear how legitimate it is.
Incidentally, I wouldn't really suggest Skiena as an alternative to Cormen et al; they're extremely different in style and content, and in situations where you need one of them the other probably won't help you. I recommend getting both. (For more verbosity on this, see my review of the first edition of Skiena's book at https://www.mccaughan.org.uk/g/books/alg-design.html .)
CLR is information-dense and authoritative. I have a copy from my college algorithms course and I refer to it occasionally. Skiena is a lighter read, so, I figure, you're more likely to actually read it. Whatever algorithms text you read, you will want to continue to augment your knowledge if you're going to confidently adapt and engineer algorithms.
Appears to be C++11, which is contained twice in your list. C++20 doesn't exist yet. C++17 isn't finished either.
It doesn't appear to use new language features, only some library things like std::unordered_map, from what I gathered in a quick look. Most things will probably work just fine with C++03.
Here's a couple innocent-looking special cases that are still surprisingly hard:
1) Given an array [a1, a2, ..., an, b1, b2, ..., bn], rearrange it into [a1, b1, a2, b2, ..., an, bn] using O(n) time and O(1) extra space.
2) Given an array of zeroes and ones, sort it stably using O(n) time and O(1) extra space.
I'd be very interested to hear about any advances in this area.