227 points by coder007 on March 10, 2017 | hide | past | favorite | 76 comments

 A really cool algorithmic problem I've been obsessed with lately is stable sorting in O(n log n) time and O(1) extra space. It's possible (Trabb Pardo 1977, Katajainen and Pasanen, Huang and Langston, etc.) but all known algorithms are very complicated. As far as I understand now, there seems to be a "wall" at sqrt(n) extra space. If we have that much, it's not too hard to write a stable mergesort or quicksort with the right time complexity. But if we are restricted to O(1) or O(log n) space, the algorithms become much more involved, using a sqrt(n) buffer inside the array to encode information about the rest. There are working implementations on GitHub (GrailSort and WikiSort), both are over 500 lines.Here's a couple innocent-looking special cases that are still surprisingly hard:1) Given an array [a1, a2, ..., an, b1, b2, ..., bn], rearrange it into [a1, b1, a2, b2, ..., an, bn] using O(n) time and O(1) extra space.2) Given an array of zeroes and ones, sort it stably using O(n) time and O(1) extra space.I'd be very interested to hear about any advances in this area.
 Nit-pick: I believe there are no sorting algorithm in O(1) space: just keeping the index of an element takes log2(N) space. Most algorithms hide this fact by assuming numbers have an infinite range. If you do, then you can use a single data for any algorithm, given a sufficiently clever encoding :) (Just use a number base-N and then you got as much easy-to-access, direct-addressable storage as you want per elements.)
 When people talk about these things, they usually assume something like the transdichotomous model (machine word size is unknown, n fits in a machine word, arithmetic on machine words is O(1)). It gives the right complexities for everyday algorithms.
 > 2) Given an array of zeroes and ones, sort it stably using O(n) time and O(1) extra space.I'm missing something here. If the array's values truly are all in {0,1}, then all the zeroes are indistinguishable and you get stability for free (same for the ones).`` mysort all01list = let ones = sum all01list in replicate (length all01list - ones) 0 ++ replicate ones 1``
 On the first glance the first problem seems quite impossible. For example, if n is power of two, then for each prime number < n there will be a cycle starting with that number. If n is not a power of two, I haven't yet seen any good explanation of cycles. Any hints? We can't use a or b in any way?
 I think you're on the right track (i.e. viewing it as a permutation, and looking at the cycle decomposition of that permutation).Try indexing from 0 instead of 1 if you're not. Then the cycle containing 1 will start with (1, 2, 4, 8, ...). What happens when it wraps around?
 throwaput on March 10, 2017 Even for power of two it's not correct, forget my statement about cycles.
 Yet I have only one idea - that we can permute the elements somehow (in a O(1) reversible way like a Gray code) so that the cycles would form something computable in O(n) time and O(1) space
 Interesting problems. But is there a practical reason why you'd want to put such stringent restriction on extra space? I mean, space is relatively cheap nowadays (except of course if locality of reference/cache size is an issue, where it translates back into time).
 a) allocating the extra space might be expensive.and most importantly:b) you might not have extra space. For example, if you are implementing an out of core sort (i.e. sorting huge on-disk datasets), you want to maximize the size of the chunk you will sort in memory and do not want to waste any memory for the temporary storage. (this is related to locality of reference of course, you can a make similar argument for out-of-cache sorting).
 On one hand, amelius is right, there are already simple fast stable sorts using O(sqrt(n)) extra space and that should be okay for all practical purposes. (Though they aren't very well known, and inventing one will take you a week or two.) Allocating 30K elements to help sort a billion elements is no big deal, you don't need to squeeze further. My interest is more academic, how come the known O(1) solutions are so complex compared to O(sqrt(n)) and can they be simplified.
 I didn't know about this problem before but now I'm incredibly curious. I love these problems where you find a trade-off between two seemingly unrelated things, here how going from o(sqrt(n)) to o(log(n)) memoey increases the code complexity by order of magnitude. If you take the code for an o(1) or o(log(n)) memory sorting algorithm, I wonder if you can identify a subroutine that if memoized would give you an o(sqrt(n)) memory sorting algorithm you already know.
 I mentioned that in the first comment. Fast in-place stable sorts usually have a step where they must freely rearrange some things (blocks, medians, etc.) and remember their original order, to preserve stability in case some of them were equal. It seems like no one can squeeze the number of these things below sqrt(n). If you have sqrt(n) extra space, you can store a permutation there. If you don't, you use part of the array itself as temporary bit storage, by changing the relative positions of elements. That's where the complexity comes from.
 The both problems that you mentioned are quite similar, IMO
 The first has a simple solution if you know some university-level math. The second has no simple solution AFAIK.
 I must be missing something, because problem 2 seems easy. It's just an array of boolean values, right? So you just pass over the array once, counting the number of zeroes. Then pass of the first n elements (for n the number of zeroes) and write zero to them, and write one to all the rest of the array. That's clearly stable (right?), requires only two passes over the array so it's O(N) time, and the only memory used is an index for looping and the count of the number of zeroes so it's O(1) memory. I'm guessing I'm missing something important, what is it?
 Suppose you have a type A ('0' in the original description) and a type B ('1' in the original description). Suppose any instance of A < any instance of B, but any given instance of A is not necessarily even comparable to another instance of A. Similarly any instance of B is not necessarily comparable to another instance of B. Now stably sort an array containing instances of A and B.
 OK, now I see why this is hard, thanks. Now I'm gonna spend the next few days mulling over this problem :)I noticed the problem is relatively easy if you're dealing with a linked list instead of an array, because you can easily move elements around. If there's some way to amortize that in the array case...
 A linked list has O(n) extra space to play with, compared to an array, so of course everything is easier. You can't transfer algorithms from one to the other.
 I've spent the last hour browsing www.techiedelight.com and I would like to say this is a great resource for coding problems. I like that the methodology for solving them is discussed for each posting.I would be interested in hearing recommendations for other such sites that specifically discuss the methodology for approaching these problems.Geeksforgeeks.com is another such resource for learning the approaches as well but I would be curious to hear any other suggestions as well.
 I suppose you meant geeksforgeeks.org
 Indeed, yes, sorry.
 The k'th largest element in an array can also be found by the nth_element() algorithm already included in the STL. It has linear complexity as opposed to the O(n log k) and O(k log n) variants proposed here.
 It is only linear in std::distance(first, last), so it very well might be (and likely is) nlogk .Edit: see comment below for why it actually is O(n) and not O(n logk) as I had thought.
 But std::distance(first, last) is how n is defined.std::nth_element is typically implemented using a fancy version of quickselect called introselect. You can imagine quickselect as a version of quicksort where the recursive call is only executed on the side which contains the element. Introselect adds some fanciness to ensure worst-case linear running time if quickselect takes too long (bad pivot selection).
 very cool! I stand corrected, I didn't think of this when I considered it. Sorry for the mix-up
 bogomipz on March 10, 2017 Its O(n logk) then? It maintain a heap I'm guessing?
 No, as per my previous comment, it t does not use a heap. It's like quicksort, but only doing one of the recursive calls:- choose a pivot p randomly from the elements- partition into elements <= p and > p. This takes time O(n).- if the number of elements <= p is at most k (the rank of the element we want), do a recursive call on these elements. Otherwise recurse on the other elements (those >p).Because a random pivot splits the elements well enough most of the time, this has an expected running time of O(n): 1 + ½ + ¼ + ⅛ + … < 2 in the best case, and similar constants if the partitioning is a little worse. But in the worst case, when the pivot is the smallest/largest element in each iteration, it's O(n²). That's where the fanciness of introselect comes in: if quickselect doesn't converge, it switches to an algorithm with worst-case O(n) time, which is slower in most cases but saves the day when quickselect has trouble.
 See "median of medians algorithm": https://en.wikipedia.org/wiki/Median_of_mediansIt's hairier than a partial quicksort, but (if you erase the proof and weigh the chalk dust) guaranteed linear time.
 and median of medians can be used to implement quicksort with O(NlogN) worst case. It is usually not worth it as other hybrid quicksorts like introsort are faster in practice.On an unrelated note, I was once asked to prove the upper bound for median of median on an interview...
 bogomipz on March 10, 2017 Thanks, right, the idea to is to only recurse down one side on each pass right? This is the (sub)family of "selection" algorithms?This is interesting I had not heard of introselect before, I will have to read up on this.
 Yup, exactly. It's a fun area that gets even more interesting when you look at distributed systems, where every node only stores a small portion of the overall data and you want to keep communication volume low.
 >"It's a fun area that gets even more interesting when you look at distributed systems"Can you elaborate a little bit, specifically how selection algorithms help reduce "chatter" in distributed systems? I am not familiar with this and this sounds like an interesting context.I have heard of Merkle Trees for similar but that's obviously hashing and not selection algorithms, or is there some connection I am not making?
 Well if you'd like to select the k-th biggest element from a distributed array it's impractical - if not impossible - to send everything to one machine so another approach is needed. A fairly simple algorithm would select a pivot randomly on a random node and broadcast it to all other nodes. Each node then does the partitioning locally and counts partition sizes. Now you do a sum all-reduction on the sizes to decide which side you need to recurse on. This takes logarithmically many rounds in expectation just like the sequential algorithm, but it could result in very unbalanced work if the data isn't distributed randomly. It also has even more trouble if the pivot selection is bad because of the communication overhead. There are various techniques to overcome these challenges but it's a bit much to go into here.
 Is the nth_element() function an implementation of "quick select" then"
 Thought this was going to be "problems in STL data structures" not "solving problems using with STL". A better title might be "Solving algorithm problems in C++" as STL isn't really important to the solutions.
 "as STL isn't really important to the solutions."Sure it is. It's as much as showcase of practical applications of the STL as it is of algorithms. Any algorithms book has implementations, the great thing about this site is that it shows you how to use the STL in real world applications.
 I was thinking the same, but don't agree on the latter: using the STL makes code quite compact and easier to follow.
 Does anyone know of a website that teaches you data structures and algorithms, preferably in C, in an interactive way? I've always found that's the best way for me to learn.
 There's a great course on Coursera that uses Java by Bob Sedgewick.
 It claimed that recursive binary search has a space complexity of O(1) - it has O(log n) as the stack frames are your used memory
 But it can be implemented iteratively instead of recursively, so that there is only one stack frame. O(1) space for binary search is entirely possible. The link demonstrates both versions, the iterative one uses O(1) auxiliary space and the recursive implementation uses O(log n), unless the compiler uses tail call optimization :)
 Since C++11 elision of copies of named local lvalues being returned is mandatory. In practice this was elided by older smart compilers and new compilers elide a whole more when returning.
 That seems entirely unrelated to tail call optimisation though? For once, "return foo(bar, baz)" isn't returning a named value, it's returning an rvalue. Secondly, return value optimisation doesn't optimise away the stack frame. Thirdly, the standard isn't as strict as you make it sound: compilers may elide the copy if they can, but if they cannot they have to move, never copy.
 Funny how I focused on "problems" and "using STL" when I read the headline the first time. I remember C++ fondly but only because my mind has now forgotten all the "fun" of debugging STL.
 Thanks for sharing this. I was looking into http://www.geeksforgeeks.org/top-10-algorithms-in-interview-... but it's not very idiomatic.Then, make sure you use std::make_unique or std::make_shared rather than operator new if you are going for a C++ job.
 Very useful work! I would love to see it implemented in many other languages. Any takers? :-)
 Just looked at the kth largest algorithm and author claims max heap method is k log n ! His method is actually n log n. The min heap is n log k and the right way to do it.
 it is clearly O(klogn). We are popping k times from a heap of size n.
 I'm very used with this kind of stuff with C++.Which modern language should I try if the first thing I miss in a language is the STL and the C-like syntax?
 Call me crazy but stick with c++.It has a ton of warts and complexity but it also has the features you need. If you program in C++ you can have simple solutions 95% of the time and when the intrinsic complexity is higher the language has the tools to save the day. Many other languages will solve the simple parts slightly cleaner but the complexity explodes on the difficult bits.However if you are looking for fun I really like Rust, it has a nice c++ feel without having to worry so much. Personally I still think it is a bit too new for "enterprise coding" but it is maturing nicely and learning it will really teach you a lot.
 Some what analogous: OCaml has Modules and Functors (not the normal FP functors but more like signature mappers).OCaml I guess isn't that modern but they did just add First Class modules [1].
 Most modern general purpose languages will provide you with equivalent functionality. But the C++ standard library and STL, not to mention Boost, has taken this very far.
 C#, Java, Swift, Rust, D probably.
 D?
 Is there something similar that's language agnostic or in Python?
 Well there is one of the best books on the subject."Introduction to Algorithms"
 "The Algorithm Design Manual", by Steven Skiena, is very readable and it's generously free to download. I was going to post the link, but I couldn't unmangle it from Google. It's easy to find.
 This looks like it: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.471....I don't see any links to it from any of Skiena's or the publisher's webpages, so it's not clear how legitimate it is.Incidentally, I wouldn't really suggest Skiena as an alternative to Cormen et al; they're extremely different in style and content, and in situations where you need one of them the other probably won't help you. I recommend getting both. (For more verbosity on this, see my review of the first edition of Skiena's book at https://www.mccaughan.org.uk/g/books/alg-design.html .)
 CLR is information-dense and authoritative. I have a copy from my college algorithms course and I refer to it occasionally. Skiena is a lighter read, so, I figure, you're more likely to actually read it. Whatever algorithms text you read, you will want to continue to augment your knowledge if you're going to confidently adapt and engineer algorithms.I agree with you. Get both.
 http://interactivepython.org/runestone/static/pythonds/index...It's free. It doesn't have the formalism of CLRS which some might not like but its still a great practical resource if you want Python.
 Sure there is, but I doubt you'll be able to get the same performance as C++ without doing heavy optimization.
 I don't think the question had anything to do with speed.To answer the question, there are many general books on algorithm design that would be appropriate for any language.
 Is there an offline/e-book version?
 Website is unusable on mobile. Text falls off right side, horizontal scrolling disabled. How is this still a thing?
 Looks fine on my phone. Do you have desktop request on by accident?
 No I do not. Code is fine. Comments in code are cut off mid word. PixelXL. Chrome.
 It doesn't state what standard it is using. Is it with:`````` C++98 or C++03 or C++11 or C++14 or C++11 or C++17 or C++20 ?``````
 Appears to be C++11, which is contained twice in your list. C++20 doesn't exist yet. C++17 isn't finished either.It doesn't appear to use new language features, only some library things like std::unordered_map, from what I gathered in a quick look. Most things will probably work just fine with C++03.
 It compiles with -std=c++11, graph one fails with 03 on WSL at least
 I see "nullptr" in a few examples, which is C++11.
 Yeah but it should work in C++03 if you just '#define nullptr NULL'. Not saying it's good practice but C++11 usage is very light.
 If this would be in Python, for example, for starters we would need to know if it is 2 or 3 variant.Then we would need to know which of the following versions is being used:`````` Python 1.0 - January 1994 Python 1.5 - December 31, 1997 Python 1.6 - September 5, 2000 Python 2.0 - October 16, 2000 Python 2.1 - April 17, 2001 Python 2.2 - December 21, 2001 Python 2.3 - July 29, 2003 Python 2.4 - November 30, 2004 Python 2.5 - September 19, 2006 Python 2.6 - October 1, 2008 Python 2.7 - July 3, 2010 Python 3.0 - December 3, 2008 Python 3.1 - June 27, 2009 Python 3.2 - February 20, 2011 Python 3.3 - September 29, 2012 Python 3.4 - March 16, 2014 Python 3.5 - September 13, 2015 Python 3.6 - December 23, 2016 `````` And for extra fun we can also add PyPy, CPython, IronPython, MicroPython, .... into the mix.I can gladly play this game with any other programming language.Languages that people care about, get new versions all the time, and not all tooling implementations or developers catch up at the same time.It is part of our job to deal with it.
 Email the author and ask? After he tells you, then use -std=c++XX (or similar) on your favorite compiler and you'll be fine.

Search: