To be pedantic, in-place sorting algorithms (typically?) take O(log n) space, no...

Buge · on March 15, 2014

Here is what CLRS says about this:

The data types in the RAM model are integer and ﬂoating point (for storing real numbers). Although we typically do not concern ourselves with precision in this book, in some applications precision is crucial. We also assume a limit on the size of each word of data. For example, when working with inputs of size n, we typically assume that integers are represented by c lg n bits for some constant c >= 1. We require c >= 1 so that each word can hold the value of n, enabling us to index the individual input elements, and we restrict c to be a constant so that the word size does not grow arbitrarily. (If the word size could grow arbitrarily, we could store huge amounts of data in one word and operate on it all in constant time—clearly an unrealistic scenario.)

I like to think of the memory requirement as the number of integers required, not that the integers have to get larger. If we considered the size of the inters then traditional O(log n) memory algorithms (like the average case of quicksort) would be determined to take O((log n)^2) memory.

tbingmann · on March 15, 2014

In a nutshell: to keep an index into an array of size n requires log_2(n) bits. Thus any algorithm, which keeps even just one index into the input takes Omega(log n) space. Yay for pedantic asymptotics.

detrino · on March 15, 2014

Heap sort is an example of an in-place sorting algorithm that is constant space. Just because this implementation uses some kind of fixed size buffer, doesn't mean it has an upper limit for the amount of data it can sort. It is possible, but probably requires a deeper understanding of the algorithm to know for sure.

_delirium · on March 15, 2014

Maybe I'm missing some way of implementing heapsort without array indices, but all the implementations I've seen use O(log n) space. For example, the pseudocode on Wikipedia (https://en.wikipedia.org/wiki/Heapsort#Pseudocode) at a quick glance uses five temporary variables: 'count', 'end', 'start', 'child', and 'root'. How much temporary space is this? Well, proportional to the log of the size of the array, at a minimum. If you're sorting 256-element arrays, these can each be 8-bit integers, and you need 5 bytes of temporary space. If you're sorting a 2^32 element array, these need to each be at least 32-bit integers, and you need 20 bytes of temporary space. If you have a 2^64 element array, they need to be 64-bit integers, and you need 40 bytes of temporary space. If you have a 2^128 element array, they need to be 128-bit integers, and you need 80 bytes of temporary space. Therefore, temporary space needed grows with the log of the array size. Which is very slow growth, but still growth!

The two things you can do are: 1) pick a fixed-size int and cap the max size of array you can handle; or 2) use a bigint datatype, and the size of your bigints will (asymptotically) grow with log(n).

tomp · on March 16, 2014

But this is a really irrelevant point, as you can't even represent an array of size n without log n space (for the start pointer and the size/end pointer). So, every algorithm involving arrays needs at least log n space, if nothing else for the input arguments.

_delirium · on March 16, 2014

It's irrelevant in practice, mostly because log(n) is very small for essentially all values of n, so one could consider it "close enough to constant". Close enough in the sense that a reasonably small constant (like "64") bounds any value of it you could possibly encounter in a sorting algorithm. But that's true of many things in big-O analysis.

If your temporary space was used by some indices represented as unary-encoded integers (which take O(n) space), it'd suddenly be harder to ignore the indices, because O(n) is not "very small" for many commonly encountered sizes of n. So I don't think taking the temporary space used by indexing into account is conceptually wrong, it just happens not to matter here because log(n) factors often don't matter.

makomk · on March 16, 2014

In practice, though, it makes sense in this case to assume integers representing array indices are constant-space and arithmetic on them takes constant time. The computers we're running these algorithms on use base 2 and the size of integer they can operate on in a single operation has scaled nicely with the amount of storage they have access to.

detrino · on March 15, 2014

I understand your point now, I thought you were mainly referring to the stack space quick sort uses.

Edit: A comment above claims that without this bound, quicksort isn't even log n space.

cbhl · on March 16, 2014

Granted, about twenty years passed before anyone had an array of >2^31 elements in Java and noticed this bug in binary search:

http://googleresearch.blogspot.ca/2006/06/extra-extra-read-a...

If people are using abstract data types and/or built-ins, in practice, presumably they can use a fixed int type for O(1) runtime and just update the data type every 20 years or so.

On that note, has anyone actually created a 2^128 element array yet? I suspect such an array would be too large to represent in the memory of all the world's computers at the moment.

sorbits · on March 16, 2014

Big-Oh notation deals with theory, and in theory the cost of a variable is fixed.

Furthermore, an array of size N is normally storing N variables (pointers) not N bits, so calculating storage required in bits relative to the input size (without a unit) is disingenuous.

jmpeax · on March 16, 2014

So, is there such a thing as a useful O(1)-space algorithm? I can't think of a single one.

repsilat · on March 16, 2014

(Particular, individual) regular expressions can be implemented with O(1) space. Regular expressions can do lots of useful things.

jmpeax · on March 27, 2014

This is incorrect, under deliriums pedantic interpretation of space complexity. Some regular expressions require one to accept a certain number of examples of a character. This means that storage space for the count is required, which scales logarithmically in the count.

repsilat · on April 3, 2014

This is why my comment began with the words "particular, individual". A regex like (1{5}0{3})\* can be implemented in constant space, but a language like

    matches(n, m, string):
        return matches((1{n}0{m})\*, string)

cannot.

Just think about it -- the former is just a DFA, and so of course you can do it in constant space (provided your input stream is abstracted away, or you use a TM.

JackpotDen · on March 17, 2014

int x = 0;

x++;

would be two examples.

jmpeax · on March 27, 2014

How many times would you need to execute line 2? The more times you need to do this, the more values x needs to be able to represent. So running x++ in O(n) time means the space needed to store the accumulation grows as O(n log_2 n)... under delirium's "pedantic" interpretation of space complexity.

Unknowingly you have allowed me to stumble onto an actual algorithm with O(1) storage pace, that is, one where there are n numbers, and you want to calculated the modulo-k sum. In this case, the algorithm scales logarithmically in the constant parameter k, but O(1) in n.