

Cycle Sort - suraj
http://corte.si/posts/code/cyclesort/index.html

======
mfukar
I believe that the author is referring to the case when the array to be sorted
contains only duplicates of a small number of items, where a perfect hash
function can speed up insertion; this turns cycle sort's time complexity into
Θ(n+k), with _k_ being the number of hashes. In such a case, _k_ is not
negligible compared to _n_ , so I wouldn't feel comfortable saying cycle sort
is O(n), as he does.

In the general case, cycle sort is Θ(n^2) with a total space complexity of
Θ(n).

edit:typos.

~~~
kingkilr
This doesn't sound remotely right. If n is the number of elements, k is the
number of distinct elements, and its O(n + k) as the author describes, k is
trivially bounded by n (you can't have more distinct elements than you do
elements). Resulting in O(n) complexity, how did you get O(n^2)?

------
tsewlliw
If you must restrict the input to permutations of [0,...,N], you already know
the result! Finding cycles is neat, but this works for the same data, minus
having bounds:

(define (trivialsort vals) (lambda (i) i))

------
stingraycharles
It's nice to say this algorithm is virtually O(n) in practice, but that's
about the "cycling" mechanism only. It needs to prepare a dictionary with
offsets, which has to loop over all the keys, perform an O (log n) insert
operation on all the keys, and allocate memory on the heap. This already makes
it (almost) O (n log n), without even doing the actual sorting.

It's a nice idea, but it's not O(n).

~~~
jemfinch
> It needs to prepare a dictionary with offsets, which has to loop over all
> the keys, perform an O (log n) insert operation on all the keys

Surely you've heard of hash tables.

~~~
Deestan
As with all other dictionary/map implementations, hash tables are O(log n) in
the general case.

~~~
jemfinch
No, they're not. Why do I have to contend with arguments like this every time
to this topic comes up?

"But," you say, "Hashing a value is O(k), where k is at least log n. Therefore
hash tables only support O(log n) access and update, not O(1)." It's become a
quite fashionable gotcha, as your upvotes indicate.

The problem is, it's wrong. It's correct in a vacuous, put-it-in-a-footnote
sense, but not in any real sense, the way we actually talk about data
structures in computer science.

We have a longstanding tradition in computer science of ignoring the O(k)
operations that you want to ascribe to hashing. The most relevant example of
where we ignore that factor is in--you guessed it--balanced binary trees used
as dictionaries. Comparison, like hashing, is also O(k), where k is >= log n.
So in the technical sense you're espousing, a balanced binary tree would offer
O(log n log n) access and update, rather than the O(log n) access and update
that everyone describes it as.

Of course, in reality, everyone considers comparison to be O(1), and thus they
say that balanced binary trees have O(log n) lookup. Likewise, everyone
considers hashing to be O(1) since it's in the same class of operations as
comparison, and thus they say that hash tables have O(1) lookup. This is how
the real world of computer science actually talks about things, fashionable
Internet objections notwithstanding.

(This is all covered in CLRS, of course, but no one seems to be able to look
things up in books anymore. "We assume that the hash value h(k) can be
computed in O(1) time...If the number of hash-table slots is at least
proportional to the number of elements in the table, we have `n = O(m)` and,
consequently, `alpha = n/m = O(m)/m = O(1)`. Thus, searching takes constant
time on average. Since insertion takes O(1) worst-case time and deletion takes
O(1) worst-case time when the lists are doubly linked, all dictionary
operations can be supported in O(1) time on average.")

~~~
eleusive
In particular, a good hash table implementation will have O(1) amortized
complexity [1] (i.e. average complexity over a worst-case sequence of
operations). Since in the Real World we deal with sequences of operations
rather than single ones, saying that hash tables provide O(1) operations is
quite correct.

[1] <http://videolectures.net/mit6046jf05_leiserson_lec13/>

------
mise
Nice use of Slovenia's TLD.

------
ancymon
I wonder how it sounds ;)

