L
Lew
Michael said:The usual problem with heapsort-based algorithms, besides having O(n
lg n) as a lower bound, is that they have lousy locality of reference.
In traditional heapsort, once the heap is built, you proceed by taking
the leftmost element in the array (the root of the heap) and swapping
it with the last element of the heap part of the array. If the heap
size is not trivial, you're doing a load from one cache line and a
store into another.
Smoothsort appears to fix this problem, because it keeps the largest
elements at the right end of the forest of heaps, so the move from the
forest of heaps to the sorted part of the array is local. It also
avoids having to re-heapify large heaps, since it always operates on
the smallest heap left in the forest, and it splits large heaps as it
dequeues from them. In fact, while I've never tried implementing
smoothsort, much less analyzed its behavior, it looks like it has
quite good locality of reference.
But Timsort has great locality of reference, because it deals in runs
(including reversed-order runs), and when it doesn't have a decent run
at the current position, it sorts a small fragment of the array to
create one.
On the other hand, when sorts are used in most modern applications,
they're comparing keys that are located elsewhere in memory and
probably invoking a user-supplied comparison function, so locality of
reference may not have much effect on performance.
Proving once again that theory can lead you to town, but it can't find you the
concierge at the hotel. For that you need measurement under realistic
conditions. Evidence will tell you if the difference between Timsort,
Quicksort, Smoothsort, Bubble sort or Stupidsort even matters, much less which
one wins.
--
Lew
Stupidsort: scan the array looking for the largest element. Copy it to one
end of temporary array. Repeat with next element, placing in next position,
repeat until whole array is copied. Quicksort the result. Copy it back to
the original array. Cache the results.