Sunday, May 31, 2009

A (very slightly) shorter version of Heapsort

I regard Heapsort as the most beautiful of the O(n.logn) sorting routines.
It also has very low space requirements. But in practice it isn't a very
good sorting routine if the array it is sorting doesn't fit in the cache
(on an Intel machine, it does very well indeed if the array fits in the L1 cache,
and still does pretty well if the array fits in the L2 cache, but for larger
arrays it is a dog).

The routine presented here isn't noticeably faster than a standard
binary Heapsort (it does roughly N fewer record moves), but it's (very slightly) shorter. Standard Heapsort treats the first element of the array as a special
case. This version does not. Instead of the code inside the loop of selection phase of the sort having
the pseudo-code:

  • Exchange the element that is to be removed with the apex of the heap

  • Reestablish the heap property for the remaining elements

it has the pseudo-code:

  • Pretend that the element to be removed is the apex of the heap

  • Reestablish the heap property



template<class T> int heapSortTopless(T *a, int count)
{
  int i; 
  int j; 
  T v;
  for (int h=count/2; h>=0; h--)  
  {
    v=a[i=h];
    j = i + i + 2;  //standard heapsort has +1 instead of + 2
    while (j<count)
    {
      if (j<count-1) 
        j+= (a[j]<a[j+1]) ? 1 : 0;
      if (a[j]<=v) 
        break;
      a[i]=a[j];
      i = j;
      j = i + i + 2;  //standard heapsort has +1 instead of +2
    }
    a[i]=v;
  }
  for (--count;count>=0;count--)
  {
    v=a[count];  //standard heapsort must also set a[count]=a[0]  
    i = count;  //standard heaposrt sets i=0
    j = 0;      //standard heapsort sets j=1

    while (j<count)
    {
      if (j<count-1) 
        j+= (a[j]<a[j+1]) ? 1 : 0;
      if (a[j]<=v) 
        break;
      a[i]=a[j];
      i = j;
      j = i + i + 2;  //standard heapsort has +1 instead of +2
    }
    a[i] = v;
  }
}

When I first came up with this change, I thought it would have a cache advantage over standard Heapsort for some array types (my reasoning was that, for record size r, if the cache line size is a multiple of 2*r, and the array starts on a cache line boundary, slightly fewer cache lines would be accessed on average), but in practice this doesn't seem to make much of a difference, even if the cache line size is exactly 2*r.

In practice, however, Heapsort is usually my second choice of sorting routine (Combsort is my first choice, because it is easier to code and easier to get right), and even when I use Heapsort I tend to use the standard binary Heapsort. Although ternary (and higher radix) heapsorts work better in practice, I don't want to waste the time of maintenance programmers who mightn't be familiar (let alone comfortable) with them.

No comments:

Post a Comment