Heapsort

Steven J. Zeil

Last modified: Oct 26, 2023
Contents:

Priority queues are useful structures in their own right. They often arise in scheduling problems where things must be serviced based upon their importance. We’ll also see, in a later section, that they are valuable in many graph manipulation algorithms.

Heaps have another important application besides as implementations of priority queues. They can be used to provide a simple and efficient sorting algorithm.

1 Heapsort – Conceptual

template <class RandomIterator, class Compare>
void pop_heap (RandomIterator first, 
               RandomIterator last,
               Compare comp)
//Pre: The range [first,last) is a valid heap.
//Post: Swaps the value in location first with the value in the location
//      last-1 and makes [first, last-1) into a heap.

Take another look at the description of the std::pop_heap function.

The position last-1 winds up holding the former root of the heap (the largest element that had been in the heap).

template <class Container, class Compare> 
class  priority_queue 
{
   ⋮
  void pop() 
    { 
      pop_heap(c.begin(), c.end(), comp);
      c.pop_back(); 
    }
};

When we use this routine to implement a priority queue, we simply discard that element.

But suppose that we kept it instead of throwing it away.

pop_heap (c.begin(), c.end(), comp);
pop_heap (c.begin(), c.end()-1, comp);
pop_heap (c.begin(), c.end()-2, comp);
    ⋮

Assume that c contains a heap.

The first call here would put the largest element of the container c in position c.end()-1.

The second call would put the largest remaining element in the container c (the second largest in the original container) in position c.end()-2.

The third call would put the largest remaining element in the container c (the third largest in the original container) in position c.end()-3.

If we keep this up, we will eventually wind up having sorted all the elements in c.

2 Implementing HeapSort

hs1.cpp
template <class Iterator, class Compare> 
void heapsort (Iterator first, Iterator last, Compare comp)
{
  make_heap (first, last, comp);
  while (last != first)
    {
     pop_heap (first, last-1, comp);
     --last;
    }
}

A heap sort is really pretty simple. First we form the array into a heap. Then we repeatedly pop the heap, collecting the successive maximum values at the end of the container.

The code discussed here is available as an animation that you can run to see how it works.

3 Analysis of HeapSort

template <class Iterator, class Compare> 
void heapsort (Iterator first, Iterator last, Compare comp)
{
  make_heap (first, last, comp);
  while (last != first)
    {
     pop_heap (first, last-1, comp);
     --last;
    }
}


As always, we work from the inside out. --last is obviously O(1).

Question: What is the worst-case complexity of the call to pop_heap?

Answer:
template <class Iterator, class Compare> 
void heapsort (Iterator first, Iterator last, Compare comp)
{
  make_heap (first, last, comp);
  while (last != first)              // cond: O(1)  #: n
    {
     pop_heap (first, last-1, comp); // O(log (last-first))
     --last;                         // O(1)
    }
}

The value of last changes each time around the loop, so we can’t use the multiplicative shortcut. But let $n$ stand for the value of last-first when we first entered the heapsort function. Then it’s clear that, each time around the loop, $\mbox{last} - \mbox{first} \leq n$.

At some risk, therefore, of obtaining an overly loose complexity bound, we can treat the body as $O(\log n)$. So, we have …

template <class Iterator, class Compare> 
void heapsort (Iterator first, Iterator last, Compare comp)
{
  make_heap (first, last, comp);
  while (last != first)            // cond: O(1)  #: n
    {
     // O(log n)
    }
}

… and the loop reduces to …

template <class Iterator, class Compare> 
void heapsort (Iterator first, Iterator last, Compare comp)
{
  make_heap (first, last, comp);
  O(n*log n)
}

Now, we just need to figure out the cost of the make_heap call.

Question: What is the worst-case complexity of the make_heap call?

Answer:
template <class Iterator, class Compare> 
void heapsort (Iterator first, Iterator last, Compare comp)
{
  make_heap (first, last, comp);  // O(n)
  O(n*log n)
}

The $O(n)$ time for make_heap is dominated by the $O(n \log n)$ time for the loop, so the entire algorithm is $O(n \log n)$.

Heapsort has an advantage over the merge sort (which also has an $O(n \log n)$ worst case) in that heapsort has a negligible memory overhead, while merge sort has $O(n)$ overhead.

Heapsort has a better worst case complexity than quick sort, but experiment has shown that heapsort tends to be slower on average because it moves more elements than does quick sort.

4 Introspective Sort

The sorting algorithm used in most implementations of the C++ std::sort function is an algorithm called the introspective sort.

Introspective sorts combine two algorithms we have already studied:

An introspective sort starts as an ordinary quicksort, but monitors the size of the stack used to control the quicksort recursion. If the stack grows much larger than $\log(N)$, the sort switches over to the heapsort.

The net result is a sorting algorithm that has a worst-case and average-case complexity of $O(N \log(N))$ but that usually runs with the fast, low-constant-multiplier of quicksort.