Processing math: 100%

Heaps

Steven J. Zeil

Last modified: Mar 11, 2025
Contents:

Problem: Given a collection of elements that carry a numeric “score”, find and remove the element with the smallest [largest] score. New elements may be added at any time.

In an earlier lesson, we saw that this collection is called a priority queue. Now we will look at an efficient way of implementing it.

1 Recap: the std Priority Queue Interface

This is the entire priority queue interface.

public class PriorityQueue<E> implements Queue<E> {

    // Create an empty priority queue.
    public PriorityQueue() { ... }

    // Create a priority queue using a specific comparator.
    public PriorityQueue(Comparator<E> comparator) { ... }

    // Create a priority queue initialized with all of the elements
    // from another Collection.
    public PriorityQueue(Collection<? extends E> c) { ... }

    public void clear() { ... }

    public boolean isEmpty() { ... }

    public int size() { ... }

    // Add an element ot the queue.
    public boolean add(E e) { ... }

    // Look at the front element without removing it.
    public E peek() { ... }

    // Remove and return the front element.
    public E poll() { ... }

       ⋮
}

1.1 The Priority Queue Implementation

public class PriorityQueue<E> implements Queue<E> {

    private MinHeap<E> heap;

    // Create an empty priority queue.
    public PriorityQueue() {
        heap = new MinHeap<>();
    }

    // Create a priority queue using a specific comparator.
    public PriorityQueue(Comparator<E> comparator) {
        heap = new MinHeap<>(comparator);
    }

    // Create a priority queue initialized with all of the elements
    // from another Collection.
    public PriorityQueue(Collection<? extends E> c) {
        heap = new MinHeap<>(c);
    }

    public void clear() { heap.clear(); }

    public boolean isEmpty() { return heap.heapSize() == 0; }

    public int size() { return heap.heapSize(); }

    public boolean add(E e) {
        heap.insert(e);
        return true;
    }

    public E peek() { return heap.peek(); }

    public E poll() { return heap.remove(); }

    ⋮
}

We have added an implementing data structure, a heap. You can then see that the priority queue functions are pretty much one-liners, all passing the buck to similar functions provided by the heap.

We will look at this heap data structure in just a bit. First, though, let’s look at what we could do to implement priority queues using the data structures we already know.

One possibility would be to use a sorted sequential structure (array or linked list). For example, using an ArrayList, we would try to keep the elements in descending order by priority. Then we could peek() at the priority queue as the get(size()-1) of the implementing ArrayList.

Question: With this data structure, what would the complexities of the priority queue add and poll operations be?

Answer

We can do better than that.

We might consider instead using a balanced binary search tree to store the priority queue. This time, it will be a little easier if we store the items in ascending order by priority.

Question:

Using a balanced binary search tree as the underlying data structure, what would the complexities of the priority queue add and poll operations be?

Answer

That sounds pretty good.

We can’t actually hope to improve on the worst case times offered by balanced search trees, we can match those worst case times (and improve on the multiplicative constant) and actually achieve O(1) average case complexities by the use of a new data structure, called a “heap”.

2 Implementing Priority Queues - the Heap

We can implement priority queues using a data structure called a (binary) heap, sometimes also known as a “binary heap”.

2.1 Binary Heaps

A binary heap is a binary tree with the properties:

**Important: A heap is a binary tree, but not a binary search tree. The ordering rules for heaps are different from those of binary search trees.

What I have defined here is sometimes called a max-heap, because the largest value in the heap will be at the root. We can also have a min-heap, in which every child has a value larger than its parent.

Max-heaps always have their very largest value in their root. Min-heaps always have their smallest value in the root.

In this course, we will always assume that a “heap” is a “max-heap” unless explicitly stated otherwise.

Let’s look at the implications of each of these two properties.

2.2 Heaps are complete trees

 

Here’s an example of a complete binary tree.

Complete binary trees have a very simple linear representation, allowing us to implement them in an array or vector with no pointers.

 

2.3 Children’s Values are Smaller than Their Parent’s

 

Using the same tree shape, we can fill in some values to show an example of a heap.

Each parent has a value larger than its children’s values (and, therefore, larger than the values of any of its descendants).

So when we ask for the front (largest value) of a priority queue, we find it in the root of the heap, which in turn will be in position 0 of the array/vector.

2.4 The Data Structure

The code in the textbook is not generic, so I will present a generic version.

class MaxHeap<E> {

    private Object[] heap; // Pointer to the heap array
    private Comparator<E> compare; // comparator to use when comparing elements
    private int n; // Number of things now in heap

Our basic data structure will be an array. In the textbook, this array has a fixed size that constitutes the maximum size of the queue. I will instead use an ArrayList-style doubling of the array whenever and add operation threatens to overflow it.

We are using the array to store a tree, but most of our “thinking” about this code will be in terms of a tree. So some useful utility functions will find the parents and children of any node.

// Return true if pos a leaf position, false otherwise
    private boolean isLeaf(int pos) {
        return (n / 2 <= pos) && (pos < n);
    }

    // Return position for left child of pos
    private static int leftChild(int pos) {
        return 2 * pos + 1;
    }

    // Return position for right child of pos
    private static int rightChild(int pos) {
        return 2 * pos + 2;
    }

    // Return position for parent
    private static int parent(int pos) {
        return (pos - 1) / 2;
    }
    ⋮

2.5 Sifting Up and Sifting Down

Before looking in detail at how to add and delete elements from a heap, let’s consider a situation in which we have a “damaged” heap with one node out of position.

How do we “fix” the heap? There are two cases to consider.

2.5.1 Sifting Up

 

When we have a node that is larger than its parent, we sift it up (sometimes call “bubbling up”) by swapping it with its parent until it has reached its proper position.

    // Moves an element up to its correct place
    private void siftUp(int pos) {
        while (pos > 0) {
            int parent = parent(pos);
            if (isGreaterThan(parent, pos)) {
                return; // stop early
            }
            swap(pos, parent);
            pos = parent; // keep sifting up
        }
    }

    // swaps the elements at two positions
    private void swap(int pos1, int pos2) {
        Object temp = heap[pos1];
        heap[pos1] = heap[pos2];
        heap[pos2] = temp;
    }

    // does comparison used for checking heap validity
    private boolean isGreaterThan(int pos1, int pos2) {
        E e1 = (E) heap[pos1];
        E e2 = (E) heap[pos2];
        return compare.compare(e1, e2) > 0;
    }

 

In this case, starting with pos = 8, we swap node 8 with its parent 3 …

first prev1 of 3next last

Note that we have repaired the heap. The final arrangement satisfies the ordering requirements for a heap.

2.5.2 Sifting Down

 

When we have a node that is smaller than one or both of its children, we sift it down (also known as “percolate down” or “drip down”) by swapping it with the larger of its children until it has reached its proper position.

    // Moves an element down to its correct place
    private void siftDown(int pos) {
        while (!isLeaf(pos)) {
            int child = leftChild(pos);
            if ((child + 1 < n) && isGreaterThan(child + 1, child)) {
                child = child + 1; // child is now index with the lesser value
            }
            if (!isGreaterThan(child, pos)) {
                return; // stop early
            }
            swap(pos, child);
            pos = child; // keep sifting down
        }
    }

This is only a little more complicated than bubbling up. The main complication is that the current node might have 0 children, 1 child, or 2 children, so we need to be careful that we don’t try to access the value of non-existent children.

 

In this case, starting with pos = 0, we swap node 0 with its larger child, 2 …

first prev1 of 3next last

If you understand the ideas of sifting up and sifting down, then almost all the things you would want to do to a heap become a variant of those two ideas.

2.6 Inserting into a heap

 

Suppose we have this heap and we want to add a new item to it.

Now, after we add an item to the heap, it will have one more tree node than it currently does. Because heaps are complete trees, we know exactly how the shape of the tree will change, even if we can’t be sure how the data values in the tree might be rearranged.

 

Question: How will the shape of the tree shown above change?

Answer:

 

Well, suppose that we just go ahead and put the new value into that position.

We’ve got two possibilities.

It would be the only node that was out of position, and we know how to “repair” a heap with a single node out of position that is larger than its parent — we sift up!

    // Insert val into heap
    public void insert(E key) {
        expandArrayIfNecessary();
        // Add the new value
        heap[n] = key;
        n++;
        siftUp(n - 1);
    }

    // Make sure that we have room to add one more element
    private expandArrayIfNecessary() {
        if (n >= heap.length) {
            // If we are about to overflow the array, double its size.
            int newCapacity = Math.max(1, 2 * heap.length);
            Object[] newHeap = new Object[newCapacity];
            System.arraycopy(heap, 0, newHeap, 0, n);
            heap = newHeap;
        }
    }

 

For example, suppose we wanted to add 54 to the heap. First we would add 54 onto the end of the array, in effect adding it to the complete tree.

first prev1 of 3next last

2.7 Removing from Heaps

When we remove the largest element from a heap, we know that the value being removed is the value currently in the root.

We also know how the tree shape will change. The rightmost node in the bottom level will disappear.

Now, unless the heap only has one node, the node that’s disappearing does not contain the value that we’re actually removing. So, we have two problems:

So, we’ve got a node with no data, and data that needs a node. The natural thing to do is to put the data in that node.

That data value will almost certainly be out of position, being smaller than one or both of its children, but, again, that’s only a single node that’s out of position. We know how to fix that.

    // Remove and return root
    public E remove() {
        n--;
        swap(0, n); // Swap maximum with last value
        if (n > 0)
            siftDown(0); // Put new heap root val in correct place
        return (E) heap[n];
    }

 

Suppose we wanted to remove the maximum value from this heap.

The first step is to replace the root value by 47.

Then we pop the back of the vector to remove that final node.

first prev1 of 4next last

3 Analysis

A binary heap has the same shape as a balanced binary search tree.

Therefore its height, for n nodes, is log(n).

3.1 insert and remove

insert and remove do O(1) work on each node along a path that runs, at worst, between a single leaf and the root.

Hence both operations are O(logn), worst case.

The average case for insert is O(1). The proof of this is beyond scope of this class.

3.2 buildHeap

A single insertion is O(logn) worst case and O(1) average.

What happens if we start with an empty heap and do n inserts? The resulting total could be O(nlogn).

As it happens, we can do better with a special build operation to build an entire heap from an array (or array-like structure such as a vector).

    // Heapify contents of the heap array
    protected void buildHeap() {
        for (int i = parent(n - 1); i >= 0; i--) {
            siftDown(i);
        }
    }
  • Start with the data in any order.

  • Force heap order by percolating each non-leaf node.

Since each siftDown takes, in worst case, a time proportional to the height of the node being sifted, the total time for buildHeap is proportional to the sum of the heights of all the nodes in a complete tree.

 

Consider an array with N elements. Let h be the height of the complete tree representing that array. h=logN

We apply percolate to the first N/2 elements.

Total work: hi=0in2i1

=hi=0ni2i1

=nhi=0i2i1

Using one of our simplifications from the FAQ,

<n3

=O(n)

Therefore buildHeap is O(n).

So it’s cheaper to build a heap all at once than to do it one insert at a time, although neither approach is terribly expensive.

In our MaxHeap, MinHeap, and PriorityQueue classes, this buildHeap function is used when we construct a new heap or priority queue form an existing collection of objects, e.g.,

    // Constructor supporting preloading of heap contents
    public MinHeap(Collection<? extends E> h) {
        heap = new Object[h.size()];
        compare = ...
        n = 0;
        for (E e : h) {
            heap[n] = e;
            ++n;
        }
        buildHeap();
    }

4 From MaxHeap to MinHeap

We can create a min heap (that returns the smallest value first) with a simple change to the MaxHeap code:

class MinHeap<E> {
       ⋮
    // Moves an element down to its correct place
    private void siftDown(int pos) {
        assert (0 <= pos && pos < n) : "Invalid heap position";
        while (!isLeaf(pos)) {
            int child = leftChild(pos);
            if ((child + 1 < n) && isLessThan(child + 1, child)) {
                child = child + 1; // child is now index with the lesser value
            }
            if (!isLessThan(child, pos)) {
                return; // stop early
            }
            swap(pos, child);
            pos = child; // keep sifting down
        }
    }

    // Moves an element up to its correct place
    private void siftUp(int pos) {
        assert (0 <= pos && pos < n) : "Invalid heap position";
        while (pos > 0) {
            int parent = parent(pos);
            if (isLessThan(parent, pos)) {
                return; // stop early
            }
            swap(pos, parent);
            pos = parent; // keep sifting up
        }
    }

    // does comparison used for checking heap validity
    private boolean isLessThan(int pos1, int pos2) {
        E e1 = (E) heap[pos1];
        E e2 = (E) heap[pos2];
        return compare.compare(e1, e2) < 0;
    }

      ⋮
}

Each place where our MaxHeap called isGreaterThan, we instead call isLessThan.

Note that our PriorityQueue implementation uses MinHeap.

5 Recap: Complexity of Heap Operations