Heaps
Steven J. Zeil
Problem: Given a collection of elements that carry a numeric “score”, find and remove the element with the smallest [largest] score. New elements may be added at any time.
In an earlier lesson, we saw that this collection is called a priority queue. Now we will look at an efficient way of implementing it.
1 Recap: the std Priority Queue Interface
This is the entire priority queue interface.
public class PriorityQueue<E> implements Queue<E> {
// Create an empty priority queue.
public PriorityQueue() { ... }
// Create a priority queue using a specific comparator.
public PriorityQueue(Comparator<E> comparator) { ... }
// Create a priority queue initialized with all of the elements
// from another Collection.
public PriorityQueue(Collection<? extends E> c) { ... }
public void clear() { ... }
public boolean isEmpty() { ... }
public int size() { ... }
// Add an element ot the queue.
public boolean add(E e) { ... }
// Look at the front element without removing it.
public E peek() { ... }
// Remove and return the front element.
public E poll() { ... }
⋮
}
-
We can add a new element into the priority queue.
Unlike add into a stack or queue, however, the element does not automatically become the first or last thing we will next retrieve. Exactly when we will see this element again depends on its priority value.
-
We can check the size of the priority queue or ask if it isEmpty.
-
We can peek at the front (smallest) element.
-
We can remove the smallest element by polling.
1.1 The Priority Queue Implementation
public class PriorityQueue<E> implements Queue<E> {
private MinHeap<E> heap;
// Create an empty priority queue.
public PriorityQueue() {
heap = new MinHeap<>();
}
// Create a priority queue using a specific comparator.
public PriorityQueue(Comparator<E> comparator) {
heap = new MinHeap<>(comparator);
}
// Create a priority queue initialized with all of the elements
// from another Collection.
public PriorityQueue(Collection<? extends E> c) {
heap = new MinHeap<>(c);
}
public void clear() { heap.clear(); }
public boolean isEmpty() { return heap.heapSize() == 0; }
public int size() { return heap.heapSize(); }
public boolean add(E e) {
heap.insert(e);
return true;
}
public E peek() { return heap.peek(); }
public E poll() { return heap.remove(); }
⋮
}
We have added an implementing data structure, a heap. You can then see that the priority queue functions are pretty much one-liners, all passing the buck to similar functions provided by the heap.
We will look at this heap data structure in just a bit. First, though, let’s look at what we could do to implement priority queues using the data structures we already know.
One possibility would be to use a sorted sequential structure (array or linked list). For example, using an ArrayList, we would try to keep the elements in descending order by priority. Then we could peek() at the priority queue as the get(size()-1) of the implementing ArrayList.
Question: With this data structure, what would the complexities of the priority queue add
and poll
operations be?
-
O(1) and O(1)
-
O(1) and O(n)
-
O(n) and O(1)
-
O(n) and O(n)
We can do better than that.
We might consider instead using a balanced binary search tree to store the priority queue. This time, it will be a little easier if we store the items in ascending order by priority.
Question:
Using a balanced binary search tree as the underlying data structure, what would the complexities of the priority queue add and poll operations be?
-
O(1); O(logn)
-
O(logn); O(1)
-
O(logn); O(logn)
-
O(logn); O(n)
-
O(n); O(logn)
-
O(n); O(n)
That sounds pretty good.
We can’t actually hope to improve on the worst case times offered by balanced search trees, we can match those worst case times (and improve on the multiplicative constant) and actually achieve O(1) average case complexities by the use of a new data structure, called a “heap”.
2 Implementing Priority Queues - the Heap
We can implement priority queues using a data structure called a (binary) heap, sometimes also known as a “binary heap”.
2.1 Binary Heaps
A binary heap is a binary tree with the properties:
-
The tree is complete (entirely filled, except possibly on the lowest level, which is filled from left to right).
-
Each non-root node in the tree has a smaller (or equal) value than its parent.
**Important: A heap is a binary tree, but not a binary search tree. The ordering rules for heaps are different from those of binary search trees.
What I have defined here is sometimes called a max-heap, because the largest value in the heap will be at the root. We can also have a min-heap, in which every child has a value larger than its parent.
Max-heaps always have their very largest value in their root. Min-heaps always have their smallest value in the root.
In this course, we will always assume that a “heap” is a “max-heap” unless explicitly stated otherwise.
Let’s look at the implications of each of these two properties.
2.2 Heaps are complete trees

Here’s an example of a complete binary tree.
Complete binary trees have a very simple linear representation, allowing us to implement them in an array or vector with no pointers.

-
The parent of node i is in slot ⌊i−12⌋.
-
The children of node i are in 2i+1 and 2i+2.
2.3 Children’s Values are Smaller than Their Parent’s

Using the same tree shape, we can fill in some values to show an example of a heap.
Each parent has a value larger than its children’s values (and, therefore, larger than the values of any of its descendants).
So when we ask for the front (largest value) of a priority queue, we find it in the root of the heap, which in turn will be in position 0 of the array/vector.
2.4 The Data Structure
The code in the textbook is not generic, so I will present a generic version.
class MaxHeap<E> {
private Object[] heap; // Pointer to the heap array
private Comparator<E> compare; // comparator to use when comparing elements
private int n; // Number of things now in heap
⋮
Our basic data structure will be an array. In the textbook, this array has a fixed size that constitutes the maximum size of the queue. I will instead use an ArrayList
-style doubling of the array whenever and add
operation threatens to overflow it.
We are using the array to store a tree, but most of our “thinking” about this code will be in terms of a tree. So some useful utility functions will find the parents and children of any node.
⋮
// Return true if pos a leaf position, false otherwise
private boolean isLeaf(int pos) {
return (n / 2 <= pos) && (pos < n);
}
// Return position for left child of pos
private static int leftChild(int pos) {
return 2 * pos + 1;
}
// Return position for right child of pos
private static int rightChild(int pos) {
return 2 * pos + 2;
}
// Return position for parent
private static int parent(int pos) {
return (pos - 1) / 2;
}
⋮
2.5 Sifting Up and Sifting Down
Before looking in detail at how to add and delete elements from a heap, let’s consider a situation in which we have a “damaged” heap with one node out of position.
How do we “fix” the heap? There are two cases to consider.
-
The out-of-place node is too large (i.e., larger than its parent).
-
The out-of-place node is too small (i.e., smaller than one or both of its children).
2.5.1 Sifting Up

When we have a node that is larger than its parent, we sift it up (sometimes call “bubbling up”) by swapping it with its parent until it has reached its proper position.
// Moves an element up to its correct place
private void siftUp(int pos) {
while (pos > 0) {
int parent = parent(pos);
if (isGreaterThan(parent, pos)) {
return; // stop early
}
swap(pos, parent);
pos = parent; // keep sifting up
}
}
// swaps the elements at two positions
private void swap(int pos1, int pos2) {
Object temp = heap[pos1];
heap[pos1] = heap[pos2];
heap[pos2] = temp;
}
// does comparison used for checking heap validity
private boolean isGreaterThan(int pos1, int pos2) {
E e1 = (E) heap[pos1];
E e2 = (E) heap[pos2];
return compare.compare(e1, e2) > 0;
}
Note that we have repaired the heap. The final arrangement satisfies the ordering requirements for a heap.
2.5.2 Sifting Down

When we have a node that is smaller than one or both of its children, we sift it down (also known as “percolate down” or “drip down”) by swapping it with the larger of its children until it has reached its proper position.
// Moves an element down to its correct place
private void siftDown(int pos) {
while (!isLeaf(pos)) {
int child = leftChild(pos);
if ((child + 1 < n) && isGreaterThan(child + 1, child)) {
child = child + 1; // child is now index with the lesser value
}
if (!isGreaterThan(child, pos)) {
return; // stop early
}
swap(pos, child);
pos = child; // keep sifting down
}
}
This is only a little more complicated than bubbling up. The main complication is that the current node might have 0 children, 1 child, or 2 children, so we need to be careful that we don’t try to access the value of non-existent children.
If you understand the ideas of sifting up and sifting down, then almost all the things you would want to do to a heap become a variant of those two ideas.
2.6 Inserting into a heap

Suppose we have this heap and we want to add a new item to it.
Now, after we add an item to the heap, it will have one more tree node than it currently does. Because heaps are complete trees, we know exactly how the shape of the tree will change, even if we can’t be sure how the data values in the tree might be rearranged.
Question: How will the shape of the tree shown above change?
-
A new child will be added to the node that currently contains 48.
-
A new child will be added to the one of the nodes that currently contain 48, 60, or 11.
-
A new child will be added to one of the current leaves.
-
None of the above.

Well, suppose that we just go ahead and put the new value into that position.
We’ve got two possibilities.
-
We might get lucky – maybe this is where the new value belongs.
-
If the new value is out of position, it must be because it is larger than its parent.
It would be the only node that was out of position, and we know how to “repair” a heap with a single node out of position that is larger than its parent — we sift up!
// Insert val into heap
public void insert(E key) {
expandArrayIfNecessary();
// Add the new value
heap[n] = key;
n++;
siftUp(n - 1);
}
// Make sure that we have room to add one more element
private expandArrayIfNecessary() {
if (n >= heap.length) {
// If we are about to overflow the array, double its size.
int newCapacity = Math.max(1, 2 * heap.length);
Object[] newHeap = new Object[newCapacity];
System.arraycopy(heap, 0, newHeap, 0, n);
heap = newHeap;
}
}
2.7 Removing from Heaps
When we remove the largest element from a heap, we know that the value being removed is the value currently in the root.
We also know how the tree shape will change. The rightmost node in the bottom level will disappear.
Now, unless the heap only has one node, the node that’s disappearing does not contain the value that we’re actually removing. So, we have two problems:
-
What value goes into the root to replace the one being removed?
-
What do we do with the value currently in the node that’s going to disappear?
So, we’ve got a node with no data, and data that needs a node. The natural thing to do is to put the data in that node.
That data value will almost certainly be out of position, being smaller than one or both of its children, but, again, that’s only a single node that’s out of position. We know how to fix that.
// Remove and return root
public E remove() {
n--;
swap(0, n); // Swap maximum with last value
if (n > 0)
siftDown(0); // Put new heap root val in correct place
return (E) heap[n];
}
3 Analysis
A binary heap has the same shape as a balanced binary search tree.
Therefore its height, for n nodes, is ⌊log(n)⌋.
3.1 insert and remove
insert and remove do O(1) work on each node along a path that runs, at worst, between a single leaf and the root.
Hence both operations are O(logn), worst case.
The average case for insert is O(1). The proof of this is beyond scope of this class.
3.2 buildHeap
A single insertion is O(logn) worst case and O(1) average.
What happens if we start with an empty heap and do n inserts? The resulting total could be O(nlogn).
As it happens, we can do better with a special build operation to build an entire heap from an array (or array-like structure such as a vector).
// Heapify contents of the heap array
protected void buildHeap() {
for (int i = parent(n - 1); i >= 0; i--) {
siftDown(i);
}
}
-
Start with the data in any order.
-
Force heap order by percolating each non-leaf node.
Since each siftDown takes, in worst case, a time proportional to the height of the node being sifted, the total time for buildHeap is proportional to the sum of the heights of all the nodes in a complete tree.

Consider an array with N elements. Let h be the height of the complete tree representing that array. h=logN
We apply percolate to the first N/2 elements.
- The first element can move at most h−1 times.
- The next two elements can move at most h−2 times.
- The next four elements can move at most h−2 times.
- The next eight elements can move at most h−3 times.
⋮
-
The last N/2 elements don’t move at all.
Total work: ∑hi=0in2i−1
=h∑i=0ni2i−1
=n∗h∑i=0i2i−1
Using one of our simplifications from the FAQ,
<n∗3
=O(n)
Therefore buildHeap is O(n).
So it’s cheaper to build a heap all at once than to do it one insert at a time, although neither approach is terribly expensive.
In our MaxHeap
, MinHeap
, and PriorityQueue
classes, this buildHeap
function is used when we construct a new heap or priority queue form an existing collection of objects, e.g.,
// Constructor supporting preloading of heap contents
public MinHeap(Collection<? extends E> h) {
heap = new Object[h.size()];
compare = ...
n = 0;
for (E e : h) {
heap[n] = e;
++n;
}
buildHeap();
}
4 From MaxHeap to MinHeap
We can create a min heap (that returns the smallest value first) with a simple change to the MaxHeap
code:
class MinHeap<E> {
⋮
// Moves an element down to its correct place
private void siftDown(int pos) {
assert (0 <= pos && pos < n) : "Invalid heap position";
while (!isLeaf(pos)) {
int child = leftChild(pos);
if ((child + 1 < n) && isLessThan(child + 1, child)) {
child = child + 1; // child is now index with the lesser value
}
if (!isLessThan(child, pos)) {
return; // stop early
}
swap(pos, child);
pos = child; // keep sifting down
}
}
// Moves an element up to its correct place
private void siftUp(int pos) {
assert (0 <= pos && pos < n) : "Invalid heap position";
while (pos > 0) {
int parent = parent(pos);
if (isLessThan(parent, pos)) {
return; // stop early
}
swap(pos, parent);
pos = parent; // keep sifting up
}
}
// does comparison used for checking heap validity
private boolean isLessThan(int pos1, int pos2) {
E e1 = (E) heap[pos1];
E e2 = (E) heap[pos2];
return compare.compare(e1, e2) < 0;
}
⋮
}
Each place where our MaxHeap
called isGreaterThan
, we instead call isLessThan
.
Note that our PriorityQueue
implementation uses MinHeap
.
5 Recap: Complexity of Heap Operations
- Building a heap from an array of N items: O(N), worst-case and average-case
- Inserting one element into a heap of size N: O(logN), worst-case, O(1) average
- Removing the largest element from a heap of size N: O(logN), worst-case and average-case