Sorting --- Insertion Sort

Steven J. Zeil

Last modified: Oct 26, 2023
Contents:

Sorting: given a sequence of data items in an unknown order, re-arrange the items to put them into ascending (descending) order by key.

Sorting algorithms have been studied extensively. There is no one best algorithm for all circumstances, but the big-O behavior is a key to understanding where and when to use different algorithms.

The insertion sort divides the list of items into a sorted and an unsorted region, with the sorted items in the first part of the list.

Idea: Repeatedly take the first item from the unsorted region and insert it into the proper position in the sorted portion of the list.

1 The Algorithm

This is the insertion sort:

// Weiss 7.2
//
template <typename Comparable>
void insertionSort(vector<Comparable>& v)
{
   for( int p = 1; p < a.size(); ++p)
   {
     // place a[p] into the sublist
     //   a[0] ... a[i-1], 1 <= i < p,
     //   so it is in the correct position
     Comparable tmp = std::move( a[p] );
     
     int j;
      // locate insertion point by scanning downward as long
      // as tmp < a[j-1] and we have not encountered the
      // beginning of the list
     for( j = p; j > 0 && tmp < a[j-1]; --j)
         a[j] = std::move (a[j-1]);
      // the location is found; insert target
     a[j] = std::move( tmp );
   }
}

2 Insertion Sort: Worst Case Analysis

// Weiss 7.2
//
template <typename Comparable>
void insertionSort(vector<Comparable>& v)
{
   for( int p = 1; p < a.size(); ++p)
   {
     // place a[p] into the sublist
     //   a[0] ... a[i-1], 1 <= i < p,
     //   so it is in the correct position
     Comparable tmp = std::move( a[p] );         // O(1)
     
     int j;                                      // O(1)
      // locate insertion point by scanning downward as long
      // as tmp < a[j-1] and we have not encountered the
      // beginning of the list
     for( j = p; j > 0 && tmp < a[j-1]; --j)
         a[j] = std::move (a[j-1]);              // O(1)
      // the location is found; insert target
     a[j] = std::move( tmp );                    // O(1)
   }
}

Looking at the inner loop,

// Weiss 7.2
//
template <typename Comparable>
void insertionSort(vector<Comparable>& v)
{
   for( int p = 1; p < a.size(); ++p)
   {
     // place a[p] into the sublist
     //   a[0] ... a[i-1], 1 <= i < p,
     //   so it is in the correct position
     Comparable tmp = std::move( a[p] );         // O(1)
     
     int j;                                      // O(1)
      // locate insertion point by scanning downward as long
      // as tmp < a[j-1] and we have not encountered the
      // beginning of the list
     for( j = p; j > 0 && tmp < a[j-1]; --j)
         a[j] = std::move (a[j-1]);              // O(1)
      // the location is found; insert target
     a[j] = std::move( tmp );                    // O(1)
   }
}

Question: In the worst case, how many times do we go around the inner loop (to within plus or minus 1)?

**Answer:**

With that determined,

// Weiss 7.2
//
template <typename Comparable>
void insertionSort(vector<Comparable>& v)
{
   for( int p = 1; p < a.size(); ++p)
   {
     // place a[p] into the sublist
     //   a[0] ... a[i-1], 1 <= i < p,
     //   so it is in the correct position
     Comparable tmp = std::move( a[p] );         // O(1)
     
     int j;                                      // O(1)
      // locate insertion point by scanning downward as long
      // as tmp < a[j-1] and we have not encountered the
      // beginning of the list
     for( j = p; j > 0 && tmp < a[j-1]; --j)  // cond: O(1) #: p
         a[j] = std::move (a[j-1]);              // O(1)
      // the location is found; insert target
     a[j] = std::move( tmp );                    // O(1)
   }
}

Moving on…

Question: So what is the complexity of the inner loop?

**Answer:**

Now, looking at the outer loop body,

// Weiss 7.2
//
template <typename Comparable>
void insertionSort(vector<Comparable>& v)
{
   for( int p = 1; p < a.size(); ++p)
   {
     // place a[p] into the sublist
     //   a[0] ... a[i-1], 1 <= i < p,
     //   so it is in the correct position
     Comparable tmp = std::move( a[p] );         // O(1)
     
     int j;                                      // O(1)
      // locate insertion point by scanning downward as long
      // as tmp < a[j-1] and we have not encountered the
      // beginning of the list
      O(p) 
         a[j] = std::move (a[j-1]);              // O(1)
      // the location is found; insert target
     a[j] = std::move( tmp );                    // O(1)
   }
}

So the entire outer loop body is $O(p)$.

// Weiss 7.2
//
template <typename Comparable>
void insertionSort(vector<Comparable>& v)
{
   for( int p = 1; p < a.size(); ++p)
   {
      O(p) 
   }
}

Let $n$ denote a.size(). The outer loop executes $n-1$ times.

Question: What, then is the complexity of the entire outer loop?

**Answer:**
// Weiss 7.2
//
template <typename Comparable>
void insertionSort(vector<Comparable>& v)
{ // let n = a.size()
   for( int p = 1; p < a.size(); ++p) // cond: O(1) #: n total: O(n^2)
   {
      O(p)
   }
}

If you gave any answer involving p, you should have known better from the start. Complexity of a block of code must always be described in therms of the inputs to that code. p is not an input to the loop - any value it might have held prior to the start of the loop is ignored and overwritten.

Question: What, then is the complexity of the entire function?

**Answer:**

Insertion sort has a worst case of $O(N^2)$ where $N$ is the size of the input vector.

3 Insertion Sort: special case

As a special case, consider the behavior of this algorithm when applied to an array that is already sorted.

// Weiss 7.2
//
template <typename Comparable>
void insertionSort(vector<Comparable>& v)
{
   for( int p = 1; p < a.size(); ++p)
   {
     // place a[p] into the sublist
     //   a[0] ... a[i-1], 1 <= i < p,
     //   so it is in the correct position
     Comparable tmp = std::move( a[p] );
     
     int j;
      // locate insertion point by scanning downward as long
      // as tmp < a[j-1] and we have not encountered the
      // beginning of the list
     for( j = p; j > 0 && tmp < a[j-1]; --j)
         a[j] = std::move (a[j-1]);
      // the location is found; insert target
     a[j] = std::move( tmp );
   }
}

This makes insertion sort a reasonable choice when adding a few items to a large, already sorted array.

4 Average-Case Analysis for Insertion Sort

Instead of doing the average case analysis by the copy-and-paste technique, we’ll produce a result that works for all algorithms that behave like it.

Define an inversion of an array a as any pair (i,j) such that i<j but a[i]>a[j].

Question: How many inversions in this array?

$ [29 \; 10 \; 14 \; 37 \; 13] $

**Answer**

4.1 Inversions

In an array of n elements, the most inversions occur when the array is in exactly reversed order. Inversions then are

inversions count
(1,2), (1,3), (1,4), … , (1,n), n-1
(2,3), (2,4), … , (2,n), n-2
(3,4), … , (3,n), n-3
(n-1,n) 1

Counting these we have (starting from the bottom): $\sum_{i=1}^{n-1} i$ inversions. So the total # of inversions is $\frac{n*(n-1)}{2}$.

We’ll state this formally:

Theorem: The maximum number of inversions in an array of $n$ elements is $(n*(n-1))/2$.

We have just proven that theorem. Now, another one, describing the average:

Theorem: The average number of inversions in an array of $n$ randomly selected elements is $(n*(n-1))/4$.

We won’t prove this, but note that it makes sense, since the minimum number of inversions is 0, and the maximum is $(n*(n-1))/2$, so it makes intuitive sense that the average would be the midpoint of these two values.

4.2 A Speed Limit on Adjacent-Swap Sorting

Now, the result we have been working toward:

Theorem: Any sorting algorithm that only swaps adjacent elements has average time no faster than $O(n^2)$.

Proof

Swapping 2 adjacent elements in an array removes at most 1 inversion.

But on average, there are $O(n^2)$ inversions, so the total number of swaps required is at least $O(n^2)$.

Hence the algorithm as a whole can be no faster than $O(n^2)$.

QED

And,

Corollary: Insertion sort has average case complexity $O(n^2)$.

Proof

Insertion sort is often written like this:

// Weiss 7.2
//
template <typename Comparable>
void insertionSort(vector<Comparable>& v)
{
   for( int p = 1; p < a.size(); ++p)
   {
     // place a[p] into the sublist
     //   a[0] ... a[i-1], 1 <= i < p,
     //   so it is in the correct position
     Comparable tmp = std::move( a[p] );
     
     int j;
      // locate insertion point by scanning downward as long
      // as tmp < a[j-1] and we have not encountered the
      // beginning of the list
     for( j = p; j > 0 && a[j] < a[j-1]; --j)
         std::swap(a[j], a[j-1]);
      // the location is found; insert target
      // a[j] = std::move( tmp );

   }
}

and it is clear that this version only exchanges adjacent elements.

By the theorem just given, the best average case complexity we could therefore get is $O(n^2)$.

The theorem does not preclude an average case complexity even slower than that, but we know that the worst case complexity is also $O(n^2)$, and the average case can’t be any slower than the worst case.

So we conclude that the average case complexity is, indeed, $O(n^2)$.

Our actual algorithm replaces the swap call with a single assignment, cutting the cost of the inner loop body in half. But that’s just a reduction by a constant multiplier, which cannot affect the overall complexity. So the actual algorithm given at the top of the page is also $O(n^2)$.