Sorting --- Insertion Sort
Steven J. Zeil
Sorting: given a sequence of data items in an unknown order, re-arrange the items to put them into ascending (descending) order by key.
Sorting algorithms have been studied extensively. There is no one best algorithm for all circumstances, but the big-O behavior is a key to understanding where and when to use different algorithms.
The insertion sort divides the list of items into a sorted and an unsorted region, with the sorted items in the first part of the list.
Idea: Repeatedly take the first item from the unsorted region and insert it into the proper position in the sorted portion of the list.
1 The Algorithm
This is the insertion sort:
// Weiss 7.2
//
template <typename Comparable>
void insertionSort(vector<Comparable>& v)
{
for( int p = 1; p < a.size(); ++p)
{
// place a[p] into the sublist
// a[0] ... a[i-1], 1 <= i < p,
// so it is in the correct position
Comparable tmp = std::move( a[p] );
int j;
// locate insertion point by scanning downward as long
// as tmp < a[j-1] and we have not encountered the
// beginning of the list
for( j = p; j > 0 && tmp < a[j-1]; --j)
a[j] = std::move (a[j-1]);
// the location is found; insert target
a[j] = std::move( tmp );
}
}
-
At the beginning of each outer iteration, items 0 … p-1 are properly ordered.
-
Each outer iteration seeks to insert item
a[p]
into the appropriate position within 0 … p. -
The
std::move
calls allow a speedup for those data types that have implemented move constructors and move assignment operators. For those data types, data is moved instead of assigned. For types that do not support those rather arcane move functions, thestd::move
call does nothing and a normal assignment is performed.Try out the insertion sort in an animation.
2 Insertion Sort: Worst Case Analysis
- Assume comparisons & copying are
O(1)
.
// Weiss 7.2
//
template <typename Comparable>
void insertionSort(vector<Comparable>& v)
{
for( int p = 1; p < a.size(); ++p)
{
// place a[p] into the sublist
// a[0] ... a[i-1], 1 <= i < p,
// so it is in the correct position
Comparable tmp = std::move( a[p] ); // O(1)
int j; // O(1)
// locate insertion point by scanning downward as long
// as tmp < a[j-1] and we have not encountered the
// beginning of the list
for( j = p; j > 0 && tmp < a[j-1]; --j)
a[j] = std::move (a[j-1]); // O(1)
// the location is found; insert target
a[j] = std::move( tmp ); // O(1)
}
}
Looking at the inner loop,
// Weiss 7.2
//
template <typename Comparable>
void insertionSort(vector<Comparable>& v)
{
for( int p = 1; p < a.size(); ++p)
{
// place a[p] into the sublist
// a[0] ... a[i-1], 1 <= i < p,
// so it is in the correct position
Comparable tmp = std::move( a[p] ); // O(1)
int j; // O(1)
// locate insertion point by scanning downward as long
// as tmp < a[j-1] and we have not encountered the
// beginning of the list
for( j = p; j > 0 && tmp < a[j-1]; --j)
a[j] = std::move (a[j-1]); // O(1)
// the location is found; insert target
a[j] = std::move( tmp ); // O(1)
}
}
Question: In the worst case, how many times do we go around the inner loop (to within plus or minus 1)?
-
0 times
-
1 time
-
p times
-
j times
-
n times
With that determined,
// Weiss 7.2
//
template <typename Comparable>
void insertionSort(vector<Comparable>& v)
{
for( int p = 1; p < a.size(); ++p)
{
// place a[p] into the sublist
// a[0] ... a[i-1], 1 <= i < p,
// so it is in the correct position
Comparable tmp = std::move( a[p] ); // O(1)
int j; // O(1)
// locate insertion point by scanning downward as long
// as tmp < a[j-1] and we have not encountered the
// beginning of the list
for( j = p; j > 0 && tmp < a[j-1]; --j) // cond: O(1) #: p
a[j] = std::move (a[j-1]); // O(1)
// the location is found; insert target
a[j] = std::move( tmp ); // O(1)
}
}
Moving on…
Question: So what is the complexity of the inner loop?
-
O(1)
-
O(p)
-
O(j)
-
O(n)
-
None of the above
Now, looking at the outer loop body,
// Weiss 7.2
//
template <typename Comparable>
void insertionSort(vector<Comparable>& v)
{
for( int p = 1; p < a.size(); ++p)
{
// place a[p] into the sublist
// a[0] ... a[i-1], 1 <= i < p,
// so it is in the correct position
Comparable tmp = std::move( a[p] ); // O(1)
int j; // O(1)
// locate insertion point by scanning downward as long
// as tmp < a[j-1] and we have not encountered the
// beginning of the list
O(p)
a[j] = std::move (a[j-1]); // O(1)
// the location is found; insert target
a[j] = std::move( tmp ); // O(1)
}
}
So the entire outer loop body is $O(p)$.
// Weiss 7.2
//
template <typename Comparable>
void insertionSort(vector<Comparable>& v)
{
for( int p = 1; p < a.size(); ++p)
{
O(p)
}
}
Let $n$ denote a.size()
. The outer loop executes $n-1$ times.
Question: What, then is the complexity of the entire outer loop?
-
$O(p)$
-
$O(n)$
-
$O(p*(n-1))$
-
$O(p*n)$
-
$O(p^2)$
-
$O(n^2)$
-
$O((n*(n-1))/2)$
// Weiss 7.2
//
template <typename Comparable>
void insertionSort(vector<Comparable>& v)
{ // let n = a.size()
for( int p = 1; p < a.size(); ++p) // cond: O(1) #: n total: O(n^2)
{
O(p)
}
}
If you gave any answer involving p
, you should have known better from the start. Complexity of a block of code must always be described in therms of the inputs to that code. p
is not an input to the loop - any value it might have held prior to the start of the loop is ignored and overwritten.
Question: What, then is the complexity of the entire function?
-
$O(n)$
-
$O(n^2)$
-
$O((n*(n-1))/2)$
-
None of the above
Insertion sort has a worst case of $O(N^2)$ where $N$ is the size of the input vector.
3 Insertion Sort: special case
As a special case, consider the behavior of this algorithm when applied to an array that is already sorted.
// Weiss 7.2
//
template <typename Comparable>
void insertionSort(vector<Comparable>& v)
{
for( int p = 1; p < a.size(); ++p)
{
// place a[p] into the sublist
// a[0] ... a[i-1], 1 <= i < p,
// so it is in the correct position
Comparable tmp = std::move( a[p] );
int j;
// locate insertion point by scanning downward as long
// as tmp < a[j-1] and we have not encountered the
// beginning of the list
for( j = p; j > 0 && tmp < a[j-1]; --j)
a[j] = std::move (a[j-1]);
// the location is found; insert target
a[j] = std::move( tmp );
}
}
- Note that if the array is already sorted, then we never enter the body of the inner loop. The inner loop is then $O(1)$ and
insertionSort
is $O(\mbox{a.size()})$.
This makes insertion sort a reasonable choice when adding a few items to a large, already sorted array.
4 Average-Case Analysis for Insertion Sort
Instead of doing the average case analysis by the copy-and-paste technique, we’ll produce a result that works for all algorithms that behave like it.
Define an inversion of an array a
as any pair (i,j)
such that i<j
but a[i]>a[j]
.
Question: How many inversions in this array?
$ [29 \; 10 \; 14 \; 37 \; 13] $
4.1 Inversions
In an array of n elements, the most inversions occur when the array is in exactly reversed order. Inversions then are
inversions | count |
---|---|
(1,2), (1,3), (1,4), … , (1,n), | n-1 |
(2,3), (2,4), … , (2,n), | n-2 |
(3,4), … , (3,n), | n-3 |
⋮ |
⋮ |
(n-1,n) | 1 |
Counting these we have (starting from the bottom): $\sum_{i=1}^{n-1} i$ inversions. So the total # of inversions is $\frac{n*(n-1)}{2}$.
We’ll state this formally:
Theorem: The maximum number of inversions in an array of $n$ elements is $(n*(n-1))/2$.
We have just proven that theorem. Now, another one, describing the average:
Theorem: The average number of inversions in an array of $n$ randomly selected elements is $(n*(n-1))/4$.
We won’t prove this, but note that it makes sense, since the minimum number of inversions is 0, and the maximum is $(n*(n-1))/2$, so it makes intuitive sense that the average would be the midpoint of these two values.
4.2 A Speed Limit on Adjacent-Swap Sorting
Now, the result we have been working toward:
Theorem: Any sorting algorithm that only swaps adjacent elements has average time no faster than $O(n^2)$.
Proof
Swapping 2 adjacent elements in an array removes at most 1 inversion.
But on average, there are $O(n^2)$ inversions, so the total number of swaps required is at least $O(n^2)$.
Hence the algorithm as a whole can be no faster than $O(n^2)$.
QED
And,
Corollary: Insertion sort has average case complexity $O(n^2)$.
Proof
Insertion sort is often written like this:
// Weiss 7.2
//
template <typename Comparable>
void insertionSort(vector<Comparable>& v)
{
for( int p = 1; p < a.size(); ++p)
{
// place a[p] into the sublist
// a[0] ... a[i-1], 1 <= i < p,
// so it is in the correct position
Comparable tmp = std::move( a[p] );
int j;
// locate insertion point by scanning downward as long
// as tmp < a[j-1] and we have not encountered the
// beginning of the list
for( j = p; j > 0 && a[j] < a[j-1]; --j)
std::swap(a[j], a[j-1]);
// the location is found; insert target
// a[j] = std::move( tmp );
}
}
and it is clear that this version only exchanges adjacent elements.
By the theorem just given, the best average case complexity we could therefore get is $O(n^2)$.
The theorem does not preclude an average case complexity even slower than that, but we know that the worst case complexity is also $O(n^2)$, and the average case can’t be any slower than the worst case.
So we conclude that the average case complexity is, indeed, $O(n^2)$.
Our actual algorithm replaces the swap
call with a single assignment, cutting the cost of the inner loop body in half. But that’s just a reduction by a constant multiplier, which cannot affect the overall complexity. So the actual algorithm given at the top of the page is also $O(n^2)$.