Average Case Analysis

Steven J. Zeil

Last modified: May 2, 2024

1 Introduction

Earlier, we looked at the process of analyzing the worst-case running time of an algorithm, and the use of worst-case analysis and big-O notation as a way of describing it.

In this section, we will introduce average-case complexity. Just as the worst-case complexity describes an upper bound on the worst-case time we would see when running an algorithm, average case complexity will present an upper bound on the average time we would see when running the program many times on many different inputs.

Worst-case complexity gets used more often than average-case. There are a number of reasons for this.

The worst-case complexity is often easier to compute than the average case. Just figuring out what an “average” set of inputs will look like is often a challenge. To figure out the worst case complexity, we only need to identify that one single input that results in the slowest running.
In many cases, it will turn out the worst and average complexity will turn out to be the same.
Finally, reporting the worst case to your boss or your customers is often “safer” than reporting the average.

If you give them the average, then sometimes they will run the program and see slower performance than they had expected. Human nature being what it is, they will probably get rather annoyed. On the other hand, if you go to those same customers with the worst-case figure, most of the time they will observe faster-than-expected behavior, and will be more pleased.

This appears to be particularly true of interactive programs.

When people are actually sitting there typing things in or clicking with the mouse and then waiting for a response, if they have to sit for a long time waiting for a response, they’re going to remember that. Even if 99.9% of the time they get instant response (so the average response is still quite good), they will characterize your program as “sluggish”.

In those circumstances, it makes sense to focus on worst-case behavior and to do what we can to improve that worst case.

On the other hand, suppose we’re talking about a batch program that will process thousands of inputs per run, or we’re talking about a critical piece of an interactive program that gets run hundreds or thousands of times in between each response from the user.

In that situation, adding up hundreds or thousands of worst-cases may be just too pessimistic. The cumulative time of thousands of different runs should show some averaging out of the worst-case behavior, and an average case analysis may give a more realistic picture of what the user will be seeing.

2 Definition

Definition: Average Case Complexity

We say that an algorithm requires average time proportional to $f(n)$ (or that it has average-case complexity $O(f(N))$ if there are constants $c$ and $n_{\mbox{0}}$ such that the average time the algorithm requires to process an input set of size $n$ is no more than $c*f(n)$ time units whenever $n \geq n_{\mbox{0}}$.

This definition is very similar to the one for worst case complexity. The difference is that for worst-case complexity, we want $T_{\mbox{max}}(n) \leq c*f(n)$ where $T_{\mbox{max}}(n)$ is the maximum time taken by any input of size $n$, but for average case complexity we want $T_{\mbox{avg}}(n) \leq c*f(n)$ where $T_{\mbox{avg}}(n)$ is the average time required by inputs of size $n$.

The average case complexity describes how quickly the average time increases when n increases, just as the worst case complexity describes how quickly the worst case time increases when n increases.

In both forms of complexity, we are looking for upper bounds, so our big-O notation (and its peculiar algebra) will still apply.

Question: Suppose we have an algorithm with worst case complexity $O(n)$.

True or false: It is possible for that algorithm to have average case complexity $O(n^{\mbox{2}})$.

Answer

This is an important idea to keep in mind as we discuss the rules for average case analysis. The worst-case analysis rules all apply, because they do provide an upper bound. But that bound is sometimes not as tight as it could be. One of the things that makes average case analysis tricky is recognizing when we can get by with the worst-case rules and when we can gain by using a more elaborate average-case rule. There’s no hard-and fast way to tell — it requires the same kind of personal judgment that goes on in most mathematical proofs.

3 Probably Not a Problem

We’re going to need a few basic facts about probabilities for this section.

Every probability is between $0$ and $1$, inclusive.
- For example, the probability of a flipped coin coming up heads is $0.5$.
- The probability of an ordinary six-sided die rolling a ‘3’ is $1/6$.
  - The probability of that same die rolling an even number is $0.5$.
The sum of the probabilities of all possible events must be $1.0$.
- For example, the probability of an ordinary six-sided die rolling some number between 1 and 6 is $1.0$.
If $p$ is the probability that an event will happen, then $(1-p)$ is the probability that the event will not happen.
- The probability of a 6-sided die rolling any number other than ‘3’ is $1 - 1/6 = 5/6$.
If two events with probabilities $p_1$ and $p_2$ are independent (the success or failure of the first event does not affect the success or failure of the other), then
1. the probability that both events will occur is $p_1 * p_2$.
2. the probability that neither will occur is $(1 - p_1)(1 - p_2)$.
3. the probability that at least one will occur is $1 - (1-p_1)(1-p_2)$.
For example, if I roll two 6-sided dice, the probability that both will roll a ‘3’ is $\frac{1}{6} * \frac{1}{6} = \frac{1}{36}$.

If I roll two six-sided dice, the probability that at least one will come up ‘3’ is

\[ 1 - \left(1-\frac{1}{6}\right)\left(1-\frac{1}{6}\right) = 1 - \left(\frac{5}{6}\right)\left(\frac{5}{6}\right) = 1 - \frac{25}{36} = \frac{11}{36} \]
More generally, if I have a series of independent events with probabilities $p_1, p_2, \ldots p_k$, then
1. the probability that all events will occur is $\prod_{i=1}^{k} p_i$.
2. the probability that neither will occur is $\prod_{i=1}^{k}(1 - p_i)$.
3. the probability that at least one will occur is $1 - \prod_{i=1}^{k}(1-p_i)$.
If I flip a coin 3 times, the probability that I will see the sequence heads-tails-heads is $\frac{1}{2} * \frac{1}{2} * \frac{1}{2} = \frac{1}{8}$
- Note that the probability of seeing the sequence heads-heads-heads is exactly the same.
If I flip a coin 3 times, the probability that I will see at least one heads is 1 - $\frac{1}{2} * \frac{1}{2} * \frac{1}{2} = \frac{7}{8}$
If we make a number of independent attempts at something, and the chance of seeing an event on any single attempt is $p$, on average we will need $1/p$ attempts before seeing that event.
- For example, if I start rolling a six-sided die until I see a ‘2’, on average I would wait $\frac{1}{1/6} = 6$ rolls.
  
  This is a great illustration of the difference between average and worst-case times, because in the worst case you would keep rolling forever!

You can find a fuller tutorial here.

4 What’s an Average?

For some people, average case analysis is difficult because they don’t have a very flexible idea of what an “average” is.

Example:

Last semester, Professor Cord gave out the following grades in his CS361 class:

A, A, A-, B+, B, B, B, B-, C+, C, C-, D, F, F, F, F

Translating these to their numerical equivalent,

4, 4, 3.7, 3.3, 3, 3, 3, 2.7, 2.3, 2, 1.7, 1, 0, 0, 0, 0

what was the average grade in Cord’s class?

According to some classic forms of average:

Median: the middle value of the sorted list, (the midpoint between the two middle values given an even number of items)

\[ \mbox{avg}_{\mbox{median}} = 2.5 \]

Mode: the most commonly occurring value

\[ \mbox{avg}_{\mbox{modal}} = 0 \]

Mean: Computed from the sum of the elements

\[ \begin{align} \mbox{avg}_{\mbox{mean}} &= (4 + 4 + 3.7 + 3.3 + 3 + 3 + 3 + 2.7 + 2.3 + 2 + 1.7 + 1 + 0 + 0 + 0 + 0) / 16 \\ &= 2.11 \end{align} \]

4.1 The Mean Average

The mean average is the most commonly used, and corresponds to most people’s idea of a “normal” average, but even that comes in comes in many varieties:

Simple mean: $\bar{x} = \frac{\sum_{i=1}^N x_i}{N}$
Weighted mean: $\bar{x} = \frac{\sum_{i=1}^N w_i * x_i}{\sum_{i=1}^N w_i}$
The $w_i$ are the weights that adjust the relative importance of the scores.

Example:

Last semester Professor Cord gave the following grades

Grade # students

4.0 2

3.7 1

3.3 1

3.0 3

2.7 1

2.3 1

2.0 1

1.7 1

1.3 0

1.0 1

0.0 4

The weighted average is $\frac{2*4.0 + 1*3.7 + 1*3.3 + 3*3.0 + 1*2.7 + 1*2.3 + 1*2.0 + 1*1.7 + 0*1.3 + 1*1.0 + 4*0.0}{2 + 1 + 1 + 3 + 1 + 1 + 1 + 1 + 0 + 1 + 4}$ $= 2.11$

Grade	# students
4.0	2
3.7	1
3.3	1
3.0	3
2.7	1
2.3	1
2.0	1
1.7	1
1.3	0
1.0	1
0.0	4

Another example of weighted averages:

When one student asked about his overall grade for the semester, Professor Cord pointed out that assignments were worth 50% of the grade, the final exam was worth 30%, and the midterm exam worth 20%. The student has a B, A, and C-, respectively on these.

Category Score Weight

Assignments 3.0 50

Final 4.0 30

Midterm 1.7 20

So the student’s average grade was $$\frac{50*3.0 + 30*4.0 + 20*1.7}{50+30+20} = 3.04$$

Category	Score	Weight
Assignments	3.0	50
Final	4.0	30
Midterm	1.7	20

4.2 Expected Value

The expected value is a special version of the weighted mean in which the weights are the probability of seeing each particular value.

If $x_1, x_2, \ldots ,$ are all the possible values of some quantity, and these values occur with probability $p_1, p_2, \ldots ,$, then the expected value of that quantity is

\[ E(x) = \sum_{i=1}^N p_i * x_i \]

Note that if we have listed all possible values, then

\[ \sum_{i=1}^N p_i = 1 \]

so you can regard the $E(x)$ formula above as a special case of the weighted average in which the denominator (the sum of the weights) becomes simply “1”.

Example:

After long observation, we have determined that Professor Cord tends to give grades with the following distribution:

Grade probability

4.0 2/16

3.7 1/16

3.3 1/16

3.0 3/16

2.7 1/16

2.3 1/16

2.0 1/16

1.7 1/16

1.3 0/16

1.0 1/16

0.0 4/16

So the expected value of the grade for an average student in his class is

$$\begin{align} &((2/16)*4.0 + (1/16)*3.7 + (1/16)*3.3 + (3/16)*3.0 \\ &+ (1/16)*2.7 + (1/16)*2.3 + (1/16)*2.0 + (1/16)*1.7 + (0/16)*1.3 \\ &+ (1/16)*1.0 + (4/16)*0.0) \\ &= 2.11 \end{align}$$

The expected value is the kind of average we will use throughout this course in discussing average case complexity.

Grade	probability
4.0	2/16
3.7	1/16
3.3	1/16
3.0	3/16
2.7	1/16
2.3	1/16
2.0	1/16
1.7	1/16
1.3	0/16
1.0	1/16
0.0	4/16

5 Determining the Average Case Complexity

In many ways, determining average case complexity is similar to determining the worst-case complexity.

5.1 Why Might Average-Case Be Smaller than Worst-Case?

It boils down to 3 possible reasons:

Your code calls another function whose average case complexity is smaller than its worst case.

You have a loop (or recursion) that, on average, does not repeat as often as it would in the worst case.

You have an if statement with different complexities for its then and else parts, that if statement is inside a loop (or recursion), and, on average it takes the cheaper option ore often than it would in the worst case.

5.2 It Still All Boils Down to Addition

If you want to know how much time a complicated process takes, you figure that out by adding up the times of its various components.

That basic observations is the same for average times as it is for worst-case times.

When in doubt, just add things up.

Just keep in mind that you want to add up their average times instead of their worst-case times.

5.3 All of Your Variables Must be Defined

In math, as in programming, all of your variables must be declared/defined before you can use them.

5.4 Complexity is Written in Terms of the Inputs

The complexity of a block of code must be a function of the inputs (only!) to that block.

5.5 The Complexity of Any Block of Code Must be Numeric

No reason for this to change.

5.6 Surprises Demand Explanation

The vast majority of algorithms we will look at will have the same average-case complexity as worst-case. So, if you come up with a different value for the average-case, make sure that you understand why.

By the same token, though, if you make it to the end of an average-case analysis and never once took the time to even consider how the “average input” is different from the “worst-case input”, you may need to start over.

6 Why Would Average-Case Complexity Be Different from Worst-Case?

There are basically only three reasons why a piece of code would run in an average case complexity faster than its worst case:

The code calls some function that is known to have a faster average case than worst case.
The code contains a conditional statement that chooses between two alternatives, one of which is faster than the other, and that, on average, the faster choice is taken more often.
The code contains a loop (or recursive call) that, on average, repeats far fewer times than it does in the worst case.

That’s pretty much it.

Of course, the first of these just kind of “passes the buck” to the function being called. How is it that this function has a faster average case than its worst case? Well, the body of that function must exhibit one of the three properties we have listed above. That means that, evenually, we would need to identify an instance of reasons 2 or 3.
Reason #2 is actually something we have already seen before. The amortized worst-case behavior of adding to an ArrayList can be considered a form of average case behavior. It is rooted in the idea that we test to see if the storage array is already full. If so, we do $O(N)$ work. If not, we do $O(1)$ work. And we showed that, on average, we only do the expensive case on $\frac{1}{N}$ calls, so the expected work for any one add call is
$E(t_{\mbox{add}}(N)) = \frac{1}{N} O(N) + \frac{N-1}{N} O(1)$

$= O\left( \frac{N}{N} + \frac{N-1}{N} \right)$

$= O(1))$
The third possible reason, early exit from a loop or recursion, could stand a little deeper exploration.

6.1 Exiting Early from a Loop

The total time for a loop is found by adding up the times of all of its iterations. But, what do we do if the number of iterations can vary depending on subtle properties of the input?

If we can say that each iteration of the loop runs in time $O_{\mbox{iteration}}(f(N))$, i.e., the time of an iteration does not depend on which iteration we are in, then we can write

\[ T_{\mbox{loop}} = \sum_{i=1}^k O_{\mbox{iteration}}(f(N)) \]

where $k$ is the number of times that the loop repeats. Now, if we are doing worst case analysis, we figure out what the maximum value of $k$ would be. But if we are doing average case analysis, we might ask if the average (or expected) value of $k$ is significantly less than that maximum.

6.1.1 Example 1: Ordered Search

Consider this code for searching an ordered (sorted) array of integers. We will assume that both the numbers inserted into the array and the values we use when searching are drawn randomly from the integers in some range $0 \ldots M$.

int orderedSearch (int[] array, int key) {
    int i = 0;
    while (i < array.length && array[i] > key) {
        ++i;
    }
    if (i < array.length && array[i] == key)
        return i;
    else
        return -1;
}

This search takes advantage of the fact that the array is ordered by stopping the loop as soon as we get to an array value larger than or equal to the key. For example, if we had the array

and if we were searching for 97, we would stop after i==2 because array[2] > 97 and so we know, if we have not found 97 yet, there’s no point to looking through the even larger numbers in the rest of the array.

So if we let $k$ denote the number of iterations of this loop,

what is $k$ in the worst case?

What does that tell us about the worst case complexity?

Now, what is $k$ in the average case?

What does that tell us about the average case complexity?

6.1.2 Example 2: Simulating a Rolling Die

Java has a useful class for generating pseudo-random numbers.

package java.util;

public class Random {
    // Creates a new random number generator.
    public Random() {...}
      ⋮
    // Returns a pseudorandom int value between 0 (inclusive) and bound (exclusive).
    public int nextInt(int bound) {...}
      ⋮
}

The distinction between a “pseudorandom” and true “random” integer is not particularly important to us.

The nextInt(bound) function returns a random integer in the range0...bound-1. It does this in O(1) time.

We can simulate the roll of a six-sided die by taking nextInt(6) + 1.This gives us a uniform random selection in the range 1..6.

Consider the following code:

    Random rand = new Random();
    int roll = rand.nextInt(6) + 1;
    while (roll != 2) {
       roll = rand.nextInt(6) + 1;
    }

What is the worst-case complexity of this code?

All of the operations in the first two lines and in the loop body are $O(1)

    Random rand = new Random();  // O(1)
    int roll = rand.nextInt(6) + 1; // O(1)
    while (roll != 2) {
       roll = rand.nextInt(6) + 1;  // O(1)
    }

The condition of the while loop is $O(1)$:

    Random rand = new Random();  // O(1)
    int roll = rand.nextInt(6) + 1; // O(1)
    while (roll != 2) {   // cond: O(1)   #?
       roll = rand.nextInt(6) + 1;  // O(1)
    }

Things get tricky when we ask the question “How many times does this loop repeat?”

How many times, in the worst case, can we roll a die until a ‘2’ comes up? In the worst case, there is no limit to the number of times we might roll.

    Random rand = new Random();  // O(1)
    int roll = rand.nextInt(6) + 1; // O(1)
    while (roll != 2) {   // cond: O(1)   # infinity
       roll = rand.nextInt(6) + 1;  // O(1)
    }

And we conclude that the loop, and the entire block of code, is $O(\infty)$.

What is the average-case complexity of this code?

All of the operations in the first two lines and in the loop body are $O(1)

    Random rand = new Random();  // O(1)
    int roll = rand.nextInt(6) + 1; // O(1)
    while (roll != 2) {
       roll = rand.nextInt(6) + 1;  // O(1)
    }

The condition of the while loop is $O(1)$:

    Random rand = new Random();  // O(1)
    int roll = rand.nextInt(6) + 1; // O(1)
    while (roll != 2) {   // cond: O(1)   #?
       roll = rand.nextInt(6) + 1;  // O(1)
    }

Now the fun part: How many times, on average, will this loop repeat? We can compute the expected value of the number of repetitions. How many times, on average, would we roll a die until a ‘2’ comes up?

On any given roll, the probability of seeing a ‘2’ is $1/6$.
And if we are looking at a string of independent events with probability $p$, on average we wait $1/p$ times to see that event.

So the average number of times we would repeat this loop is $\frac{1}{1/6}$ or 6 times.

    Random rand = new Random();  // O(1)
    int roll = rand.nextInt(6) + 1; // O(1)
    while (roll != 2) {   // cond: O(1)   #: 6
       roll = rand.nextInt(6) + 1;  // O(1)
    }

Each iteration of the loop is $O(1)$ time for both the condition and the body, so we conclude that the loop average is $6*O(1)$ time, which simplifies to $O(1)$

    Random rand = new Random();  // O(1)
    int roll = rand.nextInt(6) + 1; // O(1)
    while (roll != 2) {   // cond: O(1)  #: 6  total: O(1)
       roll = rand.nextInt(6) + 1;  // O(1)
    }

And the entire block of code has an average-case complexity of $O(1)$.

%if _ignore

7 Extended Example: Ordered Insertion, Different Input Distributions

We’ll illustrate the process of doing average-case analysis by looking at a simple but useful algorithm, exploring how changes in the input distribution (the probabilities of seeing various possible inputs) affect the average case behavior.

Here is our ordered insertion algorithm.

template <typename Iterator, typename Comparable>
int addInOrder (Iterator start, Iterator stop, const Comparable& value)
{
  Iterator preStop = stop;    
  --preStop;                  
  while (stop != start && value < *preStop) {
    *stop = *preStop;         
    --stop;                   
    --preStop;                
  }
  // Insert the new value
  *stop = value;              
  return stop;                
}

We will, as always, assume that the basic iterator operations are $O(1)$. For the sake of this example, we will also assume that the operations on the Comparable type are $O(1)$. (We’ll discuss the practical implications of this at the end.)

We start, as usual, by marking the simple bits O(1).

template <typename Iterator, typename Comparable>
int addInOrder (Iterator start, Iterator stop, const Comparable& value)
{
  Iterator preStop = stop;    // O(1)
  --preStop;                  // O(1)
  while (stop != start && value < *preStop) {
    *stop = *preStop;         // O(1)
    --stop;                   // O(1)
    --preStop;                // O(1)
  }
  // Insert the new value
  *stop = value;              // O(1)
  return stop;                // O(1)
}

Next we note that the loop body can be collapsed to O(1).

template <typename Iterator, typename Comparable>
int addInOrder (Iterator start, Iterator stop, const Comparable& value)
{
  Iterator preStop = stop;    // O(1)
  --preStop;                  // O(1)
  while (stop != start && value < *preStop) {
    // O(1)
  }
  // Insert the new value
  *stop = value;              // O(1)
  return stop;                // O(1)
}

The loop condition is O(1):

template <typename Iterator, typename Comparable>
int addInOrder (Iterator start, Iterator stop, const Comparable& value)
{
  Iterator preStop = stop;    // O(1)
  --preStop;                  // O(1)
  while (stop != start && value < *preStop) { //cond: O(1)
    // O(1)
  }
  // Insert the new value
  *stop = value;              // O(1)
  return stop;                // O(1)
}

Because the loop condition and body are $O(1)$, we can use the shortcut of simply analyzing this loop on the expected number of iterations.

That, however, depends on the values already in the container, and how the new value compares to them.

The loop might execute

0 times (if value is larger than anything already in the container),
1 time (if value is larger than all but one element already in the container),
and so on up to a maximum of distance(start,stop) times (if value is smaller than everything already in the container).

What we don’t know are the probabilities to associate with these different numbers of iterations.

These depend upon the way the successive inputs to this function are distributed.

Consider using this algorithm as part of a spell checking program. We can envision two very different input patterns:

If we are reading words from the spell check dictionary into an array for later search purposes, those words are most likely already sorted. There may be a few exceptions, e.g., words that the user has added to their personal dictionary. For example, I often find it useful to add my last name to my spellcheck dictionaries.
If we are building a concordance, a list of words actually appearing in the document, then we would recieve the words in whatever order they appear in the document text, essentially a random order.

Let’s analyze each of these cases in turn and see how they might differ in performance.

7.1 Input in Sorted Order

template <typename Iterator, typename Comparable>
int addInOrder (Iterator start, Iterator stop,
                const Comparable& value)
{
  Iterator preStop = stop;    
  --preStop;                  
  while (stop != start && value < *preStop) {
    *stop = *preStop;         
    --stop;                   
    --preStop;                
  }
  // Insert the new value
  *stop = value;              
  return stop;                
}

If the input arrives in sorted order, then each call to the function will execute the loop zero times, because the word being inserted will always be alphabetically greater than all the words already in the container.

So if $p_k$ denotes the probability of executing the loop $k$ times, then $ p_0=1, p_1 = 0, p_2 = 0, \ldots $ .

So the time is

\[ \begin{align} t_{\mbox{loop}} & = t_L(0) \\ & = O(1) \end{align} \]

For this input pattern, the entire algorithm has an average-case complexity of $O(1)$.

7.2 Input in Arbitrary Order:

template <typename Iterator, typename Comparable>
int addInOrder (Iterator start, Iterator stop,
                const Comparable& value)
{
  Iterator preStop = stop;    
  --preStop;                  
  while (stop != start && value < *preStop) {
    *stop = *preStop;         
    --stop;                   
    --preStop;                
  }
  // Insert the new value
  *stop = value;              
  return stop;                
}

In this case, we are equally likely to need 0 iterations, 1 iteration, 2 iterations , … , n iterations, where $n$ is distance(start,stop). So the possible numbers of iterations from 0 to $n$ are all equally likely:

\[p_i = \left\{ \begin{array}{ll}\frac{1}{n+1} & \mbox{if } 0 \leq k \leq n \\ 0 & \mbox{otherwise}\end{array}\right. \]

The cost of the loop condition and of the body is constant for each iteration, however, so we can use the special case

\[ t_{\mbox{loop}} = t_L(E(k)) \]

where $E(k)$ is the expected number of iterations of the loop.

What is $E(k)$?

Intuitively, if we are equally likely to repeat the loop 0 times, 1 time, 2 times, … , $n$ times, the average number of iterations would seem to be $n/2$.

Formally,

\[ \begin{eqnarray*} E(k) & = & \sum_{k=0}^{\infty} p_k k \\ & = & \sum_{k=0}^{\mbox{n}} p_k k \; \; (\mbox{because } p_k=0 \mbox{ when } k > \mbox{n})\\ & = & \sum_{k=0}^{\mbox{n}} \frac{1}{\mbox{n}+1} k \\ & = & \frac{1}{\mbox{n}+1} \sum_{k=0}^{\mbox{n}} k \\ & = & \frac{1}{\mbox{n}+1} \frac{\mbox{n}(\mbox{n}+1)}{2} \\ & = & \frac{\mbox{n}}{2} \\ \end{eqnarray*} \]

Chalk one up for intuition!

So the loop is $\frac{n}{2} O(1) = O(n)$

template <typename Iterator, typename Comparable>
int addInOrder (Iterator start, Iterator stop, const Comparable& value)
{
  Iterator preStop = stop;    // O(1)
  --preStop;                  // O(1)
  while (stop != start && value < *preStop) { //cond: O(1) #:  n/2  total: O(n)
    // O(1)
  }
  // Insert the new value
  *stop = value;              // O(1)
  return stop;                // O(1)
}

And we can then replace the entire loop by $O(\mbox{n})$.

template <typename Iterator, typename Comparable>
int addInOrder (Iterator start, Iterator stop, const Comparable& value)
{
  Iterator preStop = stop;    // O(1)
  --preStop;                  // O(1)
  // O(n)
  // Insert the new value
  *stop = value;              // O(1)
  return stop;                // O(1)
}

And now, we add up the complexities in the remaining straight-line sequence, and conclude that the entire algorithm has an average case complexity of $O(n)$, where $n$ is the distance from start to stop, when presented with randomly arranged inputs.

This is the same result we had for the worst case analysis. Does this mean that it runs in the same time on average as it does in the worst case? No, on average, it runs in half the time of its worst case, but that’s only a constant multiplier, so it disappears when we simplify.

Under similar randomly arranged inputs, the average case complexity of ordered search is $O(n)$ and the average case complexity of binary search is $O(\log n)$. Again, these are the same as their worst-case complexities.

7.3 Inputs in Almost-Sorted Order

We’ve already considered the case where the inputs to this function were already arranged into ascending order. What would happen if the inputs were almost, but not exactly, already sorted into ascending order?

For example, suppose that, on average, one out of $n$ items is out of order. Then the probability of a given input repeating the loop zero times would be $p_{\mbox{0}} = \frac{n-1}{n}$, and some single $p_{\mbox{i}}$ would have probability $1/n$, with all the other probabilities being zero.

Assuming the worst (because we want to find an upper bound), let’s assume that the one out-of-order element is the very last one added, and that it actually gets inserted into position 0. Then we have $p_0 = (n-1)/n, p_1 = 0, p_2 = 0, … , p_{n-1} = 0, p_n = 1/n$

So the average number of iterations would be given by \begin{eqnarray*} E(k) & = & \sum_{k=0}^{n} k p_k \\ & = & 0 * (n-1)/n + n * 1/n \\ & = & n/n \\ & = & 1 \\ \end{eqnarray*} and the function is $O(E(k)) = O(1)$

7.4 Almost Sorted - version 2

Now, that’s only one possible scenario in which the inputs are almost sorted. Let’s look at another. Suppose that we knew that, for each successive input, the probability of it appearing in the input $m$ steps out of its correct position is proportional to $1/(m+1)$ (i.e., each additional step out of its correct position is progressively more unlikely). Then we have $p_{0}=c, p_{1}=c/2, p_{2}=c/3, … p_{n-1}=c/n, p_{n}=c/(n+1)$.

The constant $c$ is necessary because the sum of all the probabilities must be exactly 1. We can compute the value of $c$ by using that fact:

\begin{align} \sum_{i=0}^{n} p_i = & 1 \\ \sum_{i=0}^{n} \frac{c}{i+1} = & 1 \\ c \sum_{i=0}^{n} \frac{1}{i+1} = & 1 \end{align}

This sum, for reasonably large n, is approximately $\log n$.

So we conclude that $c$ is approximately $= 1/\log(n)$.

So the function, for this input distribution, is

\begin{align} t_{\mbox{loop}} = & O(E(k)) \\ = & O\left(\sum_{i=0}^n (i + 1)p_i\right) \\ = & O\left(\sum_{i=0}^n (i + 1) \frac{c}{i+1}\right) \\ = & O\left(\sum_{i=0}^n c\right) \\ = & O((n+1)c) \\ = & O\left(\frac{n}{\log n}\right) \end{align}

So the average case is slightly smaller than the worst case, though not by much (remember that $\log n$ is nearly constant over large ranges of $n$, so $n/(\log(n))$ grows only slightly slower than $n$.

8 The Input Distribution is Key

You can see, then, that average case complexity can vary considerably depending upon just what constitutes an “average” set of inputs.

Utility functions that get used in many different programs may see different input distributions in each program, and so their average performances in different programs will vary accordingly.