I Remember MergeSort
(Memory Overhead)

So far, we have used big-O notation to describe program run times. But the idea of big-O as describing an upper bound on the rate of growth can actually be expanded to worst or average cases of any numeric quantity.

Define the "memory overhead" of a function as the amount of memory required for the data of that function. Two things contribute to memory overhead:

  1. The number of "activation records" on the system run-time stack. An activation record is created whenever a function is called and exists until we return from that call. A certain system-dependent number of bytes go into every activation record to hold information like the return address or the values of certain critical hardware registers. Whatever this number of bytes is, it is constant for any given system. But what does vary from one activation record to another is...
  2. The memory required for the parameters and local variables of the function. Each call, and therefore each activation record, gets its own set of local variables.
To determine the memory overhead of a function, we need to determine how many and what other functions it calls, and then add up "c + size(local variables)" over the combinations of activations that could be in effect at the same time. Big-O notation comes in handy here, because we may not know (and may not care) about the system-dependent constants. Furthermore, in some functions the size of the local variables may vary depending upon the function parameters (e.g., a function foo(N) might allocate a local array of size N). Also, in recursive routines, the number of activations on the stack at one time may depend on the function parameters.

Example 1: The following function

void printSqrts (double a[], int n)
{
  for (int i = 0; i < n; ++i)
    cout << sqrt(a[i]) << endl;
}
would have an O(1) memory overhead. The function itself has only one local variable (i), and its two parameters are a pointer (arrays are passed as pointers) and an int. So its own activation record size would be a system-dependent constant plus the size of two integers plus a pointer (actually, compilers often generate temporary variables that we don't see, but these are always O(1)). So we can claim that its activation record has size O(1). The function calls 3 other functions (sqrt(), <<(double), and <<(E) where E is the type of the endl constant). Only one of those is active at any time, so we need to look at maxSize(printSqrts and sqrt, printSqrts and <<(double), printSqrts and <<(E)). It's a pretty safe bet that these also have O(1) overhead. So the total memory overhead is max(O(1)+O(1), O(1)+O(1), O(1)+O(1)), which works out to just O(1).

Example 2: The following function

void printSqrts (double a[], int n)
{
  double* sqrts = new double[n];
  for (int i = 0; i < n; ++i)
    sqrts[i] = sqrt(i);
  for (int i = 0; i < n; ++i)
    cout << sqrts[i]) << endl;
}
would have an O(n) memory overhead. The function allocates an array of size O(n) bytes (the exact value depends upon the machine-dependent integer size and the machine-dependent rules for packing elements into arrays), so its own activation record size would be O(n). The function calls 3 other functions, but their O(1) overhead is dominated by the O(n) term. So the total memory overhead is O(n).

Example 3: The following function

struct ListNode {
   Data data;
   ListNode* next;
};

int length (ListNode* theList)
{
  if (theList == 0)
    return 0;
  else
    return 1 + length(theList->next);
}
would have an O(N) memory overhead, where N is the length of the list. The function has an activation record of size O(1), but the recursive call may result in N simultaneous activations. So we have N*O(1) = O(N).


Now, try applying this idea of memory overhead to some functions we have studied.

Click here to start.