Pointers and References

Steven Zeil

Last modified: Dec 26, 2016
Contents:

In this lesson, we examine the data types that C++ provides for storing addresses of data.

These are important because they

1 Indirection

It’s rather like answering a question about, say, the meaning of the word “Ragnarok” by pointing to a nearby dictionary instead of explaining it directly.

In our code, sometimes we will employ this indirection for efficiency. In other cases we use it for flexibility or to simplify our code.

A pointer or reference is the “name” or “address” of an object NOT the object itself.

Most variables have names assigned to them by the programmer at the time the program is written (e.g. “int numberOfCourses;”). Every variable in a program has an machine address where that variable is stored in the main memory of the computer. Think of this as the “machine’s name” for the variable.

Every variable in a program has a value which is the data stored in the variable’s address.

2 References

Reference types are introduced by the use of “&” in a type expression.

Example

double z[1000];
   ⋮
int k = i + 200*j;
double& zk = z[k];

zk holds the address of z[k].


Initializing References

 
When reference variables are declared, they must be immediately initialized to the location of some existing data value, e.g.,

double& zk = z[k];  // zk get address of z[k]

One initialized, a reference cannot be reset to point to a different location


Accessing data Via References

 
One initialized, we can use the reference much like any ordinary variable:

double& zk = z[k];  // zk get address of z[k]
cout << zk; // accesses data stored 
            // at that location


Assignment and References

Subsequent assignments to a reference variable will store new values at that location, but will not change the location.

double& zk = z[k];  // zk get address of z[k]
zk = 1.0; // changes the value of z[k]
++k;
zk = 2.0; // changes the value of z[k-1]


Example: working with indirection

Question: What would the output of the following code be?

int a = 1;
int b = a;
cout << a << " " << b << endl;
a = 2;
cout << a << " " << b << endl;
b = 3;
cout << a << " " << b << endl;

**Answer:**

Example: working with indirection (cont.)

Let’s make a one-character change …

Question: What would the output of the following code be?

int a = 1;
int& b = a;
cout << a << " " << b << endl;
a = 2;
cout << a << " " << b << endl;
b = 3;
cout << a << " " << b << endl;

**Answer:**

References and Loops

Once initialized, a reference cannot be reset to a different location.

The above example suggests one reason why we may use references – to avoid repeating long and complicated calculations to select array elements or struct members.


Const References

When we modify a reference type by pre-pending “const”:


References and Functions

You’ve seen lots of functions using reference parameters

void foo (Money& m, const Time& t);


References and Functions (returns)

References are also sometimes used as return types

const Money& getAmountBid (Bid& b)
{
   return b.amountBid;
}
   ⋮
cout << getAmountBid(myBid).dollars;

Slight improvement in efficiency – the Money value does not need to be copied.


References and Functions (returns) cont.

Money& getAmountBid (Bid& b)
{
   return b.amountBid;
}
   ⋮
getAmountBid(myBid).dollars = 0;

3 Pointers


Pointers

Pointers, like references, store the location or address of data.

3.1 Working with Pointers


Declaring Pointer Variables

A pointer declared like this

double *p;

contains, essentially, random bits.

To be useful, it must be initialized

double *p =  ... 

or, later, re-assigned

p =  ... 


Initializing Pointer Variables

Where do new pointer values come from?

NULL is actually a bit of a problem. Not only do you have to include a special header to get it, but there are some rare circumstances where passing it to functions that take a pointer as parameter will not compile properly. Hence the new standard introduced a better-behaved universal null pointer constant.

This won’t be available, however, until compilers take the C++11 features out of their beta status.


Dereferencing a Pointer

Accessing data whose address is stored in a pointer is called dereferencing the pointer.


Dereferencing and Structs


Assignment and Pointers

Subsequent assignments to a pointer variable will change the location it points to.

double* zk = &(z[k]);  // zk get address of z[k]
*zk = 1.0; // changes the value of z[k]
zk = &(z[k+1]);
zk = 2.0; // changes the value of z[k+1]


Example: working with pointers

Question: What would the output of the following code be?

int a = 1;
int b = 2;
int* pa = &a;
int* pb = &b;
cout << a << " " << *pa << " " << b << endl;
a = 3;
cout << a << " " << *pa << " " << b << endl;
*pb = 4;
cout << a << " " << *pa << " " << b << endl;
pa = pb;
cout << a << " " << *pa << " " << b << endl;

**Answer:**

Const Pointers

When we modify a pointer type by pre-pending “const”:

3.1.1 Memory and C++ Programs


Where is data Stored?

The memory of a running C++ program is divided into three main areas:


How Functions Work

int foo(int a, int b)
{
  return a+b-1;
}

would compile into a block of code equivalent to

   stack[1] = stack[3] + stack[2] - 1;
   jump to address in stack[0]


The Runtime Stack


An Example of Function Activation

Suppose that we were executing this code, and had just come to the call to resolveAuction within main.

#include "time.h"

void resolveAuction (Item item)
{
  ⋮
  int h = item.auctionEndsAt.getTime();
  ⋮
}

int main (int argc, char** argv)
{
  ⋮
  resolveAuction (item);
  ⋮
}


 

The runtime stack (a.ka., the activation stack) would, at this point in time, contain a single activation record for the main function, as that is the only function currently executing:

 

When main calls resolveAuction, a new record is added to the stack with space for all the data required for this new function call.

 
When resolveAuction calls getTime, another new record is added to the stack with space for all the data required for that new function call.

 

But once getHours returns (to resolveAuction), that activation record is removed from the stack. resolveAuction is once again the active function.

 
And when resolveAuction returns, its record is likewise removed from the stack.

3.1.2 Allocating Data


Example of Overall Memory layout

 


Allocating Data on the Heap

We allocate data with new and remove it with delete:

int *p = new int;
int *pa = new int[100];
    ⋮
delete p;
delete [] pa;

Note the slightly different forms for arrays versus single instances.


Dynamic Allocation

Programmers often distinguish between “dynamic” and “static” activities:

int *p = new int;
int *pa = new int[100];

Allocation of data via new is called dynamic allocation of data.

Dynamically allocated memory is controlled through the two operators new and delete


Summary: Pointers versus References

BLANK References Pointers
Type Declaration & *
Initialization must be initialized
points to existing data
optional
may be null
Dereferencing automatic *, ->
Management automatic new, delete
Dangerous? minimal very

3.2 Pointers Can Be Dangerous

Because pointers provide access a memory location and because data and executable code exist in memory together, misuses of pointers can lead to both~ bizarre effects and very subtle errors.


Potential Problems with Pointers


Uninitialized pointers


Memory Leaks

A memory leak occurs when all pointers to a value allocated on the heap has been lost, e.g.,

int isqrt (int i)
{
   int* work = new int;
   *work = i;
   while ((*work) * (*work) > i)
     -- (*work);
   return *work;
}

Over time, memory leaks can cause programs to slow down and, eventually, crash.

Worse, a leaky program may come to take up so much of a systems memory that it interferes with the operation of other programs on the same system.


Dangling Pointers

Dangling pointers refer to a pointer which was pointing at an object that has been deleted.

int* p = new int;
int* q = p;
  ⋮
delete p;

4 The Secret World of Pointers

4.1 Pointers and Arrays


What’s in a Name? (of an array)

int a[100];
double b[1000];


Arrays are Pointers

int a[100];
double b[1000];

You may have observed examples of passing arrays to functions as parameters. They are usually passed as pointers. That’s possible because of the fact that arrays really are pointers.


Pointer Arithmetic

> g++ pointerArith.cpp 
> ./a.out 
i 0x1eea010  d 0x1eea030
i 0x1eea014  d 0x1eea038
> 

Pointer Arithmetic 2

> g++ pointerArith2.cpp 
> ./a.out 
p 0x7fff2065c380  q 0x7fff2065c38c  q-p 3
> 


OK, Why do Pointer Arithmetic?

Pointer arithmetic is actually illegal and pretty much useless except when the addresses are all within a single array.

double b[1000];


Pointers, Arrays, and Functions

This is why, when arrays are passed to functions, they are generally passed as pointers:

double sumOverArray (double* a, int n)
{
   double s = 0.0;
   for (int i = 0; i < n; ++i)
      s += a[i];
   return s;
}

We’ll have more on this shortly when we look at dynamically allocated arrays.

4.2 Pointer and Strings


Not all strings are created equal.


String Literals

The most common place where strings and character arrays meet is in string literals.


main

The main function in C++ programs has the prototype

int main (int argc, char** argv)

Question: Why are there two asterisks in char**?

The first ’*’ indicates that each parameter is a C-style character array (passed, as is common for arrays, as a pointer).

The second ’*’ indicates that this is an array of those character arrays (again, passed as a pointer), because there can be multiple command line parameters.

4.3 Pointers and Member Functions


Hide the Parameter

Remember that when we convert standalone functions to member functions, one parameter becomes implicit:

struct Money {
   ⋮
};
Money add (Money left, Money right);

becomes

struct Money {
   ⋮
   Money add (Money right);
};


Revealing the Hidden Parameter

That parameter really does exist

struct Money {
   ...
   Money add (/* Money* this,*/ Money right);
};


Using this

Sometimes we need to make explicit reference to the implicit parameter.

Suppose that we had

Money Money::add (Money right)
{
  Money result;
  result.dollars = dollars + right.dollars;
  result.cents = cents + right.cents;
  return result;
}

and wanted to add some debugging output …


Explicit this

Money Money::add (Money right)
{
  cerr << "Entering Money::add, adding "
       <<  *this
       <<  " to " << right << endl;
  Money result;
  result.dollars = dollars + right.dollars;
  result.cents = cents + right.cents;
  return result;
}

There’s really no other way to pass the whole “left” value to another function or operator.

The need to explicitly refer to this is unusual, but not all that rare.