Commentary: Functions and Parameter passing
Steven Zeil
In every programming language, functions are a basic building block that packages statements together into convenient units. Functions play multiple roles in a program:
-
Sometimes we can use the function multiple times at different places in the the program, saving ourselves the need to duplicate the calculation.
This isn’t just saving ourselves typing. It also means we don’t have to debug and fix errors in the repeated code. It’s all too common to find a bug once, fix it, and forget that you had typed out the same code, with the same bug, in two or three other places in the program.
-
Sometimes we can reuse the function in other programs we will be writing in the future. Again, this doesn’t just save the effort of typing out the code, but saves a lot of effort testing and debugging.
-
They provide an opportunity to label a block of code, often in lieu of comments. For example, a programmer might feel compelled to label a messy computation with a comment:
⋮ main calculation // Estimate the square root of x double s0 = x/2; double theSqrt = s0; do { s0 = theSqrt; theSqrt = (s0 + x / s0) / 2.0; } while (abs(theSqrt - s0) > 0.01); ⋮ main calculation continues
by
⋮ main calculation theSqrt = estimateSquareRoot(x); ⋮ main calculation continues
in the main calculation, and move the detailed code into a function body elsewhere in the program. This may actually add a few lines of code to the program, but is often considered an improvement because the flow of the main calculation is not interrupted visually by the fine details of this messy bit of code.
1 Declaring Functions
1.1 Declare before use
Unlike many other programming languages, C++ has an important rule
A function must be declared before any code tries to call that function.
So this is OK:
double estimateSquareRoot (double x);
⋮
theSqrt = estimateSquareRoot(x);
but this is not:
theSqrt = estimateSquareRoot(x); // error function has not been declared yet
⋮
double estimateSquareRoot (double x);
Actually, this is just an instance of the more general rule in C++:
Every identifier must be declared before any code tries to use it.
Most programming languages have that rule for type names, variable names, etc., but some relax it for functions.
1.2 Function Prototypes
There are two ways to declare a function in C++.
-
You can give the whole function, including its body, e.g.,
double estimateSquareRoot (double x) { double s0 = x/2; double theSqrt = s0; do { s0 = theSqrt; theSqrt = (s0 + x / s0) / 2.0; } while (abs(theSqrt - s0) > 0.01); return theSqrt; }
Technically, this both declares and defines the function, but we’ll discuss the different between those two ideas in a later section.
-
You can give just the opening header, called the prototype of the function, e.g.,
double estimateSquareRoot (double x);
(Note the ‘;’ at the end.) The prototype provides just enough information for a programmer to use the function, and for the compiler to process a call to that function. Declaring a function as a prototype is also a kind of a promise by the programmer to provide the actual function body later.
A prototype has three essential components:
-
The return type (or
void
if this function does not return a value). -
The name of the function.
-
Within parentheses, a list of parameters that the function will accept.
This is called the formal parameters list.
Each parameter is described by giving its data type nad its name.
2 Calling functions
A function is called by writing its name within an expression followed by a pair of parentheses, within which we supply expressions to be evaluated and passed to the function as its parameters, e.g.,
halfTheSqrt = estimateSquareRoot(x) / 2.0;
theSqrtOfHalf = estimateSquareRoot(x/2.0);
The parameters supplied in the calls (“x” and “x/2.0” in this example) are referred to as the actual parameters of the call,
3 Parameter Passing
Programming languages tend to vary widely in the mechanisms by which parameters are passed from a function call to the function body.
3.1 Direction versus mechanism
Partly, this is a struggle to provide flexibility to programmers while still keeping function calls relatively efficient. Partly this is also a struggle to allow a function’s header/prototype to clearly express the intentions of the programmer and, just maybe, to enlist the aid of the compiler in enforcing those intentions.
One of the the first things that a function’s designer needs to communicate to the compiler and to any programmers who might want to use the function is the direction of each parameter.
- Some parameters are intended purely to supply input to the function.
- Some parameters are intended purely to provide output from the function.
- Some parameters are intended as “in/out” parameters, supplying an initial input value that will be changed or updated by the function.
Many years ago, the programming language Ada took an admirably direct approach to this question. Each function parameter in Ada had to be labeled as either in
, out
, or in out
. For some reason that rather admirably obvious approach has not been picked up by programming languages designed since then.
Instead, programming language designers have focused on the mechanism by which parameters are supp;lied to the function, as if most of us programmers really cared how the compiler accomplishes its underlying magic. The two most common mechanisms are
- pass by copy: the function receives its own private copy of the actual parameter value. This extra copy disappears when we return form the function. (This mechanism is also sometimes known as pass by value.)
- pass by reference: the function receives the address in which the caller is keeping the “original” actual parameter value. Via this address, the function could alter the value being held by the caller.
As a general rule, the mechanisms interact with the direction:
can be used for direction | |||
---|---|---|---|
mechanism | input | output | in/out |
copy | yes | no | no |
reference | yes* | yes* | yes |
The asterisks indicate that, although the mechanism can be used for that purpose, it does not prevent the function from doing both input and output with a parameter. In other words, if we want to be a bit more restrictive:
can be used for direction | |||
---|---|---|---|
mechanism | input only | output only | in/out |
copy | yes | no | no |
reference | no | no | yes |
Often, programmers designing a new function want to make a promise that their new function
- will not make changes to parameters that are intended as pure input.
- will not look at the value previously stored in an output parameter.
That’s crucial information to a programmer who wants to write a call to the function, and neither of the common mechanisms work well for all three possible direction cases.
3.2 The mechanisms and directions of C++
-
If a formal parameter is declared by simply giving the data type followed by a name, then it will be passed by copy, e.g., in
double estimateSquareRoot (double x);
x
will be passed by copy. The function gets its own copy of thatdouble
value. If the function body changes the value ofx
, it is only changing its own copy and the changes will never be seen by the caller. -
If a formal parameter is declared by giving the data type followed by ‘&’ and then the name, then it will be passed by reference, e.g., in
void doubleTheValue (double& x) { x = 2.0 * x; }
x
will be passed by reference. The function gets the address of thatdouble
value. If the function body changes the value ofx
, it changes the value in the caller. For example:double z = 1.0; doubleTheValue (z); cout << z << endl;
will print “2” because the function is given the address of
z
and makes its changes to the caller’s variable. -
If a formal parameter is declared by giving the word “
const
”, then the data type followed by ‘&’ and then the name, then it will be passed by const reference. The function receives the address of the actual parameter, but the compiler will flag as compilation errors any code in the function body that tries to change the value.For example, the code
void doubleTheValue (const double& x) { x = 2.0 * x; }
will receive a compilation error on the attempt to assign a new value to
x
, but this code:double estimateSquareRoot (const double& x) { double s0 = x/2; double theSqrt = s0; do { s0 = theSqrt; theSqrt = (s0 + x / s0) / 2.0; } while (abs(theSqrt - s0) > 0.01); return theSqrt; }
compiles just fine.
The C++ direction versus mechanism breakdown is
can be used for direction | |||
---|---|---|---|
mechanism | input only | output only | in/out |
copy | yes | no | no |
reference | no | no | yes |
const reference | yes | no | no |
There is no way to to achieve a pure output parameter in C++.
The yes/no pattern for “copy” and “const reference” are identical. Both work when we want a pure input parameter. The difference is essentially one of time. It takes a little bit of extra time for each access of a reference parameters.
But there is a one-time (per call) time penalty to make a copy of a parameter. That one-time penalty can be small for data types like char
, int
or double
that take only a few bytes and that can be copied using simple byte-by-byte copies. Or it cab be huge for data types that take a lot of space or that require elaborate copying algorithms, both of which may be true for some of the structured data types that you will encounter in later lessons. So the rule of thumb in C++ is
Use pass-by-copy for input direction with primitive data types or when the function body would be making and then changing its own copy of the input anyway.
Use pass-by-const-reference for input direction with structured data types.
Use pass-by-reference for output or input-output directions.
3.2.1 For “Language Lawyers” Only
Because I know there are going to be some out there…
Actually all parameters in C++ are passed by copy. But in C++ the data types T
, T&
(reference to T
), and const T&
(const reference to T
) are all distinct data types. C++ includes rules, however, instructing the compiler to inject automatic conversions among them whenever it makes sense to do so.
So, if I declare a C++ function with a reference or const reference parameter, the compiler actually converts my original value to a reference type by getting the address to that value, and passes that address by copy. But the effect is the same as passing the original value by reference or by const reference, and in most circumstances both the function’s designer and the designers of the C++ language would like you to think about it that way.
4 An Experiment
Enter the following code into your own C++ IDE, compile and run it.
#include <iostream>
using namespace std;
void swap (int x, int y)
{
int temp = x;
x = y;
y = temp;
cout << "x:" << x << " y:" << y << endl;
}
int main()
{
int a = 12;
int b = 63;
cout << "a:" << a << " b:" << b << endl;
swap(a,b);
cout << "a:" << a << " b:" << b << endl;
return 0;
}
Are you surprised at what didn’t happen?
Now make a one-line change:
void swap (int& x, int& y)
Recompile, and run again. Notice the change.
Finally, try changing that same line to:
void swap (const int& x, int& y)
Try recompiling. It won’t succeed, but take careful note of just which line gets flagged. We made a promise in the new swap
prototype, reneged on that promise in the function body, and got caught doing so.