Generic Programming
Steven J. Zeil
When we combine iterators with template functions, we get a powerful tool for writing programs. Because the iterator interface is the same no matter what kind of container the data is really in, many algorithms can be written as function templates to work with data in almost any kind of container.
This combination is called generic programming.
1 Iterators + Templates = Generic Programming
One benefit of designing our own classes to follow the “standard” form of iterators is that the C++ standard library is packed with small function templates for using iterators to do common tasks. These can be found in the header file <algorithm>
.
For example, we can search any range of data for a particular element using std::find
:
#include <algorithm>
⋮
pos = find(startingPosition, stoppingPosition, x);
This searches a sequence of data, beginning at startingPosition
, up to but not including stoppingPosition
, for the value x
.
If it finds x
, it returns the position where it was found. If it doesn’t find it, it returns stoppingPosition
(which, I was careful to note, is not one of the positions actually searched, so we can unambiguously determine whether we found x
or not:
#include <algorithm>
⋮
pos = find(startingPosition, stoppingPosition, x);
if (pos != stoppingPosition)
{
cout << "Found it!" << endl;
}
else
{
cout << "It's not in there." << endl;
}
Now, pos
, startingPosition
, and stoppingPosition
are all iterators of some kind. They must all be of the same iterator type, and that type has to be a position of whatever the type of x
is".
The std::find
function will work with iterators taken from an array, a vector, a list, … , or whatever.
How does this happen? std::find
is implemented as a template:
template <typename Iterator, typename T>
Iterator find (Iterator start, iterator stop, T x)
{
while (start != stop && !(x == *start))
++start;
return start;
}
-
This template makes no assumptions about the iterators passed to it, except that they support the operations
!=
,*
, and++
, which all iterators, no matter what container they come from, are supposed to support. -
It also makes minimal assumptions about the type of
x
. It simply assumes thatx
will be from a data type that supports comparison via==
.
So this template can be applied to iterators from arrays, vector
s, list
s, Book
s, PersonnelRecord
s, MyFavoriteDataTypeNumber241
, or whatever. As long as we can give a starting position and a stopping position, the code in find
is valid.
One of the hallmarks of the generic style of programming is that we always try to work on ranges of positions (iterators) with no explicit references to the container that those positions were drawn from.
We’ve already seen another such generic function template when we looked at Searching via Iterator Variants:
template <typename Iterator, typename Value>
Iterator lower_bound (Iterator start, Iterator stop, const Value& key);
as well as a handful of other useful function templates that are, strictly speaking, not “generic” because they are not based on ranges of iterator positions.
In this lesson, we want to look at more such generic functions and get a little better feel for how they can influence C++ programming style. If you have taken CS333 Principles of Programming Languages or a similar course, you may also recognize that generic programming shares a lot of ideas with “functional programming” as well.
2 Copying
All of our std::
containers support copying via copy constructors or assignment. But those cases involve copying between two containers of exactly the same type, e.g., a vector<int>
to another vector<int>
, or perhaps a list<string>
to another list<string>
.
But what if we wanted to copy a vector of strings to a list of strings? We can use std::copy
for that.
For example, we can copy one container into another this way:
vector<string> ws(50);
std::string str[50];
⋮
copy (ws.begin(), ws.end(), str);
(copies ws
into str
)
This works because copy is written entirely in terms of iterator operations, and iterators can be applied to almost any container.
template <class InputIterator,
class OutputIterator>
OutputIterator copy(InputIterator first,
InputIterator last,
OutputIterator result)
{
while (first != last)
{
*result = *first;
result++; first++;
}
return result;
}
Notice how we have two template parameters, InputIterator
and OutputIterator
, that get replaced when copy
is used. Of course, the names for these parameters are arbitrary. We could just as well have called them George
and Martha
instead of InputIterator
and OutputIterator
(at least, if we ignore documentation quality). How, then, does the compiler know that this copy
algorithm is supposed to work with iterators?
It doesn’t, really. But the copy
operation is written in terms of operator*
, operator++
and operator!=
, all of which are part of the conventional iterator interface. So the compiler will allow us to use copy
with any data type that is sufficiently iterator-like to provide those operations.
2.1 Using std::copy
So, for example, in one version of Book
(using a dynamically alllocated array), we implemented the Book
constructor and assignment operator as shown here:
class Book {
public:
Book (int nAuthors, Author* a,
string theTitle, string theID);
Book (const Book& b);
⋮
private:
std::string title;
int numAuthors;
Author* authors; // dynamic array of authors
std::string identifier;
};
⋮
Book::Book (int nAuthors, Author* a,
string theTitle, string theID);
{
authors = new Author[nAuthors];
numAuthors = nAuthors;
identifier = theID;
for (int i = 0; i < numAuthors; ++i)
authors[i] = a[i];
}
Book::Book& operator= (const Book& b)
{
delete [] authors;
authors = new Author[b.numAuthors];
numAuthors = b.numAuthors;
identifier = b.identifier;
for (int i = 0; i < numAuthors; ++i)
authors[i] = b.authors[i];
return *this;
}
With the standard copy
function, this can be written:
Book::Book (int nAuthors, Author* a,
string theTitle, string theID);
{
authors = new Author[nAuthors];
numAuthors = nAuthors;
identifier = theID;
copy (a, a+nAuthors, authors);
}
Book::Book& operator= (const Book& b)
{
delete [] authors;
authors = new Author[b.numAuthors];
numAuthors = b.numAuthors;
identifier = b.identifier;
copy (b.authors, b.authors+b.numAuthors, authors);
return *this;
}
OK, big deal. We saved two whole lines of code. Is that worth worrying about?
Well, that’s really not the point. What we did gain is the instant recognition that what is going on is a “copy”.
Someone reading the original version would need to study the loop and recognize the pattern of a copy.
That might onlysave a few seconds, but multiply that by all the places in a real program where copies and similar common, trivial programming patterns occur. The total gain in readability may be substantial.
And, as you gain experience using these standard algorithms, you will probably find that you avoid a lot of diddly little programming mistakes that you would have made in endlessly rewriting the more detailed explicit loops.
2.1.1 Generics and Clean Coding
This use of std functions is very much in keeping with one of the practices that make up Clean Coding: A function should do only one thing. Robert Martin, the guru of Clean Coding, suggests that
“The first rule of functions is that they should be small. The second rule of functions is that they should be smaller than that.”
Martin Fowler explains the reasoning for this:
“If you have to spend effort into looking at a fragment of code to figure out what it’s doing, then you should extract it into a function and name the function after that ‘what’. That way when you read it again, the purpose of the function leaps right out at you”
At its extreme, this practice of Clean Coding suggests that functions should rarely have nested control flow constructors – no nested loops, no ifs inside loops or vice versa. If you can’t look at a loop body or the then- or else-part of an if and say what it does in a simple phrase, do you really understand it? Can you possibly understand what the nested construct that includes that block of code does? And if you can describe that block of code in a simple phrase, why not pull it out into a separate (private) function named with that descriptive phrase?
2.2 copy and I/O iterators
We can use the standard template ostream_iterator
to get an output iterator that stores items in an output stream. All iterators represent a position within a container. In this case, the container is the output stream and the “position” is the place where we are set to write our next output. In effect, then, copying to that position will let us write a whole series of items.
#include <algorithm>
#include <iostream>
#include <iterator>
#include <string>
using namespace std;
int main ()
{
int v[5] = {-1, 5, 5, 5, 8};
string s[4];
s[0] = "zero";
s[1] = "one";
s[2] = "two";
s[3] = "three";
copy (v, v+5, ostream_iterator<int>(cout, "\n"));
// writes -1 5 5 5 8, each number on a separate line
copy (s, s+4, ostream_iterator<string>(cout, "!="));
// writes zero!=one!=two!=three!=
return 0;
}
Similarly, there is an input iterator called istream_iterator
, which can be used to read from an input stream. Try compiling and running the program shown here.
#include <algorithm>
#include <iostream>
#include <iterator>
#include <string>
using namespace std;
int main ()
{
copy (istream_iterator<string>(cin), // input iterator reading strings from cin
istream_iterator<string>(), // end-of-file position
ostream_iterator<string>(cout, "\n") // output iterator writing to cout
);
return 0;
}
Note that it doesn’t quite exactly copy its input to its output. Can you figure out why?
2.3 Copying, Overwriting, and Inserting
This is dangerous:
int a[5] = {1, 2, 3, 4, 5};
int b[3];
copy (a, a+5, b);
It is the generic equivalent of
int a[5] = {1, 2, 3, 4, 5};
int b[3];
for (int i = 0; i < 5; ++i)
b[i] = a[i];
which “obviously” writes past the end of the array b
.
For similar reasons, this does not work:
int a[5] = {1, 2, 3, 4, 5};
vector<int> v;
copy (a, a+5, v.begin());
copy
writes data into existing positions - you need to be sure that the data slots actually exist.
Pretty much the same is true for the output range of any generic function - they all assume that the data positions you name as the output range already exist.
2.3.1 Making room
This works:
vector<int> v;
⋮
int a[5] = {1, 2, 3, 4, 5};
v.resize(5);
copy (a, a+5, v.begin());
because resize
actually creates the data positions.
This does not:
vector<int> v;
⋮
int a[5] = {1, 2, 3, 4, 5};
v.reserve(5);
copy (a, a+5, v.begin());
because reserve
merely sets things up so that, if we later try to create some data slots, we won’t need to allocate new memory for them. The data slots may exist in the sense that the space for them has been allocated, but they have not actually been initialized and the vector’s size()
indicates that those positions are still unused.
2.3.2 copying and expanding
resize
is a useful trick, but it’s specific to vector
s. There’s a more general solution to this problem — special iterators that are used to create new data slots as they are being filled.
-
The special iterators are
-
back_inserter
-
front_inserter
-
inserter
-
-
All three are contained in
<iterator>
2.3.3 back_inserter
back_inserter(container)
returns an iterator on that container
- This is an output iterator – you can assign to it but not look at it.
- Each assignment to the element at this iterator results in a
push_back()
call on the container.
int a[5] = {1, 2, 3, 4, 5};
vector<int> v;
assert (v.size() == 0);
copy (a, a+5, back_inserter(v));
assert (v.size() == 5); // Five push_back's were done
2.3.4 front_inserter
front_inserter(container)
returns an iterator on that container.
- This is an output iterator - you can assign to it but not look at it.
- Each assignment to the element at this iterator results in a
push_front()
call on the container.
int a[5] = {1, 2, 3, 4, 5};
list<int> alist;
assert (alist.size() == 0);
copy (a, a+5, front_inserter(alist));
assert (alist.size() == 5);
assert (alist.front() == 5);
assert (alist.back() == 1); // note that the order of the data is reversed
Obviously, this only works for containers that actually provide the push_front operation, so we can’t do this, for example, with vectors.
2.3.5 inserter
inserter(container,iter)
returns an iterator on that container denoting the same position as iter
-
This is an output iterator - you can assign to it but not look at it
-
Each assignment of
foo
to the element at this iterator results in aninsert(iter,foo)
call on the container
int a[5] = {1, 2, 3, 4, 5};
list<int> alist (3, 0);
list<int>::iterator pos = alist.begin();
++pos;
copy (a, a+5, inserter(alist, pos));
// alist contains 0 1 2 3 4 5 0 0
3 Other Useful Generic Functions in the std:: Library
3.1 equal
Another useful function is equal
, which tests to see if corresponding elements in two position ranges are equal:
int v[5] = {-1, 5, 5, 5, 8};
int w[3] = {5, 5, 5};
assert (!equal(w, w+3, v));
assert (equal(w, w+3, v+1));
assert (equal(v+1, v+4, w));
There are actually two forms of equal
. The three parameter form
equal (start1, stop1, start2)
checks to see if the elements in the range of positions start1...stop1
are equal to the same number of elements in positions starting at start2
.
The four parameter form
equal (start1, stop1, start2, stop2)
checks to see if the elements in the range of positions start1...stop1
are equal to the number of elements in positions start2...stop2
and that the number of elements in the two ranges are the same.
For example, suppose that we wanted to know if two books have the same list of authors. We could have written it this way:
bool sameAuthors (const Book& left, const Book& right)
{
if (left.numberOfAuthors() == right.numberofAuthors())
{
auto lpos = left.begin();
auto rpos = right.begin();
for (int i = 0; i < left.numberOfAuthors; ++i)
{
if (*lpos != *rpos)
return false;
++lpos; ++rpos;
}
return true;
}
else
return false;
}
but we can simplify this to
bool sameAuthors (const Book& left, const Book& right)
{
if (left.numberOfAuthors() == right.numberofAuthors())
{
return equal(left.begin(), left.end(), right.begin());
}
else
return false;
}
or
bool sameAuthors (const Book& left, const Book& right)
{
return (left.numberOfAuthors() == right.numberofAuthors())
&& equal(left.begin(), left.end(), right.begin());
}
or
bool sameAuthors (const Book& left, const Book& right)
{
return equal(left.begin(), left.end(), right.begin(), right.end());
}
I might prefer the next-to-last version as being slightly faaster. It’s often a good practice to do the cheap $O(1)$ test first before diving in to a test that requires looping through the whole container.
Similarly, we could easily implement a comparison operator for sorting books by author lists via the function lexicographical_compare
. See the references at the end of these notes for details.
3.2 find and lower_bound
The find
function performs an unordered sequential search for a value in some range of positions.
p = find(ws.begin(), ws.end(), "foobar");
searches a container for the indicated string.
If find
cannot locate the indicated string, it returns the end position of the search range (e.g., ws.end()
in the example above). Remember that iterator ranges are always inclusive on the starting position, exclusive on the ending position, so the end position of a search range could not possibly be returned by a successful search.
We can provide the equivalent of an ordered insert for containers that support the insert
operations using another standard template function:
Container<Element> container;
⋮
cin >> x;
Container::iterator p = lower_bound (container.begin(), container.end(), x);
container.insert (x, p);
lower_bound
returns the first location where x
could be inserted if the collection is being maintained in sorted order.
What makes
lower_bound
a really nice function is that it uses a binary search when given random-access (e.g.,std::vector
) or trivial (array) iterators, and uses a sequential search when given merely forward or bi-directional (e.g.,std::list
) iterators.
There is a related function, upper_bound
, which is called the same way, that returns the last position where key
could be inserted. For example, suppose we had a container with 3 copies of key
already in it. Then lower_bound
would point to the first of these three, and upper_bound
would point to the position just after the third copy.
Sometimes we don’t need the position – we just want to know if the value is in there or not. If we know that we have a random access iterator, we can use binary_search
, which looks like lower_bound
but returns a boolean.
Container<Element> container;
⋮
cin >> x;
if (binary_search (container.begin(), container.end(), x))
{
cout << "We found " << x << "!" << endl;
}
3.3 count
std::string a(50, ' ');
⋮
count(a, a+50, "xxx");
Counts the number of occurrences of "xxx"
in the array a
.
3.4 fill
std::string a[50];
fill_n (a, 5, "Hello");
fills the first 5 positions of a
with "Hello"
.
~
fill(a+5, a+50, "GoodBye");
Fills the remaining positions of a
with "GoodBye"
.
4 std:: Library Generics That Take Functions
4.1 Passing Functions as Parameters
In C++, we can pass functions as parameters to other functions. For example, this is legal (though a bit silly):
typedef int *FunctionType (int); // FunctionType is declared as a type name
// for the set of all functions that
// take 1 int and return an int.
int doItTwice(FunctionType f, int i)
{
return f(f(i));
}
int mult2 (int x) {return 2*x;}
⋮
int Twelve = doItTwice(mult2, 3);
The function doItTwice
will actually call mult2
twice, passing the result of the first call as the parameter of the second.
Functions can be useful as parameters in various applications. They are often used in conjunction with templates.
4.2 for_each
Some of the most commonly used generics (in my own coding, only copy
gets used more often) are for applying a function to every element in a range:
for_each
applies a unary function (i.e., a function taking a single parameter) to each element in a range.
void printLength(string s)
{
cout << s << " is of length "
<< s.length() << endl;
}
⋮
for_each (ws.begin(), ws.end(), printLength);
In this case, if ws
contains the words [“aardvarks”, “are”, “furry”], then the output would be:
aardvarks is of length 9
are is of length 3
furry is of length 5
4.3 transform
for_each
applies a function to every element in a range and disregards the return values, if any.
transform
, on the other hand, applies a function to every element in a range and collects the returned values by copying them into an output range.
#include <string>
⋮
int v[5] = {-1, 5, 5, 5, 8};
string s[5];
transform (v, v+5, s, to_string);
In this example, each of the five v
values will be passed, one at a time, to to_string
(a function that converts numbers to strings) and the five resulting string values, [“-1”, “5”, “5”, “5”, “8”], stored in s
.
The first two parameters give the input range, and the third parameter is the beginning of the output range. We don’t have to specify the end of the output range, because there will be just as many outputs are there are input values. The final parameter is, of course, the function we want to apply to each element.
transform
can be used to replace each element in a range by some function of itself. To do this, we simply make the output range the same as the input range.
For example, suppose we had an array myNumbers
of N
floating point numbers and were really interested in working with the absolute value of all those numbers. We could write
transform(myNumbers, myNumbers+N, myNumbers, fabs);
and thereby replace every element in myNumbers
by its absolute value (the fabs
function).
4.4 "_if" Generics and Predicates
Many of the generic functions that do searching or comparisons of some kind (including some we have already discussed) have an alternate "_if" version that can take a function that is used in place of the obvious defaults.
For example, we earlier looked at this example of find:
list<string> ws;
⋮
p = find(ws.begin(), ws.end(), "foobar");
to search a container for a specific string.
Suppose, however, that we were interested in searching for a string that contained “foobar”. We could do this by supplying the appropriate test as a function:
bool containsFoobar (const string& s)
{
return s.find("foobar") != string::npos);
}
⋮
list<string> ws;
⋮
p = find_if(ws.begin(), ws.end(), containsFooBar);
The function passed to find_if
must return a bool
, for reasons that should be apparent. This kind of function is sometimes called a predicate.
Or, suppose that we wanted to find out if any of the strings in our container were exactly one character long:
bool oneCharLong (const string& s)
{
return s.size() == 1;
}
⋮
list<string> ws;
⋮
p = find_if(ws.begin(), ws.end(), oneCharLong);
There is also a useful variation find_if_not
.
Similarly,
int k = 0;
count_if (ws.begin(), ws.end(), oneCharLong, k);
would count how many strings in the container ws
are one character long.
We can use copy_if
to copy selected strings to a new sequence:
list<string> shortStrings;
copy_if (ws.begin(), ws.end(),
back_inserter(shortStrings),
oneCharLong);
would copy all of the 1-character strings from ws
into a list.
And, the rather quixotically named
remove_copy_if (ws.begin(), ws.end(),
ostream_iterator<string>(cout, "\n"),
oneCharLong);
copies all the words in ws
except for the single-character ones, copying them to the output stream (i.e., writing them on cout
, one per line).
There is also a remove_if
function, but its name is misleading because it doesn’t actually “remove” anything. It just shifts selected elements to the end of the container, where they can later be removed in one erase
call.
4.5 all_of, any_of, none_of
These functions offer the equivalent of the logical quantifiers $\forall$, $\exists$, and $\not{\exists}$.
These take a range of positions to examine, and a unary predicate (“unary” == one parameter, “predicate” == function with a bool
return type)
bool containsFoobar (const string& s)
{
return s.find("foobar") != string::npos);
}
⋮
list<string> ws;
⋮
if (all_of(ws.begin(), ws.end(), containsFooBar))
cout << "Every element in ws contains 'foobar'." << endl;
if (any_of(ws.begin(), ws.end(), containsFooBar))
cout << "At least one element in ws contains 'foobar'." << endl;
if (none_of(ws.begin(), ws.end(), containsFooBar))
cout << "No element in ws contains 'foobar'." << endl;
Of course, we could do these same tests using find_if
and checking the position it returned, for example:
// These two tests are the same.
bool test1 = any_of(ws.begin(), ws.end(), containsFooBar);
bool test2 = find_if(ws.begin(), ws.end(), containsFooBar) != ws.end();
// These two tests are the same.
bool test3 = none_of(ws.begin(), ws.end(), containsFooBar);
bool test4 = find_if(ws.begin(), ws.end(), containsFooBar) == ws.end();
// These two tests are the same.
bool test5 = all_of(ws.begin(), ws.end(), containsFooBar);
bool test4 = find_if(ws.begin(), ws.end(), not1(containsFooBar)) == ws.end();
But the all_of
, any_of
, and none_of
functions are easier to read and convey the programmer’s intent more clearly.
5 C++11 Lambda Expressions
One thing that you may have noticed is that using generics that take function parameters works only if we already have a suitable function or are willing to write one.
The need to provide these small functions can result in an explosion of short, used-only-one-time functions in our code. And, because C++ does not allow functions to be nested within other functions, these small functions will be separated from the generic function call that uses them. This can impair the readability of the code.
To address this probem, C++ allows you to write an anonymous description of a short function right in the place where you would call it or, more often, pass it to other functions. This description called a lambda expression.
One of the most common mistakes that I see students make when trying to work with generics is to try and take a short-cut by simply writing an expression in place of a “proper” function. For example, instead of
bool oneCharLong (const string& s)
{
return s.size() == 1;
}
⋮
list<string> ws;
⋮
p = find_if(ws.begin(), ws.end(), oneCharLong);
I often see students do
list<string> ws;
⋮
p = find_if(ws.begin(), ws.end(), s.size() == 1); // No!
which doesn’t work because “s
” is undeclared.
In effect, the lambda expression provides a legal way to do what those students were attempting.
A lambda expression has components:
capture-description parameter-list function-body
Of these, the parameter-list and the function-body are pretty much the same as they would be in an ordinary, non-member function.
Here’s our “find a single-character string” search using a lambda expression:
list<string> ws;
⋮
p = find_if(ws.begin(), ws.end(),
[] (const string& s) {s.size() == 1}};
The capture-description explains what to do about variables that are “captured” – used in the function body but not passed as parameters. There are several options for what to do here, but the most common are likely to be:
-
[]
if the function won’t use any variable names that are not declared, ass
is above, as a function parameter -
[&]
if the function should capture variables as references to identically named variables in the current scope, -
[this]
for functions that should capture the this pointer of the current scope (in effect turning the lambda expression into a member function).
What you don’t see in a lambda expression is a name for the function, because the whole point is to use these for one-shot functions that aren’t going to be referenced anywhere else in the program, nor will you see a description of the function’s return type, because the compiler will deduce this from examining the return
statements in the function body.
Here’s some of the examples from the previous section, redone using lambda expressions:
before
void printLength(string s)
{
cout << s << " is of length "
<< s.length() << endl;
}
⋮
for_each (ws.begin(), ws.end(), printLength);
with lambda
for_each (ws.begin(), ws.end(),
[] (string s) {
cout << s << " is of length "
<< s.length() << endl;
});
before
string convertToString(int i)
{
char buffer[256];
ostrstream obuffer(buffer);
out << i << ends;
return string(buffer);
}
⋮
int v[5] = {-1, 5, 5, 5, 8};
string s[5];
transform (v, v+5, s, convertToString);
lambda
int v[5] = {-1, 5, 5, 5, 8};
string s[5];
transform (v, v+5, s,
[] (int i) {
char buffer[256];
ostrstream obuffer(buffer);
out << i << ends;
return string(buffer);
});
before
bool containsFoobar (const string& s)
{
return s.find("foobar") != string::npos);
}
⋮
list<string> ws;
⋮
p = find_if(ws.begin(), ws.end(),
containsFooBar);
lambda
list<string> ws;
⋮
p = find_if(ws.begin(), ws.end(),
[] (const string& s) {
return s.find("foobar")
!= string::npos);
});
Lambda expressions are something of an acquired taste. Some people use them all the time. others feel that they complicate the code, arguing, as we have earlier that it makes more sense to pull that code out into a separate functions so that the function name can serve as documentation of what it actually does.
6 Functors
Function types seem a bit awkward, and there are times when we want to pass a “behavior” to a function or to save a “behavior” in a data structure, but a true function or a lambda expression just won’t do. That takes us into the strange and peculiar idiom of programming called “functors”
A functor is an object that is created to simulate a function. Why would we want to do that? Well, objects can do things that functions can’t. Objects can store information, and can be stored in other data structures. Functions can’t do these, at least not with the same ease and flexibility. A functor can often have the best of both worlds.
6.1 Example: Functors and User Interface Programming
An example of functors can be found in many windowing libraries. Think of the problem of building a menu bar, like the one you see across the top of most windows in user interfaces. A menu bar is probably just an ordered collection of menus:
class MenuBar {
⋮
vector<Menu> menus;
⋮
};
A menu would have a name (e.g., “File”, “Edit”), but would itself contain a number of menu items.
class Menu {
⋮
string menuName;
vector<MenuItem> items;
⋮
};
And MenuItem
s? Well, they certainly have names, but they also will typically have a place to store an object or a pointer to an object that actually performs the desired function.
class MenuItemAction {
public:
virtual void perform() {/* by default, do nothing */}
};
class MenuItem {
string itemName;
MenuItemAction action;
public:
MenuItem (string name, MenuItemAction act)
: itemName (name), action(act) {}
void setAction (MenuItemAction act) {action = act;}
void itemWasSelected () {action.perform();}
};
When a menu is being built, the MenuItems are created with appropriate actions:
class FileReader: public MenuItemAction
{
void perform()
{
⋮
code to read from a file
⋮
}
};
class FileSaver: public MenuItemAction
{
void perform()
{
⋮
...code to write to a file ...
⋮
}
};
class Quitter: public MenuItemAction
{
void perform()
{
⋮
...code to close window and shut down program ...
⋮
}
};
⋮
FileReader rdr;
FileSaver svr;
Quitter quit;
// build a typical file menu
fileMenu.items.push_back (MenuItem("load", rdr));
fileMenu.items.push_back (MenuItem("save", svr));
fileMenu.items.push_back (MenuItem("exit", quit));
When a user actually selects one of these items from the menu, the windowing system calls the item’s itemWasSelected
function, which in turn calls the perform()
function of its action
.
rdr
, svr
, and quit
are examples of functors; they are objects created for the sole purpose of providing a single function, which in this case is called action
.
6.2 operator()
Now in the previous example, the functors are called by calling their action
function member. But C++ has special support for functors. We can write functors that are called just like regular functions. We do this by defining an operator()
.
Now, we’ve seen that you can define operators like <
, ==
, =
, and *
.
But it is a truly strange feature of C++ that ()
is considered an operator. It is a postfix operator (appearing to the right of the object that it operates on, e.g., x()
. And, it can be defined to take any number of parameters of any legal C++ type, e.g., x(23,"abcdef")
.
So when you see something like w(z)
written in C++, the only way to tell if you are looking at
-
a function
w
applied to a parameterz
, or -
a call to the
operator()
member of an objectw
is to find the declaration of w
and see if it really is a function or an object.
6.3 A Predicate Functor
Let’s see how this works. When we introduced the notion of iterators, we looked briefly at the standard function template find_if
.
The code shown here, for example, searches a vector for the first string containing no more than 4 characters.
vector<string> v;
⋮
bool isShort(const std::string& s)
{
return (s.size() <= 4);
}
⋮
p = find_if(v.begin(), v.end(), isShort);
Now, let’s write the same thing replacing isShort
by a functor.
vector<string> v;
⋮
class IsShort {
public:
bool operator() (const std::string& s)
{
return (s.size() <= 4);
}
};
IsShort isShort;
⋮
p = find_if(v.begin(), v.end(), isShort);
6.4 Why Do Functors Work Where Functions Are Expected?
The code for find_if
looks like:
template <class InputIterator, class Predicate>
InputIterator find_if(InputIterator first, InputIterator last,
Predicate pred) {
while (first != last && !pred(*first)) ++first;
return first;
}
So when we call find_if( ... ,isShort)
, isShort
is passed to find_if
as the parameter pred
, and the body of find_if calls pred(*first)
(shown in the highlighted code).
In the original version, isShort
(and therefore pred
) was a function, so pred(*first)
was an ordinary function call.
Now, however, isShort
is an object that happens to define an operator()
taking a single string
parameter, so pred(*first)
is a call to pred
’s operator()
.
6.5 Functors Can Have Memory
OK, so what? What does the isShort
functor do that the isShort
function did not? Absolutely nothing.
But now, suppose that we’re not always interested in strings of length 4 or less. Sometimes we may want stings of length 2 or less, or 8 or less, …
vector<string> v;
⋮
int length;
cout << "What's the longest acceptable string length?" << flush;
cin >> length;
p = find_if(v.begin(), v.end(), ????);
There’s no good way to write an ordinary function that we can pass to find_if
that would search for strings of variable lengths. But a functor can fill the bill very nicely, by storing the critical length inside the object.
vector<string> v;
⋮
class IsShort {
int length;
public:
IsShort (int len): length(len) {}
bool operator() (const std::string& s)
{
return (s.size() <= length);
}
};
vector<string> v;
⋮
int length;
cout << "What's the longest acceptable string length?" << flush;
cin >> length;
IsShort isShort (length); ➀
p = find_if(v.begin(), v.end(), isShort); ➁
Once we know the length we want to hunt for, we create an object that remembers that length (➀) and that, when its operator()
is called with some string, compares that string’s length to the value it has saved.
Then we can pass that object to find_if
to be applied to every element in the range we are searching (➁).
Finally, we note that we can actually do without the isShort
variable by using the constructor to create a temporary functor object to pass to find_if
.
vector<string> v;
⋮
class IsShort {
int length;
public:
IsShort (int len): length(len) {}
bool operator() (const std::string& s)
{
return (s.size() <= length);
}
};
vector<string> v;
⋮
int length;
cout << "What's the longest acceptable string length?" << flush;
cin >> length;
p = find_if(v.begin(), v.end(), IsShort(length));
6.6 std Functors for Comparisons
The C++ standard library provides a number of functors. The most commonly used are functors for comparing pairs of objects using the relational operators.
Suppose we have a vector of strings that we want to sort into ascending order. The standard sort function takes three parameters. The firsttwo are iterators denoting the range of itmes to be sorted. The third is a comparison function or functor that is used to compare to objects and return “true” if the first object should come before the second in the desired sorted order.
We want the strings arranged into ascending order, so we would like to use the ordinary <
for comparisons. We might be able to do this:
sort(v.begin(), v.end(), operator<);
taking advantage of the “real” name of the less-than operator. This should work for strings, but won’t work for some other data types for which operator<
is a member function.
A safer alternative that will work for any data type that provides an operator<
, member function or standalone function, is provided by the C++ standard library. The standard library provides less<T>
for this purpose, so we could write:
sort (v.begin(), v.end(), less<string>());
less
is not particularly complicated:
template <class T>
struct less : public binary_function<T, T, bool> {
bool operator()(const T& x, const T& y) const
{ return x < y; }
};
As you can see, the less
class simply provides an operator()
that uses the <
operator on its parameters.
In addition to less
, the standard library provides equal_to
, not_equal_to
, greater
, greater_equal
, and less_equal
, all declared in the header <functional>
.
A bit of a challenge: what does this function do, and can you think of a better name for it?
template <typename T>
function mysteryFunction (list<T>& aList, const T& x)
{
list<T>::iterator pos = find_if (aList.begin(), aList.end(),
bind2nd(x, greater<T>()));
aList.insert (pos, x);
}
bind2d
turns a two-parameter functor into a one-parameter functor by supplying a fixed value for the first parameter.
If you change “list” to “vector”, the code would still compile and would run correctly. Why would that be a bad idea?
6.7 Substituting Your Own Comparison Functions
Sometimes none of the standard relations will do. In those cases, we just define our own functors.
Suppose you were keeping a vector of PersonnelRecord
, but there is no <
for entire PersonnelRecord
s.
- You need to pick some appropriate key, a data member or group of members that uniquely defines each record.
For example, we might use a combination of name and address.
- Provide a functor that compares the keys:
class CompareByNameAddress {
public:
bool operator()
(const PersonnelRecord& p1,
const PersonnelRecord& p2)
{return (p1.name() < p2.name())
|| ((p1.name() == p2.name())
&& (p1.address() < p2.address());}
};
set<PersonnelRecord, CompareByNameAddress> employees;
Then
sort (v.begin(), v.end(), CompareByNameAddress() );
would sort your personnel records into order by name, with any people with the same name being sorted by address.
The () following “CompareByNameAddress” in the call above are important. CompareByNameAddress
is a class, but we don’t pass classes as parameters to functions, we pass objects. So what is CompareByNameAddress()
? It’s a call to the default constructor for the CompareByNameAddress
class, which returns an object of that type which, in turn, we pass to the sort
function.
7 References
This has not been an exhaustive list of all the generic functions in the C++ standard library. There are many others, but enough to give you a taste. Others are scattered through your textbook.
For a more compact listing, look at this summary sheet.