ADTs
Steven J Zeil
1 Abstraction
In general, abstraction is a creative process of focusing attention on the main problems by ignoring lower-level details.
In programming, we encounter two particular kinds of abstraction:
- procedural abstraction and
- data abstraction.
1.1 Procedural Abstraction
A procedural abstraction is a mental model of what we want a subprogram to do (but not how to do it).
Example:
double hypotenuse = sqrt(side1*side1 + side2*side2);
We can write this, understanding that the sqrt function is supposed to compute a square root, even if we have no idea how that square root actually gets computed.
- That’s because we understand what a square root is.
1.2 Data Abstraction
A data abstraction is a mental model of what
can be done to a collection of data. It deliberately excludes details of how
to do it.
Example: calendar days
A day (date?) in a calendar denotes a 24-hour period, identified by a specific year, month, and day number.
That’s it. That’s probably all you need to know for you and I to agree that we are talking about a common idea.
Example: cell names
Every cell in a spreadsheet has a unique name. The name has a column part and a row part.
-
The row indicators are integer values starting at 1.
-
The column indicators are case-insensitive strings of alphabetic characters as follows:
A
,B
, … ,Z
,AA
,AB
,AC
, … ,AZ
,BA
,BB
, …ZZ
,AAA
,AAB
, … and so on. -
Optional
$
markers may appear in front of each part to “fix” the row or column during copying.
Example: a book
How to describe a book?
-
If we are implementing a card catalog and library checkout, it is probably enough to list the metadata
-
(e.g., title, authors, publisher, date).
-
-
If, however, we are going to be working on a project involving the full text of the document (e.g., automatic metadata extraction and indexing), then we might need all the pages and all the text.
-
Of course, if we were building bookshelves, we might need more physical attributes such as size and weight!
Example: positions within a container
Many of the abstractions that we work with are “containers” of arbitrary numbers of pieces of other data.
Any time you have an ordered sequence of data, you can imagine the need to look through it. That then leads to the concept of a position within that sequence, with notions like
- finding the first and last position,
- going forward to the next position, etc.
2 Abstract Data Types
Adding Interfaces
-
The mental model offered by a data abstraction gives us an informal understanding of how and when to use it.
-
But because it is simply a mental model, it does not tell us enough information to program with it.
-
An abstract data type (ADT) captures this model in a programming language interface.
Definition of an Abstract Data Type
(traditional): An abstract data type (ADT) is a type name and a list of operations on that type.
It’s convenient, for the purpose of this course, to modify this definition just slightly:
Definition (alternate): An abstract data type (ADT) is a type name and a list of members (data or function) on that type.
- an ADT corresponds, more or less, to the public portion of a typical class
- the “list of members” includes
- names
- data types
- expected behavior
- a.k.a. an ADT specification
ADT Members: attributes and operations
Commonly divided into
-
attributes: the things that we think of as being data stored inthe ADT
- Actual interface is often through getAttr() and setAttr() functions.
- which, in turn, might or might not actually involve direct access to a “data member”
-
operations: the functions or behaviors or the ADT
- the “type” of a function consists of its return type and an ordered list of of its parameters’ types
2.1 Examples
Calendar Days
Nothing in the definition of ADT that says that the interface has to be written out in a programming language.
UML diagrams present classes as a 3-part box: name, attributes, & operations
Calendar Days: alternative
But we can use a more programming-style interface:
class Day {
public:
// Attributes
int getDay();
void setDay (int);
int getMonth();
void setMonth(int);
int getYear();
void setYear(int);
// Operations
Day operator+ (int numDays);
int operator- (Day);
bool operator< (Day);
bool operator== (Day);
⋮
See also the interface developed in sections 3.1 and 3.2 of your text (Horstmann).
- It’s essentially the same ADT, but lots of details are different
Notations
class Day {
public:
// Attributes
int getDay();
void setDay (int);
int getMonth();
void setMonth(int);
int getYear();
void setYear(int);
// Operations
Day operator+ (int numDays);
int operator- (Day);
bool operator< (Day);
bool operator== (Day);
⋮
- Disadvantages of moving early to programming-style interfaces:
- getting lost in language details
- prematurely committing to those details
Cell Names
Here is a possible interface for our cell name abstraction.
class CellName
{
public:
CellName (std::string column, int row,
bool fixTheColumn = false,
bool fixTheRow=false);
//pre: column.size() > 0 && all characters in column are alphabetic
// row > 0
CellName (std::string cellname);
//pre: exists j, 0<=j<cellname.size()-1,
// cellname.substr(0,j) is all alphabetic (except for a
// possible cellname[0]=='$')
// && cellname.substr(j) is all numeric (except for a
// possible cellname[j]=='$') with at least one non-zero
// digit
CellName (unsigned columnNumber = 0, unsigned rowNumber = 0,
bool fixTheColumn = false,
bool fixTheRow=false);
std::string toString() const;
// render the entire CellName as a string
// Get components in spreadsheet notation
std::string getColumn() const;
int getRow() const;
bool isRowFixed() const;
bool isColumnFixed() const;
// Get components as integer indices in range 0..
int getColumnNumber() const;
int getRowNumber() const;
bool operator== (const CellName& r) const
⋮
private:
⋮
Arguably, the diagram presents much the same information as the code
Example: a book
If we were to try to capture our book abstraction (concentrating on the metadata), we might come up with something like:
class Book {
public:
Book (Author) // for books with single authors
Book (Author[], int nAuthors) // for books with multiple authors
std::string getTitle() const;
void putTitle(std::string theTitle);
int getNumberOfAuthors() const;
std::string getIsBN() const;
void putISBN(std::string id);
Publisher getPublisher() const;
void putPublisher(const Publisher& publ);
AuthorPosition begin();
AuthorPosition end();
void addAuthor (AuthorPosition at, const Author& author);
void removeAuthor (AuthorPosition at);
private:
⋮
};
- What are
Author
andPublisher
in this interface?- They are simply other ADTs in this library world, and will need to have designed interfaces of their own.
2.1.1 Example: positions within a container
Coming up with a good interface for our position abstraction is a problem that has challenged many an ADT designer.
- A look at our
Book
interface may suggest why.
class Book {
public:
Book (Author) // for books with single authors
Book (Author[], int nAuthors) // for books with multiple authors
std::string getTitle() const;
void putTitle(std::string theTitle);
int getNumberOfAuthors() const;
std::string getIsBN() const;
void putISBN(std::string id);
Publisher getPublisher() const;
void putPublisher(const Publisher& publ);
typedef int AuthorPosition;
<+1>Author getAuthor (AuthorPosition authorNum) const; <-1>
void addAuthor (AuthorPosition at, const Author& author);
void removeAuthor (AuthorPosition at);
private:
⋮
};
- One intuitive idea might be to simply number the authors and treat the number as a position indicator, as shown here.
Iterators
The solution adapted by the C++ community is to have every ADT that is a “container” of sequences of other data to provide a special type for positions within that sequence.
- The container itself provides functions to return
- the beginning position in the sequence and
- the position just after the last data item in the
begin()
andend()
- The position ADT must provide, at a minimum:
-
A function to fetch the data item at the given position.
-
A function to advance from the current position to the next position in the sequence.
-
A function to compare two positions to see if they are the same.
-
A Possible Position Interface
In theory, we could satisfy this requirement with an ADT like this:
class AuthorPosition {
public:
AuthorPosition();
// get data at this position
Author getData() const;
// get the position just after this one
AuthorPosition next() const;
// Is this the same position as pos?
bool operator== (const AuthorPosition& pos) const;
bool operator!= (const AuthorPosition& pos) const;
};
which in turn would allow us to access authors like this:
void listAllAuthors(Book& b)
{
for (AuthorPosition p = b.begin(); p != b.end();
p = p.next())
cout << "author: " << p.getData() << endl;
}
The Iterator ADT
For historical reasons (and brevity), however, C++ programmers use overloaded operators for the getData()
and next()
operations:
class AuthorPosition {
public:
AuthorPosition();
// get data at this position
Author operator*() const;
// get a data/function member at this position
Author* operator->() const;
// move forward to the position just after this one
AuthorPosition operator++();
// Is this the same position as pos?
bool operator== (const AuthorPosition& pos) const;
bool operator!= (const AuthorPosition& pos) const;
};
so that code to access authors would look like this:
void listAllAuthors(Book& b)
{
for (AuthorPosition p = b.begin(); p != b.end();
++p)
cout << "author: " << *p << endl;
}
This ADT for positions is called an iterator (because it lets us iterate over a collection of data).
2.2 Design Patterns
Iterator as a Design Pattern
- a reusable concept that is not so much a piece of code as it is a design idea.
Pattern, not ADT
In C++, our application code does not actually work with an actual ADT named “Iterator”.
-
Instead, we typically have a lot of different ADTs, all of which share the common pattern of supporting an operator
++
to move forward, an operator*
for fetching a value at the position, etc. -
Each of these iterator ADTs is related to some kind of collection of data.
- These collections have nothing to do with one another except that they share the common idea of supplying
begin()
andend()
functions to provide the beginning and ending position within that collection.
- These collections have nothing to do with one another except that they share the common idea of supplying
Realizing a Design Pattern
-
Iterator and Collection are general patterns for interfaces.
-
We will have many actual classes that are unrelated to one another but that implement or realize those patterns.
- Such classes are called concrete realizations of the general patterns.
- It’s these concrete classes that our application actually works with.
3 ADTs as contracts
ADTs as contracts
An ADT represents a contract between the ADT developer and the users (application programmers).
The Contract
Application writers (the “users” of the ADT) are expected to alter/examine values of this type only via the operations and members provided.
The creator of the ADT promises to leave the operation specifications unchanged.
The creator of the ADT is allowed to change the code of the operations at any time, as long as it continues to satisfy the specifications.
The creator of the ADT is also allowed to change the data structure actually used to implement the type.
Why the Contract
What do we gain by holding ourselves to this contract?
- Application programmers can be designing and even implementing the application before the details of the ADT implementation have been worked out. This helps in
- top-down design
- development by teams
- The ADT implementors knows exactly what they must provide and what they are allowed to change.
- ADTs designed in this manner are often re-usable. By reusing code, we save time in
- implementation
- debugging
- We gain the flexibility to try/substitute different data structures to actually implement the ADT, without needing to alter the application code.
- By encouraging modularity, application code becomes more readable.
3.1 Information Hiding
Information Hiding
Every design can be viewed as a collection of “design decisions”.
-
David Parnas formulated the principle: “Every module [procedure] should be designed so as to hide one design decision from the rest of the program.”
-
He argued that such information hiding made future changes more economical.
Encapsulation
Although ADTs can be designed without language support, they rely on programmers’ self-discipline for enforcement of information hiding.
Encapsulation is the enforcement of information hiding by programming language constructs.
4 ADT Implementations
ADT Implementations
An ADT is implemented by supplying
-
a data structure for the type name.
-
coded algorithms for the operations.
We sometimes refer to the ADT itself as the ADT specification or the ADT interface, to distinguish it from the code of the ADT implementation.
In C++, implementation is generally done using a C++ class
.
- Uses public/private to enforce the ADT contract
4.1 Examples
Calendar Day Implementations
CellName implementation
class CellName
{
public:
CellName (std::string column, int row,
bool fixTheColumn = false,
bool fixTheRow=false);
//pre: column.size() > 0 && all characters in column are alphabetic
// row > 0
CellName (std::string cellname);
//pre: exists j, 0<=j<cellname.size()-1,
// cellname.substr(0,j) is all alphabetic (except for a
// possible cellname[0]=='$')
// && cellname.substr(j) is all numeric (except for a
// possible cellname[j]=='$') with at least one non-zero
// digit
CellName (unsigned columnNumber = 0, unsigned rowNumber = 0,
bool fixTheColumn = false,
bool fixTheRow=false);
std::string toString() const;
// render the entire CellName as a string
// Get components in spreadsheet notation
std::string getColumn() const;
int getRow() const;
bool isRowFixed() const;
bool isColumnFixed() const;
// Get components as integer indices in range 0..
int getColumnNumber() const;
int getRowNumber() const;
bool operator== (const CellName& r) const
{return (columnNumber == r.columnNumber &&
rowNumber == r.rowNumber &&
theColIsFixed == r.theColIsFixed &&
theRowIsFixed == r.theRowIsFixed);}
private:
⋮
int rowNumber;
bool theRowIsFixed;
bool theColIsFixed;
int CellName::alphaToInt (std::string columnIndicator) const;
std::string CellName::intToAlpha (int columnIndex) const;
};
inline
bool CellName::isRowFixed() const {return theRowIsFixed;}
inline
bool CellName::isColumnFixed() const {return theColIsFixed;}
#endif
There are some options here the have not been explored:
- Do we want the column info stored as a number, a string, or both?
Book implementation
We can implement Book in book.h
:
#ifndef BOOK_H
#include "author.h"
#include "publisher.h"
class Book {
public:
typedef const Author* AuthorPosition;
Book (Author); // for books with single authors
Book (const Author[], int nAuthors); // for books with multiple authors
std::string getTitle() const;
void setTitle(std::string theTitle);
int getNumberOfAuthors() const;
std::string getISBN() const;
void setISBN(std::string id);
Publisher getPublisher() const;
void setPublisher(const Publisher& publ);
AuthorPosition begin() const;
AuthorPosition end() const;
void addAuthor (AuthorPosition at, const Author& author);
void removeAuthor (AuthorPosition at);
private:
std::string title;
int numAuthors;
std::string isbn;
Publisher publisher;
static const int MAXAUTHORS = 12;
Author authors[MAXAUTHORS];
};
#endif
and in book.cpp
:
#include "book1.h"
// for books with single authors
Book::Book (Author a)
{
numAuthors = 1;
authors[0] = a;
}
// for books with multiple authors
Book::Book (const Author au[], int nAuthors)
{
numAuthors = nAuthors;
for (int i = 0; i < nAuthors; ++i)
{
authors[i] = au[i];
}
}
std::string Book::getTitle() const
{
return title;
}
void Book::setTitle(std::string theTitle)
{
title = theTitle;
}
int Book::getNumberOfAuthors() const
{
return numAuthors;
}
std::string Book::getISBN() const
{
return isbn;
}
void Book::setISBN(std::string id)
{
isbn = id;
}
Publisher Book::getPublisher() const
{
return publisher;
}
void Book::setPublisher(const Publisher& publ)
{
publisher = publ;
}
Book::AuthorPosition Book::begin() const
{
return authors;
}
Book::AuthorPosition Book::end() const
{
return authors+numAuthors;
}
void Book::addAuthor (Book::AuthorPosition at, const Author& author)
{
int i = numAuthors;
int atk = at - authors;
while (i >= atk)
{
authors[i+1] = authors[i];
i--;
}
authors[atk] = author;
++numAuthors;
}
void Book::removeAuthor (Book::AuthorPosition at)
{
int atk = at - authors;
while (atk + 1 < numAuthors)
{
authors[atk] = authors[atk + 1];
++atk;
}
--numAuthors;
}
We’ll explore some of the details and alternatives of these implementations in the next lesson.