Discovering and Documenting Classes
Steven Zeil
1 Classification
Classification: Where do Classes Come From?
A key step in any OO analysis or design is identifying the appropriate classes.
- In practice, this process is
-
incremental
We tend to add a few classes at a time.
-
iterative
We may revisit earlier decisions and change them.
We often identify classes and then later add details about their relationships to other classes.
-
Grouping Objects into Classes
We may identify groups of objects as a class because they have common
-
properties,
e.g., we regard all things that have titles, authors, and textual content as documents, regardless of whether they are in a print medium, a file, or even chiseled into a set of stone tablets.
- Don’t make the mistake of grouping things into a class because they have common property values.
-
A “collection of documents” can be a class. The values of that class can be collections that were selected by many different criteria.
-
By contrast, the collection of documents written by Mark Twain (i.e., whose author property has the value MarkTwain) is not a class. It’s just a particular value of the “collection of documents” class.
-
- Don’t make the mistake of grouping things into a class because they have common property values.
-
behaviors,
e.g., the set of all documents that can be loaded from and saved to a disk might represent a distinct class ElectronicDocuments.
2 Driving the classification process
Where do we get the information from which we can identify classes during analysis?
The “program as simulation” philosophy suggests that we should be looking for a model of the “real world”.
-
Initially we will build that model by looking at informal English descriptions of the world.
-
Later, from use cases (scenarios)
Our first pass is often informal, based on the documents at hand.
We will eventually formalize this process by writing, in conjunction with our domain experts, use-cases (scenarios) of sequences of actions in the world and analyzing those scenarios to see what they suggest about classes of objects in the world and the responsibilities of those classes.
Working from Informal Descriptions
Generally, this is done at the start of construction of a domain model or, if no domain model is needed, of the analysis model.
A fairly simple way to get started is to scan the informal statement looking for noun phrases and verb phrases
-
Nouns represent candidate objects
-
Verbs the messages/operations upon them (responsibilities)
This doesn’t scale well to large projects/documents, but it is simple and often a useful starting point.
After that, we move on by exploiting our knowledge to assign the responsibilities to the appropriate and to classes, refine our choice of classes and responsibilities.
Use-Case analysis
A use-case is a particular pattern of usage, a scenario that begins with some user of the system initiating a transaction or sequence of related events.
We analyze use-cases to discover the
-
objects that participate in the scenario
-
responsibilities of each object
-
collaborations with other objects
More on this in later lessons.
3 Documenting our Knowledge: UML classes
We have previously introduced the notation for class in UML. A class is shown as a rectangle with three components:
-
a name
-
a list of attributes (properties that we think of as being simply observed or directly changed).
-
An attribute is generally written in the form
name : datatype
e.g.,
Librarian name: string
-
Early in the analysis process, when we have more unresolved questions than firm conclusions, we might have only a name for some attributes, or only a data type, e.g.
Librarian name: string
id
Librarian name: string
: BranchLibrary
-
Later in the design, we may choose to add visibility marks, (‘+’ for public, ‘-’ for private),
Librarian +name: string
+id: string
+assignment: BranchLibrary
-schedule: vector of DateTime
but in the early stage of analysis, we are really only concerned with public behaviors of our classes.
- Even during design, take note that marking an attribute as public does not mean that it will be implemented as a public data member. For example
CalendarDay +year: int
⋮
is much more likely to be implemented as
class CalendarDay { int year; public: ⋮ int getYear() const; void setYear(int theYear); ⋮ };
than as a public data member, because that’s how public attributes are usually handled in ADTs.
- Even during design, take note that marking an attribute as public does not mean that it will be implemented as a public data member. For example
-
-
-
A list of operations supplied by the class.
Operations may be simple text names very early in the analysis, but eventually are written in a pseudo-function declaration:
operationName
(
list-of-parameters)
or
operationName
(
list-of-parameters) :
return-typee.g.,
Librarian name: string
id: string
assignment: BranchLibrary
checkOut (:Book, for: Patron) Parameters are written in the same “name : type” style as attributes.
4 Informal Documentation: CRC Cards
An alternative to UML diagrams, CRC cards are a popular tool for use in early analysis, particularly in team discussions and brainstorming sessions.
They are not exactly a high-tech tool:
-
4x6 index cards
-
used to take notes during analysis,
-
as a concrete symbol for an object during disucssion
- cards can be stacked, moved, etc. to illustrate proposed relationships
A low-tech (no-tech?) approach is often useful in early brainstorming sessions. The index cards can serve as a concrete symbol for object during discussion. People trying to make a point may stack the cards, move them around, etc., while discussing proposed relationships.
CRC Cards are Informal Documentation
The point of CRC cards is to capture info about an analysis discussion without slowing down that discussion so someone can take nicely formatted notes. (Every now and then I see someone announce a new online tool for letting you type into a PC to write CRC cards that will then be printed out in nice neatly formatted output. That really misses the point. Ever been in a meeting where some single person was trying to type up all the important stuff being said? What does that usually do to the discussion dynamic?)
- They aren’t pretty.
-
They aren’t something you ever want to show your customers or even your own upper-management.
-
If you come out of a group meeting and your CRC cards aren’t smudged, dog-eared, with lots of scratched-out bits, you probably weren’t really trying.
-
4.1 CRC Card Layout
-
labeled with class name
-
divided into two columns
-
responsibilities
A high-level description of a purpose of the class
- attributes
- behaviors
-
collaborators
other classes with which this class must work with (send messages to) to fulfill this class’s responsibilities
-
ClassName | |
responsibility 1 | collaborator 1 |
responsibility 2 | collaborator 2 |
responsibility 3 | |
CRC Card Example
Librarian | |
Handles checkout and checkin of publications | Patrons |
Reshelves books | Publications |
Manages new acquisitions of publications | Inventory |
Has name, ID#, branch assignment | Catalog |
4.2 Assigning Responsibilities
-
The responsibilities will eventually evolve into messages that can be sent to this class and then into member functions of an ADT.
-
Being aware of that intention can be a good indicator of which class should receive a particular responsibility.
5 Example
5.1 Checking Out a Book
For example, if I were told “a library patron will give a librarian a book to be checked out”, I would model this as
Librarian |
---|
... |
checkOut (:Book, for: Patron) |
Patron |
---|
|
|
or, if I were using CRC cards,
Librarian | |
Handles checkout of books for patrons | |
... | |
Patron | |
Librarian | |
but not as
Librarian |
---|
|
|
Patron |
---|
|
askToCheckOut (:Book, :Librarian) |
or
Librarian | |
Patron | |
Asks librarian to check out book | Librarian |
But, truthfully, there are a lot of people who would be OK with the latter set of CRC cards. No one in their right mind, however, would accept the equivalent UML.
The problem is that the idea of “responsibility” in a CRC card is often described as “something that these objects do”, without distinguishing between things that objects do because something else asked them to (i.e., it’s a part of their public interface) and things that they do as internal steps for responding to some other, possibly more general request.
If we accepted “asks…to check out” as a Patron responsibility or operation, what would be the implementing steps? Assuming that it really the expertise of the Librarian to know how to check out a book, then the Patron’s function almost certainly would be:
void Patron::askToCheckOut (librarian: Librarian, book: Book)
{
librarian.checkOut(book, this);
}
In other words, we are immediately forced to add the preferred Librarian-based option anyway.
Then we also have to ask, if Patron
provides this askToCheckOut
function, what in this simulated world will call it? For all we know, that “ask…” is just an internal step of some larger process (simulating a patron walking into a library, returning some books, browsing, selecting a new book, asking a librarian to check it out) and is never used by objects other than the Patron itself.
5.2 When A does B to C
A useful rule of thumb is that if “A does B to C”, then
-
“doing B” is an operation.
-
But it is usually a operation provided by class C, not by A.
-
A presumably calls that operation, but we may need more analysis to determine why or in what context.
Natural language being the flexible and imprecise tool that it is, there are many exceptions to this. But that’s where thinking of responsibilities as future functions can help. In programming terms, statements like “A does B to C” often occur in context where we are describing a series of steps being enacted because someone or something else asked A to fulfill some higher-level responsibility.
In pseudo-code terms, we might say
void A::fulfillSomeOtherResponsibilityOfA(C c1)
{
⋮
c1.B();
⋮
}
in which case it is clear that B is a responsibility of class C, and C should be listed as a collaborator on A’s card (along with that “some other responsibility”).
5.2.1 Containers
A variation on this rule of thumb occurs when managing collections.
If I told you that “the librarian adds the book’s metadata to the catalog”, I would expect you to model that as
Catalog |
---|
|
add (metadata) |
and not as
Metadata |
---|
|
addTo (:Catalog) |
(Metadata refers to the set of properties that identify and describe a document or other collection of data. Typical metadata fields are author, title, date of publication, etc. “Metadata” is a perfect example of the type of specialized vocabulary that the people working in the Library World would understand and that you as a software developer assigned to that world would need to learn to use when communicating with them.)
Containers (cont.)
Again, by analogy with programming, we understand that if you had something like
vector<int> v;
set<int> myList;
⋮
v.push_back(23);
myList.insert (23);
You would regard the ability to push data onto the end of a vector or to add data to a set as operations of the vector and the set, not as operations of the int data type.
You would never say:
23.insertInto (myList);
Not only is this ugly, but it suggests that we would need to predict all the possible container classes that will ever be written to hold integers, and to somehow add their specific add/insert/push operations to the set of permitted operations of the int type.
That’s just not possible.
Similarly, we’d expect that metadata could be added to many other kinds of containers as well as the library catalog. The ability to hold multiple instances of metadata is a responsibility of the container, not of the thing contained.
6 Brainstorming
CRC cards are often touted as the tool of choice for brainstorming sessions when a lot of people are hashing out ideas about possible classes.
Certainly, the informal act of scrawling text on an index card is consistent with that style of meeting. (I often see ads for apps and programs for drawing nice pretty CRC cards, but to me that kind of misses the whole point.)
But UML class diagrams don’t have to be particularly clumsy in that kind of setting.
- You can scrawl them on a sheet of paper or a whiteboard (you really just need the horizontal dividing lines to give everyone the right idea).
- In any text editor, you can type a couple of lines of hyphens to set off the three areas of the class, e.g.,
Librarian --------- name id asst: BranchLibrary --------- checkOut(:Book, for: Patron) BranchLibrary ------------- name location : list of staff -------------
-
Most word processors will let you create a 1-column by 3-row table with just a couple of clicks.
Whether you work with CRC cards or UML class diagrams, the early process is much the same
-
Examine your available documents for noun phrases describing real-world things that are essential to describing the system being contemplated. These are your candidate classes.
-
Don’t worry about whether you think you are going to be automating these or not.
As we will see later, we need to model the real-world external things that interact with our automated systems as well as the things that we will be simulating within the code.
-
-
Also watch for verb phrases describing things that happen in that real world. These are you candidate operations.
-
Try to match your candidate operations up with the candidate classes. Watch also for obvious attributes that contribute to those operations.
Some classes may turn out to be trivial. Some may be uninteresting because they don’t provide any of the candidate operations, nor are they likely to call on any of those operations.
What’s left is the core of your domain and/or analysis model.