Discovering and Documenting Classes

Steven Zeil

Last modified: Sep 3, 2018
Contents:

1 Classification

Classification: Where do Classes Come From?

A key step in any OO analysis or design is identifying the appropriate classes.


Grouping Objects into Classes

We may identify groups of objects as a class because they have common

2 Driving the classification process

Where do we get the information from which we can identify classes during analysis?

The “program as simulation” philosophy suggests that we should be looking for a model of the “real world”.

Our first pass is often informal, based on the documents at hand.

We will eventually formalize this process by writing, in conjunction with our domain experts, use-cases (scenarios) of sequences of actions in the world and analyzing those scenarios to see what they suggest about classes of objects in the world and the responsibilities of those classes.


Working from Informal Descriptions

Generally, this is done at the start of construction of a domain model or, if no domain model is needed, of the analysis model.

A fairly simple way to get started is to scan the informal statement looking for noun phrases and verb phrases

This doesn’t scale well to large projects/documents, but it is simple and often a useful starting point.

After that, we move on by exploiting our knowledge to assign the responsibilities to the appropriate and to classes, refine our choice of classes and responsibilities.


Use-Case analysis

A use-case is a particular pattern of usage, a scenario that begins with some user of the system initiating a transaction or sequence of related events.

We analyze use-cases to discover the

More on this in later lessons.

3 Documenting our Knowledge: UML classes

We have previously introduced the notation for class in UML. A class is shown as a rectangle with three components:

  1. a name

  2. a list of attributes (properties that we think of as being simply observed or directly changed).

    • An attribute is generally written in the form

      name : datatype

      e.g.,
      Librarian
      name: string

      • Early in the analysis process, when we have more unresolved questions than firm conclusions, we might have only a name for some attributes, or only a data type, e.g.

        Librarian
        name: string
        id
        Librarian
        name: string
        : BranchLibrary

      • Later in the design, we may choose to add visibility marks, (‘+’ for public, ‘-’ for private),

        Librarian
        +name: string
        +id: string
        +assignment: BranchLibrary
        -schedule: vector of DateTime

        but in the early stage of analysis, we are really only concerned with public behaviors of our classes.

        • Even during design, take note that marking an attribute as public does not mean that it will be implemented as a public data member. For example
          CalendarDay
          +year: int

          is much more likely to be implemented as

          class CalendarDay {
              int year;
          public:
                 ⋮
              int getYear() const;
              void setYear(int theYear);
                 ⋮
          };
          

          than as a public data member, because that’s how public attributes are usually handled in ADTs.

  3. A list of operations supplied by the class.

    Operations may be simple text names very early in the analysis, but eventually are written in a pseudo-function declaration:

    operationName ( list-of-parameters )

    or

    operationName ( list-of-parameters ) : return-type

    e.g.,
    Librarian
    name: string
    id: string
    assignment: BranchLibrary
    checkOut (:Book, for: Patron)

    Parameters are written in the same “name : type” style as attributes.

4 Informal Documentation: CRC Cards

An alternative to UML diagrams, CRC cards are a popular tool for use in early analysis, particularly in team discussions and brainstorming sessions.

They are not exactly a high-tech tool:

A low-tech (no-tech?) approach is often useful in early brainstorming sessions. The index cards can serve as a concrete symbol for object during discussion. People trying to make a point may stack the cards, move them around, etc., while discussing proposed relationships.


CRC Cards are Informal Documentation

The point of CRC cards is to capture info about an analysis discussion without slowing down that discussion so someone can take nicely formatted notes. (Every now and then I see someone announce a new online tool for letting you type into a PC to write CRC cards that will then be printed out in nice neatly formatted output. That really misses the point. Ever been in a meeting where some single person was trying to type up all the important stuff being said? What does that usually do to the discussion dynamic?)

4.1 CRC Card Layout

ClassName
responsibility 1 collaborator 1
responsibility 2 collaborator 2
responsibility 3  

CRC Card Example

Librarian
Handles checkout and checkin of publications Patrons
Reshelves books Publications
Manages new acquisitions of publications Inventory
Has name, ID#, branch assignment Catalog

4.2 Assigning Responsibilities

5 Example

5.1 Checking Out a Book

For example, if I were told “a library patron will give a librarian a book to be checked out”, I would model this as

Librarian
...
checkOut (:Book, for: Patron)
Patron


or, if I were using CRC cards,

Librarian
Handles checkout of books for patrons   
...   
Patron
    Librarian

but not as

Librarian


Patron

askToCheckOut (:Book, :Librarian)

or

Librarian
       
Patron
Asks librarian to check out book Librarian

But, truthfully, there are a lot of people who would be OK with the latter set of CRC cards. No one in their right mind, however, would accept the equivalent UML.

The problem is that the idea of “responsibility” in a CRC card is often described as “something that these objects do”, without distinguishing between things that objects do because something else asked them to (i.e., it’s a part of their public interface) and things that they do as internal steps for responding to some other, possibly more general request.

If we accepted “asks…to check out” as a Patron responsibility or operation, what would be the implementing steps? Assuming that it really the expertise of the Librarian to know how to check out a book, then the Patron’s function almost certainly would be:

void Patron::askToCheckOut (librarian: Librarian, book: Book)
{
   librarian.checkOut(book, this);
}

In other words, we are immediately forced to add the preferred Librarian-based option anyway.

Then we also have to ask, if Patron provides this askToCheckOut function, what in this simulated world will call it? For all we know, that “ask…” is just an internal step of some larger process (simulating a patron walking into a library, returning some books, browsing, selecting a new book, asking a librarian to check it out) and is never used by objects other than the Patron itself.

5.2 When A does B to C

A useful rule of thumb is that if “A does B to C”, then

In pseudo-code terms, we might say

void A::fulfillSomeOtherResponsibilityOfA(C c1)
{
   ⋮
  c1.B();
   ⋮
}

in which case it is clear that B is a responsibility of class C, and C should be listed as a collaborator on A’s card (along with that “some other responsibility”).

5.2.1 Containers

A variation on this rule of thumb occurs when managing collections.

If I told you that “the librarian adds the book’s metadata to the catalog”, I would expect you to model that as

Catalog

add (metadata)

and not as

Metadata

addTo (:Catalog)

(Metadata refers to the set of properties that identify and describe a document or other collection of data. Typical metadata fields are author, title, date of publication, etc. “Metadata” is a perfect example of the type of specialized vocabulary that the people working in the Library World would understand and that you as a software developer assigned to that world would need to learn to use when communicating with them.)


Containers (cont.)

Again, by analogy with programming, we understand that if you had something like

vector<int> v;
set<int> myList;
   ⋮
v.push_back(23);
myList.insert (23);

You would regard the ability to push data onto the end of a vector or to add data to a set as operations of the vector and the set, not as operations of the int data type.

You would never say:

23.insertInto (myList);

Not only is this ugly, but it suggests that we would need to predict all the possible container classes that will ever be written to hold integers, and to somehow add their specific add/insert/push operations to the set of permitted operations of the int type.

That’s just not possible.

Similarly, we’d expect that metadata could be added to many other kinds of containers as well as the library catalog. The ability to hold multiple instances of metadata is a responsibility of the container, not of the thing contained.

6 Brainstorming

CRC cards are often touted as the tool of choice for brainstorming sessions when a lot of people are hashing out ideas about possible classes.

Certainly, the informal act of scrawling text on an index card is consistent with that style of meeting. (I often see ads for apps and programs for drawing nice pretty CRC cards, but to me that kind of misses the whole point.)

But UML class diagrams don’t have to be particularly clumsy in that kind of setting.

Whether you work with CRC cards or UML class diagrams, the early process is much the same

  1. Examine your available documents for noun phrases describing real-world things that are essential to describing the system being contemplated. These are your candidate classes.

    • Don’t worry about whether you think you are going to be automating these or not.

      As we will see later, we need to model the real-world external things that interact with our automated systems as well as the things that we will be simulating within the code.

  2. Also watch for verb phrases describing things that happen in that real world. These are you candidate operations.

  3. Try to match your candidate operations up with the candidate classes. Watch also for obvious attributes that contribute to those operations.

Some classes may turn out to be trivial. Some may be uninteresting because they don’t provide any of the candidate operations, nor are they likely to call on any of those operations.

What’s left is the core of your domain and/or analysis model.