Documentation and Documentation Generators

Steven J Zeil

Last modified: Dec 27, 2023
Contents:

… because everyone loves writing documentation.

1 Source Code Documentation

1.1 Comments

1.1.1 Do Comments Matter?

McConnell has a good & balanced discussion on this.

Modern focus has shifted considerably away from commenting bodies towards API documentation.

1.1.2 Which is better?

double m; // mean average
double s; // standard deviation
double meanAverage
double standardDeviation

1.1.3 Which is better?

// Sum up the data
double sum = 0.0;
double sumSquares = 0.0;
// Add up the sums
for (double d: scores)
{
   sum += d;
   sumSquares += d*d;
}

// Compute the average and standard
//  deviation
double meanAverage = sum / numScores;
double standardDeviation =
   sqrt ((sumSquares - numScores*sum*sum)
            /(numScores - 1.0));

// Subtract the average from each data
// item and divide by the standard
// deviation.
for (int i = 0; i < numScores; ++i)
{
   scores[i] = (scores[i] - meanAverage)
       / standardDeviation;
}

// Compute summary statistics
double sum = 0.0;
double sumSquares = 0.0;

for (double d: scores)
{
   sum += d;
   sumSquares += d*d;
}

double meanAverage = sum / numScores;
double standardDeviation =
   sqrt ((sumSquares - numScores*sum*sum)
            / (numScores - 1.0));

// Normalize the scores
for (int i = 0; i < numScores; ++i)
{
   scores[i] = (scores[i] - meanAverage)
       / standardDeviation;
}

1.1.4 Which is better?

// Compute summary statistics
double sum = 0.0;
double sumSquares = 0.0;

for (double d: scores)
{
   sum += d;
   sumSquares += d*d;
}

double meanAverage = sum / numScores;
double standardDeviation =
   sqrt ((sumSquares - numScores*sum*sum)
            /(numScores - 1.0));

// Normalize the scores
for (int i = 0; i < numScores; ++i)
   scores[i] = (scores[i] - meanAverage)
       / standardDeviation;

void computeSummaryStatistics (
   const double* scores,      // inputs
   int numScores,
   double& meanAverage,       // outputs
   double& standardDeviation)
{
  double sum = 0.0;
  double sumSquares = 0.0;
  for (double d: scores)
  {
	 sum += d;
	 sumSquares += d*d;
  }

  meanAverage = sum / numScores;
  standardDeviation =
	 sqrt ((sumSquares - numScores*sum*sum)
			  /(numScores - 1.0));
}


void normalizeData (double* data,
                    int numData,
                    double center,
					double spread)
{
  for (int i = 0; i < numData; ++i)
    data[i] = (data[i] - center) / spread;
}

    ⋮

double meanAverage;
double standardDeviation;
computeSummaryStatistics (scores, numScores,
    meanAverage, standardDeviation);
normalizeData (scores, numScores,
    meanAverage, standardDeviation);

1.1.5 Which is better?

void computeSummaryStatistics (
   const double* scores,      // inputs
   int numScores,
   double& meanAverage,       // outputs
   double& standardDeviation)
{
  double sum = 0.0;
  double sumSquares = 0.0;

  for (double d: scores)
  {
	 sum += d;
	 sumSquares += d*d;
  }

  meanAverage = sum / numScores;
  standardDeviation =
	 sqrt ((sumSquares - numScores*sum*sum)
			  /(numScores - 1.0));
}


void normalizeData (double* data,
                    int numData,
                    double center,
					double spread)
{
  for (int i = 0; i < numData; ++i)
	 data[i] = (data[i] - center) / spread;
}
    ⋮

double meanAverage;
double standardDeviation;
computeSummaryStatistics (scores, numScores,
    meanAverage, standardDeviation);
normalizeData (scores, numScores,
    meanAverage, standardDeviation);
void computeSummaryStatistics (
   const double* scores,      // inputs
   int numScores,
   double& meanAverage,       // outputs
   double& standardDeviation)
{
  double sum = accumulate(
     scores, scores+numScores);
  double sumSquares = accumulate(
     scores, scores+numScores,
     [](double x, double y)
	   {return x + y*y;});

  meanAverage = sum / numScores;
  standardDeviation =
	 sqrt ((sumSquares - numScores*sum*sum)
			  /(numScores - 1.0));
}


    ⋮

// Normalize the scores
double meanAverage;
double standardDeviation;
computeSummaryStatistics (scores, numScores,
    meanAverage, standardDeviation);
transform (
    scores, scores+numScores,
	scores,
    [] (double d) {
	  return (d - meanAverage)
	           / standardDeviation});

1.1.6 Kinds of Comments

1.2 Self-Documenting Code

Self-Documenting code relies on good programming style to perform most of the documentation.

1.2.1 Characteristics of Self-Documenting Code

Classes

  • Does the class’s interface present a consistent abstraction?

  • Is the class well named, and does its name describe its central purpose?

  • Does the class’s interface make obvious how you should use the class?

  • Is the class’s interface abstract enough that you don’t have to think about how its services are implemented? Can you treat the class as a black box?

Routines

  • Does each routine’s name describe exactly what the routine does?

  • Does each routine perform one well-defined task?

  • Have all parts of each routine that would benefit from being put into their own routines been put into their own routines?

  • Is each routine’s interface obvious and clear?

Data Names

  • Are type names descriptive enough to help document data declarations?

  • Are variables named well?

  • Are variables used only for the purpose for which they’re named?

  • Are loop counters given more informative names than i, j, and k?

  • Are well-named enumerated types used instead of makeshift flags or boolean variables?

  • Are named constants used instead of magic numbers or magic strings?

  • Do naming conventions distinguish among type names, enumerated types, named constants, local variables, class variables, and global variables?

Data Organization

  • Are extra variables used for clarity when needed?

  • Are references to variables close together?

  • Are data types simple so that they minimize complexity?

  • Is complicated data accessed through abstract access routines (abstract data types)?

Control

  • Is the nominal path through the code clear?

  • Are related statements grouped together?

  • Have relatively independent groups of statements been packaged into their own routines?

  • Does the normal case follow the if rather than the else?

  • Are control structures simple so that they minimize complexity?

  • Does each loop perform one and only one function, as a well-defined routine would?

  • Is nesting minimized?

  • Have boolean expressions been simplified by using additional boolean variables, boolean functions, and decision tables?

Layout

  • Does the program’s layout show its logical structure?

Design

  • Is the code straightforward, and does it avoid cleverness?

  • Are implementation details hidden as much as possible?

  • Is the program written in terms of the problem domain as much as possible rather than in terms of computer-science or programming-language structures?

(McConnell, ch 32)

1.3 Charting

How many forms of software documentation charting do you know?

1.3.1 From Code to Charts

1.3.2 From Charts to Code

A hallmark of so-called CASE (Computer-Aided Software Engineering) systems

2 API Documentation

As stated earlier, modern views on code documentation have shifted considerably away from commenting code bodies and instead emphasize API documentation.

At the same time, API documentation tools are now very common. These tools

2.1 javadoc

Perhaps the best known tool in this category

2.1.1 Javadoc Comments


Common Javadoc Markup


Running javadoc

2.1.2 JavaDoc and build managers

Ant

Ant has a javadoc task among its default task set.

A typical invocation might be:

<javadoc packagenames="edu.odu.cs.*"
         destdir="target/javadoc"
         classpathref="javadoc.classpath" Author="yes"
         Version="yes" Use="yes" defaultexcludes="yes">
   <fileset dir="." defaultexcludes="yes">
      <include name="extractor/src/main/java/**" />
      <exclude name="**/*.html" />
   </fileset>
   <doctitle><![CDATA[<h1>ODU CS Extract
                    Project</h1>]]></doctitle>
</javadoc>


Gradle

The ‘java’ plugin for Gradle provides a javadoc provides a javadoc task.

The Javadoc report will be found in build/docs (Remember, all file produced by the Gradle build process will be stored in build/.)

If you want to have javadoc rerun every time you change your source code, you can do this by making the build task dependent on the javadoc task. To do that, simply add this to the build.gradle file:

build.dependsOn javadoc

The default Javadoc processing in Gradle does not include linking to other libraries, such as the Java API. You can add that ability by changing some options on the javadoc task:

javadoc {
   options.with {
       links 'https://docs.oracle.com/javase/8/docs/api/', 'gradle/javadocs/jdk'
       }
}

The links option takes two parameters. One identifies the URL where the external library has published its own Javadocs. The other identifies the library that is documented at that URL.

Finally, you may find that, if you have any unit test failures, your Javadoc reports (and other reports that we will be producing in the coming weeks) do not get updated because the build stops at the first test failure.

But, if we are doing test-driven development, we expect to be failing unit tests pretty much from day one up until the completion of the project. So it often makes sense to not halt Gradle after a test failure. We can indicate this by modifying an option on the test task:

test {
    ignoreFailures = true
}

Here is the full build.gradle for a simple Java build with Javadoc generation.

Click to reveal

2.2 doxygen


Running doxygen

2.3 Other API Documentation Generators

Because a documentation generator needs to module and function structure and function parameters, a distinct parser is needed for each programming language.

This leads to a variety of language-specific tools, e.g.,