System Testing

System testing adds particular challenges to testing because it is constrained to work with the actual system inputs and outputs. Compared to unit testing, we have less control over both the input supply and the capture and evaluation of the output.

In this lesson we look at tools for measuring quality of system testing. We will explore the difficulties that arise when dealing with non-text input and output, particularly graphical interfaces, and will look at some of the support available for testing at this level.

1 Integration and system testing

exercise larger potions of code than does unit testing.
validates the interactions between separate code features.

1.1 Unit to Integration

Start with your unit tests.
- Replace stubs and mocks by real code.
- Now they are integration tests!
More to the point, the problem is usually not converting from unit tests to integration tests. As we saw when looking at stubs & mocking, it’s often common to come up with integration tests that we have to work hard at to turn into properly isolated unit tests.

So those “first drafts” at unit tests can often be saved to serve as integration tests.
Supplement with tests of “interesting” interactions among modules.

1.2 Integration to System

System testing is the limiting case of integration testing.
Works with entire program’s inputs and outputs
- A challenge when inputs and outputs involve GUIs, databases, and other non-text, non-API components.
Generally fewer tests than under unit and integration testing.
- But may run a lot longer

2 Testing Phases and the Build Manager

Generally, unit tests should run quickly (a matter of seconds)

So we design the build to re-run these every time the code is recompiled.
Integration tests and systems tests may take considerably longer.

Might only want to rerun these during more occasional “reporting” runs when we want a full report on the overall state of the project.
- Suggests distinct targets in the build manager.
- Probably easiest if we keep the different kinds of tests in separate directories

2.1 How often should we run integration/system tests?

Plausible answers:

Once daily.
When we push a set of local changes to the central repository.
When merging changes into the master branch.

These kinds of runs are often triggered automatically.

Daily runs cane be a simple timed script.

Runs triggered by repository actions are triggered using continuous integration.

2.2 Separating the Tests

For Java projects, we already have a separation of code into src/main and src/test. We might consider adding src/integrationTest and src/systemTest.

I sometimes use src/itest and src/systest.

Each test directory ty[ically has source code (e.g., src/itest/java) and maybe data files (e.g., src/systest/data).

2.3 Building the Tests

In Gradle, we can add test directories with the TestsSets plugin.

In build.gradle:

plugins {
   id 'java'
}

repositories {
    jcenter()
}

dependencies {
    testImplementation("junit:junit:4.12")
    testRuntimeOnly("org.junit.vintage:junit-vintage-engine:5.5.2")
}

test {
    useJUnit()
}

// Add integration test directory itest
sourceSets {
    iTest {
        compileClasspath += sourceSets.main.output
        runtimeClasspath += sourceSets.main.output
    }
}

configurations {
    iTestImplementation.extendsFrom implementation
    iTestRuntimeOnly.extendsFrom runtimeOnly
}

dependencies {
    iTestImplementation 'junit:junit:4.12'
}

This adds to a Java project new tasks to compile and run tests from src/iTest/java.

You can configure these tasks independently, e.g.,

iTest.mustRunAfter test  ➀

iTest {
    ignoreFailures = true          ➁
}
test {
    ignoreFailures = false         ➂
}

➀ Makes sense to run the integration tests only after the normal unit tests.
➁ If we want to generate reports of how many tests passed and failed, we probably need to make sure the build keeps going (so that we can get to the reporting tasks) even if some tests fail.

Keep in mind also that our integration tests are likely to start as things that fail, and will continue to fail until we actually get far enough in the project development for losts of the missing pieces to have been finally implemented.
➂ This is for the purpose of illustration only. I don’t know why you would want to kill your reporting after unit test failures. After all, in TDD, we expect such failures to be common and to persist for some time.

3 Test Coverage

Although we can monitor test coverage during unit test, it’s more common to do this during integration and system test.

During Unit test, we are working with a lot of “fake” code (drivers and stubs/mocks).
- We certainly don’t care how well our drivers and stubs were covered!
Integration and system testing gets to more realistic ’combinations" of operations.

3.1 Coverage Measures

We have previously reviewed:

Black-Box Testing
- Equivalence partitioning
- Boundary-value testing
- Special-values testing
White-Box Testing
- Structural Testing (a.k.a., “path testing”
  - Statement Coverage
  - Branch Coverage
  - Cyclomatic coverage (“independent path testing”)
  - Data-flow Coverage
- Mutation testing

3.2 C/C++ - gcov

Monitoring Statement Coverage with gcov

coverage tool includes with the GNU compiler suite (gcc, g++, etc.)

As an example, look at testing the three search functions in

arrayUtils.h

#ifndef ARRAYUTILS_H
#define ARRAYUTILS_H



//  Add to the end
//  - Assumes that we have a separate integer (size) indicating how
//     many elements are in the array
//  - and that the "true" size of the array is at least one larger 
//      than the current value of that counter
template <typename T>
void addToEnd (T* array, int& size, T value)
{
   array[size] = value;
   ++size;
}



// Add value into array[index], shifting all elements already in positions
//    index..size-1 up one, to make room.
//  - Assumes that we have a separate integer (size) indicating how
//     many elements are in the array
//  - and that the "true" size of the array is at least one larger 
//      than the current value of that counter

template <typename T>
void addElement (T* array, int& size, int index, T value)
{
  // Make room for the insertion
  int toBeMoved = size - 1;
  while (toBeMoved >= index) {
    array[toBeMoved+1] = array[toBeMoved];
    --toBeMoved;
  }
  // Insert the new value
  array[index] = value;
  ++size;
}


// Assume the elements of the array are already in order
// Find the position where value could be added to keep
//    everything in order, and insert it there.
// Return the position where it was inserted
//  - Assumes that we have a separate integer (size) indicating how
//     many elements are in the array
//  - and that the "true" size of the array is at least one larger 
//      than the current value of that counter

template <typename T>
int addInOrder (T* array, int& size, T value)
{
  // Make room for the insertion
  int toBeMoved = size - 1;
  while (toBeMoved >= 0 && value < array[toBeMoved]) {
    array[toBeMoved+1] = array[toBeMoved];
    --toBeMoved;
  }
  // Insert the new value
  array[toBeMoved+1] = value;
  ++size;
  return toBeMoved+1;
}


// Search an array for a given value, returning the index where 
//    found or -1 if not found.
template <typename T>
int seqSearch(const T list[], int listLength, T searchItem)
{
    int loc;

    for (loc = 0; loc < listLength; loc++)
        if (list[loc] == searchItem)
            return loc;

    return -1;
}


// Search an ordered array for a given value, returning the index where 
//    found or -1 if not found.
template <typename T>
int seqOrderedSearch(const T list[], int listLength, T searchItem)
{
    int loc = 0;

    while (loc < listLength && list[loc] < searchItem)
      {
       ++loc;
      }
    if (loc < listLength && list[loc] == searchItem)
       return loc;
    else
       return -1;
}


// Removes an element from the indicated position in the array, moving
// all elements in higher positions down one to fill in the gap.
template <typename T>
void removeElement (T* array, int& size, int index)
{
  int toBeMoved = index + 1;
  while (toBeMoved < size) {
    array[toBeMoved] = array[toBeMoved+1];
    ++toBeMoved;
  }
  --size;
}



// Search an ordered array for a given value, returning the index where 
//    found or -1 if not found.
template <typename T>
int binarySearch(const T list[], int listLength, T searchItem)
{
    int first = 0;
    int last = listLength - 1;
    int mid;

    bool found = false;

    while (first <= last && !found)
    {
        mid = (first + last) / 2;

        if (list[mid] == searchItem)
            found = true;
        else 
            if (searchItem < list[mid])
                last = mid - 1;
            else
                first = mid + 1;
    }

    if (found) 
        return mid;
    else
        return -1;
}





#endif

with test driver

gcovDemo.cpp

#include <cassert>
#include <iostream>
#include <sstream>
#include <string>

#include "arrayUtils.h"

using namespace std;



// Unit test driver for array search functions





int main(int argc, char** argv)
{
  // Repeatedly reads tests from cin
  // Each test consists of a line containing one or more words. 
  // The first word is one that we want to search for. The
  // remaining words are placed into an array and represent the collection
  // we will search through.

  string line;
  getline (cin, line);
  while (cin)
    {
      istringstream in (line);
      cout << line << endl;
      string toSearchFor;
      in >> toSearchFor;
      int nWords = 0;
      string words[100];
      while (in >> words[nWords])
	++nWords;
      
      cout << seqSearch (words, nWords, toSearchFor)
	   << " "
	   << seqOrderedSearch (words, nWords, toSearchFor)
	   << " "
	   << binarySearch (words, nWords, toSearchFor)
	   << endl;

      getline (cin, line);
    }

  return 0;
}

which reads data from a text stream (e.g., standard in), uses that data to construct arrays, and invokes each function on those arrays, printing the results of each.

Compiling for gcov Statement Coverage

To use gcov, we compile with special options
- -fprofile-arcs -ftest-coverage
When the code has been compiled, in addition to the usual files there will be several files with endings like .gcno
- These hold data on where the statements and branches in our code are.

Running Tests with gcov

Run your tests normally.
As you test, a *.gcda file will accumulate data on your test coverage.

Viewing Your Report

Run: gcov_mainProgram_
- The immediate output will be a report on the percentages of statements covered in each source code file.
- Also creates a *.gcov detailed report for each source code file. e.g.,

Sample Statement Coverage Report

     -:   69:template <typename T>
     -:   70:int seqSearch(const T list[], int listLength, T searchItem)
     -:   71:{
     1:   72:    int loc;
     -:   73:
     2:   74:    for (loc = 0; loc < listLength; loc++)
     2:   75:        if (list[loc] == searchItem)
     1:   76:            return loc;
     -:   77:
 #####:   78:    return -1;
     -:   79:}

Report lists number of times each statement has been executed
- Lists #### if a statement has never been executed

Monitoring Branch Coverage with gcov

gcov can report on branches taken.

Just add options to the gcov command:
- gcov -b -c_mainProgram_

Reading gcov Branch Info

gcov reports
- Number of times each function call successfully returned
- Number of times a branch was executed (i.e,, how many times the branch condition was evaluated)
- and number of times each branch was taken
  - For branch coverage, this is the relevant figure

But What is a “Branch”?

A “branch” is anything that causes the code to not continue on in straight-line fashion
- Branch listed right after an “if” is the “branch” that jumps around the “then” part to go to the “else” part.
- && and || operators introduce their own branches
- Other branches may be hidden
  - Contributed by calls to inline functions
  - Or just a branch generated by the compiler’s code generator
In practice, this can be very hard to interpret

Example: gcov Branch Coverage report

        -:   84:template <typename T>
        -:   85:int seqOrderedSearch(const T list[], int listLength, T searchItem)
        -:   86:{
        1:   87:    int loc = 0;
        -:   88:
        1:   89:    while (loc < listLength && list[loc] < searchItem)
branch  0 taken 0
call    1 returns 1
branch  2 taken 0
branch  3 taken 1
        -:   90:      {
    #####:   91:       ++loc;
branch  0 never executed
        -:   92:      }
        1:   93:    if (loc < listLength && list[loc] == searchItem)
branch  0 taken 0
call    1 returns 1
branch  2 taken 0
        1:   94:       return loc;
branch  0 taken 1
        -:   95:    else
    #####:   96:       return -1;
        -:   97:}

Report is organized by basic blocks, straight-line sequences of code terminated by a branch or a call
Hard to map to specific source code constructs
- lowest-numbered branch is often the leftmost condition
- Fact of life that compilers insert branches and calls that are often invisible to us

3.3 Java

Java Coverage Tools

Clover
JaCoCo
- Part of the EclEmma project (Eclipse plugin for Emma)
- Emma, an older coverage tool, now replaced by JaCoCo

Clover

Commercial product, currently free for open-source projects
- integrates with Ant, Maven
- lots of reporting features
Works in “traditional” coverage tool fashion
- Requires a “fork” of the build process to build a monitoring version
- Injects monitors into compiled code
Test optimization: can re-run only those tests that covered changed code

JaCoCo

Java Code Coverage

line and branch coverage
Instrumentation is done on the fly
- An “agent” monitors execution of normally compiled bytecode
  - No special build required
Works with Eclipse
- JaCoCo “started” as an Eclipse plug-in
Works with Maven & Ant
- In Ant, wrap normal <java> and <junit> tasks inside a <jacoco:coverage> element
Works with Gradle
- Just apply the plug-in.

Example: JaCoCo in Eclipse

Using JaCoCo in Eclipse

Once you have the plugin installed,

Open any Eclipse Java project.
Right-click on a unit test or executable. Look for “Coverage As”
- Function of this is the same as “Run As” and “Debug As”, but monitors coverage during execution.
After execution has completed coverage results are shown
- A summary in the console area under the “Coverage” tab
- Details in the Java editor, as color coding
  - Green means that all branches were covered
  - Red means that none were covered.
  - Yellow means that some were covered.

Example: JaCoCo in Gradle

In build.gradle:

plugins {
   id 'java'
   id 'jacoco'
}
⋮
check.dependsOn jacocoTestReport

The last line is because I typically have a task named “check” that is my target for report generation. In other words, I plan to use

./gradlew check

to prepare all of my project reports, so I add a dependency between each kind of report I add and the check task.

Example: JaCoCo Report

Report

4 Oracles

A testing oracle is the process, person, and/or program that determines if test output is correct

4.1 expect

Covered previously, expect is a shell for testing interactive programs.

an extension of TCL (a portable shell script).
Largely confined to text streams as input/output

4.2 *Unit

Can we use *Unit-style frameworks as oracles at the system test level?

The very question is heresy to many *Unit advocates
- Particularly runs counter to the goals of the various Mock Objects projects
But, why not?
- Such tests do not (should not) be at the expense of having done earlier “proper” unit testing.
- Particularly in Java, MyClass.main(String[]) can be called just like any other function
  - And System.in (cin) and System.out (cout) can be rerouted to/from files or internal strings
- Major limitation is the accessibility of system inputs & outputs.
  - GUIs, data bases, etc.

4.3 Testing GUI systems

Scripting or record/playback: playing back input events for
- convenience & efficiency
- consistent reproducibility
Capture of results
- Can occur at different levels
  - event/message level
  - graphics level

Some Open Alternatives

Marathon - free in limited version
Jemmy

Marathon

For Java GUIs

Recorder captures AWT/swing events as JRuby scripts
Scripts can then be edited to alter inputs, add assertions, etc.

def test

   $java_recorded_version = "1.6.0_24"

   with_window("Simple Widgets") {
       select("First Name", "Jalian Systems")
       select("Password", "Secret")
       assert_p("First Name", "Text", "Jalian Systems")
    }
end

Jemmy

Also for Java GUIs

Tests scripted as Java
Integrates with JUnit
- Example

4.4 Web systems

A subproblem of GUI testing
- Simpler because input structure more constrained
- Output detail level is fixed (http: events)

Some Open Alternatives

4.5 Selenium

Browser automation (SeleniumIDE - Firefox add-on)
- Record & playback
- Or scripted (Selenium Webdriver)
  - Firefox, IE, Safari, Opera, Chrome

Selenium Scripting

Actions do things to elements.

E.g., click buttons, select options
Accessors examine the application state
Assertions validate the state

Each assertion has 3 modes
- assert: failure aborts the test
- verify: test continues, but failure is logged
- waitFor: conditions that may be true immediately or may become true within a specified time interval

Selenese

A typical scripting statement has the form

command parameter1 [parameter2]

Parameters can be

locators for finding a UI element within a page (xpath)
text patterns
variable names

A Sample Selenium Script

<table>
  <tr><td>open</td><td>http://mySite.com/downloads/</td><td></td></tr>
  <tr><td>assertTitle</td><td></td><td>Downloads</td></tr>
  <tr><td>verifyText</td><td>//h2</td><td>Terms and Conditions</td></tr>
  <tr><td>clickAndWait</td><td>//input[@value="I agree"]</td><td></td></tr>
  <tr><td>assertTitle</td><td></td><td>Product Selection</td></tr>
</table>

That’s right – it’s an HTML table:

open	http://mySite.com/downloads/
assertTitle		Downloads
verifyText	//h2	Terms and Conditions
clickAndWait	//input[@value=“I agree”]
assertTitle		Product Selection

A Selenium “test suite” is a web page with a table of links to web pages with test cases.

Selenium Webdriver

An alternate version of Selenium is more code-oriented. It provides APIs to a variety of languages allowing for very similar capabilities:

Select select = new Select(driver.findElement(
       By.tagName("select")));
select.deselectAll();
select.selectByVisibleText("Edam");

Selenium works by interacting with a “real” web browser (Firefox, Chrome) to simulate actions like clicking on or sending keystrokes to a web page element.

I’ve used Selenium Webdriver to implement some very nice web scraper applications.

Waiting

A tricky thing about testing web applications is the unknwon amount of time that may be required to respond to a click or other interaction.

Valid tests often need to give very specific instructions to wait for pages to be loaded and for elements to become visible and clickable.

WebDriver driver = new FirefoxDriver();
driver.get("http://somedomain/url_that_delays_loading");
WebElement myDynamicElement = (
  new WebDriverWait(driver, 10))
  .until(ExpectedConditions.elementIsClickable(
            By.id("myDynamicElement")));

Waits up to 10 seconds for an expected element to load and become active.