# System Testing

Abstract

System testing adds particular challenges to testing because it is constrained to work with the actual system inputs and outputs. Compared to unit testing, we have less control over both the input supply and the capture and evaluation of the output.

In this lesson we look at tools for measuring quality of system testing. We will explore the difficulties that arise when dealing with non-text input and output, particularly graphical interfaces, and will look at some of the support available for testing at this level.

# 1 Integration and system testing

• exercise larger potions of code than does unit testing.
• validates the interactions between separate code features.

## 1.1 Unit to Integration

• Replace stubs and mocks by real code.
• Now they are integration tests!

More to the point, the problem is usually not converting from unit tests to integration tests. As we saw when looking at stubs & mocking, it’s often common to come up with integration tests that we have to work hard at to turn into properly isolated unit tests.

So those “first drafts” at unit tests can often be saved to serve as integration tests.

• Supplement with tests of “interesting” interactions among modules.

## 1.2 Integration to System

• System testing is the limiting case of integration testing.

• Works with entire program’s inputs and outputs

• A challenge when inputs and outputs involve GUIs, databases, and other non-text, non-API components.

• Generally fewer tests than under unit and integration testing.

• But may run a lot longer

# 2 Testing Phases and the Build Manager

• Generally, unit tests should run quickly (a matter of seconds)

So we design the build to re-run these every time the code is recompiled.

• Integration tests and systems tests may take considerably longer.

Might only want to rerun these during more occasional “reporting” runs when we want a full report on the overall state of the project.

• Suggests distinct targets in the build manager.
• Probably easiest if we keep the different kinds of tests in separate directories

## 2.1 How often should we run integration/system tests?

• Once daily.
• When we push a set of local changes to the central repository.
• When merging changes into the master branch.

These kinds of runs are often triggered automatically.

## 2.2 Separating the Tests

For Java projects, we already have a separation of code into src/main and src/test. We might consider adding src/integrationTest and src/systemTest.

• I sometimes use src/itest and src/systest.

Each test directory ty[ically has source code (e.g., src/itest/java) and maybe data files (e.g., src/systest/data).

## 2.3 Building the Tests

In build.gradle:

plugins {
id 'java'
}

repositories {
jcenter()
}

dependencies {
testImplementation("junit:junit:4.12")
testRuntimeOnly("org.junit.vintage:junit-vintage-engine:5.5.2")
}

test {
useJUnit()
}

// Add integration test directory itest
sourceSets {
iTest {
compileClasspath += sourceSets.main.output
runtimeClasspath += sourceSets.main.output
}
}

configurations {
iTestImplementation.extendsFrom implementation
iTestRuntimeOnly.extendsFrom runtimeOnly
}

dependencies {
iTestImplementation 'junit:junit:4.12'
}


This adds to a Java project new tasks to compile and run tests from src/iTest/java.

You can configure these tasks independently, e.g.,

iTest.mustRunAfter test  ➀

iTest {
ignoreFailures = true          ➁
}
test {
ignoreFailures = false         ➂
}

• Makes sense to run the integration tests only after the normal unit tests.

• If we want to generate reports of how many tests passed and failed, we probably need to make sure the build keeps going (so that we can get to the reporting tasks) even if some tests fail.

Keep in mind also that our integration tests are likely to start as things that fail, and will continue to fail until we actually get far enough in the project development for losts of the missing pieces to have been finally implemented.

• This is for the purpose of illustration only. I don’t know why you would want to kill your reporting after unit test failures. After all, in TDD, we expect such failures to be common and to persist for some time.

# 3 Test Coverage

Although we can monitor test coverage during unit test, it’s more common to do this during integration and system test.

• During Unit test, we are working with a lot of “fake” code (drivers and stubs/mocks).

• We certainly don’t care how well our drivers and stubs were covered!

• Integration and system testing gets to more realistic ’combinations" of operations.

## 3.1 Coverage Measures

We have previously reviewed:

• Black-Box Testing

• Equivalence partitioning
• Boundary-value testing
• Special-values testing
• White-Box Testing

• Structural Testing (a.k.a., “path testing”

• Statement Coverage
• Branch Coverage
• Cyclomatic coverage (“independent path testing”)
• Data-flow Coverage
• Mutation testing

## 3.2 C/C++ - gcov

Monitoring Statement Coverage with gcov

• coverage tool includes with the GNU compiler suite (gcc, g++, etc.)

As an example, look at testing the three search functions in

arrayUtils.h
#ifndef ARRAYUTILS_H
#define ARRAYUTILS_H

//  - Assumes that we have a separate integer (size) indicating how
//     many elements are in the array
//  - and that the "true" size of the array is at least one larger
//      than the current value of that counter
template <typename T>
void addToEnd (T* array, int& size, T value)
{
array[size] = value;
++size;
}

//    index..size-1 up one, to make room.
//  - Assumes that we have a separate integer (size) indicating how
//     many elements are in the array
//  - and that the "true" size of the array is at least one larger
//      than the current value of that counter

template <typename T>
void addElement (T* array, int& size, int index, T value)
{
// Make room for the insertion
int toBeMoved = size - 1;
while (toBeMoved >= index) {
array[toBeMoved+1] = array[toBeMoved];
--toBeMoved;
}
// Insert the new value
array[index] = value;
++size;
}

// Assume the elements of the array are already in order
// Find the position where value could be added to keep
//    everything in order, and insert it there.
// Return the position where it was inserted
//  - Assumes that we have a separate integer (size) indicating how
//     many elements are in the array
//  - and that the "true" size of the array is at least one larger
//      than the current value of that counter

template <typename T>
int addInOrder (T* array, int& size, T value)
{
// Make room for the insertion
int toBeMoved = size - 1;
while (toBeMoved >= 0 && value < array[toBeMoved]) {
array[toBeMoved+1] = array[toBeMoved];
--toBeMoved;
}
// Insert the new value
array[toBeMoved+1] = value;
++size;
}

// Search an array for a given value, returning the index where
template <typename T>
int seqSearch(const T list[], int listLength, T searchItem)
{
int loc;

for (loc = 0; loc < listLength; loc++)
if (list[loc] == searchItem)
return loc;

return -1;
}

// Search an ordered array for a given value, returning the index where
template <typename T>
int seqOrderedSearch(const T list[], int listLength, T searchItem)
{
int loc = 0;

while (loc < listLength && list[loc] < searchItem)
{
++loc;
}
if (loc < listLength && list[loc] == searchItem)
return loc;
else
return -1;
}

// Removes an element from the indicated position in the array, moving
// all elements in higher positions down one to fill in the gap.
template <typename T>
void removeElement (T* array, int& size, int index)
{
int toBeMoved = index + 1;
while (toBeMoved < size) {
array[toBeMoved] = array[toBeMoved+1];
++toBeMoved;
}
--size;
}

// Search an ordered array for a given value, returning the index where
template <typename T>
int binarySearch(const T list[], int listLength, T searchItem)
{
int first = 0;
int last = listLength - 1;
int mid;

bool found = false;

while (first <= last && !found)
{
mid = (first + last) / 2;

if (list[mid] == searchItem)
found = true;
else
if (searchItem < list[mid])
last = mid - 1;
else
first = mid + 1;
}

if (found)
return mid;
else
return -1;
}

#endif


with test driver

gcovDemo.cpp
#include <cassert>
#include <iostream>
#include <sstream>
#include <string>

#include "arrayUtils.h"

using namespace std;

// Unit test driver for array search functions

int main(int argc, char** argv)
{
// Repeatedly reads tests from cin
// Each test consists of a line containing one or more words.
// The first word is one that we want to search for. The
// remaining words are placed into an array and represent the collection
// we will search through.

string line;
getline (cin, line);
while (cin)
{
istringstream in (line);
cout << line << endl;
string toSearchFor;
in >> toSearchFor;
int nWords = 0;
string words[100];
while (in >> words[nWords])
++nWords;

cout << seqSearch (words, nWords, toSearchFor)
<< " "
<< seqOrderedSearch (words, nWords, toSearchFor)
<< " "
<< binarySearch (words, nWords, toSearchFor)
<< endl;

getline (cin, line);
}

return 0;
}


which reads data from a text stream (e.g., standard in), uses that data to construct arrays, and invokes each function on those arrays, printing the results of each.

Compiling for gcov Statement Coverage

• To use gcov, we compile with special options

• -fprofile-arcs -ftest-coverage
• When the code has been compiled, in addition to the usual files there will be several files with endings like .gcno

• These hold data on where the statements and branches in our code are.

Running Tests with gcov

• As you test, a *.gcda file will accumulate data on your test coverage.

• Run: gcov_mainProgram_
• The immediate output will be a report on the percentages of statements covered in each source code file.
• Also creates a *.gcov detailed report for each source code file. e.g.,

Sample Statement Coverage Report

     -:   69:template <typename T>
-:   70:int seqSearch(const T list[], int listLength, T searchItem)
-:   71:{
1:   72:    int loc;
-:   73:
2:   74:    for (loc = 0; loc < listLength; loc++)
2:   75:        if (list[loc] == searchItem)
1:   76:            return loc;
-:   77:
#####:   78:    return -1;
-:   79:}


• Report lists number of times each statement has been executed
• Lists #### if a statement has never been executed

Monitoring Branch Coverage with gcov

gcov can report on branches taken.

• Just add options to the gcov command:
• gcov -b -c_mainProgram_

• gcov reports
• Number of times each function call successfully returned
• Number of times a branch was executed (i.e,, how many times the branch condition was evaluated)
• and number of times each branch was taken

• For branch coverage, this is the relevant figure

But What is a “Branch”?

• A “branch” is anything that causes the code to not continue on in straight-line fashion

• Branch listed right after an “if” is the “branch” that jumps around the “then” part to go to the “else” part.
• && and || operators introduce their own branches
• Other branches may be hidden
• Contributed by calls to inline functions
• Or just a branch generated by the compiler’s code generator
• In practice, this can be very hard to interpret

Example: gcov Branch Coverage report

        -:   84:template <typename T>
-:   85:int seqOrderedSearch(const T list[], int listLength, T searchItem)
-:   86:{
1:   87:    int loc = 0;
-:   88:
1:   89:    while (loc < listLength && list[loc] < searchItem)
branch  0 taken 0
call    1 returns 1
branch  2 taken 0
branch  3 taken 1
-:   90:      {
#####:   91:       ++loc;
branch  0 never executed
-:   92:      }
1:   93:    if (loc < listLength && list[loc] == searchItem)
branch  0 taken 0
call    1 returns 1
branch  2 taken 0
1:   94:       return loc;
branch  0 taken 1
-:   95:    else
#####:   96:       return -1;
-:   97:}


• Report is organized by basic blocks, straight-line sequences of code terminated by a branch or a call

• Hard to map to specific source code constructs

• lowest-numbered branch is often the leftmost condition
• Fact of life that compilers insert branches and calls that are often invisible to us

## 3.3 Java

Java Coverage Tools

• Clover

• JaCoCo

• Part of the EclEmma project (Eclipse plugin for Emma)
• Emma, an older coverage tool, now replaced by JaCoCo

Clover

• Commercial product, currently free for open-source projects

• integrates with Ant, Maven
• lots of reporting features
• Works in “traditional” coverage tool fashion

• Requires a “fork” of the build process to build a monitoring version
• Injects monitors into compiled code
• Test optimization: can re-run only those tests that covered changed code

JaCoCo

Java Code Coverage

• line and branch coverage

• Instrumentation is done on the fly

• An “agent” monitors execution of normally compiled bytecode
• No special build required
• Works with Eclipse

• JaCoCo “started” as an Eclipse plug-in
• Works with Maven & Ant

• In Ant, wrap normal <java> and <junit> tasks inside a <jacoco:coverage> element

• Just apply the plug-in.

Example: JaCoCo in Eclipse

Using JaCoCo in Eclipse

Once you have the plugin installed,

1. Open any Eclipse Java project.
2. Right-click on a unit test or executable. Look for “Coverage As
• Function of this is the same as “Run As” and “Debug As”, but monitors coverage during execution.

3. After execution has completed coverage results are shown

• A summary in the console area under the “Coverage” tab
• Details in the Java editor, as color coding
• Green means that all branches were covered
• Red means that none were covered.
• Yellow means that some were covered.

In build.gradle:

plugins {
id 'java'
id 'jacoco'
}
⋮
check.dependsOn jacocoTestReport


The last line is because I typically have a task named “check” that is my target for report generation. In other words, I plan to use

./gradlew check


to prepare all of my project reports, so I add a dependency between each kind of report I add and the check task.

Example: JaCoCo Report

# 4 Oracles

A testing oracle is the process, person, and/or program that determines if test output is correct

## 4.1 expect

Covered previously, expect is a shell for testing interactive programs.

• an extension of TCL (a portable shell script).

• Largely confined to text streams as input/output

## 4.2 *Unit

Can we use *Unit-style frameworks as oracles at the system test level?

• The very question is heresy to many *Unit advocates

• Particularly runs counter to the goals of the various Mock Objects projects
• But, why not?

• Such tests do not (should not) be at the expense of having done earlier “proper” unit testing.

• Particularly in Java, MyClass.main(String[]) can be called just like any other function

• And System.in (cin) and System.out (cout) can be rerouted to/from files or internal strings

• Major limitation is the accessibility of system inputs & outputs.

• GUIs, data bases, etc.

## 4.3 Testing GUI systems

• Scripting or record/playback: playing back input events for

• convenience & efficiency
• consistent reproducibility
• Capture of results

• Can occur at different levels
• event/message level
• graphics level

Some Open Alternatives

Marathon

For Java GUIs

• Recorder captures AWT/swing events as JRuby scripts

• Scripts can then be edited to alter inputs, add assertions, etc.

def test

\$java_recorded_version = "1.6.0_24"

with_window("Simple Widgets") {
select("First Name", "Jalian Systems")
assert_p("First Name", "Text", "Jalian Systems")
}
end



Jemmy

Also for Java GUIs

• Tests scripted as Java

• Integrates with JUnit

## 4.4 Web systems

• A subproblem of GUI testing
• Simpler because input structure more constrained
• Output detail level is fixed (http: events)

Some Open Alternatives

## 4.5 Selenium

• Browser automation (SeleniumIDE - Firefox add-on)
• Record & playback
• Or scripted (Selenium Webdriver)
• Firefox, IE, Safari, Opera, Chrome

Selenium Scripting

• Actions do things to elements.

E.g., click buttons, select options

• Accessors examine the application state

• Assertions validate the state

Each assertion has 3 modes

• assert: failure aborts the test
• verify: test continues, but failure is logged
• waitFor: conditions that may be true immediately or may become true within a specified time interval

Selenese

A typical scripting statement has the form

command parameter1 [parameter2]


Parameters can be

• locators for finding a UI element within a page (xpath)

• text patterns

• variable names

A Sample Selenium Script

<table>
<tr><td>verifyText</td><td>//h2</td><td>Terms and Conditions</td></tr>
<tr><td>clickAndWait</td><td>//input[@value="I agree"]</td><td></td></tr>
<tr><td>assertTitle</td><td></td><td>Product Selection</td></tr>
</table>



That’s right – it’s an HTML table:

A Selenium “test suite” is a web page with a table of links to web pages with test cases.

Selenium Webdriver

An alternate version of Selenium is more code-oriented. It provides APIs to a variety of languages allowing for very similar capabilities:

Select select = new Select(driver.findElement(
By.tagName("select")));
select.deselectAll();
select.selectByVisibleText("Edam");



Selenium works by interacting with a “real” web browser (Firefox, Chrome) to simulate actions like clicking on or sending keystrokes to a web page element.

I’ve used Selenium Webdriver to implement some very nice web scraper applications.

Waiting

A tricky thing about testing web applications is the unknwon amount of time that may be required to respond to a click or other interaction.

Valid tests often need to give very specific instructions to wait for pages to be loaded and for elements to become visible and clickable.

WebDriver driver = new FirefoxDriver();
WebElement myDynamicElement = (
new WebDriverWait(driver, 10))
.until(ExpectedConditions.elementIsClickable(
By.id("myDynamicElement")));



Waits up to 10 seconds for an expected element to load and become active.