System Testing
Steven J Zeil
Abstract
System testing adds particular challenges to testing because it is constrained to work with the actual system inputs and outputs. Compared to unit testing, we have less control over both the input supply and the capture and evaluation of the output.
In this lesson we look at tools for measuring quality of system testing. We will explore the difficulties that arise when dealing with non-text input and output, particularly graphical interfaces, and will look at some of the support available for testing at this level.
1 Integration and system testing
- exercise larger potions of code than does unit testing.
- validates the interactions between separate code features.
1.1 Unit to Integration
-
Start with your unit tests.
- Replace stubs and mocks by real code.
- Now they are integration tests!
More to the point, the problem is usually not converting from unit tests to integration tests. As we saw when looking at stubs & mocking, it’s often common to come up with integration tests that we have to work hard at to turn into properly isolated unit tests.
So those “first drafts” at unit tests can often be saved to serve as integration tests.
-
Supplement with tests of “interesting” interactions among modules.
1.2 Integration to System
-
System testing is the limiting case of integration testing.
-
Works with entire program’s inputs and outputs
-
A challenge when inputs and outputs involve GUIs, databases, and other non-text, non-API components.
-
-
Generally fewer tests than under unit and integration testing.
-
But may run a lot longer
-
2 Testing Phases and the Build Manager
-
Generally, unit tests should run quickly (a matter of seconds)
So we design the build to re-run these every time the code is recompiled.
-
Integration tests and systems tests may take considerably longer.
Might only want to rerun these during more occasional “reporting” runs when we want a full report on the overall state of the project.
- Suggests distinct targets in the build manager.
- Probably easiest if we keep the different kinds of tests in separate directories
2.1 How often should we run integration/system tests?
Plausible answers:
- Once daily.
- When we push a set of local changes to the central repository.
- When merging changes into the master branch.
These kinds of runs are often triggered automatically.
- Daily runs cane be a simple timed script.
- Runs triggered by repository actions are triggered using continuous integration.
2.2 Separating the Tests
For Java projects, we already have a separation of code into src/main
and src/test
. We might consider adding src/integrationTest
and src/systemTest
.
- I sometimes use
src/itest
andsrc/systest
.
Each test directory ty[ically has source code (e.g., src/itest/java
) and maybe data files (e.g., src/systest/data
).
2.3 Building the Tests
In Gradle, we can add test directories with the TestsSets plugin.
In build.gradle
:
plugins {
id 'java'
}
repositories {
jcenter()
}
dependencies {
testImplementation("junit:junit:4.12")
testRuntimeOnly("org.junit.vintage:junit-vintage-engine:5.5.2")
}
test {
useJUnit()
}
// Add integration test directory itest
sourceSets {
iTest {
compileClasspath += sourceSets.main.output
runtimeClasspath += sourceSets.main.output
}
}
configurations {
iTestImplementation.extendsFrom implementation
iTestRuntimeOnly.extendsFrom runtimeOnly
}
dependencies {
iTestImplementation 'junit:junit:4.12'
}
This adds to a Java project new tasks to compile and run tests from src/iTest/java
.
You can configure these tasks independently, e.g.,
iTest.mustRunAfter test ➀
iTest {
ignoreFailures = true ➁
}
test {
ignoreFailures = false ➂
}
-
➀ Makes sense to run the integration tests only after the normal unit tests.
-
➁ If we want to generate reports of how many tests passed and failed, we probably need to make sure the build keeps going (so that we can get to the reporting tasks) even if some tests fail.
Keep in mind also that our integration tests are likely to start as things that fail, and will continue to fail until we actually get far enough in the project development for losts of the missing pieces to have been finally implemented.
-
➂ This is for the purpose of illustration only. I don’t know why you would want to kill your reporting after unit test failures. After all, in TDD, we expect such failures to be common and to persist for some time.
3 Test Coverage
Although we can monitor test coverage during unit test, it’s more common to do this during integration and system test.
-
During Unit test, we are working with a lot of “fake” code (drivers and stubs/mocks).
-
We certainly don’t care how well our drivers and stubs were covered!
-
-
Integration and system testing gets to more realistic ’combinations" of operations.
3.1 Coverage Measures
We have previously reviewed:
-
Black-Box Testing
- Equivalence partitioning
- Boundary-value testing
- Special-values testing
-
White-Box Testing
-
Structural Testing (a.k.a., “path testing”
- Statement Coverage
- Branch Coverage
- Cyclomatic coverage (“independent path testing”)
- Data-flow Coverage
-
Mutation testing
-
3.2 C/C++ - gcov
Monitoring Statement Coverage with gcov
- coverage tool includes with the GNU compiler suite (gcc, g++, etc.)
As an example, look at testing the three search functions in
#ifndef ARRAYUTILS_H
#define ARRAYUTILS_H
// Add to the end
// - Assumes that we have a separate integer (size) indicating how
// many elements are in the array
// - and that the "true" size of the array is at least one larger
// than the current value of that counter
template <typename T>
void addToEnd (T* array, int& size, T value)
{
array[size] = value;
++size;
}
// Add value into array[index], shifting all elements already in positions
// index..size-1 up one, to make room.
// - Assumes that we have a separate integer (size) indicating how
// many elements are in the array
// - and that the "true" size of the array is at least one larger
// than the current value of that counter
template <typename T>
void addElement (T* array, int& size, int index, T value)
{
// Make room for the insertion
int toBeMoved = size - 1;
while (toBeMoved >= index) {
array[toBeMoved+1] = array[toBeMoved];
--toBeMoved;
}
// Insert the new value
array[index] = value;
++size;
}
// Assume the elements of the array are already in order
// Find the position where value could be added to keep
// everything in order, and insert it there.
// Return the position where it was inserted
// - Assumes that we have a separate integer (size) indicating how
// many elements are in the array
// - and that the "true" size of the array is at least one larger
// than the current value of that counter
template <typename T>
int addInOrder (T* array, int& size, T value)
{
// Make room for the insertion
int toBeMoved = size - 1;
while (toBeMoved >= 0 && value < array[toBeMoved]) {
array[toBeMoved+1] = array[toBeMoved];
--toBeMoved;
}
// Insert the new value
array[toBeMoved+1] = value;
++size;
return toBeMoved+1;
}
// Search an array for a given value, returning the index where
// found or -1 if not found.
template <typename T>
int seqSearch(const T list[], int listLength, T searchItem)
{
int loc;
for (loc = 0; loc < listLength; loc++)
if (list[loc] == searchItem)
return loc;
return -1;
}
// Search an ordered array for a given value, returning the index where
// found or -1 if not found.
template <typename T>
int seqOrderedSearch(const T list[], int listLength, T searchItem)
{
int loc = 0;
while (loc < listLength && list[loc] < searchItem)
{
++loc;
}
if (loc < listLength && list[loc] == searchItem)
return loc;
else
return -1;
}
// Removes an element from the indicated position in the array, moving
// all elements in higher positions down one to fill in the gap.
template <typename T>
void removeElement (T* array, int& size, int index)
{
int toBeMoved = index + 1;
while (toBeMoved < size) {
array[toBeMoved] = array[toBeMoved+1];
++toBeMoved;
}
--size;
}
// Search an ordered array for a given value, returning the index where
// found or -1 if not found.
template <typename T>
int binarySearch(const T list[], int listLength, T searchItem)
{
int first = 0;
int last = listLength - 1;
int mid;
bool found = false;
while (first <= last && !found)
{
mid = (first + last) / 2;
if (list[mid] == searchItem)
found = true;
else
if (searchItem < list[mid])
last = mid - 1;
else
first = mid + 1;
}
if (found)
return mid;
else
return -1;
}
#endif
with test driver
#include <cassert>
#include <iostream>
#include <sstream>
#include <string>
#include "arrayUtils.h"
using namespace std;
// Unit test driver for array search functions
int main(int argc, char** argv)
{
// Repeatedly reads tests from cin
// Each test consists of a line containing one or more words.
// The first word is one that we want to search for. The
// remaining words are placed into an array and represent the collection
// we will search through.
string line;
getline (cin, line);
while (cin)
{
istringstream in (line);
cout << line << endl;
string toSearchFor;
in >> toSearchFor;
int nWords = 0;
string words[100];
while (in >> words[nWords])
++nWords;
cout << seqSearch (words, nWords, toSearchFor)
<< " "
<< seqOrderedSearch (words, nWords, toSearchFor)
<< " "
<< binarySearch (words, nWords, toSearchFor)
<< endl;
getline (cin, line);
}
return 0;
}
which reads data from a text stream (e.g., standard in), uses that data to construct arrays, and invokes each function on those arrays, printing the results of each.
Compiling for gcov Statement Coverage
-
To use gcov, we compile with special options
-fprofile-arcs -ftest-coverage
-
When the code has been compiled, in addition to the usual files there will be several files with endings like
.gcno
-
These hold data on where the statements and branches in our code are.
-
Running Tests with gcov
-
Run your tests normally.
-
As you test, a
*.gcda
file will accumulate data on your test coverage.
Viewing Your Report
- Run:
gcov
_mainProgram_- The immediate output will be a report on the percentages of statements covered in each source code file.
- Also creates a *.gcov detailed report for each source code file. e.g.,
Sample Statement Coverage Report
-: 69:template <typename T>
-: 70:int seqSearch(const T list[], int listLength, T searchItem)
-: 71:{
1: 72: int loc;
-: 73:
2: 74: for (loc = 0; loc < listLength; loc++)
2: 75: if (list[loc] == searchItem)
1: 76: return loc;
-: 77:
#####: 78: return -1;
-: 79:}
- Report lists number of times each statement has been executed
- Lists #### if a statement has never been executed
Monitoring Branch Coverage with gcov
gcov can report on branches taken.
- Just add options to the
gcov
command:gcov -b -c
_mainProgram_
Reading gcov Branch Info
- gcov reports
- Number of times each function call successfully returned
- Number of times a branch was executed (i.e,, how many times the branch condition was evaluated)
-
and number of times each branch was taken
- For branch coverage, this is the relevant figure
But What is a “Branch”?
-
A “branch” is anything that causes the code to not continue on in straight-line fashion
- Branch listed right after an “if” is the “branch” that jumps around the “then” part to go to the “else” part.
&&
and||
operators introduce their own branches- Other branches may be hidden
- Contributed by calls to inline functions
- Or just a branch generated by the compiler’s code generator
-
In practice, this can be very hard to interpret
Example: gcov Branch Coverage report
-: 84:template <typename T>
-: 85:int seqOrderedSearch(const T list[], int listLength, T searchItem)
-: 86:{
1: 87: int loc = 0;
-: 88:
1: 89: while (loc < listLength && list[loc] < searchItem)
branch 0 taken 0
call 1 returns 1
branch 2 taken 0
branch 3 taken 1
-: 90: {
#####: 91: ++loc;
branch 0 never executed
-: 92: }
1: 93: if (loc < listLength && list[loc] == searchItem)
branch 0 taken 0
call 1 returns 1
branch 2 taken 0
1: 94: return loc;
branch 0 taken 1
-: 95: else
#####: 96: return -1;
-: 97:}
-
Report is organized by basic blocks, straight-line sequences of code terminated by a branch or a call
-
Hard to map to specific source code constructs
- lowest-numbered branch is often the leftmost condition
- Fact of life that compilers insert branches and calls that are often invisible to us
3.3 Java
Java Coverage Tools
-
- Part of the EclEmma project (Eclipse plugin for Emma)
- Emma, an older coverage tool, now replaced by JaCoCo
Clover
-
Commercial product, currently free for open-source projects
- integrates with Ant, Maven
- lots of reporting features
-
Works in “traditional” coverage tool fashion
- Requires a “fork” of the build process to build a monitoring version
- Injects monitors into compiled code
-
Test optimization: can re-run only those tests that covered changed code
JaCoCo
Java Code Coverage
-
line and branch coverage
-
Instrumentation is done on the fly
- An “agent” monitors execution of normally compiled bytecode
- No special build required
- An “agent” monitors execution of normally compiled bytecode
-
Works with Eclipse
- JaCoCo “started” as an Eclipse plug-in
-
Works with Maven & Ant
- In Ant, wrap normal
<java>
and<junit>
tasks inside a<jacoco:coverage>
element
- In Ant, wrap normal
-
Works with Gradle
- Just apply the plug-in.
Example: JaCoCo in Eclipse
Once you have the plugin installed,
- Open any Eclipse Java project.
- Right-click on a unit test or executable. Look for “
Coverage As
”-
Function of this is the same as “
Run As
” and “Debug As
”, but monitors coverage during execution.
-
-
After execution has completed coverage results are shown
- A summary in the console area under the “Coverage” tab
- Details in the Java editor, as color coding
- Green means that all branches were covered
- Red means that none were covered.
- Yellow means that some were covered.
Example: JaCoCo in Gradle
In build.gradle
:
plugins {
id 'java'
id 'jacoco'
}
⋮
check.dependsOn jacocoTestReport
The last line is because I typically have a task named “check” that is my target for report generation. In other words, I plan to use
./gradlew check
to prepare all of my project reports, so I add a dependency between each kind of report I add and the check
task.
Example: JaCoCo Report
4 Oracles
A testing oracle is the process, person, and/or program that determines if test output is correct
4.1 expect
Covered previously, expect is a shell for testing interactive programs.
-
an extension of TCL (a portable shell script).
-
Largely confined to text streams as input/output
4.2 *Unit
Can we use *Unit-style frameworks as oracles at the system test level?
-
The very question is heresy to many *Unit advocates
- Particularly runs counter to the goals of the various Mock Objects projects
-
But, why not?
-
Such tests do not (should not) be at the expense of having done earlier “proper” unit testing.
-
Particularly in Java, MyClass.main(String[]) can be called just like any other function
-
And System.in (cin) and System.out (cout) can be rerouted to/from files or internal strings
-
-
Major limitation is the accessibility of system inputs & outputs.
- GUIs, data bases, etc.
-
4.3 Testing GUI systems
-
Scripting or record/playback: playing back input events for
- convenience & efficiency
- consistent reproducibility
-
Capture of results
- Can occur at different levels
- event/message level
- graphics level
- Can occur at different levels
Some Open Alternatives
Marathon
For Java GUIs
-
Recorder captures AWT/swing events as JRuby scripts
-
Scripts can then be edited to alter inputs, add assertions, etc.
def test
$java_recorded_version = "1.6.0_24"
with_window("Simple Widgets") {
select("First Name", "Jalian Systems")
select("Password", "Secret")
assert_p("First Name", "Text", "Jalian Systems")
}
end
Jemmy
Also for Java GUIs
-
Tests scripted as Java
-
Integrates with JUnit
4.4 Web systems
- A subproblem of GUI testing
- Simpler because input structure more constrained
- Output detail level is fixed (http: events)
Some Open Alternatives
4.5 Selenium
- Browser automation (SeleniumIDE - Firefox add-on)
- Record & playback
- Or scripted (Selenium Webdriver)
- Firefox, IE, Safari, Opera, Chrome
Selenium Scripting
-
Actions do things to elements.
E.g., click buttons, select options
-
Accessors examine the application state
-
Assertions validate the state
Each assertion has 3 modes
- assert: failure aborts the test
- verify: test continues, but failure is logged
- waitFor: conditions that may be true immediately or may become true within a specified time interval
Selenese
A typical scripting statement has the form
command parameter1 [parameter2]
Parameters can be
-
locators for finding a UI element within a page (xpath)
-
text patterns
-
variable names
A Sample Selenium Script
<table>
<tr><td>open</td><td>http://mySite.com/downloads/</td><td></td></tr>
<tr><td>assertTitle</td><td></td><td>Downloads</td></tr>
<tr><td>verifyText</td><td>//h2</td><td>Terms and Conditions</td></tr>
<tr><td>clickAndWait</td><td>//input[@value="I agree"]</td><td></td></tr>
<tr><td>assertTitle</td><td></td><td>Product Selection</td></tr>
</table>
That’s right – it’s an HTML table:
open | http://mySite.com/downloads/ | |
assertTitle | Downloads | |
verifyText | //h2 | Terms and Conditions |
clickAndWait | //input[@value=“I agree”] | |
assertTitle | Product Selection |
A Selenium “test suite” is a web page with a table of links to web pages with test cases.
Selenium Webdriver
An alternate version of Selenium is more code-oriented. It provides APIs to a variety of languages allowing for very similar capabilities:
Select select = new Select(driver.findElement(
By.tagName("select")));
select.deselectAll();
select.selectByVisibleText("Edam");
Selenium works by interacting with a “real” web browser (Firefox, Chrome) to simulate actions like clicking on or sending keystrokes to a web page element.
I’ve used Selenium Webdriver to implement some very nice web scraper applications.
Waiting
A tricky thing about testing web applications is the unknwon amount of time that may be required to respond to a click or other interaction.
Valid tests often need to give very specific instructions to wait for pages to be loaded and for elements to become visible and clickable.
WebDriver driver = new FirefoxDriver();
driver.get("http://somedomain/url_that_delays_loading");
WebElement myDynamicElement = (
new WebDriverWait(driver, 10))
.until(ExpectedConditions.elementIsClickable(
By.id("myDynamicElement")));
Waits up to 10 seconds for an expected element to load and become active.