The diagram here illustrates the steps involved in testing code.
Beginning from an overall test plan (or test specification),
we eventually seek to discover a collection of failures
These failures become the input to the process of debugging, where we seek to find the faults in the code responsible for those failures.
Terminology
Failure: An execution on which incorrect behavior occurs
Fault: A defect in the code that (may) cause a failure
Related to these, though not part of our testing process:
alternatively, the difference between the expected output and the actual output on a failed test.
A test plan (more properly, a test specification) describes a set of test cases.
Derive inputs for each test case.
In most cases, you will also need to record the expected outputs or behavior for your test inputs.
The inputs and expected outputs may be recorded in a database of regression tests for later. But the most obvious use for the new inputs is to…
Execute the tests
The test inputs are fed into the program being tested and the actual outputs collected.
Determine which tests have failed.
The test inputs, actual outputs obtained from their execution, and the expected outputs are passed on to the testing oracle. The oracle is the person, program, or process used to determine if a test has failed.
Pass the failures on for debugging.
The purpose of debugging is to determine the faults i nthe code that are actually responsible for the failures observed during testing.
The testing oracle is the person, program, or process used to determine if a test has failed.
Common oracles:
The testing oracle is the person, program, or process used to determine if a test has failed.
Common oracles:
the “eyeball” oracle
The testing oracle is the person, program, or process used to determine if a test has failed.
Common oracles:
the “eyeball” oracle
human inspection of output
The testing oracle is the person, program, or process used to determine if a test has failed.
Common oracles:
the “eyeball” oracle
human inspection of output
notoriously unreliable
The testing oracle is the person, program, or process used to determine if a test has failed.
Common oracles:
the “eyeball” oracle
human inspection of output
notoriously unreliable
The testing oracle is the person, program, or process used to determine if a test has failed.
Common oracles:
the “eyeball” oracle
human inspection of output
notoriously unreliable
“head to head” oracles
The testing oracle is the person, program, or process used to determine if a test has failed.
Common oracles:
the “eyeball” oracle
human inspection of output
notoriously unreliable
“head to head” oracles
comparison against outputs of an existing system
The testing oracle is the person, program, or process used to determine if a test has failed.
Common oracles:
the “eyeball” oracle
human inspection of output
notoriously unreliable
“head to head” oracles
comparison against outputs of an existing system
automated oracles
The testing oracle is the person, program, or process used to determine if a test has failed.
Common oracles:
the “eyeball” oracle
human inspection of output
notoriously unreliable
“head to head” oracles
comparison against outputs of an existing system
automated oracles
an upcoming lesson
The testing oracle is the person, program, or process used to determine if a test has failed.
Common oracles:
the “eyeball” oracle
human inspection of output
notoriously unreliable
“head to head” oracles
comparison against outputs of an existing system
automated oracles
an upcoming lesson
verification but not validation
The regression log or regression database is a collection of tests and expected outputs from past testing.
It is used, during regression testing, to quickly rerun old tests. Regression databases can quickly grow to thousands or tens of thousands of cases or more. It becomes particularly important that we not rely on the eyeball oracle for evaluating regression tests.
We recognize several different stages of testing. These differ in scope (how much of the program is involved) and purpose (who conducts the testing and what information do they derive from it).
We recognize several different stages of testing. These differ in scope (how much of the program is involved) and purpose (who conducts the testing and what information do they derive from it).
Unit Test: Tests of individual subroutines and modules,
We recognize several different stages of testing. These differ in scope (how much of the program is involved) and purpose (who conducts the testing and what information do they derive from it).
Unit Test: Tests of individual subroutines and modules,
We recognize several different stages of testing. These differ in scope (how much of the program is involved) and purpose (who conducts the testing and what information do they derive from it).
Unit Test: Tests of individual subroutines and modules,
Integration Test: Tests of “subtrees” of the total project hierarchy chart (groups of subroutines calling each other).
We recognize several different stages of testing. These differ in scope (how much of the program is involved) and purpose (who conducts the testing and what information do they derive from it).
Unit Test: Tests of individual subroutines and modules,
Integration Test: Tests of “subtrees” of the total project hierarchy chart (groups of subroutines calling each other).
We recognize several different stages of testing. These differ in scope (how much of the program is involved) and purpose (who conducts the testing and what information do they derive from it).
Unit Test: Tests of individual subroutines and modules,
Integration Test: Tests of “subtrees” of the total project hierarchy chart (groups of subroutines calling each other).
System Test: Test of the entire system,
We recognize several different stages of testing. These differ in scope (how much of the program is involved) and purpose (who conducts the testing and what information do they derive from it).
Unit Test: Tests of individual subroutines and modules,
Integration Test: Tests of “subtrees” of the total project hierarchy chart (groups of subroutines calling each other).
System Test: Test of the entire system,
We recognize several different stages of testing. These differ in scope (how much of the program is involved) and purpose (who conducts the testing and what information do they derive from it).
Unit Test: Tests of individual subroutines and modules,
Integration Test: Tests of “subtrees” of the total project hierarchy chart (groups of subroutines calling each other).
System Test: Test of the entire system,
We recognize several different stages of testing. These differ in scope (how much of the program is involved) and purpose (who conducts the testing and what information do they derive from it).
Unit Test: Tests of individual subroutines and modules,
Integration Test: Tests of “subtrees” of the total project hierarchy chart (groups of subroutines calling each other).
System Test: Test of the entire system,
Regression Test: Unit/Integration/System tests that are repeated after a change has been made to the code.
We recognize several different stages of testing. These differ in scope (how much of the program is involved) and purpose (who conducts the testing and what information do they derive from it).
Unit Test: Tests of individual subroutines and modules,
Integration Test: Tests of “subtrees” of the total project hierarchy chart (groups of subroutines calling each other).
System Test: Test of the entire system,
Regression Test: Unit/Integration/System tests that are repeated after a change has been made to the code.
Acceptance Test: A test conducted by the customers or their representatives to decide whether to purchase/accept a developed system.
Testing goals
Focusing on the differing purposes of testing, …
Unit Test: does it work?
Integration Test: does it work?
System Test: does it work?
Regression Test: has it changed?
Acceptance Test: should we pay for it?
Regression testing is particularly interesting. We regression test after a change to make sure we have not inadvertently broken anything else. In fact, we really are looking for unintended effects of our changes.
So, while most testing has possible outcomes “pass” or “fail”, regression testing has outcomes
Unexpected fail: this test used to pass, and now it fails.
We’re going to spend a lot of time taking about unit testing this semester, so it deserves some special attention now.
By testing modules in isolation from the rest of the system
Easier to design and run extensive tests
Much easier to debug any failures
Errors caught much earlier
Main challenge is how to test in isolation
To do Unit tests, we have to provide replacements for parts of the program that we will omit from the test.
Scaffolding is any code that we write, not as part of the application, but simply to support the process of Unit and Integration testing.
Scaffolding comes in two forms
Drivers
Stubs
A driver is test scaffolding that calls the module being tested.
Stubs are replacements for code begin called from the unit under test
Integration testing is testing that combines several modules, but still falls short of exercising the entire program all at once.
Integration testing usually combines a small number of modules that call upon one another.
Integration testing can be conducted
bottom-up
relieves the need for stubs
or top-down
It’s worth noting that unit testing and integration testing can sometimes use some of the same test inputs (and maybe the same expected outputs), because we are testing the software in different configurations.
In theory, as integration testing progresses, you eventually will integration test the entire system.
However,
Integration tests are generally
System tests