A General Regression Test Selection Technique English Language Essay

Published: November 21, 2015 Words: 2682

Software maintenance is an expensive phase accounting for near 60% of overall cost of software life cycle expenditure [3]. Regression testing is an important step in software development to ensure that modifications do not break previously working functionality. However, regression testing is often expensive and time consuming. Regression test suites can be very large, e.g. including tens of thousands of test cases requiring days or weeks to execute [1].

Regression test selection is the activity that choosing from an existing test set, test cases that can and need to be rerun to ensure that changed parts behave as intended and the changes did not introduce unexpected faults. Reducing the number of regression test cases to execute is an obvious way of reducing the cost associated with regression testing. The main objective of selecting test cases that need to be rerun is to identify regression test cases that exercise modified parts of the system. This is referred to as safe regression testing as, it identifies all test cases in the original test set that can reveal one or more faults in the modified program [12].

There are many techniques that handle regression testing, some of them based on source code and other based on design.

The techniques that based on source code are more safe and easy to make. But, it requires that the changes be already implemented. These techniques are very specific to the programming language used to develop the software. Where, if an application is builted using functional languages such as C. Hence, it is not suitable to analyze applications built using C# and Java because the tool cannot identify indirect changes due to object oriented features of these languages like dynamic binding, exceptions etc.

Other techniques that based on specification are more general where the designs are represented using the Unified Modeling Language (UML) that independent on programming language. But, some changes to the source code may not be detectable from UML documents so cannot detect all test cases for the changes.

In this paper we present a new approach that overcomes these shortcomings has been proposed. The approach is based on combining the code based technique and model based technique together to generate a safe and general regression test selection technique. Our approach capture and analyzing the dynamic behavior of the software applications from UML diagram. Then identify the impact of changes made to software, and based on this it selects test cases to be re-executed. These test cases are fewer in number when compared to the complete system test suite.

The rest of the paper is organized as follows. Section 2 discusses different regression test selection techniques that are available in literature. Section 3 presents in detail the proposed approach to regression test selection. The results of the case studies are presented in Section 4. Conclusions and future works are summarized in Section 5.

II. RELATED WORK

Typically regression test selection techniques are either code-based or model-based. Code-based techniques use the information obtained from two different versions of the code to analyze the change impact and select the tests. In the case of model based techniques, change information is obtained through two versions of models constructed during the requirements analysis phase or system design phase.

Code based techniques [2], [5], [6], [7], [11], [12] select tests based on changes made to two versions of the code. These techniques are very specific to the programming language used to develop the code. Chianti [10] and JDiff [5] are comprehensive techniques for managing changes in Java programs. Chianti selects regression tests after analyzing the change impact analysis whereas JDiff performs only change impact analysis. As both these tools analyses the changes at statement level and are specific to Java programming language, hence, they are neither generic nor efficient.

Model-based techniques [3], [4], [8], [9] are based on UML design models used during the design phase of the system. Reference [15] use UML activity diagrams to detect changes in design and then use a traceability matrix between activity diagram and the test suite but, it does not support

object-oriented features. Reference [8] proposes a regression testing technique based on UML sequence and class diagrams. Their approach does not take into account the pre and post conditions of the operations which affect behavior of a class. Also, their approach does not handle concurrency.

III. OUR REGRESSION TEST SELECTION TECHNIQUE

Our proposed approach to regression test selection is based on changes made to software specification that represented in UML diagram and code that represented in any programming language. Our approach is consists of three functions as shown in Fig. 1. These three functions are

(1) Capture dynamic behavior, (2) Identify changes, (3) Select regression test suite. Each of these functions has been described in details.

Capture dynamic

behavior from

UML

Model dynamic

Identify affected

behavior as a FIG

functions from FIG

against the altered

methods set

Compare codes in

Select regression test

new and old

suite

versions

Affected methods

set

Figure 1. Block diagram of our approach

A. Capturing Dynamic Behavior of the Application

Dynamic behavior of software is a set of interactions among system components along with their invoked classes/functions across all application processes. We captured dynamic behavior of the system from UML Class diagram and Sequence diagram. The captured behavior is modeled into Interclass Relation Graph (IRG) and Functional Interaction Graph (FIG). The Interclass Relation Graph (IRG) for a program is a triple {N, IE,UE}:

N is the set of nodes, one for each class.

IE is the set of inheritance edges. An inheritance edge between a node for class C1 and a node for class C2 indicates that C1 is a direct sub-class of

C2.

UE is the set of use edges. A use edge between a node for class C1 and a node for class C2 indicates that C1 contains an explicit reference to C2

Program P

public class SuperA { public void F6 ( ) { System.out.println("aa") ;

}

}

public class A extends SuperA { public void F4( ) { F6( ); } public void F5( ) {…. }

}

public class SubA extends A {} public class B {

A a = new A( );

public void F1 ( ) {a.F4( );} public void F2 ( ) {a.F4( );} public void F3 ( ) {a.F5( );}

}

public class SubB extends B {} public class C {

public static void main ( ) { B b=new B( );

b.F1( ) ; b.F2( ); b.F3( );

}

}

Figure 2. Example program P

Fig 3. Shows the algorithm for building an IRG, buildIRG. For simplicity, in defining the algorithms, we use the following syntax: ne indicates the node for a type e (class or interface); GN, GIE, and GUE indicate the set of nodes N, inheritance edges IE, and use edges UE for a graph G, respectively. Algorithm buildIRG first creates a node for each type in the program (lines 2-5). Then, for each type e, the algorithm connects ne to the node of its direct super-type through an inheritance edge (lines 7-8), and (2) creates a use edge from each nc to n e, such that c contains a reference to e (lines 9-11).

Algorithm buildIRG

Input: program P

Output: IRG G for P

Begin buildIRG

create empty IRG G

for each class and interface in P do

create node ne

GN = GN U {n}

end for

for each class and interface in P do

get direct super-type of e, s

GIE = GIE U {( ne , ns )}

for each class c in P that e references do

GUE = GUE U {( nc , ne}

end for

end for

return G

End buildIRG

Figure 3. Algorithm for building an IRG

Fig. 4 shows the IRG for program P in Fig. 2. The IRG represents the six classes in P and their inheritance and use relationships. Fig. 4 has three inheritance relationships and two use relationships. The class A inherits from class SuperA, class SubA inherit from class A and class SubB inherit from class B. So if class A is changes then we need to test class A, class SuperA and class SubA.

SuperA

B

C

A

SubB

Inheritance edge

SubA

Use edge

Figure 4. IRG for program P of Figure 2.

After we draw the IRG we can draw the sequence diagram of the original program to detect the relations between functions as shown in Fig. 5.

:C

:B

:A

:SuberA

F1( )

F4( )

F6( )

F2( )

F4( )

F3( )

F5( )

Figure 5. Sequence diagram for Program P in Figure 2.

Fig. 6 shows the FIG for program P in Fig. 2 where we use UML sequence diagram in Fig. 5 to capture the relationship between functions.

F1

F6

F4

Main

F2

F3

F5

Figure 6. FIG for Program P of Figure 1.

B. Identify Affected Methods

Affected methods are identified by comparing the original program and modified program. Changes to code occur at the syntactic and semantic levels. Code changes due to a change in syntax refer to the textual differences between corresponding line statements of code versions of a program.

A syntactic difference may not necessarily cause a change in the semantics of the program. For example, consider that int sum = a + b + c; statement is replaced with two statements

(1) int sum = a + b; and (2) sum = sum + c; in the new version of the software. There is a change syntactically between corresponding lines of code. However, semantically the final value assigned to variable sum is the added value of variables a, b, and c in both the cases, hence, it is considered as no change. To resolve such problems, data flow analysis techniques, based on program slicing, have been devised [13] [14]. Although slicing of program statements is a safe and precise method, it is overly complex and necessitates heavy usage of memory and processing time. Thus, scaling slicing techniques to large programs would be difficult and too costly in terms of performance.

The semantic change involves identifying indirectly affected methods which might get invoked due to polymorphism, dynamic binding and exceptions features.

Dynamic binding, Because of dynamic binding, an apparently harmless modification of a program may affect call statements in a different part of the program with respect to the change point. For example, class-hierarchy changes may affect calls to methods in any of the classes in the hierarchy, and adding a method to a class may affect calls to the methods with the same signature in its superclasses and subclasses. As shown in Fig. 7, Class B inherits from Class A and a virtual method calc( ) is implemented in both the classes. The method calc( ) in class A has been changed in new version of the code. This changed method will affect the execution of func1( ) of class D as the argument passed to that method can also be an object of type A due to inheritance property (parent is a sub-type of a child). Therefore, we marks both methods A.calc( ) and D.func1( ) as changed.

Old Program P

New Program P'

Class A : System.Array{

Class A{

int sum;

int sum;

virtual int calc( ) {

virtual int calc( ) {

return sum;

return sum*sum;

}}

}}

Class B:A{

Class B:A{

int calc( ) { return sum/3;

int calc( ) { return sum/3;

}}

}}

Class D {

Class D{

void int func1(B obj)

void int func1(B obj)

{ return obj.calc( ); }

{ return obj.calc( ); }

void func2( ) {

void func2( ) {

try{…}

try{…}

catch(e1){…}

catch(e1){…}

catch(e2){…}

.

.

}

}

void func3( )

void func3( )

{

{

try{…}

try{…}

catch(e1){func3();}

catch(e1){return;}

catch(e2){…}

catch(e2){…}

}

}

Figure 7. Original program P and its modified P'

Changes to an inheritance tree also affect the execution of methods in that tree. For instance, consider the new version of Class A shown in Fig. 7. The new version Class A does not inherit from System.Array. This change will influence the execution of all the methods in Class A. Hence, all methods in Class A are marked as changed. This change also influences the execution of Class B since it inherits the changed Class A. Therefore, all the methods of Class B are also marked as changed. The changes to an inheritance tree are identified by simple comparison of two inheritance tree objects of both old and new versions of a component. The methods (of both Classes A and B) are identified from the inheritance tree object and are marked as changed.

Changes to exceptions can occur at two levels. One during the handling of the exceptions and another is at the definition of exceptions. For instance consider the changes made to exceptions handling as shown in Fig. 7. In the new version, method func2( ) doesn't handle exception e2 whereas method func3( ) handle both exceptions but has changed its implementation while handling exception of type e1. In such cases both methods func2 ( ) and func3( ) are marked as changed methods.

To make automation of change impact analysis complete, both syntactic and semantic changes to a program should be considered. Our technique identifies methods affected due to both syntactic and semantic changes made to software written using any programming language.

C. Selecting Smaller Regression Test Suite

On identifying the affected methods, we can find the impact of these changed methods by analyzing the FIG and IRG. For example, in Fig. 6, if method F4 is marked as changed in the new version of the software, then F4, F1 and F2 are marked as changed so any test case pass in these functions are selected.

IV. CASE STUDIES

In this section we apply our approach on two case studies. The first one is software that written in Java language called AlarmClock, and the second is software that written in C++ language called Schedule.

In the first case study the system test suite consists of 90 test cases. The case study has been conducted on three upgrades released during application regression testing cycle. These upgraded consists of mainly bug fixes, like change to source code statements, deletion of methods, adding new methods. The original program has 6 classes and 20 methods and when we apply our approach on this case study we get the resulted that are tabulated in Table I.

TABLE I

RESULTS OF THE FIRST CASE STUDY

# Version

# Test cases selected

% of test effort saved

V1

16

81%

V2

42

53%

V3

27

70%

In the second case study the system test suite consists of 150 test cases. The case study has been conducted on four upgrades released during application regression testing cycle and we get the resulted that are tabulated in Table II.

TABLE I I

RESULTS OF THE SECOND CASE STUDY

# Version

# Test cases selected

% of test effort saved

V1

76

49%

V2

84

44%

V3

65

65%

V4

72

52%

V. CONCLUSION AND FUTURE WORK

In this paper we present a new approach that is based on combining the code based technique and model based technique together to generate a safe and general regression test selection technique. Where we capture the dynamic behaviors of the software applications from UML diagrams. Then identify the impact of changes made to software code that written in any programming language, and based on these changes we select test cases to be re-executed. These test cases are fewer in number when compared to the complete system test suite.

Software maintenance also includes addition and deletion of user functionality. These modifications could be classified as major changes. Often these changes require new test cases to be added/deleted or modify existing test cases. Our future research would focus on investigation of techniques that automatically identify major changes made to code and generate test cases that validate these changes.

ACKNOWLEDGMENT

This study was supported by a grant from Menoufia University, Egypt. Authors appreciate the good cooperation with the members of SQS Software Quality System Egypt.