Interprocedural Data Flow Testing*
Mary JeanHarrold and Mary Lou Soffa
Department of Computer Science
University of Pittsburgh
Pittsburgh, PA 15260
ABSTRACT
As current trends in programming encourage a high
degree of modularity, the number of procedure calls and
returns executed in a module continues to grow. This
increasein proceduresmandatesthe efficient testing of the
interactions among procedures. In this paper, we extend
the utility of data flow testing to include the testing of data
dependenciesthat exist across procedure boundaries. An
interprocedural data flow analysis algorithm is first
presented that enables the efficient computation of information detailing the locations of definitions and uses
neededby an interprocedural data flow tester. To utilize
this information, a technique to guide the selection and
execution of test cases,that takes into account the various
associations of names with definitions and uses across
procedures, is also presented. The resulting interprocedural data flow tester handles global variables, reference parameters and recursive procedure calls, and is
compatible with the current intraprocedural data flow testing techniques. The testing tool has been implemented on
a Sun 3/50 Workstation.
single calls and returns and indirectly over multiple calls
and returns are needed. The current data flow testing tools
either use intmprocedural data flow analysis typically
employed in compiler optimization to determine the data
dependenciesor determine the definition-use pairs from the
source code by building and then searching the program’s
def-use graph. Although interprocedural data flow analysis
algorithms do exist, [2-7,9,17] they do not provide the
detailed information (i.e., the locations of definitions and
uses that reach across both procedure calls and returns)
needed for the interprocedural data flow testing. Also, the
methods for guiding the actual data flow testing do not
currently handle the renaming of variables that is required
when performing interprocedural testing.
The underlying premise of all of the data flow testing
criteria is that confidence in the correctness of a variable
assignmentat a point in a program is dependenton whether
some test data has caused execution of a path from the
assignment (i.e., definition) to points where the variable’s
value is used (i.e., use). Test data adequacy criteria are
used to select particular definition-use pairs or subpaths
that are identified as the test case requirements for a program. Then, test cases are generated that satisfy the
requirements when used in a program’s execution. Thus,
interprocedural data flow testing consists of (1) determining the definition-use information for definitions that reach
across procedure boundaries (both calls and returns) to
meet the adequacy criteria and (2) guiding the selection
and execution of test casesthat meet the requirements. The
problems of determining interprocedural definition-use
information include the development of an efficient technique that is procedure call site specific and handles reference parameters,global variables and recursion for direct
and indirect data dependencies. Direct dependenciesexist
when either (1) a definition of an actual parameter in one
procedure reaches a use of the corresponding formal
parameter in a called procedure or (2) a definition of a formal parameter in a called procedure reaches a use of the
corresponding actual parameter in the calling procedure.
Conditions for indirect dependenciesare similar to direct
1. INTRODUCTION
Although a number of data flow testing methodologies have been developed and studied in recent
years,[8,11,14,15,18,19] their utility has been restricted
to testing data dependenciesthat exist within a procedure
(i.e., intruprocedural). Testing the data dependenciesthat
exist among procedures (i.e., inferprocedural) requires
information about the flow of data acrossprocedure boundaries, including both calls and returns. The data dependencies that exist between procedures both directly over
* This work was partially supported by the National Science Foundation under Grants CCR-88001104 to the University of Pittsburgh.
Permission to copy without fee all or part of this material is granted provided
that the copies are not made or distributed for direct commercial advantage,
the ACM copyright notice and the title of the publication and its date appear,
and notice is given that copying is by permission of the Association for
Computing Machinery. To copy otherwise, or to republish, requires a fee
and/or specific permission.
0 1989 ACM 089791-342-6/89/0012/0158
$1.50
158
Section 2 introduces an example to illustrate someof
the problems that are inherent to interprocedural data flow
testing. In Section 3, the issues involved in the design of
the system are discussed. Section 4 describesthe interprocedural data flow testing system. Our conclusions are
given in Section 5.
dependenciesexcept that multiple levels of procedure calls
and returns are considered. When a formal parameter is
passedas an actual parameterat a call site, an indirect data
flow dependency may exist. In this case,a definition of an
actual parameter in one procedure may have uses in procedures more than one level away in the calling sequence,
considering both calls and returns. For the selection and
execution of the test cases, the presence of reference
parameters requires incorporating the renaming of variables as procedures are called and returned, which complicatesthe design of a testing tool.
This paper extends data flow testing to include interprocedural data frow testing by developing both an
efficient interprocedural data flow analysis technique that
gathers the necessary information about interprocedural
data dependenciesand a systemthat usesthe information to
guide the interprocedural testing. Through this extension,
data flow testing can be used to test individual procedures
in a module more accurately since data flow information
about called proceduresis known as well as the interactions
of theseproceduresas indicated by the interprocedural data
dependencies. In particular, we have designedan interprocedural data flow testing tool that first computes the
required interprocedural definition-use information, for
both direct and indirect dependencies, and then uses it to
provide integration testing of the procedures in a module.
For the interprocedural data flow analysis, our technique
summarizesthe procedure information at call sites and then
propagates this information throughout the module to
obtain the interprocedural definition-use information for
both global variables and reference parameters. Our technique handles recursion and thus permits data flow analysis
of recursive procedures. By computing the interprocedural
data dependenciesin an efficient way prior to testing, existing path selection techniques based on data flow[ll, 191
can be used for interprocedural testing. To guide the testing, the system recognizes the calls to, and returns from,
procedures and handles the associations of various names
with a definition as the execution path is being inspected.
With our interprocedural data flow testing tool,
integration testing can be performed in several ways. One
is to use ‘incremental bottom-up’ testing of the procedures
in the module. As each procedure is tested, it is also
integrated with any procedures that it calls directly or
indirectly. Thus, the test case requirements include testing
definitions that have uses within the procedure, definitions
that have uses in a called procedure, and definitions in a
called procedure that reach uses in a calling procedure.
Another way is to use the ‘big bang’ approach which first
tests each procedure in isolation and then integrates them
all at once by providing test cases that test only those
definitions that have interprocedural uses. In either case,
the direct and indirect data dependenciesare used to compute the required definition-use information for the testing.
2. MOTIVATING EXAMPLE
To illustrate some of the problems involved in interprocedural data flow testing, we consider an example. Our
data flow tester usesthe data flow testing methodology proposed by Rapps and Weyuker [19] and later extended by
Frankl and Weyuker.[ 111 One criterion, ‘all-uses’ requires
that each definition be tested to all of its uses. If the use is
in a computation statement (e.g., assignment), testing is
from the block containing the definition to the block containing the use. If the use is in a predicate (e.g., conditional), testing is from the block containing the definition to
the successors of the block containing the use. This
ensures that both edges from a conditional statementcontaining a use are tested. The subpath traversed in the testing must contain no redefinition of the variable (i.e., it must
be a dejinition-clear subpath). To illustrate, consider the
module Main and the flow graph for procedure GetMax
which are given in Figure 1. GetMax is a recursive procedure that inputs the index of the first and last elementsof
a list of integers and returns the maximum value in the list.
S is a global array initialized in Main, that storesthe lists of
integers for which the maximum (MX) is to be found. To
simplify the example, only the definitions and usesof reference parametersthat reach acrossprocedure boundaries are
considered. These include definitions and uses of the formal reference parameter for GetMax (i.e., MX) and actual
parametersat the call sites that are bound to formal reference parameters in called procedures (i.e., MX, Ml and
M2). The definition-use pairs required to test GetMax are
identified from the graph and given in Table 1.
Table 1: Definition-Use Pairs for GetMax
Consider a test case whose elements for S are
3,5,1,6. When the procedure is executed with this test data
as input, the path traversed through the basic blocks is
1,2,4,5,1,2,3,8,6,1,2,3,8,7,8. Comparison of this test path
with the required definition-use pairs reveals that it satisfies
all of the test caserequirements. However, with this single
159
module Main
declare
S: an array l...N of integer;
I,MAX,MIN: integer,
begin
for I := 1 to N do read(S[I]);
r-----------i
Bl I P~=NFJ-L,M-m ’
GetMax( 1,N,MAX);
write(MAX);
end;
procedure GetMax;
input
F,L: integer;
MX: reference integer;
declare Ml&C&MD: integer;
begin
if F+l=L then PairMax(S[FT],S[L],MX)
else begin
MD := (F+L) DIV 2;
GetMax(F,MD,Ml);
GetMax(MD+ 1,L,M2);
B3
(PairMax(Sm,SL],MX)
B4
MD:=(F+L)/‘L
B5
1
GetMax(F,MD,Ml)
B6
4
GetMax(MD+l,L,M2)
B7
PairMax(M1 ,M2,MX)
4
PairMax(Ml,M2,MX);
endif;
end;
procedure Pair&,
input IJ,K: reference integer;
begin
if I>J then K := I
else K := J;
end;
Figure 1: Example Program and Flow Graph for Procedure GetMax
ering the possible uses of definitions along the calling
sequences. For efficiency, this information must be computed prior to the testing by considering local information
and propagating it throughout the module instead of computing it during testing.
Assuming that the interprocedural data dependencies have been computed, another problem involves guiding the testing of the required interprocedural definitionuse pairs. Consider the definition of K in PairMax and its
use of MAX in Main. To determine the definition-clear
subpath from the definition to the use of a variable
requires traversal through several procedures where the
name of the variable changes. Specifically, in this example, K in PairMax, MX in GetMax, and MAX in Main are
all names for the same memory location. Thus, accessto
information about the binding of actual and formal parameters at call and return sites is required.
test case, several interprocedural definition-use pairs
remain untested. For example, there are two definitions of
K in procedure PairMax and both of them reach the use of
MX in B8 but the test casetestsonly one of them. To test
the integration of procedures, information about the locations of uses of definitions that reach across procedure
boundaries is required. With this information, definitionuse pairs include uses of a definition that occur as a result
of procedure calls. This allows better selection of test
data to test the integration of procedures.
In addition to the direct dependencies that exist
between definitions and uses in GetMax and PairMax that
are discussed above, indirect dependenciesexist between
Main and PairMax. For example, since formal parameter
K in PairMax is returned to GetMax and subsequently to
Main, there is a data dependency between the definitions
of K in PairMax and the use of MAX in Main. Determining the indirect dependenciesis accomplished by consid-
160
3. DESIGN ISSUES
The design of an inter-procedural data flow tester
includes issues concerned with both a technique to compute the interprocedural data dependenciesand a method
to guide the testing process. This section discussesthese
issues.
Existing interprocedural data flow analysis techniques [2-7,9,17] vary in the type of the information provided and the efficiency of gathering the information.
Most of the techniques concentrate on providing sets of
variables that are used or modified by procedure calls
using either flow sensitive or flow insensitive information.? Although this information is useful in optimizations
and parallelization, it does not provide the locations of the
definitions and uses of variables that reach across procedure boundaries. Thus, to determine the test case
requirements for interprocedural data flow testing, a new
method for computing the interprocedural definition-use
information is required. This involves the development of
an efficient representation of the procedures within a
module and an algorithm to propagate data flow information throughout a module, taking into account reference
parameters, global variables and recursive procedure
calls.
One possible way to represent the program is by
in-line substitution of procedures at call sites. In addition
to the obvious problem of the memory requirements, inline substitution has other inherent problems. Both scoping of local variables in proceduresand binding of formal
and actual parameters are difficult because the entire
module is viewed as one procedure. Additionally, recursive procedures cannot be represented. Another possible
representation is the traditional cull graph of a module
where nodes in the graph represent procedures and edges
represent call sites. However, the call graph is not
sufficient for computing the definition-use information
across procedure boundaries because it has no return
information and provides no information about the control
flow in individual procedures. Other representations,such
as the program summary gruph[SJ and the super
graph[l7] provide this return and control flow information, With the super graph, each call to a procedure
accessesa single copy of the procedure. This eliminates
some of the memory problems of in-line substitution but
still may be prohibitive for large programs. The program
summary graph summarizes some of the required information at call sites but this information does not indicate
the locations of definitions and usesthat reach acrossprocedure boundaries.
Closely related to the choice of the module
representation is the design of an algorithm that propagates the summarized information throughout the
module. Here, the requirements are that the algorithm be
efficient in both memory requirements and execution
time, and that it handles inter-procedural definition-use
computation for both recursive and nonrecursive procedures. Definitions that reach across procedure boundaries include definitions of global variables that reach a
call or return site, definitions of actual parameters that
reach a call site and definitions of formal parametersthat
reach a return site. A similar situation exists for uses.
Most of the existing interprocedural data flow analysis
techniques make worst case assumptions at the call sites
that involve recursion due to the fact that incomplete
information is known about the called procedure. Thus,
new techniquesare required to handle theseproblems.
Other problems deal with guiding the testing of a
module to meet the test caserequirements that were computed using the results of the inter-procedural data flow
analysis. If only global variables are to be tested, a tool
similar to that employed for intraprocedural data flow testing [lo] can be used. A procedure is instrumented to
record the execution path that occurs when the procedure
is executed with particular test data as input. The tool
then searchesthe execution path for the desired definition
and use, making sure that the variable is not redefined on
the subpathbetween them. Thus, an important component
of this tool is the information about the locations of all
definitions of a variable being considered. With global
variables, the names of the variables remain the same
throughout the module and thus, the locations of the
definitions can be easily computed during interprocedural
analysis and usedin the testing.
However, for reference parameters, this is not the
case. While searching for a definition and use pair, the
execution path moves from procedure to procedure, causing the name of a definition to change. In order to ensure
that the subpath from the definition to the use has no
redefinition of the associatedvariable, the pairings of the
actual and formal parameters must be handled when the
execution path reaches a call or return site. A problem
occurs when a definition and use are separated by a
number of procedure calls. The actual parameter associated with a use in a procedure must be bound to the
appropriate formal parameteron the returns along the call
chain.
4. INTERPROCEDURAL DATA FLOW TESTING
The interprocedural data flow testing is performed
in two parts: (1) static analysis of the module to compute
the interprocedural definition-use information for the test
caserequirements and (2) dynamic testing that guides the
testing of the module to meet the requirements, Sections
t A technique is jbw sensifive if it incorporates information about
the flow of control in the called procedures and J’LW imemifive
otherwise.
161
4.1 and 4.2 describe our interprocedural tester. To simplify our discussion, we assumethat the user has chosen
the ‘all-uses’ criterion for the testing and that only the
definitions of reference parameters that have interprocedural usesare considered. Global variables can be handled similarly.
representprocedure entry and procedure exit respectively.
Both nodesare created for every formal reference parameter of every procedure. Call and return nodes represent
procedure invocation and procedure return respectively
and both are created for every actual parameter of every
call site. Edges from call nodes to entry nodes and exit
nodes to return nodes correspond to the binding of formal
and actual parameters. Reaching edges from entry and
return nodes to call and exit nodes summarize the control
information in the procedure by indicating that a
definition that reaches the source of an edge also reaches
the sink of the edge. This reaching edge is strictly
intraprocedural since it is computed without incorporating
the control structure of called procedures by using ‘best
case’ assumptions at call sites. (Best case is to assume
that there is no definition or use of the variable in the
called procedure and that the variable is not preserved
over the call). Inter-reaching edges from call nodes to
return nodes that abstract the reaching information of the
called procedure allow this reaching information to be
incorporated into the calling procedure. This edge is
added to the IFG if the variable is preservedU61 over a
call to the procedure. The inter-reaching edges allow the
calling context of the called procedures to be preserved
during the propagation.
Figure 2 gives the partial IFG for the module in Figure 1. Here, we show the part of the graph that represents
reference parameter MX in procedure GetMax and the
reference parameters for procedure PairMax at the call
sites in B5, B6 and B7. The rest of the IFG is similar. In
the graph, circles represent call and return nodes, double
circles represent entry and exit nodes, solid lines
correspond to binding edges, dashed lines to reaching
edgesand bold lines to inter-reaching edges. Thus, nodes
3,5,7,9 and 11 are call nodeswhile nodes4,6,8, 10 and
12 are their corresponding return nodes respectively.
Entry and exit pairs are nodes (1,2), (13,14), (15,16),
and (17.18). Reaching edge (1,ll) indicates that a
definition of MX that reaches the entry to GetMax also
reaches the call to PairMax where it is used as a parameter. Likewise, the reaching edge (12,2) indicates that a
definition of MX that reaches the return from PairMax
also reachesthe exit from GetMax. The flow of control in
PairMax for formal reference parametersI and J is summarized by reaching edges (13.14) and (15,16). The last
set of reaching edges, (4,7) and (6,9) indicate that a
definition of the parameter that reaches the return from
one procedure subsequently reaches another call site
where it is used as a parameter. Finally, inter-reaching
edges (7,8) and (9,lO) indicate that definitions of the
actual parameter that reach the call to PairMax are still
available on the return from PairMax.
In step 1, as procedures are processed and IFG
nodes are created, definition and use information about
4.1. Computing the Interprocedural Data Flow Information
Our technique to compute the interprocedural
definition-use information[l2] representsthe module by a
graph (IFG), that is based
graph, the interproceduraljow
on the program summary graph. [5] Our algorithm computes the interprocedural definition-use information using
the IFG for modules with both recursive and nonrecursive
procedures. The technique has four steps.
Step I: Construction of IFG subgraphs to abstract control flow information for each procedure in the program.
A subgraph is constructed for each procedure where
nodes represent regions of code associated with points
that are of interest interprocedurally, and edgesrepresent
the control flow in the procedure. Local information is
computed for non-local variables and is attached to
appropriate nodes in the graph.
Step 2: Construction of an IFG to represent the interprocedural control flow in the program. The subgraphs of
the procedures, obtained in step 1, are combined to create
the IFG which is constructed by creating edges that
represent the bindings of formal and actual parametersin
both called and calling procedures. Preservedinformation
is computed for each procedure, using the IFG, and
edges that represent this information are added to the
graph.
Step 3: Propagation throughout the graph to obtain global information. The local information at each node is
propagated in two phases throughout the graph resulting
in the interprocedural definitions that reach, and the interprocedural uses that can be reached from, the parts of the
program representedby the node in the graph.
Step 4: Computation of the interprocedural def-use and
use-def chains. Interprocedural def-use and use-def
chains are computed using both the local information and
the propagatedglobal information.
4.1.1. Steps 1 and 2: Constructing the Interprocedural
Flow Graph
As procedures are processed one at a time in any
order, the results of the intraprocedural data flow analysis
are used to construct the IFG. The IFG has four types of
nodes: entry, exit, call and return. Entry and exit nodes
162
GetMax
7
j
PairMax
/
.?I
I
L.67J
I U1=(B11,(BlO,B11),(B10,B12))
1 U2=(B12,(B1O,B11),(B10,B12))
1 U3=(B8)
/ U4=(B8,B11,B12,(BlO,B11),(B10,B12))
Figure 2: Partial IFG for Module in Figure 1
formal and actual parameters in a module is recorded.
Sets containing information about definitions (i.e., DEF)
and uses (i.e., UPEXP) are attached to the nodes in the
IFG. These sets contain data flow information about a
procedure. They are roughly analogous to to the ‘gen’
and ‘use’ sets that are computed during intraprocedural
data flow analysis for each basic block in the flow graph
of a procedure[l] (i.e., they represent local information
about certain parts of the module that is propagated
throughout the graph). We limit our discussion to the
computation of the UPEXP sets. Analogous techniques
are used to compute the DEF sets. The UPEXP sets are
attached to entry and return nodes. The UPEXP of an
entry node is the set of uses of the formal reference
parameter that can be reached from the beginning of the
procedure; the UPEXP of a return node is the set of uses
of the actual reference parameterthat can be reachedfrom
the return from a procedure. To compute the UPEXP set
of a procedure, the control flow graph of the procedure is
constructed and intraprocedural data flow analysis is performed. Again, consider the IFG in Figure 2 where the
UPEXP sets (i.e., Ul,..., U4) are attached to the entry and
call nodes. Consider entry node 13 in the IFG which
represents the entry into procedure PairMax for formal
reference parameter I. In the procedure, I is used in Bll
and on edges (BlO,Bll) and (BlO,B12) (refer to Figure
3). Thus, the UPEXP set attached to node 13 is
(B1l,(BlO,B11),(B10,B12)). Another example is entry
163
-. .
,_
Bll
K := I
B12
flow is in the backward direction in the graph. To
preserve the calling context, propagation is restricted over
certain edges in the graph. For example, propagation of
the UPEXP sets is restricted to the reaching edges, the
inter-reaching edges and the return binding edges. The
call binding edges are excluded to preserve the calling
context of the called procedures.
To illustrate, consider the results of the computation
of the OUT, of node 18 in Figure 2. Since there is a path
from this node to node 13 (i.e., 18,12,2,4,7,13), the
UPEXP set of node 13 is propagatedthroughout the graph
and added to the OUT, of node 18. Other nodes with
UPEXP setsreachable from node 18 are nodes 15,9,6,4,
7 and 13. The UPEXP setsof all of these nodesare added
to the OUTo of node 18. Thus, the OUT, of node 13 is
the set (B8,B11,B12,(BlO,B11),(B10,B12)) which
represents the uses of parameter I reachable from the
definition of K in the call to PairMax.
K:=J
B 13 ~return(I,J,K)1
I
1
Figure 3: Flow Graph for PairMax
node 1 which representsthe entry into procedure GetMax
for formal reference parameter MX. MX has a local use
in B8 and thus, UPEXP, attachedto node 1, contains B8.
During this tirst step, the intraprocednml data flow
information is also used to determine the existence of
definition-clear subpaths that are required to create the
reaching edges. Since there is a definition-clear subpath
with respect to MX from the beginning of GetMax to the
call to PairMax in B7, reaching edge (1,ll) is created.
Another reaching edge is (13,14) which representsthe fact
that there is a definition-clear subpatbwith respect to variable I from the beginning to the end of PairMax.
After all procedures have been processed, step 2
consists of constructing the IFG by creating the appropriate binding edges among actual and formal parametersin
the procedures. In Figure 2, call binding edges (7,13),
(9,15), (11,17) and return binding edges (14,8), (16,10),
(18,12) are added at this time. The inter-reaching edges
are obtained by processing the graph using an iterative
algorithm[5] to determine whether, for each entry node,
the formal parameter is preserved. Inter-reaching edges
are created for each call-return pair whose associated
entry node is preserved. In Figure 2, edges (7,8) and
(9,lO) are added in step 2.
4.1.3. Step 4: Computation of the Interprocedural
Def-Use and Use-Def Chains
With the OUT, attached to all nodes in the graph,
the interprocedural def-use pairs can be computed. In
procedure PairMax, there are definitions of K in B 11 and
B12 which reach the associatedexit node 18 in the IFG.
Thus, the usesthat can be reached from this definition are
those in the OUT, of node 18. These are the usesof I, J
and MX in B8, B 10, B 11 and B 12. Thus, we get the following definition-use associationsfor testing.
1
variable
1 definition 1
uses
I
(BlO,B12)
4.1.4. Complexity Analysis
The IFG inherits its space requirements from the
program summary graph whose size is proportional to the
length of the program.151The time complexity of the
algorithm is determined by considering each of the four
steps. Clearly, in the first step, the creation of the graph
requires one visit to each of the n nodes in the IFG. The
last step in the algorithm is performed by considering the
definition in each DEF set and combining the appropriate
OUT, sets to get the inter-proceduraldef-use chains. This
step also requires one visit to each node during the computation. In step two the preserved information that is
required for the inter-reaching edges is computed. For
modules with no recursion, this computation is linear in
the number of nodes in the IFG. The propagation of the
4.1.2. Step 3: Computing the Reaching Definitions
and Reachable Uses
The third step consists of propagating the local
definitions throughout the module using the IFG to obtain
the sets of interprocedural reaching definitions (IN,, and
OUT,) and reachable uses (IN, and OUT,) for eachnode
in the graph. For reaching definitions, the algorithm propagates the DEF sets forward in the graph as far as they
reach. Likewise, the algorithm propagates the UPEXP
sets in the graph as far as they can be reached except the
164
recursive procedures, the algorithm performs in linear
time.
For programs with recursive procedures, steps two
and three depend on the number of procedures. For step
two, in the worst case,the processing may visit each node
p times where p is the number of procedures in the program and thus, the time is O(pn). In step three, p iterations may be required to obtain the IN and OUT sets.
Thus, the algorithm is O@n) when recursion is present.
algorithm Accept
input
TRACEFILE: file of block numbersof execution path;
VAR: variable being defined;
DBLK: block number of definition;
UBLKl: block number of c-use (or sourceof edgeof p-use);
UBKL2 block number of sink of edgeof p-use;
TY: c-use or p-use;
declare
B: block number;
DEFSET: blocks in current procedure where VAR is defined;
PROCSTACK: stack to store call chain;
CONTINUE: boolean;
begin
ACCEPT := false; PROCSTACK := NULL;
DEFSET := SetOfBlocks(VAR);
while not eof(TRACEFlLE) or not ACCEPT do
repeat
B := GetNextBlock(TRACEFILE);
if IsACall(B) then Push(PROCSTACK,B)
elseif IsARetum(B) then Pop(PROCSTACK);
until eof(TRACEFILE) or B = DBLK;
if B = DBLK then
4.2. INTERPROCEDURAL DATA FLOW TESTING
After the interprocedural data flow information is
computed and the required definition-use pairs are determined, the tester guides the selection and execution of the
module with test casesas input. This consists of choosing
the required definition-use pairs according to the desired
testing criterion and processing the test cases until the
required pairs are satisfied. The required definition-use
pairs depend on the desired testing criterion. For example, if the criterion is ‘all-p-uses/some-c-uses’,the tester
runs the acceptor with all definition-p-use pairs. If for a
particular definition, no p-use exists, then some
definition-c-use pairs must be accepted.
Processing a test case consists of (1) executing the
module with the test data as input to get the test path and
(2) running the test case acceptor with the test path and a
definition-use pair as input. Our tester instruments the
module at the intermediate code level. Intermediate code
statementsare inserted that output to a file the number of
each basic block that is traversed during module execu-
B := Get.NextBlock(TRACEFILE);
CONTINUE := true;
while CONTINUE do
if IsACall(B) then
Push(PR0CSTACK.B);
VAR := GetFormalName(VAR,B);
DEFSET := SetOfBlocks(VAR);
elseif IsAReturn(B) then
VAR := GetActualNsme(VAR,Pop(PROCSTACK));
DEFSET := SetOfBlocks(VAR);
elseif B = UBLKl then
if TY = c-use then ACCEPT := true;
tion.
However, the execution path must also contain
information that signals procedure calls and returns so that
the renaming of formal and actual reference parameters
can be handled. We accomplish this by instrumenting the
intermediate code to indicate procedure calls and returns.
The algorithm for the acceptor is given in Figure 4. It
inputs the file containing the block numbers of the execution path (TRACEFILE) along with the information about
the definition-use pair that is to be tested (VAR, DBLK,
UBLKl, UBLK2, TY). VAR represents the variable
being considered and DBLK representsthe block number
containing the desired definition. TY is either c-use or puse. If TY is c-use, then UBLKl contains the block
number of the desired use and UBLK2 is not used; if TY
is p-use, then (UBLKl, UBLK2) represents the desired
edge. Since we also have accessto the data flow information that was computed in the first phase, it is used to
determine the set of blocks containing definitions, DEFSET, of the actual parameter at returns from procedures
and the formal parameterat the entries to procedures.The
entry into a procedure causesthe DEFSET to be changed
to reflect the renaming of the formal parameter that is
bound to the definition being tested. The exit from a procedure causes the DEFSET to revert back to its value in
elseif TY = p-use then
B := GetNextBlock(TRACEFILE);
if B = UBLK2 then ACCEPT := true;
endiP;
elseif B E DEFSET then CONTINUE := false;
endif;
B := GetNextBlock(TRACEF);
endwhile;
endif;
endwhile;
Accept := ACCEPT;
end
Figure 4: Algorithm Accept
local information throughout the graph is accomplishedin
step three. If no recursion is present, only one iteration of
the algorithm is required to compute the global information. This requires a reverse topological ordering of the
nodes in each procedure along with an invocation ordering on the procedures themselves. Thus, for non-
165
the calling procedure.
To illustrate, consider the test case with input
4,8,9,2 where the definition-use pair being processedis
the definition of K in B12 of procedure PairMax and the
use is the edge (BlO,Bll). This definition of K reaches
the use of I on edge (B 10,B11) since it reaches over the
return to calling procedure GetMax, over the return to
GetMax through the recursive call, over the call to GetMax again, and finally over the call to PairMax where the
variable is bound to I. The steps involved in processing
the test path are shown below. The DEFSET is initialized
to the set of definitions of the variable corresponding to
the definition that is being tested. At each call to and
return from a procedure, the DEFSET is changedto reflect
the new name of the actual or formal parameter. Since
there are no interprocedural definitions of the corresponding variable MX in GetMax, the DEFSET is empty while
processing the blocks in that procedure. The partial test
paths are indented to show the procedure nesting along
the execution path.
execution subpath
1245GetMax
9, , 9
1,2,3,PairMax
I
I
action taken
(Bll,B12)
Push(S)
{Bll,B12)
1 DBLKfound
1 (BlLB12)
1
13,Retum
1 Pap(3); reset VAR;
assign DEFSET
(
I
9,10,11
Push(3); reset VAR;
assign DEFSET
use edge found
=> accept test case
In order to avoid retesting of all interprocedural
data dependencies in a module when a modification is
made to one procedure, algorithms have been developed
that efficiently retest the modified procedure and its
interactions by first identifying and then retesting only
those interactions of the module affected by the change.
The need for complete retesting is eliminated.[l2,13] The
nodes and edgesof the portion of the IFG associatedwith
the modified procedure are updated to reflect the changes.
The control flow graph of no procedure other than the
modified procedure needsto be recomputed and the structure of the IFG remains unchanged for these unmodified
procedures. This algorithm determines the affected
definition use pairings which are then reported to the user
for retesting.
The interprocedural data flow tester described in
this paper illustrates our technique for testing with criteria
other than ‘all-du-paths’. Although requiring that the
‘all-du-paths’ criterion be satisfied may be prohibitive in
most casesbecauseof the numerous subpaths that would
exist interprocedurally, some applications justify this level
of testing. In this case, additional information about the
subpaths involving the interprocedural definitions and
uses is obtained during intraprocedural analysis and used
to construct the interprocedural du-paths that are required
for the testing.
Future work includes experimentation with the data
llow tester to determine the cost of applying interprocedural data flow testing, especially when changes are
made within a module. In addition, we also intend to
apply the same principle of summarizing information to
determine the problems of developing an intermodular
data flow testing technique. Also, although aliases can be
incorporated into the original program summary graph,
we have yet to incorporate it into our testing system.
1 DEFSET
Push(S)
9,10,12
1,2,3,PairMax
The interprocedural data flow technique has been
implemented and tested on a Sun 3/50 Workstation. Additionally, the interprocedural definition-use information has
been used in the the implementation of the interprocedural
tester that utilizes the data flow information.
Q
I (B11vB12) I
(Bll,B12)
5. CONCLUSIONS
In this paper, we have extended data flow testing to
include the testing of definitions that reach, and uses that
can be reached, acrossprocedure calls and returns. To do
so, required the development of an interprocedural data
flow analysis technique that computes the locations of
interprocedural reaching definitions and use pairings and
the development of a testing methodology that associates
actual and formal parameters in calls and returns. The
benefit of the resulting testing system is that data flow
testing can be uniformly applied to individual procedures
for the integration of procedures in a module and to the
interfaces of procedures.
References
1.
A. V. Aho, R. Sethi, and J. D. Ullman, in Compilers, Principles, Techniques, and Tools, Addison-
2.
166
Wesley Publishing Company, Massachusetts,1986.
F. E. Allen, “Interprocedural data flow analysis,”
in IFIP Information Processing 74, North-Holland
Publishing Company, 1974.
3.
14.
B. Korel and J. Laski, “A tool for data flow
oriented program testing,” ACM Softfair Proceedings, pp. 35-37, December 1985.
15. J. W. Laski and B. Korel, “A data flow oriented
program testing strategy,” IEEE Transactions on
Software Engineering, vol. SE-g, no. 3, pp. 347354, May 1983.
16. D. B. Lomet, “Data flow analysis in the presenceof
procedure calls,” IBM Journal of Research and
Development, vol. 21, no. 6, pp. 559-571,
November 1977.
17. E. W. Myers, “A precise inter-procedural data flow
algorithm,” Conference Record of the Eighth
J. P. Banning, “An efficient way to find the side
effects of procedure calls and aliases of variables,”
Sixth Annual ACM Symposium on Principles of
Programming Languages, pp. 29-41, January 1979.
4.
5.
J. M. Barth, “A practical interprocedural data flow
analysis algorithm,” CACM, vol. 21, no. 9, pp.
724-736, September1978.
D. Callahan, “The program summary graph and
flow-sensitive interprocedural data flow analysis,”
Proceedings of the SIGPLAN’88 Conference on
Programming Language Design and Implementation, pp. 47-56, Atlanta, GA, June 1988.
6.
M. D. Carroll and B. G. Ryder, “An incremental
algorithm for software analysis,” Proceedings of
Annual ACM Symposium on Principles of Programming Languages, pp. 219-230, Williamsburg, VA,
the SIGSOFTISIGPLAN Software Engineering Symposium on Practical
Software Development
Environments, SIGPLAN Notices, vol. 22, no. 1,
7.
January 1981.
18. S. C. Ntafos, “An evaluation of required element
testing strategies,” Proceedings 7th International
Conference on Sofhvare Engineering, pp. 250-256,
Orlando, Florida, March 1984.
19. S. Rapps and E. J. Weyuker, “Selecting software
test data using data flow information,” IEEE Transactions of Software Engineering, vol. SE-11, no. 4,
pp. 367-375, April 1985.
January 1987.
M. D. Carroll and B. G. Ryder, “Incremental data
flow analysis via dominator and attribute updates,”
Proceedings of the Fifteenth Annual ACM
SIGACTISIGPLAN Symposium on Principles of
Programming Languages, San Diego, CA, January
8.
9.
1988.
L. A. Clarke, A. Podgurski, D. Richardson, and S.
Zeil, “A comparison of data flow path selection criteria,” Proceedings 8th International Conference
on Software Engineering, pp. 244-251, London,
UK, August 1985.
K. Cooper and K. Kennedy, “Interprocedural sideeffect analysis in linear time,” Proceedings of the
SIGPLAN’88
Conference
on
Programming
Language Design and Implementation, pp. 57-66,
Atlanta, GA, June, 1988.
10. P. G. Frankl, S. N. Weiss, and E. J. Weyuker,
“ASSET: A system to select and evaluate tests,”
Proceedings of the IEEE Conference on Software
Tools, New York, April 1985.
11. P. G. Frankl and E. J. Weyuker, “An applicable
family of data flow testing criteria,” IEEE Transactions on Software Engineering, vol. 14, no. 10, pp.
1483-1498,October 1988.
M.
J. Harrold, “An approach to incremental test12.
ing,” Technical Report 89-l Department of Computer Science, University of Pittsburgh, January
1989.
13. M. J. Harrold and M. L. Soffa, “An incremental
data flow testing tool,” Proceeding of the Sixth
International
Conference on Testing Computer
Software, Washington, DC, May 1989.
167
>,
..
._