27.03.2014 Views

SEKE 2012 Proceedings - Knowledge Systems Institute

SEKE 2012 Proceedings - Knowledge Systems Institute

SEKE 2012 Proceedings - Knowledge Systems Institute

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Therefore, the calculation of RT S under Partial-CATESR can<br />

be given by the following process.<br />

RT S ←<br />

if TS HC ≠ ∅ then RT S HC<br />

else if TS MC ≠ ∅ then RT S MC<br />

else RT S LC<br />

(3)<br />

Full-CATESR is a naive strategy by simply combining three<br />

sub test suites into one test suite. The construction process of<br />

RT S can be defined as follows:<br />

RT S ← RT S HC ∪ RT S MC ∪ RT S LC (4)<br />

IV. EMPIRICAL STUDIES<br />

To evaluate the effectiveness of our CATESR approach, we<br />

perform empirical studies to answer the following research<br />

questions:<br />

RQ1: To what extent can CATESR approach decrease the<br />

size of reduced test suite compared to HGS approach?<br />

RQ2: Can the FDA be kept, when CATESR approach<br />

generates a relatively smaller reduced test suite, comparing<br />

to HGS approach?<br />

RQ3: Can the FDA be weakened due to the partial coverage<br />

of test requirements?<br />

A. Subjects and Experiment Setup<br />

1) Experiment Subjects: We adopt seven small C programs<br />

in Siemens suite and one large-scale C program named space<br />

in our empirical study. The Siemens suite is originally contributed<br />

by Ostrand et al. for a study of the fault detection<br />

abilities of control-flow and data-flow coverage criteria [12],<br />

and has been partially modified by researchers for further<br />

studies. space is a program for interpreting statements written<br />

in some specific array definition language (ADL). Each subject<br />

contains a single correct version and a set of versions with a<br />

single fault.<br />

TABLE II<br />

EXPERIMENT SUBJECTS<br />

Subject # Test Cases # Versons LOC Description<br />

printtok 4130 7 402 Lexical Analyzer<br />

printtok2 4115 10 483 Lexical Analyzer<br />

replace 5542 29 516 Pattern Replacement<br />

schedule 2650 9 299 Priority Scheduler<br />

schedule2 2710 9 297 Priority Scheduler<br />

tcas 1608 40 138 Altitude Separation<br />

totinfo 1052 23 346 Information Measure<br />

space 13585 35 6218 ADL Interpreter<br />

The characteristics of these subjects are summarized in<br />

Table II. Each subject contains a large test pool with at least<br />

1052 test cases, and 13585 at most. Each subject contains<br />

multiple single-faulty versions, with the count between 7 and<br />

40. For subjects in Siemens suite, the number of lines of code<br />

(LOC) ranges from 138 to 516, which is relatively small comparing<br />

to practical programs. Therefore, we introduce space,<br />

which contains 6218 LOC, to verify the scalability of our<br />

CATESR approach. These subjects are available from Subject<br />

Infrastructure Repository (SIR) at University of Nebraska-<br />

Lincoln 2 [4].<br />

2) Experiment Setup: Since there exists randomness in<br />

our approach, we independently perform the experiment 1000<br />

times on each faulty version for each subject. During each<br />

iteration, we firstly generate a test suite by randomly choosing<br />

test cases from the test pool, then we adopt both Partial-<br />

CATESR and Full-CATESR in test suite reduction. To show<br />

the effectiveness of our approach, we also implement HGS<br />

algorithm [7] as a baseline. When randomly generating a test<br />

suite, we firstly determine the size n of test suite. The value<br />

of n is determined by LOC of the subject timing a random<br />

number ranging from 0 to 0.5. Then we randomly choose n<br />

test cases from the test pool and construct TS. Finally, when<br />

test cases in TS cannot cover all feasible requirements covered<br />

by the test pool, we will add additional test cases by randomly<br />

choosing test cases from the remainder of the test pool. If no<br />

test case in TS can detect the fault in this faulty version, we<br />

will discard TS and regenerate the test suite. This experiment<br />

design is motivated by Rothermel et al. [19].<br />

B. Results and Analysis<br />

In this subsection, we analyze the data gathered from our<br />

experiments to answer RQ1, RQ2, and RQ3.<br />

1) Experiment Metrics: In our empirical study, we mainly<br />

focus on two metrics: time cost and FDA.<br />

In real testing scenarios, time cost includes test environment<br />

setup, test case execution, and test result examination. Since<br />

test suite reduction can significantly reduce the time cost of<br />

regression testing, here we only consider the size of test suite<br />

as a metric to measure the time cost. Consequently, we can<br />

adopt the average extent of reduction to original test suite in<br />

percentage over 1000 times independent experiments, given<br />

by the following formula:<br />

TS Reduced = |TS| avg −|RT S| avg<br />

∗ 100 (5)<br />

|TS| avg<br />

To measure the effectiveness of our approach, we also need<br />

to check FDA after test suite reduction. Since each faulty<br />

version contains only one fault, we define an indicator function<br />

g f as follows:<br />

g f (TS)=<br />

{<br />

1 if fault f can be detected by TS,<br />

0 other wise.<br />

Consequently, we can take the average value g f (TS) of<br />

indicator over 1000 experiments as the metric of fault detection<br />

ability, denoted by FDA.<br />

2) Test Suite Size Analysis: To answer RQ1, we analyze the<br />

data, and fetch the size of reduced test suites. The reduction<br />

extent is measured by formula 5 and visualized in Figure 2.<br />

For each subject over all available faulty versions, we mark<br />

three key values to show the reduction extent TS Reduced for<br />

HGS, Partial-CATESR, and Full-CATESR,respectively in the<br />

same vertical drop line.<br />

2 http://sir.unl.edu/php/index.php<br />

(6)<br />

221

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!