27.03.2014 Views

SEKE 2012 Proceedings - Knowledge Systems Institute

SEKE 2012 Proceedings - Knowledge Systems Institute

SEKE 2012 Proceedings - Knowledge Systems Institute

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

esults. To be more specific, they will lead to high false<br />

negative and false positive rates, respectively.<br />

5 RELATED WORK<br />

Voas [2] introduced the PIE model, which emphasizes that<br />

for a failure to be observed, the following three conditions must<br />

be satisfied: “Execution”, “Infection”, and “Propagation”. W.<br />

Masri et al. [3] have proved that coincidental correctness is<br />

responsible for reducing the safety of CBFL.<br />

As shown in the previous studies [4, 5], the efficiency and<br />

accuracy of CBFL can be improved by cleaning the<br />

coincidentally correct test cases. However, it is challenging to<br />

identify coincidental correctness because we do not know the<br />

location of fault beforehand. X. Wang et al. [4] have proposed<br />

the concept of context pattern to help coverage refinement so<br />

that the correlation between program failures and the coverage<br />

of faulty statements can be strengthened. W. Masri et al. [5]<br />

have presented variations of a technique that identify the subset<br />

of passed test cases that are likely to be coincidentally correct.<br />

One of these techniques first identifies program elements (cc e )<br />

that are likely to be correlated with coincidentally correct test<br />

cases. Then it categorizes test cases that induce some cc e s as<br />

coincidental correctness. The set of coincidentally correct test<br />

cases would be partitioned into two clusters further. A more<br />

suspicious subset will be cleaned to improve the effectiveness<br />

of fault localization. The experimental result is promising,<br />

however, although it used the same subject program (the<br />

Siemens test suite) as ours in their experiment, it is applicable<br />

to only 18 versions of the 132 versions, which has a smaller<br />

application scope than our approach (applicable to 66 out of<br />

132 versions).<br />

Previous empirical observations have shown that, by cluster<br />

analysis, test cases with similar behaviors could be grouped<br />

into the same clusters. Therefore, cluster analysis has been<br />

introduced for test case selection. Vangala et al. [12] used<br />

program profiles and static execution to compare test cases and<br />

applied cluster analysis on them, identifying redundant test<br />

cases with high accuracy. Dickinson et al. introduced cluster<br />

filtering technique [7, 13]. It groups similar execution profiles<br />

into the same clusters and then selects a subset of test cases<br />

from each cluster based on a certain sampling strategy. Since<br />

test cases in the same cluster have similar behaviors, the<br />

subsets are representative for the test suite so that it is able to<br />

find most faults by using the selected subsets instead of the<br />

whole test suite.<br />

6 CONCLUSIONS AND FUTURE WORK<br />

In this paper, we proposed a clustering-based strategy to<br />

identify coincidental correctness from the set of passed test<br />

cases. To alleviate the adverse effect of coincidental<br />

correctness on the effectiveness of CBFL, two strategies, either<br />

removing or relabeling, were introduced to deal with the<br />

identified coincidentally correct test cases. We conducted an<br />

experiment to evaluate the proposed approach. The<br />

experimental results suggested that it achieved approximate<br />

results as the ideal situation did.<br />

We intend to conduct more comprehensive empirical<br />

studies and explore the following issues in our future work:<br />

1) Search for better clustering algorithms to fit in with this<br />

scenario. As denoted in section 4.4, the failed test cases<br />

clustered either too centralized or too scattered would lead to<br />

poor results. We use K-means in our experiment for its<br />

simplicity, and there are many other clustering algorithms need<br />

to be explored.<br />

2) Conduct empirical studies on how multiple-faults affect<br />

the result of our approach and explore how to deal with this<br />

situation to minimize the adverse effects.<br />

REFERENCES<br />

[1] J.A. Jones and M. J. Harrold, “Empirical evaluation of the tarantula<br />

automatic fault-localization technique”, ASE, 2005, pp.273-282.<br />

[2] J.M. Voas, “PIE: A dynamic failure-based technique”, IEEE Trans.<br />

Softw. Eng., 1992, pp.717-727.<br />

[3] W. Masri, R. Abou-Assi, M. El-Ghali and N. Fatairi. Nour, “An<br />

empirical study of the factors that reduce the effectiveness of coveragebased<br />

fault localization”, ISSTA, 2009, pp.1-5.<br />

[4] X. Wang, S. Cheung, W. Chan and Z. Zhang, “Taming coincidental<br />

correctness: Coverage refinement with context patterns to improve fault<br />

localization”, ICSE, 2009, pp.45-55.<br />

[5] W. Masri and R. Abou-Assi, “Cleansing test suites from coincidental<br />

correctness to enhance fault-localization”, ICST, 2010, pp.165-174.<br />

[6] S. Yan, Z. Chen, Z. Zhao, C. Zhang and Y. Zhou, “A dynamic test<br />

cluster sampling strategy by leveraging execution spectra information”,<br />

ICST, 2010, pp.147-154.<br />

[7] W. Dickinson, D. Leon and A. Podgurski, “Finding failures by cluster<br />

analysis of execution profiles”, ICSE, 2001, pp.339-348.<br />

[8] C. Zhang, Z. Chen, Z. Zhao, S. Yan, J. Zhang and B. Xu, “An improved<br />

regression test selection technique by clustering execution profiles”,<br />

QSIC, 2010, pp.171-179.<br />

[9] R. Abreu, P. Zoeteweij, R. Golsteijn, A. J.C. van Gemund, “A practical<br />

evaluation of spectrum-based fault localization”, Journal of <strong>Systems</strong> and<br />

Software, Volume 82, Issue 11, 2009, pp. 1780-1792.<br />

[10] B. Liblit, M. Naik, A. Zheng, A. Aiken and M. Jordan, “Scalable<br />

statistical bug isolation”, PLDI, 2005, pp.15-26.<br />

[11] C. Liu, X. Yan, L. Fei, J. Han and S. Midkiff, “SOBER: statistical<br />

model-based bug localization”, ESEC/FSE, 2005, pp.286-295.<br />

[12] V. Vangala, J. Czerwonka and P. Talluri, “Test case comparison and<br />

clustering using program profiles and static execution”, ESEC/FSE,<br />

2009, pp. 293-294.<br />

[13] W. Dickinson, D. Leon and A. Podgurski, “Pursuing failure: the<br />

distribution of program failures in a profile space”, ESEC/FSE, 2001, pp.<br />

246-255.<br />

[14] V. Debroy, W. Eric Wong, X. Xu, B. Choi, “A grouping-based strategy<br />

to improve the effectiveness of fault localization techniques”,<br />

QSIC,2010, pp.13-22.<br />

[15] T. Denmat, M. Ducassé and O. Ridoux, “Data mining and crosschecking<br />

of execution traces: a re-interpretation of Jones, Harrold and<br />

Stasko test information”, ASE, 2005, pp. 396-399.<br />

[16] L. Naish, H. J. Lee, and K. Ramamohanarao, A model for spectra-based<br />

software diagnosis, TOSEM, in press.<br />

[17] Software-artifact Infrastructure Repository. http://sir.unl.edu/, University<br />

of Nebraska.<br />

[18] http://gcc.gnu.org/onlinedocs/gcc/Gcov.html<br />

[19] http://en.wikipedia.org/wiki/Euclidean_distance<br />

272

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!