27.03.2014 Views

SEKE 2012 Proceedings - Knowledge Systems Institute

SEKE 2012 Proceedings - Knowledge Systems Institute

SEKE 2012 Proceedings - Knowledge Systems Institute

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

An Empirical Study of Execution-Data Classification Based on<br />

Machine Learning<br />

Dan Hao, Xingxia Wu, Lu Zhang<br />

Key Laboratory of High Confidence Software Technologies, Ministry of Education<br />

<strong>Institute</strong> of Software, School of Electronics Engineering and Computer Science, Peking University,<br />

Beijing, 100871, P. R. China<br />

{haod, wuxx10, zhanglu}@sei.pku.edu.cn<br />

Abstract<br />

As it may be difficult for users to distinguish a passing execution<br />

from a failing execution for a released software system,<br />

researchers have proposed to apply the Random Forest<br />

algorithm to classify remotely-collected program execution<br />

data. In general, execution-data classification can be<br />

viewed as a machine-learning problem, in which a trained<br />

learner needs to classify whether each execution is a passing<br />

execution or a failing execution. In this paper, we report<br />

an empirical study that further investigates various issues<br />

in execution-data classification based on machine learning.<br />

Compared with previous research, our study further investigates<br />

the impact of the following issues: different machinelearning<br />

algorithms, the numbers of training instances to<br />

construct a classification model, and different types of execution<br />

data.<br />

1. Introduction<br />

After a software system is delivered to its different users,<br />

it may be still necessary to analyze or measure the released<br />

software system to improve software quality. Although a<br />

software system usually has been tested before delivery, the<br />

testing process cannot guarantee the reliability of the release.<br />

Before delivering a software system, developers need<br />

to test the software system in-house, assuming that the software<br />

system will be used in the field as it is tested. However,<br />

this assumption may not always be satisfied [9], due to varied<br />

running environments or unexpected inputs for instance.<br />

Consequently, a software system still needs to be tested and<br />

analyzed after it is delivered.<br />

However, as a released software system runs on sites of<br />

remote users, the testing and analysis activities on a released<br />

software system are different from those in-house. For instance,<br />

developers can hardly detect failures directly based<br />

on the behavior of a released software system because its<br />

behavior occurs on the sites of remote users. Moreover,<br />

analysis on a released software system may require instrumentation,<br />

and only lightweight instrumentation is allowed<br />

so as to incur as little as possible extra running cost. Consequently,<br />

several techniques [6, 7, 11] are proposed to support<br />

Remote execution Analysis and Measurement of Software<br />

<strong>Systems</strong> (abbreviated as RAMSS by Haran and colleagues<br />

[5]). RAMSS techniques usually collect data by instrumenting<br />

instances of a software system used by different<br />

remote users and then analyze the collected execution data.<br />

In RAMSS, a fundamental issue is to distinguish a failing<br />

execution from a passing execution. Addressing this issue<br />

helps developers understand the behavior of a released software<br />

system. For instance, developers know how often failure<br />

occurs in the field when remote users run the released<br />

software system by counting the numbers of passing executions<br />

and failing executions. Moreover, distinguishing a<br />

failing execution from a passing execution is also important<br />

for developers to debug a released software system because<br />

such information helps developers know in what circumstances<br />

failures occur.<br />

To automate distinguishing failing executions from passing<br />

ones, Haran and colleagues [5] proposed to collect execution<br />

data in the field and distinguish a failing execution<br />

from a passing execution by applying a machine-learning<br />

algorithm (i.e., the Random Forest algorithm) to the collected<br />

execution data. According to the empirical study<br />

conducted by Haran and colleagues [5], their approach is<br />

promising, as it can reliably and efficiently classify the execution<br />

data. However, there are several important factors<br />

not addressed in their study. First, their study investigated<br />

only one machine-learning algorithm, but there are quite<br />

afewmachine-learning algorithms for classification. Second,<br />

their study did not investigate the impact of the number<br />

of instances for constructing the classification model,<br />

but this number may be an important factor of the overhead<br />

283

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!