20.01.2014 Views

thesis - Faculty of Information and Communication Technologies ...

thesis - Faculty of Information and Communication Technologies ...

thesis - Faculty of Information and Communication Technologies ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 1. Introduction<br />

1.2 Research Approach<br />

Empirical research by its very nature relies heavily on quantitative information.<br />

Our research is based on an exploratory study <strong>of</strong> forty nontrivial<br />

<strong>and</strong> popular Java Open Source S<strong>of</strong>tware Systems <strong>and</strong> the results<br />

<strong>and</strong> interpretation are from an empirical s<strong>of</strong>tware engineering perspective.<br />

The data set consists <strong>of</strong> over 1000 distinct releases encompassing<br />

an evolution history comprising approximately 55000 classes. We investigate<br />

Open Source S<strong>of</strong>tware Systems due to their non-restrictive<br />

licensing, ease <strong>of</strong> access, <strong>and</strong> their growing use in a wide range <strong>of</strong><br />

projects.<br />

Our approach involves collecting metric data by processing compiled<br />

binaries (Java class files) <strong>and</strong> analysing how these metrics change over<br />

time in order to underst<strong>and</strong> both growth as well as change. Although we<br />

use the compiled builds as input for our analysis, we also make use <strong>of</strong><br />

other artifacts such as revision logs, project documentation, <strong>and</strong> defect<br />

logs as well as the source code in order to interpret our findings <strong>and</strong><br />

better underst<strong>and</strong> any abnormal change events. For instance, if the size<br />

<strong>of</strong> the code base has doubled between two consecutive releases within<br />

a short time frame (as observable in the history), additional project<br />

documentation <strong>and</strong> messages on the discussion board <strong>of</strong>ten provide an<br />

insight into the rationale <strong>and</strong> motivations within the team that cannot<br />

be directly ascertained from an analysis <strong>of</strong> the binaries alone.<br />

In order to underst<strong>and</strong> the nature <strong>of</strong> growth, we construct relative <strong>and</strong><br />

absolute frequency histograms <strong>of</strong> the various metrics <strong>and</strong> then observe<br />

how these histograms change over time using higher-order statistical<br />

techniques. This method <strong>of</strong> analysis allows us, for example, to identify<br />

if a certain set <strong>of</strong> classes is gaining complexity <strong>and</strong> volume at the<br />

expense <strong>of</strong> other classes in the s<strong>of</strong>tware system. By analysing how developers<br />

choose to distribute functionality, we can also identify if there<br />

are common patterns across s<strong>of</strong>tware systems <strong>and</strong> if evolutionary pressures<br />

have any impact on how developers organise s<strong>of</strong>tware systems.<br />

We examine the nature <strong>of</strong> change, by analyzing s<strong>of</strong>tware at two levels <strong>of</strong><br />

granularity: version level <strong>and</strong> class level. The change measures that we<br />

5

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!