20.01.2014 Views

thesis - Faculty of Information and Communication Technologies ...

thesis - Faculty of Information and Communication Technologies ...

thesis - Faculty of Information and Communication Technologies ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 3. Data Selection Methodology<br />

needed <strong>and</strong> hence not included during the build process using settings<br />

provided in the Makefile. Hence, when using source code files as input<br />

into a study <strong>of</strong> evolution, ideally, the build scripts have to be parsed to<br />

determine a the set <strong>of</strong> files for a specific configuration <strong>and</strong> then the evolution<br />

<strong>of</strong> the system for this specific configuration has to be analysed.<br />

Many previous studies [41, 100, 105, 120, 153, 217, 239, 256] that use<br />

release histories do not explicitly indicate if the build scripts have been<br />

pre-processed adequately to ensure that the correct set <strong>of</strong> source files<br />

is used as the input.<br />

In our study, we use compiled releases (Java classes package inside JAR<br />

files) to construct our release history <strong>and</strong> so our input data has already<br />

gone through the build process, reducing the chance <strong>of</strong> encountering<br />

code that is no longer in active use. This approach allows us to focus<br />

on the set <strong>of</strong> classes that have been deemed fit for release by the<br />

development team.<br />

Third Party Libraries<br />

Development teams in general, use a number <strong>of</strong> third party Java libraries<br />

as well as the st<strong>and</strong>ard Java libraries (which are part <strong>of</strong> the<br />

Java runtime environment) in order to improve their productivity. In<br />

our study, we classify the set <strong>of</strong> classes created by the development team<br />

as the core system <strong>and</strong> focus explicitly on how this core system evolves<br />

(see Figure 3.1). This scope allows us to gain a direct perspective into<br />

the efforts <strong>of</strong> the development team. Our tighter focus has the advantage<br />

<strong>of</strong> ensuring that the size <strong>and</strong> complexity measures that we collect<br />

are not influenced by changes in the external libraries. Although, the<br />

developers <strong>of</strong> the core system potentially exert some evolutionary pressure<br />

on external libraries as consumers, they do not directly influence<br />

the internal structure, organisation <strong>and</strong> size <strong>of</strong> these external libraries.<br />

Our intentional focus <strong>of</strong> ignoring third party libraries distinguishes our<br />

study from other previous large scale studies into open source s<strong>of</strong>tware<br />

evolution where this choice was not explicitly made or stated in their<br />

work [39, 120, 129, 188, 217, 239, 306]. These third party libraries add<br />

57

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!