20.01.2014 Views

thesis - Faculty of Information and Communication Technologies ...

thesis - Faculty of Information and Communication Technologies ...

thesis - Faculty of Information and Communication Technologies ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 3. Data Selection Methodology<br />

volume to the size measure <strong>and</strong> have the potential to distort the evolutionary<br />

patterns <strong>and</strong> may indicate a faster rate <strong>of</strong> growth than would<br />

be possible if only the contributions <strong>of</strong> the core team are considered.<br />

Including the external libraries also has the potential to distort measures<br />

<strong>of</strong> complexity <strong>and</strong> may indicate that a project is far more complex<br />

than it really is. For example, if a project makes use <strong>of</strong> two complex<br />

libraries for visualization <strong>and</strong> signal processing the structural <strong>and</strong> algorithm<br />

complexity <strong>of</strong> these libraries will be considered to be part <strong>of</strong><br />

the actual project under investigation <strong>and</strong> the core project will show far<br />

more complexity than what needs to be considered by the developers.<br />

Although, including third party libraries provide another dimension<br />

into evolution, from the developers perspective the effort is expended<br />

on selection <strong>of</strong> the library <strong>and</strong> learning it rather than in construction <strong>of</strong><br />

the library. Furthermore, it is possible that even though a large library<br />

is included, only a small fraction <strong>of</strong> the features are directly used <strong>and</strong><br />

as a consequence reduce the strength <strong>of</strong> any inferences derived from<br />

the observed evolution. We therefore focus on the set <strong>of</strong> code that can<br />

be considered to be directly contributed by the developers <strong>and</strong> hence<br />

potentially maintained by the development team as it evolves.<br />

All systems that we have analyzed made extensive use <strong>of</strong> additional<br />

Java-based third party libraries with a few systems making use <strong>of</strong> libraries<br />

written in C/C++. In our study, these third party libraries as<br />

well as all Java st<strong>and</strong>ard libraries are treated as external to the s<strong>of</strong>tware<br />

system under investigation <strong>and</strong> for this reason we do not collect<br />

metric data for classes in these libraries (See Figure 3.1). For instance,<br />

if a s<strong>of</strong>tware system makes extensive use <strong>of</strong> the Java Encryption API<br />

(provided as part <strong>of</strong> the st<strong>and</strong>ard Java framework), we do not extract<br />

metrics for classes in this external encryption library as the effort that<br />

has gone into developing these libraries does not directly impact on the<br />

s<strong>of</strong>tware system under investigation.<br />

We also noticed that many projects rely on the same set <strong>of</strong> libraries <strong>and</strong><br />

frameworks. For example, the Apache Java libraries are extensively<br />

used for String, Math, Image, <strong>and</strong> XML processing. Though, there are<br />

58

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!