20.01.2014 Views

thesis - Faculty of Information and Communication Technologies ...

thesis - Faculty of Information and Communication Technologies ...

thesis - Faculty of Information and Communication Technologies ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

Chapter 3. Data Selection Methodology<br />

The size <strong>and</strong> skill <strong>of</strong> development teams, though helpful, was a criteria<br />

that was removed after an initial pass at selecting systems mainly because<br />

it was not possible to obtain this information accurately. In some<br />

<strong>of</strong> our projects, the s<strong>of</strong>tware used to host the source control repositories<br />

changed during the evolutionary history <strong>of</strong> a project <strong>and</strong> many projects<br />

choose to archive older contribution logs at regular intervals removing<br />

access to this data. These aspects limited our ability to determine the<br />

number <strong>of</strong> active <strong>and</strong> contributing developers to the project, specifically<br />

during the early part <strong>of</strong> the evolution. Another facet that could not be<br />

accurately determined was that the level <strong>of</strong> contribution from different<br />

developers. That is, we were not able to identify reliably if some developers<br />

contribute more code than others. Further, some project members<br />

contributed artwork, documentation, organised <strong>and</strong> conducted meetings<br />

while some focused on testing. These non-code contributions were<br />

<strong>of</strong>ten not visible as active contributors on the source code repository.<br />

Another interesting finding during our investigation was that developers<br />

that have not contributed any material code for many years are still<br />

shown as members in the project. These limitations, including an observation<br />

that suggests that a small sub-set <strong>of</strong> developers are responsible<br />

for a large amount <strong>of</strong> the changes <strong>and</strong> additions to the source code<br />

in open source s<strong>of</strong>tware, has been noted by Capiluppi et al. [39].<br />

The observation that few developers contribute most <strong>of</strong> the code by<br />

Capiluppi et al. [39] <strong>and</strong> the variance in the contribution levels over<br />

time indicates that we require a measure that can meaningfully identify<br />

the number <strong>of</strong> normalised developers working on a project at any<br />

given point in time. However, such a metric has not yet been developed<br />

<strong>and</strong> widely accepted as effective <strong>and</strong> hence we did not rely on the<br />

development team size as a variable for use in our study.<br />

3.5 Selected Systems - An Overview<br />

Using the selection criteria, we initially identified 100s <strong>of</strong> s<strong>of</strong>tware systems<br />

that satisfy the criteria. However, we focused on a representative<br />

smaller subset in order to allow us to study each <strong>of</strong> the selected systems<br />

at a greater depth. Our final data set comprises <strong>of</strong> forty s<strong>of</strong>tware sys-<br />

50

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!