20.01.2014 Views

thesis - Faculty of Information and Communication Technologies ...

thesis - Faculty of Information and Communication Technologies ...

thesis - Faculty of Information and Communication Technologies ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 5. Growth Dynamics<br />

mentally, with new functionality constructed on top <strong>of</strong> existing code.<br />

Though not surprising the real value <strong>of</strong> our analysis is that it provides<br />

empirical support for the Yule process [210] also known as preferential<br />

attachment growth model. In the simple form <strong>of</strong> this growth model, as<br />

classes are added, the probability that a class will depend on another<br />

existing class is proportional to the popularity <strong>of</strong> the existing class [55].<br />

Simply put, popular classes will gain additional dependents causing<br />

their popularity to increase. The preferential attachment model has<br />

been used as one <strong>of</strong> the explanations for how certain web sites gain<br />

popularity on the world-wide web [16, 209]. An interesting statistical<br />

property <strong>of</strong> systems that exhibit growth via a preferential attachment<br />

is that the data exhibits a highly skewed distribution <strong>and</strong> tends to fit a<br />

power-law distribution [210].<br />

Many researchers that study s<strong>of</strong>tware have noted that the In-Degree<br />

Count metric distributions follow a power-law [21,54,55,223,287,299],<br />

<strong>and</strong> have hypo<strong>thesis</strong>ed that this distribution arises due to preferential<br />

attachment. That is, they have inferred the potential growth mechanism<br />

from the observed outcome. However, although preferential attachment<br />

can cause a power-law distribution, it is not the only model<br />

that can give rise to this category <strong>of</strong> skewed distributions [48,145]. This<br />

possibility implies that the hypo<strong>thesis</strong> <strong>of</strong> preferential attachment model<br />

generating skewed In-Degree Count distributions was not fully validated<br />

empirically. Our observations show that, in general, there is an upward<br />

trend in the Gini coefficient for IDC providing empirical support for the<br />

preferential attachment growth model as the most likely explanation for<br />

the observed highly skewed distributions for In-Degree Count metric.<br />

In our data, we observed three systems – Struts, Tapestry, <strong>and</strong> Ant<br />

where the Gini Coefficients for In-Degree Count decreased over time.<br />

Why are these systems different? In order to answer this specific question,<br />

we had to study the release notes as well as all available architecture<br />

<strong>and</strong> design documentation for these systems.<br />

The developers <strong>of</strong> Struts <strong>and</strong> Tapestry performed major restructuring<br />

<strong>of</strong> the code base <strong>and</strong> in both cases, ab<strong>and</strong>oned a substantial amount<br />

<strong>of</strong> the code base during these restructuring phases (more detail is pre-<br />

125

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!