20.01.2014 Views

thesis - Faculty of Information and Communication Technologies ...

thesis - Faculty of Information and Communication Technologies ...

thesis - Faculty of Information and Communication Technologies ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Chapter 6. Change Dynamics<br />

new classes with that <strong>of</strong> all classes, we can see (Figure 6.7) that this<br />

proportion is consistently lower than the norm. New classes, therefore,<br />

tend to start out with relatively lower popularity <strong>and</strong> some <strong>of</strong> these gain<br />

dependents over time. The observation that in general new classes are<br />

less popular than existing classes is also supported by our finding from<br />

the Gini coefficient analysis in the previous chapter. We stated that the<br />

In-Degree Count gini coefficient in general increases as systems mature<br />

which also shows that new classes do not have the same distribution<br />

pr<strong>of</strong>ile in terms <strong>of</strong> popularity as the existing classes — if they did, the<br />

Gini Coefficient would not change.<br />

6.2.7 Structural Complexity <strong>of</strong> Modified Classes<br />

In the previous section we have shown that the popularity <strong>of</strong> a class<br />

makes it more change-prone. But, does the internal structural complexity<br />

<strong>of</strong> a class increase the likelihood that a class will be modified?<br />

In this section, we address this question by investigating the relationship<br />

between the Number <strong>of</strong> Branches <strong>and</strong> change.<br />

When a class is modified, there is a certain probability that the branching<br />

statements are altered. To appreciate if the number <strong>of</strong> branching<br />

instructions has an impact on the probability <strong>of</strong> change, we constructed<br />

two logistic regression models. In the first model, Number <strong>of</strong> Branches<br />

was the explanatory (independent) variable. For the second model, we<br />

used the size normalized Number <strong>of</strong> Branches as the explanatory variable.<br />

The size normalized Number <strong>of</strong> Branches is a ratio calculated as<br />

the number <strong>of</strong> branches per bytecode instructions in a class. The size<br />

normalization was applied to identify if size causes a significant distortion<br />

in the probability <strong>of</strong> modification. That is, the class is being modified<br />

because it is bigger rather than because it has more structural<br />

complexity. In both models the response (dependent) variable was if a<br />

class was modified at least once in its release history (0 was used to<br />

represent classes that were not modified, 1 indicated class was modified<br />

at least once in its evolution history). The logistic regression models<br />

were constructed by using the set <strong>of</strong> all classes across all versions <strong>of</strong> all<br />

s<strong>of</strong>tware systems in our input data set.<br />

162

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!