thesis - Faculty of Information and Communication Technologies ...
thesis - Faculty of Information and Communication Technologies ...
thesis - Faculty of Information and Communication Technologies ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Chapter 2. S<strong>of</strong>tware Evolution<br />
A limitation <strong>of</strong> relying on individual changes is that the change log data<br />
needs to be carefully processed [86] in order to identify if the changes<br />
recorded are related to aspects <strong>of</strong> s<strong>of</strong>tware system under study (for instance,<br />
source code), <strong>and</strong> also to ensure that the changes are significant<br />
(for example, minor edits in the code comments may need to be<br />
eliminated if the emphasis <strong>of</strong> a study is to underst<strong>and</strong> how developers<br />
adapt the actual functional source code as they evolve the system).<br />
Another constraint that studies relying <strong>of</strong> change logs face is raised by<br />
Chen et al. [44] who found that developers in some open source projects<br />
did not properly record all <strong>of</strong> the changes. In their study, Chen et al.<br />
highlight that in two out <strong>of</strong> the three systems studied, over 60% <strong>of</strong> the<br />
changes were not recorded, <strong>and</strong> as a consequence, the information provided<br />
in the change logs cannot be considered to be representative <strong>of</strong> all<br />
the changes that take place within a s<strong>of</strong>tware system. The significant<br />
drawback <strong>of</strong> change based studies is their heavy reliance on developers<br />
providing consistent <strong>and</strong> regular information about individual changes.<br />
There is currently no evidence that shows that developers record individual<br />
changes carefully. Furthermore, the definition <strong>of</strong> an individual<br />
change is likely to vary from developer to developer, as well as from<br />
project to project.<br />
In our study, we focus on how s<strong>of</strong>tware evolves post-release both in<br />
terms <strong>of</strong> growth <strong>and</strong> changes between the releases that developers have<br />
made available to end-users. We focus on releases because an underst<strong>and</strong>ing<br />
<strong>of</strong> evolution from this perspective is <strong>of</strong> greater value to<br />
managers <strong>and</strong> developers as any post-release change, in general, has<br />
a greater impact on the end users [220]. Furthermore, existing release<br />
based studies have mainly investigated very few s<strong>of</strong>tware systems<br />
(typically less then 20), including the seminal work by Lehman [174]<br />
which investigated only one large s<strong>of</strong>tware system. The restriction on<br />
small data sets was potentially unavoidable in earlier work [85,148,284]<br />
due to the reliance on commercial s<strong>of</strong>tware systems which have legal<br />
restrictions that make it challenging to investigate, <strong>and</strong> to replicate<br />
the experiments. The wide-spread <strong>and</strong> increasing availability <strong>of</strong><br />
open source s<strong>of</strong>tware systems over the past decade has allowed researchers<br />
to study distinct releases <strong>of</strong> a larger number <strong>of</strong> s<strong>of</strong>tware systems<br />
in order to underst<strong>and</strong> evolution. However, even these studies<br />
17