thesis - Faculty of Information and Communication Technologies ...
thesis - Faculty of Information and Communication Technologies ...
thesis - Faculty of Information and Communication Technologies ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
Chapter 6. Change Dynamics<br />
ing it with another syntax tree. In contrast to the String matching approach,<br />
the Abstract Syntax Tree (AST) based matching is programming<br />
language aware <strong>and</strong> is able to identify the types <strong>of</strong> changes more precisely.<br />
Though this approach is comparatively more complex, it has<br />
increasingly been adopted by contemporary IDE’s like Eclipse [73] to<br />
perform an intelligent diff allowing developers to compare two versions<br />
<strong>of</strong> a class. The AST based matching approach has also been enhanced<br />
to better detect structural changes in object oriented programs by combining<br />
it with call graphs analysis [157], <strong>and</strong> program behaviour analysis<br />
[7].<br />
The metric based finger printing approach involves computing a set <strong>of</strong><br />
measures from the abstraction that is being monitored <strong>and</strong> comparing<br />
this set with a set collected after the change has been made. If there<br />
is a difference between these two sets, the abstraction is considered to<br />
have been changed. This approach, depending on the implementation<br />
can be aware <strong>of</strong> the underlying programming language semantics, especially<br />
if an AST is used to represent the program under analysis. A<br />
key advantage <strong>of</strong> the metric based approach is its ability in providing<br />
a consistent qualitative distance measure which can be used to determine<br />
the magnitude <strong>of</strong> a change. One such distance measure is the<br />
n-dimensional Euclidian distance measure where the two sets <strong>of</strong> measures<br />
are treated as the input vectors from which a distance measure is<br />
computed [34,67,155]. The resulting distance falls on the ordinal scale<br />
<strong>and</strong> only provides sufficient information to permit ordering <strong>of</strong> changes<br />
by magnitude. That is, we can determine that a specific class has had<br />
a greater level <strong>of</strong> change than another. However, these measures are<br />
unbounded <strong>and</strong> hence need to be normalised if they are to be used in a<br />
comparative analysis (discussed in additional detail in the next section).<br />
Change detection algorithms have applications beyond an intelligent<br />
diff <strong>of</strong> the source code [242]. Variations <strong>of</strong> these algorithms have been<br />
applied to detect different types <strong>of</strong> refactoring [288], as well as to identify<br />
duplicate blocks <strong>of</strong> code (clones) in a s<strong>of</strong>tware system [156, 195]<br />
as these have been shown to reduce maintainability [236]. Clones are<br />
in general caused by developers copying <strong>and</strong> pasting blocks <strong>of</strong> code,<br />
which is considered a poor programming practice [236]. These clones<br />
141