COMBINING INFORMATION RETRIEVAL MODULES AND ...
COMBINING INFORMATION RETRIEVAL MODULES AND ...
COMBINING INFORMATION RETRIEVAL MODULES AND ...
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
indicated how traditional information retrieval methods can be augmented with the features<br />
distilled from the domain knowledge of JavaScript and software analysis to improve<br />
classification performance. Their program comprehension included that static and dynamic<br />
features alone do not perform well, but their combination greatly reduces individual mistakes.<br />
The combined feature set also does not outperform the simple lexical approach, but serves to<br />
augment its performance.<br />
Kanellopoulos et al. 2007 presented a methodology and an associated model for<br />
extracting information from object oriented code by applying clustering and association rules<br />
mining. K-means clustering produces system overviews and deductions, which support further<br />
employment of an improved version of Multiple Minimum Support (MMS) Apriori that<br />
identifies hidden relationships between classes, methods and member data. By such a novel<br />
algorithmic framework which combines two different kinds of data mining algorithms, one for<br />
clustering and one for mining association rules from code, developers could have a more<br />
comprehensive view of the system under maintenance, at various levels of abstraction.<br />
Hill et al. 2007 proposed a technique that retrieves neighborhood information for a<br />
software component. Their tool Dora was compared with a structural technique Suade, and two<br />
base line techniques: Boolean-<strong>AND</strong> (<strong>AND</strong>) and Boolean-OR (OR), and Dora performed best.<br />
Their integrated lexical-based and structural-based approach was significantly more effective in<br />
helping programmers explore programs.<br />
Maskeri et al. 2008 investigated latent Dirichlet allocation (LDA) in the context of<br />
comprehending large software systems and proposed a human assisted approach based on LDA<br />
for extracting domain topics from source code. Their results indicate that their tool was able to<br />
satisfactorily extract some of the domain topics but not all, and certain human input is needed in<br />
11