15.08.2013 Views

COMBINING INFORMATION RETRIEVAL MODULES AND ...

COMBINING INFORMATION RETRIEVAL MODULES AND ...

COMBINING INFORMATION RETRIEVAL MODULES AND ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

indicated how traditional information retrieval methods can be augmented with the features<br />

distilled from the domain knowledge of JavaScript and software analysis to improve<br />

classification performance. Their program comprehension included that static and dynamic<br />

features alone do not perform well, but their combination greatly reduces individual mistakes.<br />

The combined feature set also does not outperform the simple lexical approach, but serves to<br />

augment its performance.<br />

Kanellopoulos et al. 2007 presented a methodology and an associated model for<br />

extracting information from object oriented code by applying clustering and association rules<br />

mining. K-means clustering produces system overviews and deductions, which support further<br />

employment of an improved version of Multiple Minimum Support (MMS) Apriori that<br />

identifies hidden relationships between classes, methods and member data. By such a novel<br />

algorithmic framework which combines two different kinds of data mining algorithms, one for<br />

clustering and one for mining association rules from code, developers could have a more<br />

comprehensive view of the system under maintenance, at various levels of abstraction.<br />

Hill et al. 2007 proposed a technique that retrieves neighborhood information for a<br />

software component. Their tool Dora was compared with a structural technique Suade, and two<br />

base line techniques: Boolean-<strong>AND</strong> (<strong>AND</strong>) and Boolean-OR (OR), and Dora performed best.<br />

Their integrated lexical-based and structural-based approach was significantly more effective in<br />

helping programmers explore programs.<br />

Maskeri et al. 2008 investigated latent Dirichlet allocation (LDA) in the context of<br />

comprehending large software systems and proposed a human assisted approach based on LDA<br />

for extracting domain topics from source code. Their results indicate that their tool was able to<br />

satisfactorily extract some of the domain topics but not all, and certain human input is needed in<br />

11

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!