COMBINING INFORMATION RETRIEVAL MODULES AND ...

More documents

Recommendations

Info

indicated how traditional information retrieval methods can be augmented with the features distilled from the domain knowledge of JavaScript and software analysis to improve classification performance. Their program comprehension included that static and dynamic features alone do not perform well, but their combination greatly reduces individual mistakes. The combined feature set also does not outperform the simple lexical approach, but serves to augment its performance. Kanellopoulos et al. 2007 presented a methodology and an associated model for extracting information from object oriented code by applying clustering and association rules mining. K-means clustering produces system overviews and deductions, which support further employment of an improved version of Multiple Minimum Support (MMS) Apriori that identifies hidden relationships between classes, methods and member data. By such a novel algorithmic framework which combines two different kinds of data mining algorithms, one for clustering and one for mining association rules from code, developers could have a more comprehensive view of the system under maintenance, at various levels of abstraction. Hill et al. 2007 proposed a technique that retrieves neighborhood information for a software component. Their tool Dora was compared with a structural technique Suade, and two base line techniques: Boolean-<strong>AND</strong> (<strong>AND</strong>) and Boolean-OR (OR), and Dora performed best. Their integrated lexical-based and structural-based approach was significantly more effective in helping programmers explore programs. Maskeri et al. 2008 investigated latent Dirichlet allocation (LDA) in the context of comprehending large software systems and proposed a human assisted approach based on LDA for extracting domain topics from source code. Their results indicate that their tool was able to satisfactorily extract some of the domain topics but not all, and certain human input is needed in 11
order to improve the quality of topics extracted. Hill et al. 2008 presented an automated approach to mining abbreviation expansions from source code to enhance software maintenance tools that utilize natural language information. Their tool Automatically Mining Abbreviation (AMAP) expansions in Programs used contextual information at the method, program, and general software level to automatically select the most appropriate expansion for a given abbreviation. AMAP is helpful for developers to understand the abbreviations in programs. Hill et al. 2009 discussed an approach that automatically extracts natural language phrases from source code identifiers and categorizes the phrases and search results in a hierarchy. Their technique allowed developers to explore the word usage in a piece of software, helping them to quickly identify relevant program elements for investigation or to quickly recognize alternative words for query reformulation. 2.2.2 Impact analysis Impact analysis is used to evaluate the impact that a proposed change in one part of a program may have on another (Arnold et al. 1993; Turver et al., 1994; Fyson et al., 1998). Antoniol et al. 2000 propose an IR based method for impact analysis. The authors use a vector space model and a probabilistic model to trace maintenance requests onto software components that are affected by the requests. In their case study of LEDA, they use change log files to extract 11 maintenance requests. Canfora et al. 2006 provided an approach to predict impacted files from a change request definition. Their approach exploited information retrieval algorithms performed on code entities, such as source files and lines of code, indexed with free text contained in software repositories. Their results indicated that indexing fine grained entities improved precision at the cost of 12
Page 1 and 2: COMBINING INFORMATION RETRIEVAL MOD
Page 3 and 4: ABSTRACT Bug localization and featu
Page 5 and 6: FCA Formal concept analysis: a tech
Page 7 and 8: S A diagonal matrix whose diagonal
Page 9 and 10: CONTENTS ABSTRACT .................
Page 11 and 12: 5.4.2 Rhino .......................
Page 13 and 14: LIST OF TABLES 5.1 Three projects i
Page 15 and 16: 5.31 Rankings for six bugs in jEdit
Page 17 and 18: 5.10 Average Rankings for Rhino (al
Page 19 and 20: al., 2007; Shao et al., 2009), are
Page 21 and 22: 25 features in JavaHMO, 35 bugs in
Page 23 and 24: Ndqr Recall = Where Ndqr is the tot
Page 25 and 26: kβ Then Pr(wk |Di) = λ if wk
Page 27: probabilistic model obtained higher
Page 31 and 32: software change artifacts. It conve
Page 33 and 34: e an object in object-oriented prog
Page 35 and 36: approaches to monitor, plan, and pr
Page 37 and 38: into situations where obtaining dir
Page 39 and 40: CHAPTER 3 RELATED WORK Chapter 3 gi
Page 41 and 42: integrate the LSI score (SLSI) and
Page 43 and 44: • Stemming - strip suffixes to re
Page 45 and 46: stretching and minimum squeezing th
Page 47 and 48: different types of software artifac
Page 49 and 50: the text description of software ar
Page 51 and 52: provided a useful first impression
Page 53 and 54: CHAPTER 4 RESEARCH METHOD Chapter 4
Page 55 and 56: 2. Construct and query the LSI mode
Page 57 and 58: those methods may also be relevant
Page 59 and 60: 4.3 LSICG implementation LSICG comp
Page 61 and 62: called methods. Because the JDT is
Page 63 and 64: Question: How accurate are LSICG an
Page 65 and 66: classes identified for jEdit featur
Page 67 and 68: 6. Play your MP3 files and streamin
Page 69 and 70: To accommodate the two change reque
Page 71 and 72: For a given feature, we first run L
Page 73 and 74: Likewise, we compute the average ra
Page 75 and 76: (NativeRegExp, emitREBytecode) 2639
Page 77 and 78: Figure 6 illustrates the overall re
Page 79 and 80:
As was the case for JavaHMO, seven
Page 81 and 82:
Figure 5.5 Average rankings from ea
Page 83 and 84:
For each query in Table 5.13, we ru
Page 85 and 86:
For JavaHMO and Rhino, we also prov
Page 87 and 88:
5.5.2 Rhino We use all 35 bugs in R
Page 89 and 90:
5.6 All bugs/features Results This
Page 91 and 92:
plugins.organizer.OrganizerContaine
Page 93 and 94:
900 800 700 600 500 400 300 200 100
Page 95 and 96:
overload] 253323 Assignment to vari
Page 97 and 98:
Table 5.21 Rankings of thirty-five
Page 99 and 100:
well for small size of subjects (si
Page 101 and 102:
254915 289 80 109 8 125 8 8 255549
Page 103 and 104:
Pair 1 LSICG - LSI Mean Std. Deviat
Page 105 and 106:
The average ranking of BestLSI is 2
Page 107 and 108:
900 800 700 600 500 400 300 200 100
Page 109 and 110:
Pair 1 Pair 1 BestofALL_ LSICG Best
Page 111 and 112:
200 150 100 50 0 Figure 5.13 Averag
Page 113 and 114:
pathMenu path_menu 1607211 Q51 jEdi
Page 115 and 116:
Note that both BestLSI and BestLSIC
Page 117 and 118:
5.6.4 Discussion of results The sec
Page 119 and 120:
Table 5.32 Features and targeted cl
Page 121 and 122:
Table 5.33 Bugs and targeted classe
Page 123 and 124:
Table 5.34 Features and targeted cl
Page 125 and 126:
Table 5.36 Rankings for JavaHMO (cl
Page 127 and 128:
35 30 25 20 15 10 5 0 Figure 5.18 A
Page 129 and 130:
256339 62 5 59 1 55 1 256575 28 5 1
Page 131 and 132:
120 100 80 60 40 20 0 98 80 71 Figu
Page 133 and 134:
For LSICG16 and LSI16, |t| = 5.608,
Page 135 and 136:
For bug localization task in jEdit,
Page 137 and 138:
70 60 50 40 30 20 10 Figure 5.22 Av
Page 139 and 140:
CHAPTER 6 DISCUSSION The results of
Page 141 and 142:
more coupling than the other two pr
Page 143 and 144:
4. The projects in this study range
Page 145 and 146:
call-away rule, try combining struc
Page 147 and 148:
Canfora, G. and Cerulo, L. (2006a),
Page 149 and 150:
Fry, Z. P. and Shepherd, D. and Hil
Page 151 and 152:
Lewis, D., In Ifs, Harper, W.L., St
Page 153 and 154:
Marcus, A. and Poshyvanyk, D. and F
Page 155 and 156:
Websites JavaHMO, http://JavaHMO.so
Page 157 and 158:
offset timezone] (NativeDate, date_
Page 159 and 160:
254778 Rhino treats label as separa
Page 161 and 162:
Parser,argumentList Parser,assignEx
Page 163:
Interpreter,updateLineNumber Interp
show all

COMBINING INFORMATION RETRIEVAL MODULES AND ...

Create successful ePaper yourself

Delete template?

Save as template?