CoSIT 2017

More documents

Recommendations

Info

138 Computer Science & Information Technology (CS & IT) between results), we tried to incorporate the uncertainties to explain the variability and we are not aware of any other factors that could bias the results. External Validity. Six different projects from different companies and several industries were examined in this study. Our approach can be applied to almost any data set of defects, with a wide range of diversity. However, a possible threat is that the data sets in this study were all taken from the same implementation of ALM. Further studies on different systems are desirable to verify our findings. Reliability Validity. The dataset used in the work is commercial so cannot be publicly accessible for the moment and therefore this work cannot be replicated by other researchers using the exact dataset. However, we made a significant effort to provide all relevant implementation details. 8. SUMMARY In this paper, we presented a novel approach for prediction of software defect handling time by applying data mining techniques on historical defects. We designed and implemented a generic system that extracts defect information, applies preprocessing, and uses a random forest regression algorithm to train a prediction model. We applied our method on six projects of different customers from different industries, with promising results. Our system was designed to handle flaws in the data sets common in real-life scenarios, such as missing and noisy data. 9. FUTURE WORK We currently do not sufficiently exploit the content of the free-text fields. We would like to use text mining techniques to extract key terms that affect the defect handling time. We are also considering an expanded approach based on state transition models, in which we calculate the probability of a defect transitioning between any given pair of states during a certain time period. A similar idea was described in [18] but we want to expand this idea by computing a separate model for each defect, based on its fields, rather than assuming that all defects behave identically. This could be used to make a large number of predictions about defect life-cycle, for example, to predict how many defects will be reopened. Combining this with methods used for defect injection rates, such as those surveyed in [19], may provide a more realistic prediction for the situation in which new defects are detected within the project time-frame. Our algorithm can be easily generalized to other entities representing work to be done, such as product requirements and production incidents, and we would like to evaluate its accuracy in these cases. We plan to also generalize our system to extract data from different defect reporting systems. To make our data publicly available to be used by related research, we plan to obfuscate our data set by removing any identifiable or private information and publish it.
Computer Science & Information Technology (CS & IT) 139 REFERENCES [1] B. Boehm and V. R. Basili, \Software defect reduction top 10 list," Foundations of empirical software engineering: the legacy of Victor R. Basili, vol. 426, 2005. [2] S. McConnell, Software estimation: demystifying the black art. Microsoft press, 2006. [3] H. Zeng and D. Rine, \Estimation of software defects fix effort using neural networks," in Computer Software and Applications Conference, 2004. COMPSAC 2004. Proceedings of the 28th Annual International, vol. 2, pp. 20{21, IEEE, 2004. [4] C. Weiss, R. Premraj, T. Zimmermann, and A. Zeller, \How long will it take to fix this bug?," in Proceedings of the Fourth International Workshop on Mining Software Repositories, p. 1, IEEE Computer Society, 2007. [5] E. Hatcher, O. Gospodnetic, and M. McCandless, \Lucene in action," 2004. [6] R. Hewett and P. Kijsanayothin, \On modeling software defect repair time," Empirical Software Engineering, vol. 14, no. 2, pp. 165{186, 2009. [7] E. Giger, M. Pinzger, and H. Gall, \Predicting the fix time of bugs," in Proceedings of the 2nd International Workshop on Recommendation Systems for Software Engineering, pp. 52{56, ACM, 2010. [8] H. Zhang, L. Gong, and S. Versteeg, \Predicting bug-fixing time: an empirical study of commercial software projects," in Proceedings of the 2013 International Conference on Software Engineering, pp. 1042{1051, IEEE Press, 2013. [9] L. Davies and U. Gather, \The identification of multiple outliers," Journal of the American Statistical Association, vol. 88, no. 423, pp. 782{792, 1993. [10] J. Friedman, T. Hastie, and R. Tibshirani, The elements of statistical learning, vol. 1. Springer series in statistics Springer, Berlin, 2001. [11] L. Breiman, \Random forests," Machine learning, vol. 45, no. 1, pp. 5{32, 2001. [12] D. S. Hochbaum and D. B. Shmoys, \Using dual approximation algorithms for scheduling problems theoretical and practical results," Journal of the ACM (JACM), vol. 34, no. 1, pp. 144{162, 1987. [13] T. Kawaguchi and S. Kyan, \Worst case bound of an lrf schedule for the mean weighted ow-time problem," SIAM Journal on Computing, vol. 15, no. 4, pp. 1119{1129, 1986. [14] R. L. Graham, \Bounds on multiprocessing timing anomalies," SIAM journal on Applied Mathematics, vol. 17, no. 2, pp. 416{429, 1969. [15] R. Kohavi et al., \A study of cross-validation and bootstrap for accuracy estimation and model selection," in Ijcai, vol. 14, pp. 1137{1145, 1995. [16] P. Bhattacharya and I. Neamtiu, \Bug-fix time prediction models: can we do better?," in Proceedings of the 8th Working Conference on Mining Software Repositories, pp. 207{210, ACM, 2011. [17] R. K. Yin, Case study research: Design and methods. Sage publications, 2013.
Page 2:
Computer Science & Information Tech
Page 5 and 6:
Volume Editors Dhinaharan Nagamalai
Page 7 and 8:
Organization General Chair Nataraja
Page 9 and 10:
Technically Sponsored by Computer S
Page 11 and 12:
Fourth International Conference on
Page 13 and 14:
UNSUPERVISED DETECTION OF VIOLENT C
Page 15 and 16:
Page 17 and 18:
Page 19 and 20:
Page 21 and 22:
INVESTIGATING BINARY STRING ENCODIN
Page 23 and 24:
Page 25 and 26:
Page 27 and 28:
Page 29 and 30:
USE OF ADAPTIVE COLOURED PETRI NETW
Page 31 and 32:
2.2. Extended Adaptive Decision Tab
Page 33 and 34:
Page 35 and 36:
4. METHODOLOGY Computer Science & I
Page 37 and 38:
Page 39 and 40:
Page 41 and 42:
A NOVEL ADAPTIVE - WAVELET BASED DE
Page 43 and 44:
Page 45 and 46:
Page 47 and 48:
Page 49 and 50:
Page 51 and 52:
HANDWRITTEN CHARACTER RECOGNITION U
Page 53 and 54:
Page 55 and 56:
Page 57 and 58:
Page 59 and 60:
Page 61 and 62:
DIVING PERFORMANCE ASSESSMENT BY ME
Page 63 and 64:
Page 65 and 66:
Page 67 and 68:
4.4. Barycentre identification Comp
Page 69 and 70:
Page 71 and 72:
IS AI IN JEOPARDY? THE NEED TO UNDE
Page 73 and 74:
Page 75 and 76:
Page 77 and 78:
Page 79 and 80:
Page 81 and 82:
Page 83 and 84:
ANKLE MUSCLE SYNERGIES FOR SMOOTH P
Page 85 and 86:
Page 87 and 88:
Page 89 and 90:
Page 91 and 92:
ADVANCED LSB TECHNIQUE FOR AUDIO ST
Page 93 and 94:
Page 95 and 96:
Page 97 and 98:
Page 99 and 100: SECURING ONLINE ACCOUNTS VIA NEW HA
Page 101 and 102: Computer Science & Information Tech
Page 109 and 110: • Clark-Wilson Integrity Model Co
Page 117 and 118: NEW NON-COPRIME CONJUGATE-PAIR BINA
Page 125 and 126: TOWARDS A MULTI-FEATURE ENABLED APP
Page 129 and 130: 3.1. Architecture Computer Science
Page 139 and 140: ESTIMATING HANDLING TIME OF SOFTWAR
Page 149: Computer Science & Information Tech
Page 153 and 154: NEED FOR A SOFT DIMENSION Pradeep W
Page 157 and 158: AUTHORS Computer Science & Informat
Page 159 and 160: THE ANNUAL REPORT ALGORITHM: RETRIE
Page 179 and 180: A STUDY ON THE MOTION CHANGE UNDER
Page 181 and 182: B) Experiment part Computer Science
Page 185 and 186: ADAPTIVE AUTOMATA FOR GRAMMAR BASED
Page 197 and 198: STORAGE GROWING FORECAST WITH BACUL
Page 201 and 202:
Page 203 and 204:
Page 205 and 206:
Page 207 and 208:
Page 209:
AUTHOR INDEX Abdullah A. Al-Shaher
show all

CoSIT 2017

Create successful ePaper yourself

Delete template?

Save as template?