
 
ACKNOWLEDGEMENTS 
This research is supported in part by Bogazici 
University research fund under grant number BAP-
06HA104. 
REFERENCES 
Alpaydin, E., “Introduction to Machine Learning.”, The 
MIT Press, October 2004.  
Auer, M., Trendowicz, A., Graser, B., Haunschmid, E. and 
Biffl, S., “Optimal Project Feature Weights in 
Analogy Based Cost Estimation: Improvement and 
Limitations”, IEEE Transactions on Software 
Engineering., 32(2), 2006, pp. 83-92. 
Basili, V. R., Briand, L. C., and Melo, W. L., “A 
Validation of Object-Oriented Design Metrics as 
Quality Indicators”, IEEE Transactions on Software 
Engineering, 22(10), 1996, pp. 751-761. 
Domingos, P. and Pazzani, M., “On the Optimality of the 
Simple Bayesian Classifier under Zero-One Loss”, 
Machine Learning., 29(2-3), 1997, pp. 103-130. 
Fenton, N.E. and Neil, M., “A critique of software defect 
prediction models”, IEEE Transactions. on Software. 
Engineering., 25(5), 1999, pp. 675–689. 
Fenton, N. and Ohlsson, N., “Quantitative Analysis of 
Faults and Failures in a Complex Software System,” , 
IEEE Transactions on Software Engineering., 2000, 
pp. 797-814. 
Frank, E., Hall, M., Pfahringer, B., “Locally weighted 
naive Bayes”,  In Proceedings of the Uncertainty in 
Artificial Intelligence Conference, Acapulco, Mexico, 
Morgan Kaufmann, 2003, pp. 249-256. 
Hall, M., “A decision tree-based attribute weighting filter 
for naive Bayes”, Knowledge-Based Systems., 20(2), 
2007, pp. 120-126. 
Harrold, M. J., “Testing: a roadmap”, In Proceedings of 
the Conference on the Future of Software Engineering, 
ACM Press, New York, NY, 2000, pp. 61-72. 
Khoshgoftaar, T. M. and Seliya, N., “Fault Prediction 
Modeling for Software Quality Estimation: Comparing 
Commonly Used Techniques”, Empirical Software 
Engineering.,  8(3), 2003, pp. 255-283. 
Lewis, D. D., “Naive (Bayes) at Forty: The Independence 
Assumption in Information Retrieval”, In Proceedings 
of the 10th European Conference on Machine 
Learning, C. Nedellec and C. Rouveirol, Eds. Lecture 
Notes In Computer Science, vol. 1398. Springer-
Verlag, London, 1998, pp. 4-15. 
Menzies, T., Stefano, J. D., Chapman, M., “Learning Early 
Lifecycle IV and V Quality Indicators,” In 
Proceedings of the  IEEE Software Metrics 
Symposium, 2003. 
Menzies, T., DiStefano, J., Orrego, A., Chapman, R., 
“Assessing Predictors of Software Defects,” In 
Proceedings of Workshop Predictive Software 
Models, 2004. 
Menzies T., Greenwald, J., Frank, A., “Data mining static 
code attributes to learn defect predictors”, IEEE 
Transactions on Software Engineering, 33(1), 2007, 
pp. 2–13.  
Mladenic, D. and Grobelnik, M., “Feature Selection for 
Unbalanced Class Distribution and Naive Bayes”, In 
Proceedings of the Sixteenth international Conference 
on Machine Learning, I. Bratko and S. Dzeroski, Eds. 
Morgan Kaufmann Publishers, San Francisco, CA, 
1999, pp. 258-267. 
Munson, J. and Khoshgoftaar, T. M., “Regression 
modelling of software quality: empirical 
investigation”, Journal of Electronic Materials., 19(6), 
1990, pp. 106-114. 
Munson, J. and Khoshgoftaar, T. M., “The Detection of 
Fault-Prone Programs”, IEEE Transactions on 
Software Engineering., 18(5), 1992, pp. 423-433.  
Nagappan N., Williams, L., Osborne, J., Vouk, M., 
Abrahamsson, P., “Providing Test Quality Feedback 
Using Static Source Code and Automatic Test Suite 
Metrics”, International Symposium on Software 
Reliability Engineering, 2005. 
Nasa/Wvu IV&V Facility, Metrics Data Program, 
available from http://mdp.ivv.nasa.gov; Internet; 
accessed 2007. 
Padberg, F., Ragg T., Schoknecht R., “Using machine 
learning for estimating the defect content after an 
inspection”, IEEE Transactions on Software 
Engineering, 30(1), 2004, pp: 17- 28. 
Quinlan, J. R. “C4.5: Programs for Machine Learning.”, 
Morgan Kaufmann, San Mateo, CA, 1993. 
Shepperd, M. and Ince D., “A Critique of Three Metrics,” 
Journal of Systems and Software., 26(3), 1994,  pp. 
197-210. 
Song, O., Shepperd, M., Cartwright, M., Mair, C., 
"Software Defect Association Mining and Defect 
Correction Effort Prediction," IEEE Transactions on 
Software Engineering., 32(2), 2006,  pp. 69-82. 
Tahat, B. V.,  Korel B., Bader, A., "Requirement-Based 
Automated Black-Box Test Generation", In 
Proceedings of 25th Annual International Computer 
Software and Applications Conference, Chicago, 
Illinois, 2001,  pp. 489-495. 
Zhang, H. and Sheng S., “Learning weighted naive Bayes 
with accurate ranking”, In Proceedings of the  
4
th
 IEEE International Conference on Data Mining, 
1(4), 2004, pp. 567- 570 
Zheng, Z. and Webb, G. I., “Lazy Learning of Bayesian 
Rules”, Machine Learning.,  41(1), 2000, pp. 53-84.. 
SOFTWARE DEFECT PREDICTION: HEURISTICS FOR WEIGHTED NAÏVE BAYES
249