
Table 4: Point estimation results (%). 
 
Using the mean of projects  Using the median of projects 
Dataset  Classifier 
MMRE  MdMRE  PRED  MMRE  MdMRE  PRED 
LD 189 183 33 131 131 33.6 
K-NN 189 183 33 131 131 33.6 
coc81 
DT 192 190 29.6 134 131 30.2 
LD 69 45 42.2 51 32 54.8 
K-NN 69 45 42 51  32 54.6 
cocomonasa_v1 
DT 76 50 26.8 58 40 39.4 
LD 13 12 84.14 13 12 86.42 
K-NN 13 12 84.14 13  12 86.71 
desharnais_1_1 
DT 16 15 79 15 15 81.85 
LD 70 52 55.5 52 40 57.7 
K-NN 69 52 55.5 52  40 57.7 
nasa93 
DT 72 52 51.2 55 41 53.4 
LD 45 28 45.5 37 26 52 
K-NN 45 28 45.5 37  26 52 
sdr05 
DT 59 44 28.5 52 38 35 
LD 31 31 50.5 25 23 67 
K-NN 30 31 50.5 24  23 67 
sdr06 
DT 34 36 44.5 27 25 61 
LD 14 14 84.66 14 14 79.6 
K-NN 14 13 81.33 14  14 76.3 
sdr07 
DT 14 13 81.33 14 14 76.3 
 
problem and propose an approach that classifies new 
software projects in one of the dynamically created 
effort classes each corresponding to an effort 
interval. In the experiments done, we obtain higher 
hit rates than other studies in the literature. For point 
estimation results, we can see that MdMRE, MMRE, 
and PRED (25) values are comparable to those in the 
literature for most of the datasets although we use 
simple methods like mean and median regression. 
  Future work includes using different clustering 
techniques to find effort classes and to apply 
regression-based models for point estimation.  
ACKNOWLEDGEMENTS 
This research is supported by Boğaziçi University 
research fund under grant number BAP 06HA104 
and the Turkish Scientific Research Council 
(TUBITAK) under grant number EEEAG 108E014. 
REFERENCES 
Alpaydin, E., 2004. Introduction to Machine Learning, 
MIT Press. 
Angelis, L., Stamelos, I., 2000. A Simulation Tool for 
Efficient Analogy Based Cost Estimation, Empirical 
Software Engineering, 5, 35-68. 
Bakar, Z. A., Deris, M. M., Alhadi, A. C., 2005. 
Performance Analysis of Partitional and Incremental 
Clustering, SNATI 2005. 
Boehm B. W., 1981. Software Engineering Economics, 
Prentice-Hall. 
Boetticher, G., Menzies, T., Ostrand, T., 2007. PROMISE 
  Repository of Empirical Software Engineering Data, 
http://promisedata.org/repository, West Virginia 
University, Department of Computer Science.   
Gallego, J. J. C., Rodriguez, D., Sicilia, M. A., Rubio, M. 
G., Crespo, A. G., 2007. Software Project effort 
Estimation Based on Multiple Parametric Models 
Generated through Data Clustering, Journal of 
Computer Science and Technology, 22 (3), 371-378. 
Jorgensen, M., 2003. An Effort Prediction Interval 
Approach Based on the Empirical Distribution of 
Previous Estimation Accuracy, Information and 
Software Technology, 45, 123-126. 
Lee, A., Cheng, C. H., Balakrishnan, J., 1998. Software 
Development Cost Estimation: Integrating Neural 
Network with Cluster analysis, Information & 
Management, 34, 1-9. 
Leung, H., Fan, Z., 2002. Software Cost Estimation, 
Handbook of Software Engineering and Knowledge 
Engineering, Vol. 2, World Scientific. 
Quinlan, J. R., 1993. C4.5: Programs for Machine 
Learning, Morgan Kaufman. 
Sentas, P., Angelis, L., Stamelos, I., 2003. Multinominal 
Logistic Regression Applied on Software Productivity 
Prediction,  PCI 2003, 9
th
 Panhellenic Conference in 
Informatics, Thessaloniki.  
Sentas, P., Angelis, L., Stamelos, I., Bleris, G., 2004. 
Software Productivity and Effort Prediction with 
Ordinal Regression, Information and Software 
Technology, 47 (2005), 17-29.  
SoftLab, Software Research Laboratory, Department of 
Computer Engineering, Bogazici University, 
http://softlab.boun.edu.tr 
Stamelos, I., Angelis, L., 2001. Managing Uncertainty in 
Project Portfolio Cost Estimation, Information and 
Software Technology, 43(13), 759-768. 
SOFTWARE EFFORT ESTIMATION AS A CLASSIFICATION PROBLEM
277