
Table 4: Point estimation results (%).
Using the mean of projects Using the median of projects
Dataset Classifier
MMRE MdMRE PRED MMRE MdMRE PRED
LD 189 183 33 131 131 33.6
K-NN 189 183 33 131 131 33.6
coc81
DT 192 190 29.6 134 131 30.2
LD 69 45 42.2 51 32 54.8
K-NN 69 45 42 51 32 54.6
cocomonasa_v1
DT 76 50 26.8 58 40 39.4
LD 13 12 84.14 13 12 86.42
K-NN 13 12 84.14 13 12 86.71
desharnais_1_1
DT 16 15 79 15 15 81.85
LD 70 52 55.5 52 40 57.7
K-NN 69 52 55.5 52 40 57.7
nasa93
DT 72 52 51.2 55 41 53.4
LD 45 28 45.5 37 26 52
K-NN 45 28 45.5 37 26 52
sdr05
DT 59 44 28.5 52 38 35
LD 31 31 50.5 25 23 67
K-NN 30 31 50.5 24 23 67
sdr06
DT 34 36 44.5 27 25 61
LD 14 14 84.66 14 14 79.6
K-NN 14 13 81.33 14 14 76.3
sdr07
DT 14 13 81.33 14 14 76.3
problem and propose an approach that classifies new
software projects in one of the dynamically created
effort classes each corresponding to an effort
interval. In the experiments done, we obtain higher
hit rates than other studies in the literature. For point
estimation results, we can see that MdMRE, MMRE,
and PRED (25) values are comparable to those in the
literature for most of the datasets although we use
simple methods like mean and median regression.
Future work includes using different clustering
techniques to find effort classes and to apply
regression-based models for point estimation.
ACKNOWLEDGEMENTS
This research is supported by Boğaziçi University
research fund under grant number BAP 06HA104
and the Turkish Scientific Research Council
(TUBITAK) under grant number EEEAG 108E014.
REFERENCES
Alpaydin, E., 2004. Introduction to Machine Learning,
MIT Press.
Angelis, L., Stamelos, I., 2000. A Simulation Tool for
Efficient Analogy Based Cost Estimation, Empirical
Software Engineering, 5, 35-68.
Bakar, Z. A., Deris, M. M., Alhadi, A. C., 2005.
Performance Analysis of Partitional and Incremental
Clustering, SNATI 2005.
Boehm B. W., 1981. Software Engineering Economics,
Prentice-Hall.
Boetticher, G., Menzies, T., Ostrand, T., 2007. PROMISE
Repository of Empirical Software Engineering Data,
http://promisedata.org/repository, West Virginia
University, Department of Computer Science.
Gallego, J. J. C., Rodriguez, D., Sicilia, M. A., Rubio, M.
G., Crespo, A. G., 2007. Software Project effort
Estimation Based on Multiple Parametric Models
Generated through Data Clustering, Journal of
Computer Science and Technology, 22 (3), 371-378.
Jorgensen, M., 2003. An Effort Prediction Interval
Approach Based on the Empirical Distribution of
Previous Estimation Accuracy, Information and
Software Technology, 45, 123-126.
Lee, A., Cheng, C. H., Balakrishnan, J., 1998. Software
Development Cost Estimation: Integrating Neural
Network with Cluster analysis, Information &
Management, 34, 1-9.
Leung, H., Fan, Z., 2002. Software Cost Estimation,
Handbook of Software Engineering and Knowledge
Engineering, Vol. 2, World Scientific.
Quinlan, J. R., 1993. C4.5: Programs for Machine
Learning, Morgan Kaufman.
Sentas, P., Angelis, L., Stamelos, I., 2003. Multinominal
Logistic Regression Applied on Software Productivity
Prediction, PCI 2003, 9
th
Panhellenic Conference in
Informatics, Thessaloniki.
Sentas, P., Angelis, L., Stamelos, I., Bleris, G., 2004.
Software Productivity and Effort Prediction with
Ordinal Regression, Information and Software
Technology, 47 (2005), 17-29.
SoftLab, Software Research Laboratory, Department of
Computer Engineering, Bogazici University,
http://softlab.boun.edu.tr
Stamelos, I., Angelis, L., 2001. Managing Uncertainty in
Project Portfolio Cost Estimation, Information and
Software Technology, 43(13), 759-768.
SOFTWARE EFFORT ESTIMATION AS A CLASSIFICATION PROBLEM
277