In the second part of our evaluation, we tried to
simulate the use case in a realistic assignment
scenario. Therefore we applied the two best working
approaches over the project history and predicted
every assignment at exactly the state, when it was
originally done. As a consequence all approaches
can process less information than in the first part of
the evaluation, which was based on the last project
state. As expected the history-based evaluation leads
to lower accuracies for all approaches. The model-
based approach is less affected by this scenario than
the SVM. A possible reason for that is that the
model-based approach is not so much depending on
the size of the existing data but more on its quality.
This assumption is underlined by the behavior of the
model-based approach during massive changes in
the model, leading to lower results. In contrast to
that, the SVM was not so sensible to changes in
model, but more to fluctuations in the project
staffing.
We conclude that the best solution would be a
hybrid approach, i.e. a combination of the model-
based approach and SVM. This would lead to high
results for linked work items, but would also be able
to deal with unlinked items.
REFERENCES
Čubranić, D., 2004. Automatic bug triage using text
categorization. In SEKE 2004: Proceedings of the
Sixteenth International Conference on Software
Engineering & Knowledge Engineering. S. 92–97.
Anvik, J., 2006. Automating bug report assignment. In
Proceedings of the 28th international conference on
Software engineering. S. 940.
Anvik, J., Hiew, L. & Murphy, G.C., 2006. Who should
fix this bug? In Proceedings of the 28th international
conference on Software engineering. Shanghai,
China: ACM, S. 361-370. Available at: http://
portal.acm.org/citation.cfm?id=1134285.1134336
Arndt, H., Bundschus, M. & Naegele, A., 2009. Towards a
next-generation matrix library for Java. In COMPSAC:
International Computer Software and Applications
Conference.
Bruegge, B. u. a., 2009. Classification of tasks using
machine learning. In Proceedings of the 5th
International Conference on Predictor Models in
Software Engineering.
Bruegge, B. u. a., 2008. Unicase – an Ecosystem for
Unified Software Engineering Research Tools. In
Workshop Distributed Software Development -
Methods and Tools for Risk Management. Third IEEE
International Conference on Global Software
Engineering, ICGSE. Bangalore, India, S. 12-17.
Available at: http://www.outshore.de/Portals/0/
Outshore/ICGSE_2008_Workshop_Proceedings.pdf.
Canfora, G. & Cerulo, L., How software repositories can
help in resolving a new change request. STEP 2005,
99.
Fan, R.E. u. a., 2008. LIBLINEAR: A library for large
linear classification. The Journal of Machine Learning
Research, 9, 1871–1874.
Freund, Y. & Schapire, R.E., 1997. A decision-theoretic
generalization of on-line learning and an application to
boosting. Journal of computer and system sciences,
55(1), 119–139.
Fritz, T., Murphy, G.C. & Hill, E., 2007. Does a
programmer's activity indicate knowledge of code? In
Proceedings of the the 6th joint meeting of the
European software engineering conference and the
ACM SIGSOFT symposium on The foundations of
software engineering. S. 350.
Haykin, S., 2008. Neural networks: a comprehensive
foundation, Prentice Hall.
Helming, J. u. a., 2009. Integrating System Modeling with
Project Management–a Case Study. In International
Computer Software and Applications Conference,
COMPSAC 2009. COMPSAC 2009.
Holger Arndt, I.I., The Java Data Mining Package–A Data
Processing Library for Java.
Koegel, M., 2008. Towards software configuration
management for unified models. In Proceedings of the
2008 international workshop on Comparison and
versioning of software models. S. 19–24.
Mockus, A. & Herbsleb, J.D., 2002. Expertise browser: a
quantitative approach to identifying expertise. In
Proceedings of the 24th International Conference on
Software Engineering
. S. 503–512.
Raymond, E., 1999. The cathedral and the bazaar.
Knowledge, Technology & Policy, 12(3), 23–49.
Schuler, D. & Zimmermann, T., 2008. Mining usage
expertise from version archives. In Proceedings of the
2008 international working conference on Mining
software repositories. S. 121–124.
Sebastiani, F., 2002. Machine learning in automated text
categorization. ACM computing surveys (CSUR),
34(1), 1–47.
Sindhgatta, R., 2008. Identifying domain expertise of
developers from source code. In Proceeding of the
14th ACM SIGKDD international conference on
Knowledge discovery and data mining. S. 981–989.
Witten, I.H. & Frank, E., 2002. Data mining: practical
machine learning tools and techniques with Java
implementations. ACM SIGMOD Record, 31(1), 76–
77.
Yingbo, L., Jianmin, W. & Jiaguang, S., 2007. A machine
learning approach to semi-automating workflow staff
assignment. In Proceedings of the 2007 ACM
symposium on Applied computing. S. 345.
ENASE 2010 - International Conference on Evaluation of Novel Approaches to Software Engineering
158