Jonas Helming, Holger Arndt, Zardosht Hodaie, Maximilian Koegel, Nitesh Narayan


Many software development projects maintain repositories managing work items such as bug reports or tasks. In open-source projects, these repositories are accessible for end-users or clients, allowing them to enter new work items. These artifacts have to be further triaged. The most important step is the initial assignment of a work item to a responsible developer. As a consequence, a number of approaches exist to semi-automatically assign bug reports, e.g. using methods from machine learning. We compare different approaches to assign new work items to developers mining textual content as well as structural information. Furthermore we propose a novel model-based approach, which also considers relations from work items to the system specification for the assignment. The approaches are applied to different types of work items, including bug reports and tasks. To evaluate our approaches we mine the model repository of three different projects. We also included history data to determine how well they work in different states.


  1. Cubranic, D., 2004. Automatic bug triage using text categorization. In SEKE 2004: Proceedings of the Sixteenth International Conference on Software Engineering & Knowledge Engineering. S. 92-97.
  2. Anvik, J., 2006. Automating bug report assignment. In Proceedings of the 28th international conference on Software engineering. S. 940.
  3. Anvik, J., Hiew, L. & Murphy, G.C., 2006. Who should fix this bug? In Proceedings of the 28th international conference on Software engineering. Shanghai, China: ACM, S. 361-370. Available at: http:// Arndt, H., Bundschus, M. & Naegele, A., 2009. Towards a next-generation matrix library for Java. In COMPSAC: International Computer Software and Applications Conference.
  4. Bruegge, B. u. a., 2009. Classification of tasks using machine learning. In Proceedings of the 5th International Conference on Predictor Models in Software Engineering.
  5. Bruegge, B. u. a., 2008. Unicase - an Ecosystem for Unified Software Engineering Research Tools. In Workshop Distributed Software Development - Methods and Tools for Risk Management. Third IEEE International Conference on Global Software Engineering, ICGSE. Bangalore, India, S. 12-17. Available at: Outshore/ICGSE_2008_Workshop_Proceedings.pdf.
  6. Canfora, G. & Cerulo, L., How software repositories can help in resolving a new change request. STEP 2005, 99.
  7. Fan, R.E. u. a., 2008. LIBLINEAR: A library for large linear classification. The Journal of Machine Learning Research, 9, 1871-1874.
  8. Freund, Y. & Schapire, R.E., 1997. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences, 55(1), 119-139.
  9. Fritz, T., Murphy, G.C. & Hill, E., 2007. Does a programmer's activity indicate knowledge of code? In Proceedings of the the 6th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering. S. 350.
  10. Haykin, S., 2008. Neural networks: a comprehensive foundation, Prentice Hall.
  11. Helming, J. u. a., 2009. Integrating System Modeling with Project Management-a Case Study. In International Computer Software and Applications Conference, COMPSAC 2009. COMPSAC 2009.
  12. Koegel, M., 2008. Towards software configuration management for unified models. In Proceedings of the 2008 international workshop on Comparison and versioning of software models. S. 19-24.
  13. Mockus, A. & Herbsleb, J.D., 2002. Expertise browser: a quantitative approach to identifying expertise. In Proceedings of the 24th International Conference on Software Engineering. S. 503-512.
  14. Raymond, E., 1999. The cathedral and the bazaar. Knowledge, Technology & Policy, 12(3), 23-49.
  15. Schuler, D. & Zimmermann, T., 2008. Mining usage expertise from version archives. In Proceedings of the 2008 international working conference on Mining software repositories. S. 121-124.
  16. Sebastiani, F., 2002. Machine learning in automated text categorization. ACM computing surveys (CSUR), 34(1), 1-47.
  17. Sindhgatta, R., 2008. Identifying domain expertise of developers from source code. In Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining. S. 981-989.
  18. Witten, I.H. & Frank, E., 2002. Data mining: practical machine learning tools and techniques with Java implementations. ACM SIGMOD Record, 31(1), 76- 77.
  19. Yingbo, L., Jianmin, W. & Jiaguang, S., 2007. A machine learning approach to semi-automating workflow staff assignment. In Proceedings of the 2007 ACM symposium on Applied computing. S. 345.

Paper Citation

in Harvard Style

Helming J., Arndt H., Hodaie Z., Koegel M. and Narayan N. (2010). SEMI-AUTOMATIC ASSIGNMENT OF WORK ITEMS . In Proceedings of the Fifth International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE, ISBN 978-989-8425-21-8, pages 149-158. DOI: 10.5220/0003000901490158

in Bibtex Style

author={Jonas Helming and Holger Arndt and Zardosht Hodaie and Maximilian Koegel and Nitesh Narayan},
booktitle={Proceedings of the Fifth International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE,},

in EndNote Style

JO - Proceedings of the Fifth International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE,
SN - 978-989-8425-21-8
AU - Helming J.
AU - Arndt H.
AU - Hodaie Z.
AU - Koegel M.
AU - Narayan N.
PY - 2010
SP - 149
EP - 158
DO - 10.5220/0003000901490158