
performance in zero-shot compared to other baseline
models. In-context learning decreases the risk of data
overfitting as no training data is used. Thus, our com-
mit classification approach is as generalized as the
category definitions used in prompt engineering. In
the future, we plan to combine this approach with
other commit classification approaches to further im-
prove the classification performance.
REFERENCES
Amit, I. and Feitelson, D. G. (2021). Corrective commit
probability: a measure of the effort invested in bug
fixing. Software Quality Journal, 29(4):817–861.
Chiu, K.-L., Collins, A., and Alexander, R. (2021). De-
tecting hate speech with gpt-3. arXiv preprint
arXiv:2103.12407.
Fu, Y., Yan, M., Zhang, X., Xu, L., Yang, D., and Kymer,
J. D. (2015). Automated classification of software
change messages by semi-supervised latent dirich-
let allocation. Information and Software Technology,
57:369–377.
Ghadhab, L., Jenhani, I., Mkaouer, M. W., and Messaoud,
M. B. (2021). Augmenting commit classification by
using fine-grained source code changes and a pre-
trained deep neural language model. Information and
Software Technology, 135:106566.
Gharbi, S., Mkaouer, M. W., Jenhani, I., and Messaoud,
M. B. (2019). On the classification of software change
messages using multi-label active learning. In Pro-
ceedings of the 34th ACM/SIGAPP Symposium on Ap-
plied Computing, pages 1760–1767.
Hassan, A. E. (2008). Automated classification of change
messages in open source projects. In Proceedings
of the 2008 ACM symposium on Applied computing,
pages 837–841.
Heri
ˇ
cko, T. and
ˇ
Sumak, B. (2023). Commit classification
into software maintenance activities: A systematic lit-
erature review. In 2023 IEEE 47th Annual Computers,
Software, and Applications Conference (COMPSAC),
pages 1646–1651. IEEE.
Hindle, A., German, D. M., Godfrey, M. W., and Holt, R. C.
(2009). Automatic classication of large changes into
maintenance categories. In 2009 IEEE 17th Interna-
tional Conference on Program Comprehension, pages
30–39. IEEE.
H
¨
onel, S., Ericsson, M., L
¨
owe, W., and Wingkvist, A.
(2020). Using source code density to improve the ac-
curacy of automatic commit classification into main-
tenance activities. Journal of Systems and Software,
168:110673.
Levin, S. and Yehudai, A. (2016). Using temporal and se-
mantic developer-level information to predict mainte-
nance activity profiles. In 2016 IEEE International
Conference on Software Maintenance and Evolution
(ICSME), pages 463–467. IEEE.
Levin, S. and Yehudai, A. (2017). Boosting automatic com-
mit classification into maintenance activities by uti-
lizing source code changes. In Proceedings of the
13th International Conference on Predictive Models
and Data Analytics in Software Engineering, pages
97–106.
Mariano, R. V., dos Santos, G. E., and Brand
˜
ao, W. C.
(2021). Improve classification of commits mainte-
nance activities with quantitative changes in source
code. In ICEIS (2), pages 19–29.
Mauczka, A., Huber, M., Schanes, C., Schramm, W., Bern-
hart, M., and Grechenig, T. (2012). Tracing your
maintenance work–a cross-project validation of an
automated classification dictionary for commit mes-
sages. In Fundamental Approaches to Software Engi-
neering: 15th International Conference, FASE 2012,
Held as Part of the European Joint Conferences
on Theory and Practice of Software, ETAPS 2012,
Tallinn, Estonia, March 24-April 1, 2012. Proceedings
15, pages 301–315. Springer.
Meqdadi, O., Alhindawi, N., Alsakran, J., Saifan, A., and
Migdadi, H. (2019). Mining software repositories
for adaptive change commits using machine learning
techniques. Information and Software Technology,
109:80–91.
Min, S., Lewis, M., Zettlemoyer, L., and Hajishirzi, H.
(2021). Metaicl: Learning to learn in context. arXiv
preprint arXiv:2110.15943.
Min, S., Lyu, X., Holtzman, A., Artetxe, M., Lewis, M.,
Hajishirzi, H., and Zettlemoyer, L. (2022). Rethink-
ing the role of demonstrations: What makes in-context
learning work? arXiv preprint arXiv:2202.12837.
Mockus and Votta (2000). Identifying reasons for software
changes using historic databases. In Proceedings 2000
International Conference on Software Maintenance,
pages 120–130. IEEE.
Sarwar, M. U., Zafar, S., Mkaouer, M. W., Walia, G. S.,
and Malik, M. Z. (2020). Multi-label classification
of commit messages using transfer learning. In 2020
IEEE International Symposium on Software Reliabil-
ity Engineering Workshops (ISSREW), pages 37–42.
IEEE.
Sazid, Y., Fuad, M. M. N., and Sakib, K. (2023). Automated
detection of dark patterns using in-context learning ca-
pabilities of gpt-3. In 2023 30th Asia-Pacific Soft-
ware Engineering Conference (APSEC), pages 569–
573. IEEE.
Swanson, E. B. (1976). The dimensions of maintenance.
In Proceedings of the 2nd international conference on
Software engineering, pages 492–497.
Trautsch, A., Erbel, J., Herbold, S., and Grabowski, J.
(2023). What really changes when developers in-
tend to improve their source code: a commit-level
study of static metric value and static analysis warning
changes. Empirical Software Engineering, 28(2):30.
Yan, M., Fu, Y., Zhang, X., Yang, D., Xu, L., and Kymer,
J. D. (2016). Automatically classifying software
changes via discriminative topic model: Supporting
multi-category and cross-project. Journal of Systems
and Software, 113:296–308.
Zafar, S., Malik, M. Z., and Walia, G. S. (2019). Towards
standardizing and improving classification of bug-fix
commits. In 2019 ACM/IEEE International Sympo-
sium on Empirical Software Engineering and Mea-
surement (ESEM), pages 1–6. IEEE.
ENASE 2024 - 19th International Conference on Evaluation of Novel Approaches to Software Engineering
512