Breiman, L. (2001). Random forests. Machine Learning,
45:5–32.
Dam, H. K., Pham, T., Ng, S. W., Tran, T., Grundy, J.,
Ghose, A., Kim, T., and Kim, C.-J. (2018). A deep
tree-based model for software defect prediction.
Deerwester, S. C., Dumais, S. T., Landauer, T. K., Furnas,
G. W., and Harshman, R. A. (1990). Indexing by latent
semantic analysis. Journal of the American Society for
Information Science, 41:391–407.
Fawcett, T. (2006). An introduction to ROC analysis. Pat-
tern Recognition Letters, 27(8):861–874.
Fu, W. and Menzies, T. (2017). Easy over hard: A case
study on deep learning. In Proc. of the Joint Meeting
on Foundations of Software Engineering, page 49–60.
Herbold, S., Trautsch, A., Trautsch, F., and Ledel, B.
(2022). Problems with szz and features: An empirical
study of the state of practice of defect prediction data
collection. Empirical Software Engineering, 27(2).
Howard, J. et al. (2018). fastai. https://github.com/fastai/
fastai.
Le, Q. V. and Mikolov, T. (2014). Distributed represen-
tations of sentences and documents. Computing Re-
search Repository (CoRR), abs/1405.4:1–9.
Li, J., He, P., Zhu, J., and Lyu, M. R. (2017). Software de-
fect prediction via convolutional neural network. In
IEEE International Conf. on Software Quality, Relia-
bility and Security, pages 318–328.
Majumder, S., Balaji, N., Brey, K., Fu, W., and Menzies,
T. (2018). 500+ times faster than deep learning: A
case study exploring faster methods for text mining
stackoverflow. In Proceedings of the 15th Interna-
tional Conference on Mining Software Repositories,
MSR ’18, page 554–563.
Malhotra, R. (2015). A systematic review of machine learn-
ing techniques for software fault prediction. Applied
Soft Computing, 27:504 – 518.
Matloob, F., Ghazal, T. M., Taleb, N., Aftab, S., Ahmad,
M., Khan, M. A., Abbas, S., and Soomro, T. R.
(2021). Software defect prediction using ensemble
learning: A systematic literature review. IEEE Access,
9:98754–98771.
Miholca, D. and Onet-Marian, Z. (2020). An analysis of
aggregated coupling’s suitability for software defect
prediction. In 2020 22nd International Symposium on
Symbolic and Numeric Algorithms for Scientific Com-
puting, pages 141–148. IEEE Computer Society.
Miholca, D.-L. and Czibula, G. (2019). Software defect
prediction using a hybrid model based on semantic
features learned from the source code. In Knowledge
Science, Engineering and Management: 12th Interna-
tional Conference, Part I, page 262–274.
Miholca, D.-L., Czibula, G., and Tomescu, V. (2020).
Comet: A conceptual coupling based metrics suite for
software defect prediction. Procedia Computer Sci-
ence, 176:31–40.
Miholca, D.-L., Tomescu, V.-I., and Czibula, G. (2022). An
in-depth analysis of the software features’ impact on
the performance of deep learning-based software de-
fect predictors. IEEE Access, 10:64801–64818.
Narayanan, A., Chandramohan, M., Venkatesan, R., Chen,
L., Liu, Y., and Jaiswal, S. (2017). Graph2vec: Learn-
ing distributed representations of graphs.
Neto, E. C., da Costa, D. A., and Kulesza, U. (2018).
The impact of refactoring changes on the szz algo-
rithm: An empirical study. 2018 IEEE 25th Inter-
national Conference on Software Analysis, Evolution
and Reengineering (SANER), pages 380–390.
Pan, C., Lu, M., and Xu, B. (2021). An empirical study on
software defect prediction using CodeBERT model.
Applied Sciences, 11(11).
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,
Thirion, B., Grisel, O., Blondel, M., Prettenhofer,
P., Weiss, R., Dubourg, V., Vanderplas, J., Passos,
A., Cournapeau, D., Brucher, M., Perrot, M., and
Duchesnay, E. (2011). Scikit-learn: Machine learning
in Python. Journal of Machine Learning Research,
12:2825–2830.
ˇ
Reh
˚
u
ˇ
rek, R. and Sojka, P. (2010). Software framework for
topic modelling with large corpora. In Proceedings of
the LREC 2010 Workshop on New Challenges for NLP
Frameworks, pages 45–50. ELRA.
Rozemberczki, B., Kiss, O., and Sarkar, R. (2020). Karate
Club: An API Oriented Open-source Python Frame-
work for Unsupervised Learning on Graphs. In Proc.
of the ACM International Conf. on Information and
Knowledge Management, page 3125–3132. ACM.
Sayyad, S. and Menzies, T. (2015). The PROMISE reposi-
tory of software engineering databases. School of In-
formation Technology and Engineering, University of
Ottawa, Canada.
Sikic, L., Kurdija, A. S., Vladimir, K., and Silic, M. (2022).
Graph neural network for source code defect predic-
tion. IEEE Access, 10:10402–10415.
Tantithamthavorn, C., McIntosh, S., Hassan, A. E., and
Matsumoto, K. (2019). The impact of automated
parameter optimization on defect prediction models.
IEEE Trans. on Software Eng., 45(7):683–711.
Uddin, M. N., Li, B., Ali, Z., Kefalas, P., Khan, I., and
Zada, I. (2022). Software defect prediction employ-
ing BiLSTM and BERT-based semantic feature. Soft
Computing, 26:1–15.
Wang, S., Liu, T., Nam, J., and Tan, L. (2020). Deep se-
mantic feature learning for software defect prediction.
IEEE Trans. on Software Eng., 46(12):1267–1293.
Wang, S., Liu, T., and Tan, L. (2016). Automatically learn-
ing semantic features for defect prediction. In Proc. of
the Int. Conf. on Software Eng., pages 297–308.
Zhang, D., Tsai, J., and Boetticher, G. (2007). Improving
credibility of machine learner models in software en-
gineering. In Advances in Machine Learning Applica-
tions in Software Engineering, pages 52–72.
ICSOFT 2023 - 18th International Conference on Software Technologies
196