6 CONCLUSIONS
The analysis of reported issues and relevant
comments showed that a deeper study of the impact
of non-NL elements is needed to explore semantical
aspects of reports. This extends the space of the report
analysis. The introduced text pre-processing and
derived text features facilitate understanding the
classification decisions and assure better accuracy.
The validity threat to our study is the result restriction
to a few projects. Nevertheless, the presented
methodology is universal due to similarities in created
software repositories (contents and structure).
The main text mining is targeted at classification
or clustering of the considered textual objects, here
we can use diverse statistical and machine learning
techniques, which can be combined and adapted to
project specificity and searched properties. This can
be enhanced with contextual and correlations
analysis. Distinction between texts generated by bots,
authored by users, developers or testers could narrow
semantic searches and extend the space of repository
studies. Further research is targeted at correlating
issue handling processes schemes and times with
semantical aspects of textual descriptions and other
issue features (basing on our previous experience -
Sosnowski, et al., 2017; Polaczek & Sosnowski,
2021). This can be enhanced with questionnaire
studies involving project participants.
REFERENCES
Banerjee, S. et al. (2017). Automated triaging of very large
bug repositories. In Information and Software
Technology, 89.
Ebrahimi, N. e al. (2019). An HMM-based approach for
automatic detection and classification of duplicate bug
reports. In Information and Software Technology 113
(2019) 98–109.
Fan, Y., Xia, X., Lo, D., Hassan, A.E. (2018). Chaff from
the wheat: characterizing and determining valid bug
reports. In IEEE Transactions on Software
Engineering,· August 2018.
Ferreira Gomes, L.A., et al. (2019). Bug report severity
level prediction in open source software: A survey and
research opportunities. In Information and Software
Technology 115 (2019) 58–78.
Herbold, S., Trautsch, A., Trautsch, F. (2020). On the
feasibility of automated prediction of bug and non-bug
issues.In Empirical Software Engineering 25, 5333–
5369.
Hindle, A., Onuczko, C. (2019). Preventing duplicate bug
reports by continuously querying bug reports. In
Empirical Software Engineering, vol. 24, no. 2.
Huang, Y., et al. (2019). An empirical study on the issue
reports with questions raised during the issue resolving
process. In Empirical Software Engineering 24, 718–
750.
Li, Q. et al. (2022). A survey on text classification: From
traditional to deep learning In ACM Transactions on
Intelligent Systems and Technology vol. 13, no. 2.
Nadeem, A., Usman Sarwar, M., Zubair Malik, M. (2021).
Automatic issue classifier: a transfer learning
framework for classifying issue reports. In IEEE
International Symposium on Software Reliability
Engineering Workshops (ISSREW), October.
Nagvani, N.K., Verma, S. (2012). CLUBAS: An algorithm
and Java based tool for software bug classification
using bug attributes similarities. In Journal of Software
Engineering and Applications, vol.5, no. 6, 436-447.
Panichella, S., Canfora, G., Andrea Di Sorbo (2021).
‘Won’t we fix this issue?’ Qualitative characterization
and automated identification of wontfix issues on
GitHub. In Information and Software Technology, Vol.
139, Nov., 106665.
Polaczek, J., Sosnowski, J. (2021). Exploring the software
repositories of embedded systems: An industrial
experience. In Information and Software Technology,
vol. 131.
Sosnowski, J., Dobrzyński, B., Janczarek, P. (2017)
Analysing problem handling schemes in software
projects. In Information and Software Technology, vol.
91.
Umer, Q., Liu, H., Sultan, Y. (2019). Sentiment based
approval prediction for enhancement reports, In
Journal of Systems and Software, 1555 (2019) 57-69.
Vidoni, M. (2021). A systematic process for Mining
Software Repositories: Results from a systematic
literature review. In Information and Software
Technology, vol 4 December.
Wenting, D.A, et al. (2019). Analysis and detection of
information types of open source software issue
discussions. In ICSE, IEEE/ACM 41st International
Conference on Software Engineering.
Yahav, I., Shehory, O, Schwartz, D. (2019). Comments
mining with TF-IDF: The inherent bias and its removal.
In IEEE Transactions on Knowledge and Data
Engineering, vol. 31, no. 3, pp. 437-450.
Zabardast, E. Gonzalez-Huerta, J., Šmite, D. (2020).
Refactoring, bug fixing, and new development effect on
technical debt: An industrial case study. In 46th
Euromicro SEAA Conference, pp. 376-384.
Zhang, T., et al. (2016). Towards more accurate severity
prediction and fixer recommendation of software bugs.
In Journal of Systems and Software 117 166–184.
Zhang, W. et al. (2019). FineLocator: A novel approach to
method-level fine-grained bug localization by query
expansion. In Information and Software Technology
110 (2019) 121–135.
Zimmermann, T. et al. (2010). What makes a good bug
report? In IEEE Transactions on Software Engineering,
vol. 36, no. 5, pp. 618-643.