6 CONCLUSIONS
Software defects must be tackled quickly and proac-
tively because the quality and efficiency of any soft-
ware product are essential features for it to survive on
the market.
One of the main advantages of using the proposed
solution is that it can work with any dataset from any
industry as long as the bug related data is organized
in a particular format - similar to the one mentioned
in our datasets. In addition, this approach gathers de-
tails associated with a specific application component
in a single place. It was developed to find similari-
ties between two or more software bugs that were re-
ported on a particular system so that the data that is
being displayed refers to the same feature or category
of features. Also, most of the time in the development
process is spent on discovering why a bug is present in
the software, how the code base needs to be changed
to fix it, and what other similar issues have been re-
ported in the past. All these answers can be easily
formulated by analyzing the data shown in the web
application module of the presented solution.
The solution presented can be improved and fur-
ther enriched with multiple features so users can ben-
efit even more when using it in their day-to-day ac-
tivities. For example, in this research, we used a pre-
trained model for the BERT algorithm. However, us-
ing an industry-specific vocabulary could benefit even
more from the accuracy of BERT because the more
he learns what different terms mean, the better the
prediction will be. Furthermore, the product require-
ments represent another source of relevant informa-
tion on a specific topic. For the case analyzed in this
paper, the requirements reside inside epic contracts
and epic analysis documents. Integrating this infor-
mation into the dataset used for prediction will en-
large the possibility of finding the root cause of a bug.
REFERENCES
Ahmed, H. A., Bawany, N. Z., and Shamsi, J. A. (2021).
Capbug-a framework for automatic bug categorization
and prioritization using nlp and machine learning al-
gorithms. IEEE Access.
Bianchini, M., Gori, M., and Scarselli, F. (2005). Inside
pagerank. ACM Trans. Internet Technol.
Britton, T., Jeng, L., Carver, G., Cheak, P., and Katzenel-
lenbogen, T. (2013). Reversible Debugging Software:
Quantify the time and cost saved using reversible de-
buggers.
Chirkova, N. and Troshin, S. (2021). Empirical study of
transformers for source code. New York, NY, USA.
Association for Computing Machinery.
Chung, N. C., Miasojedow, B., Startek, M. P., and Gambin,
A. (2019). Jaccard/tanimoto similarity test and esti-
mation methods for biological presence-absence data.
BMC Bioinformatics.
Devlin, J. and Chang, M.-W. (2018). Open Sourcing
BERT: State-of-the-Art Pre-training for Natural Lan-
guage Processing.
Elmore, K. and Richman, M. Euclidean distance as a
similarity metric for principal component analysis.
Monthly Weather Review - MON WEATHER REV.
Erkan, G. and Radev, D. R. (2004). Lexrank: Graph-based
lexical centrality as salience in text summarization. J.
Artif. Int. Res.
Harman, M., Mansouri, S. A., and Zhang, Y. (2012).
Search-based software engineering: Trends, tech-
niques and applications. ACM Computing Surveys
(CSUR).
Jalbert, N. and Weimer, W. (2008). Automated duplicate
detection for bug tracking systems. In 2008 IEEE In-
ternational Conference on Dependable Systems and
Networks With FTCS and DCC (DSN).
Luhn, H. P. (1958). The automatic creation of literature
abstracts. IBM Journal of Research and Development.
Nazar, N., Hu, Y., and He, J. (2016). Summarizing software
artifacts: A literature review. Journal of Computer
Science and Technology.
Nguyen, A. T., Nguyen, T. T., Nguyen, T. N., Lo, D., and
Sun, C. (2012). Duplicate bug report detection with a
combination of information retrieval and topic model-
ing.
Saha, R. K., Lease, M., Khurshid, S., and Perry, D. E.
(2013). Improving bug localization using structured
information retrieval.
Singh, S. (2018). Natural language processing for informa-
tion extraction.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,
L., Gomez, A. N., Kaiser, L. u., and Polosukhin, I.
(2017). Attention is all you need. In Guyon, I.,
Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R.,
Vishwanathan, S., and Garnett, R., editors, Advances
in Neural Information Processing Systems. Curran
Associates, Inc.
Vidal, F. (2021). Similarity Distances for Natural Language
Processing.
Wang, Z., Tong, W., Li, P., Ye, G., Chen, H., Gong, X.,
and Tang, Z. (2022). Bugpre: an intelligent software
version-to-version bug prediction system using graph
convolutional neural networks. Complex & Intelligent
Systems.
Yang, X., Lo, D., Xia, X., Bao, L., and Sun, J. (2016). Com-
bining word embedding with information retrieval to
recommend similar bug reports. In 2016 IEEE 27th
International Symposium on Software Reliability En-
gineering (ISSRE).
An Analysis of Improving Bug Fixing in Software Development
477