• Taking inspiration from the work of Tan et al. (Tan
et al., 2021), the ranking could be performed
with the use of machine learning models such as
RankNet. It would probably increase the resource
usage in exchange for a possibly more robust and
accurate method.
• Manually validating all the extracted VICs would
improve the confidence in the dataset quality and
further strengthen the validity of our proposed
method.
• Building efficient just-in-time vulnerability detec-
tion algorithms based on machine learning models
trained on the extracted VICs dataset.
ACKNOWLEDGMENTS
The presented work was carried out within the
SETIT project (2018-1.2.1-NKP-2018-00004),
5
sup-
ported by project TKP2021-NVA-09,
6
and the Min-
istry of Innovation and Technology NRDI Office
within the framework of the Artificial Intelligence
National Laboratory Program (MILAB). The research
was partly supported by the EU-funded project As-
sureMOSS (Grant no. 952647) as well.
Furthermore, P
´
eter Heged
˝
us was supported by the
Bolyai J
´
anos Scholarship of the Hungarian Academy
of Sciences and the
´
UNKP-21-5-SZTE-570 New Na-
tional Excellence Program of the Ministry for Innova-
tion and Technology.
REFERENCES
(2021. dec. 14.a). Cve 2016-3674: https://nvd.nist.gov/
vuln/detail/cve-2016-3674.
(2021. dec. 14.b). Jgroups fixing commit - https://
github.com/belaban/jgroups/commit/38a882331035ff
ed205d15a5c92b471fd09659c.
(2021. dec. 14.). Sap - project kb: https://github.com/sap/
project-kb/tree/master/vulnerability-data.
(2021. dec. 14.). The state of open source vulnerabili-
ties 2021: https://www.whitesourcesoftware.com/
resources/research-reports/the-state-of-open-source-
vulnerabilities/.
5
Project no. 2018-1.2.1-NKP-2018-00004 has been im-
plemented with the support provided from the National Re-
search, Development and Innovation Fund of Hungary, fi-
nanced under the 2018-1.2.1-NKP funding scheme.
6
Project TKP2021-NVA-09 was implemented with the
support provided by the Ministry of Innovation and Tech-
nology of Hungary from the National Research, Develop-
ment and Innovation Fund, financed under the TKP2021-
NVA funding scheme.
(2021. dec. 14.). Szz unleashed:https://github.com/ wogsc-
par/szzunleashed.
(2021. dec. 14.). Xstream: https://github.com/x-
stream/xstream.
(2021. nov. 20.). The mitre corporation - common vulnera-
bilities and exposures: https://www.cve.org/.
(2021. nov. 20.). U.s. national institute of standards
and technology - national vulnerability database:
https://nvd.nist.gov/.
Amin, A., Eldessouki, A., Magdy, M. T., Abdeen, N.,
Hindy, H., and Hegazy, I. (2019). Androshield: Au-
tomated android applications vulnerability detection,
a hybrid static and dynamic analysis approach. Infor-
mation, 10(10).
Bhandari, G. P., Naseer, A., and Moonen, L. (2021).
Cvefixes: Automated collection of vulnerabilities
and their fixes from open-source software. CoRR,
abs/2107.08760.
Borg, M., Svensson, O., Berg, K., and Hansson, D. (2019).
Szz unleashed: an open implementation of the szz
algorithm - featuring example usage in a study of
just-in-time bug prediction for the jenkins project.
Proceedings of the 3rd ACM SIGSOFT International
Workshop on Machine Learning Techniques for Soft-
ware Quality Evaluation - MaLTeSQuE 2019.
Cao, S., Sun, X., Bo, L., Wei, Y., and Li, B. (2021).
Bgnn4vd: Constructing bidirectional graph neural-
network for vulnerability detection. Information and
Software Technology, 136:106576.
Dai, J., Zhang, Y., Jiang, Z., Zhou, Y., Chen, J., Xing, X.,
Zhang, X., Tan, X., Yang, M., and Yang, Z. (2020).
BScout: Direct Whole Patch Presence Test for Java
Executables. USENIX Association, USA.
Falleri, J.-R., Morandat, F., Blanc, X., Martinez, M., and
Monperrus, M. (2014). Fine-grained and accurate
source code differencing. ASE 2014 - Proceedings of
the 29th ACM/IEEE International Conference on Au-
tomated Software Engineering.
Gkortzis, A., Mitropoulos, D., and Spinellis, D. (2018).
Vulinoss: A dataset of security vulnerabilities in open-
source systems. In 2018 IEEE/ACM 15th Interna-
tional Conference on Mining Software Repositories
(MSR), pages 18–21.
Herzog, S. (2010). Xml external entity attacks (xxe). Re-
trieved October, 13:2013.
Li, F. and Paxson, V. (2017). A large-scale empirical study
of security patches. In Proceedings of the 2017 ACM
SIGSAC Conference on Computer and Communica-
tions Security, CCS ’17, page 2201–2215, New York,
NY, USA. Association for Computing Machinery.
Li, H., Kim, T., Bat-Erdene, M., and Lee, H. (2013).
Software vulnerability detection using backward trace
analysis and symbolic execution. In 2013 Interna-
tional Conference on Availability, Reliability and Se-
curity, pages 446–454.
Meneely, A., Srinivasan, H., Musa, A., Tejeda, A. R.,
Mokary, M., and Spates, B. (2013). When a patch
goes bad: Exploring the properties of vulnerability-
contributing commits. In 2013 ACM / IEEE Interna-
A Vulnerability Introducing Commit Dataset for Java: An Improved SZZ based Approach
77