
Construct Validity: Construct validity focuses on the
tool used for extracting Program Dependency Graphs
(PDGs), namely Joern. While commonly used, Jo-
ern may have inherent flaws. Despite this, we chose
Joern for PDG extraction and performed manual re-
views to identify and address any issues. There’s a
potential threat from modifying baseline approaches,
but we mitigate this risk by retrieving the original
source code directly from GitHub repositories asso-
ciated with analyzed techniques.
Criterion Validity: In vulnerability detection, met-
rics like precision, recall, and F-measure quantify the
alignment of identified vulnerabilities with actual vul-
nerable functions. High value of criterion validity in-
dicates our algorithm effectively predicts vulnerabili-
ties in line with widely accepted standards.
6 CONCLUSION AND FUTURE
WORK
This paper introduces an automated Software
Vulnerability Detection with CodeBert and
Convolutional Neural Network named VulBertCNN,
aiming to overcome the limitations of state-of-the-
art individual text and graph-based approaches in
vulnerability detection.
In this paper, a vulnerability detection approach
is proposed which focuses on integrating Codebert
embedding model with multiple centralities in image
generation from PDGs to assess the overall impact
of each line of code within a function, thereby de-
termining its vulnerability status. The evaluation in-
volves the generation of 16 centrality combinations
derived from 5 centralities, revealing that the highest
accuracy is attained with a combination of 4 central-
ity measures. This achieves an accuracy surpassing
the previous state-of-the-art techniques from 78.6%
to 95.7% and F1-score increasing it from 62.6% to
89%. It is observed that leveraging codebert embed-
ding with CNN emerges effective role in vulnerability
detection.
Future plans involve optimizing program depen-
dency graph generation time with tools like Frama-
C, incorporating dynamic analysis for improved de-
tection. Additionally, efforts will be made to narrow
down the search space within a function by comparing
source code with vulnerability reports from National
Vulnerability Database aiming to identify statement-
level vulnerabilities.
ACKNOWLEDGEMENTS
This research is supported by the fellowship from
Information and Communication Technology (ICT)
Division, Ministry of Posts, Telecommunications
and Information Technology, Bangladesh. No-
56.00.0000.052.33.001.23-09; Date: 04.02.2024.
REFERENCES
Alves, H., Fonseca, B., and Antunes, N. (2016). Software
metrics and security vulnerabilities: dataset and ex-
ploratory study. In 2016 12th European Dependable
Computing Conference (EDCC), pages 37–44. IEEE.
Cheng, X., Wang, H., Hua, J., Xu, G., and Sui, Y. (2021).
Deepwukong: Statically detecting software vulnera-
bilities using deep graph neural network. ACM Trans-
actions on Software Engineering and Methodology
(TOSEM), 30(3):1–33.
Dahl, G. E., Sainath, T. N., and Hinton, G. E. (2013). Im-
proving deep neural networks for lvcsr using rectified
linear units and dropout. In 2013 IEEE international
conference on acoustics, speech and signal process-
ing, pages 8609–8613. IEEE.
Duan, X., Wu, J., Ji, S., Rui, Z., Luo, T., Yang, M., and Wu,
Y. (2019). Vulsniper: Focus your attention to shoot
fine-grained vulnerabilities. In IJCAI, pages 4665–
4671.
Fan, J., Li, Y., Wang, S., and Nguyen, T. N. (2020). A c/c++
code vulnerability dataset with code changes and cve
summaries. In Proceedings of the 17th International
Conference on Mining Software Repositories, pages
508–512.
Feng, Z., Guo, D., Tang, D., Duan, N., Feng, X., Gong,
M., Shou, L., Qin, B., Liu, T., Jiang, D., et al. (2020).
Codebert: A pre-trained model for programming and
natural languages. arXiv preprint arXiv:2002.08155.
Freeman, L. C. et al. (2002). Centrality in social networks:
Conceptual clarification. Social network: critical con-
cepts in sociology. Londres: Routledge, 1:238–263.
Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2017). Im-
agenet classification with deep convolutional neural
networks. Communications of the ACM, 60(6):84–90.
Li, Z., Zou, D., Xu, S., Jin, H., Zhu, Y., and Chen, Z. (2021).
Sysevr: A framework for using deep learning to detect
software vulnerabilities. IEEE Transactions on De-
pendable and Secure Computing, 19(4):2244–2258.
Li, Z., Zou, D., Xu, S., Ou, X., Jin, H., Wang, S.,
Deng, Z., and Zhong, Y. (2018). Vuldeepecker: A
deep learning-based system for vulnerability detec-
tion. arXiv preprint arXiv:1801.01681.
Lin, G., Zhang, J., Luo, W., Pan, L., and Xiang, Y. (2017).
Poster: Vulnerability discovery with function repre-
sentation learning from unlabeled projects. In Pro-
ceedings of the 2017 ACM SIGSAC conference on
computer and communications security, pages 2539–
2541.
ENASE 2024 - 19th International Conference on Evaluation of Novel Approaches to Software Engineering
166