Authors:
Rabaya Mim
;
Abdus Satter
;
Toukir Ahammed
and
Kazi Sakib
Affiliation:
Institute of Information Technology, University of Dhaka, Dhaka, Bangladesh
Keyword(s):
Source Code, Vulnerability Detection, CodeBERT, Centrality Analysis, Convolutional Neural Network.
Abstract:
As software programs continue to grow in size and complexity, the prevalence of software vulnerabilities has emerged as a significant security threat. Detecting these vulnerabilities has become a major concern due to the potential security risks they pose. Though Deep Learning (DL) approaches have shown promising results, previous studies have encountered challenges in simultaneously maintaining detection accuracy and scalability. In response to this challenge, our research proposes a method of automated software Vulnerability detection using CodeBERT and Convolutional Neural Network called VulBertCNN. The aim is to achieve both accuracy and scalability when identifying vulnerabilities in source code. This approach utilizes pre-trained codebert embedding model in graphical analysis of source code and then applies complex network analysis theory to convert a function’s source code into an image taking into account both syntactic and semantic information. Subsequently, a text convoluti
onal neural network is employed to detect vulnerabilities from the generated images of code. In comparison to three existing CNN based methods TokenCNN, VulCNN and ASVD, our experimental results demonstrate a noteworthy improvement in accuracy from 78.6% to 95.7% and F1 measure increasing from 62.6% to 89% which is a significant increase of 21.7% and 26.3%. This underscores the effectiveness of our approach in detecting vulnerabilities in large-scale source code. Hence, developers can employ these findings to promptly apply effective patches on vulnerable functions.
(More)