results, as the model may not generalize well to
real-world applications.
6 CONCLUSION AND FUTURE
WORK
In this study, we focus on the problem of having vul-
nerability samples for some programming languages
but not others. To overcome this problem, we design
a method that extract vulnerability prediction knowl-
edge from available data samples and then use it to
predict vulnerabilities in another programming lan-
guage. We also, add flexibility to update the model
once new samples are provided. Specifically, in this
study, we built a model that is able to detect vulnera-
bilities in both Java and C source code. We trained a
CNN-based model with C source code from VDISC
dataset. Then, we modified the model to detect the
learned vulnerabilities in Java source code. We ex-
tracted Java sample codes from SARD dataset. By
the end of our experiments, we were able to show
that despite the many differences between program-
ming languages, we were able to train one model to
detect vulnerabilities in more than one programming
language. This study could be further extended to de-
tect vulnerabilities in other commonly used program-
ming languages such as Python and Javascript. The
study could be also improved by training the model
on other common vulnerability types from different
programming languages.
ACKNOWLEDGEMENTS
This work was funded by The Scientific and Techno-
logical Research Council of Turkey, under 1515 Fron-
tier R&D Laboratories Support Program with project
no: 5169902.
REFERENCES
Bilgin, Z., Ersoy, M. A., Soykan, E. U., Tomur, E., C¸ omak,
P., and Karac¸ay, L. (2020). Vulnerability prediction
from source code using machine learning. IEEE Ac-
cess, 8:150672–150684.
Black, P. E. (2018). A software assurance reference dataset:
Thousands of programs with known bugs. Journal
of research of the National Institute of Standards and
Technology, 123:1.
Coskun, T., Halepmollasi, R., Hanifi, K., Fouladi, R. F.,
De Cnudde, P. C., and Tosun, A. (2022). Profiling
developers to predict vulnerable code changes. In Pro-
ceedings of the 18th International Conference on Pre-
dictive Models and Data Analytics in Software Engi-
neering, pages 32–41.
Devlin, J., Chang, M.-W., Lee, K., and Toutanova, K.
(2018). Bert: Pre-training of deep bidirectional trans-
formers for language understanding. arXiv preprint
arXiv:1810.04805.
Duan, X., Wu, J., Ji, S., Rui, Z., Luo, T., Yang, M., and Wu,
Y. (2019). Vulsniper: Focus your attention to shoot
fine-grained vulnerabilities. In IJCAI, pages 4665–
4671.
Halepmollası, R., Hanifi, K., Fouladi, R. F., and Tosun,
A. (2023). A comparison of source code represen-
tation methods to predict vulnerability inducing code
changes. In ENASE, page In Press.
Hanif, H., Nasir, M. H. N. M., Ab Razak, M. F., Firdaus, A.,
and Anuar, N. B. (2021). The rise of software vulnera-
bility: Taxonomy of software vulnerabilities detection
and machine learning approaches. Journal of Network
and Computer Applications, 179:103009.
Kalouptsoglou, I., Siavvas, M., Kehagias, D., Chatzigeor-
giou, A., and Ampatzoglou, A. (2022). Examining
the capacity of text mining and software metrics in
vulnerability prediction. Entropy, 24(5):651.
Li, X., Wang, L., Xin, Y., Yang, Y., Tang, Q., and Chen, Y.
(2021). Automated software vulnerability detection
based on hybrid neural network. Applied Sciences,
11(7):3201.
Lin, G., Wen, S., Han, Q.-L., Zhang, J., and Xiang, Y.
(2020). Software vulnerability detection using deep
neural networks: a survey. Proceedings of the IEEE,
108(10):1825–1848.
Lin, G., Zhang, J., Luo, W., Pan, L., De Vel, O., Montague,
P., and Xiang, Y. (2019). Software vulnerability dis-
covery via learning multi-domain knowledge bases.
IEEE Transactions on Dependable and Secure Com-
puting, 18(5):2469–2485.
Palit, T., Moon, J. F., Monrose, F., and Polychronakis, M.
(2021). Dynpta: Combining static and dynamic anal-
ysis for practical selective data protection. In 2021
IEEE Symposium on Security and Privacy (SP), pages
1919–1937. IEEE.
Russell, R., Kim, L., Hamilton, L., Lazovich, T., Harer,
J., Ozdemir, O., Ellingwood, P., and McConley, M.
(2018). Automated vulnerability detection in source
code using deep representation learning. In 2018 17th
IEEE international conference on machine learning
and applications (ICMLA), pages 757–762. IEEE.
Zhou, Y., Liu, S., Siow, J., Du, X., and Liu, Y. (2019). De-
vign: Effective vulnerability identification by learn-
ing comprehensive program semantics via graph neu-
ral networks. Advances in neural information process-
ing systems, 32.
Ziems, N. and Wu, S. (2021). Security vulnerability detec-
tion using deep learning natural language processing.
In IEEE INFOCOM 2021-IEEE Conference on Com-
puter Communications Workshops (INFOCOM WK-
SHPS), pages 1–6. IEEE.
ENASE 2023 - 18th International Conference on Evaluation of Novel Approaches to Software Engineering
486