Using Bigrams to Detect Leaked Secrets in Source Code

Anton V. Konygin, Anton V. Konygin, Andrey Kopnin, Ilya Mezentsev, Alexandr Pankratov

2023

Abstract

Leaked secrets in source code lead to information security problems. It is important to find sensitive information in the repository as early as possible and neutralize it. By now, there are many different approaches to leaked secret detection without human intervention. Often, these are heuristic algorithms using regular expressions. Recently, more and more approaches based on machine learning have appeared. Nevertheless, the problem of detecting secrets in the code remains relevant since the available approaches often give a large number of false positives. In this paper, we propose an approach to leaked secret detection in source code based on machine learning using bigrams. This approach significantly reduces the number of false positives. The model showed a false positive rate of 2.4% and false negative rate of 1.9% on test dataset.

Download


Paper Citation


in Harvard Style

V. Konygin A., Kopnin A., Mezentsev I. and Pankratov A. (2023). Using Bigrams to Detect Leaked Secrets in Source Code. In Proceedings of the 18th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE, ISBN 978-989-758-647-7, SciTePress, pages 589-596. DOI: 10.5220/0011983600003464


in Bibtex Style

@conference{enase23,
author={Anton V. Konygin and Andrey Kopnin and Ilya Mezentsev and Alexandr Pankratov},
title={Using Bigrams to Detect Leaked Secrets in Source Code},
booktitle={Proceedings of the 18th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE,},
year={2023},
pages={589-596},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0011983600003464},
isbn={978-989-758-647-7},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 18th International Conference on Evaluation of Novel Approaches to Software Engineering - Volume 1: ENASE,
TI - Using Bigrams to Detect Leaked Secrets in Source Code
SN - 978-989-758-647-7
AU - V. Konygin A.
AU - Kopnin A.
AU - Mezentsev I.
AU - Pankratov A.
PY - 2023
SP - 589
EP - 596
DO - 10.5220/0011983600003464
PB - SciTePress