An Asynchronous Federated Learning Approach for a Security Source Code Scanner
Sabrina Kall, Slim Trabelsi
2021
Abstract
Hard-coded tokens and secrets leaked through source code published on open-source platforms such as Github are a pervasive security threat and a time-consuming problem to mitigate. Prevention and damage control can be sped up with the aid of scanners to identify leaks, however such tools tend to have low precision, and attempts to improve them through the use of machine learning have been hampered by the lack of training data, as the information the models need to learn from is by nature meant to be kept secret by its owners. This problem can be addressed with federated learning, a machine learning paradigm allowing models to be trained on local data without the need for its owners to share it. After local training, the personal models can be merged into a combined model which has learned from all available data for use by the scanner. In order to optimize local machine learning models to better identify leaks in code, we propose an asynchronous federated learning system combining personalization techniques for local models with merging and benchmarking algorithms for the global model. We propose to test this new approach on leaks collected from the code-sharing platform Github. This use case demonstrates the impact on the accuracy of the local models employed by the code scanners when we apply our new proposed approach, balancing federation and personalization to handle often highly diverse and unique datasets.
DownloadPaper Citation
in Harvard Style
Kall S. and Trabelsi S. (2021). An Asynchronous Federated Learning Approach for a Security Source Code Scanner.In Proceedings of the 7th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP, ISBN 978-989-758-491-6, pages 572-579. DOI: 10.5220/0010300305720579
in Bibtex Style
@conference{icissp21,
author={Sabrina Kall and Slim Trabelsi},
title={An Asynchronous Federated Learning Approach for a Security Source Code Scanner},
booktitle={Proceedings of the 7th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,},
year={2021},
pages={572-579},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010300305720579},
isbn={978-989-758-491-6},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 7th International Conference on Information Systems Security and Privacy - Volume 1: ICISSP,
TI - An Asynchronous Federated Learning Approach for a Security Source Code Scanner
SN - 978-989-758-491-6
AU - Kall S.
AU - Trabelsi S.
PY - 2021
SP - 572
EP - 579
DO - 10.5220/0010300305720579