Malware Detection in PDF Files using Machine Learning
Bonan Cuan, Aliénor Damien, Claire Delaplace, Mathieu Valois
2018
Abstract
We present how we used machine learning techniques to detect malicious behaviours in PDF files. At this aim, we first set up a SVM (Support Machine Vector) classifier that was able to detect 99.7% of malware. However, this classifier was easy to lure with malicious PDF files, which we forged to make them look like clean ones. For instance, we implemented a gradient-descent attack to evade this SVM. This attack was almost 100% successful. Next, we provided counter-measures to this attack: a more elaborated features selection and the use of a threshold allowed us to stop up to 99.99% of this attack. Finally, using adversarial learning techniques, we were able to prevent gradient-descent attacks by iteratively feeding the SVM with malicious forged PDF files. We found that after 3 iterations, every gradient-descent forged PDF file were detected, completely preventing the attack.
DownloadPaper Citation
in Harvard Style
Delaplace C. and Valois M. (2018). Malware Detection in PDF Files using Machine Learning.In Proceedings of the 15th International Joint Conference on e-Business and Telecommunications - Volume 1: SECRYPT, ISBN 978-989-758-319-3, pages 412-419. DOI: 10.5220/0006884704120419
in Bibtex Style
@conference{secrypt18,
author={Claire Delaplace and Mathieu Valois},
title={Malware Detection in PDF Files using Machine Learning},
booktitle={Proceedings of the 15th International Joint Conference on e-Business and Telecommunications - Volume 1: SECRYPT,},
year={2018},
pages={412-419},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0006884704120419},
isbn={978-989-758-319-3},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 15th International Joint Conference on e-Business and Telecommunications - Volume 1: SECRYPT,
TI - Malware Detection in PDF Files using Machine Learning
SN - 978-989-758-319-3
AU - Delaplace C.
AU - Valois M.
PY - 2018
SP - 412
EP - 419
DO - 10.5220/0006884704120419