Authors:
Bonan Cuan
1
;
Aliénor Damien
2
;
Claire Delaplace
3
and
Mathieu Valois
4
Affiliations:
1
INSA Lyon, CNRS, LIRIS, Lyon and France
;
2
Thales Group, Toulouse, France, CNRS, LAAS, Toulouse and France
;
3
Univ Rennes 1, CNRS, IRISA, 35000 Rennes, France, Univ. Lille, CRIStAL, 59655 Villeneuve d’Ascq and France
;
4
Normandie Univ., UNICAEN, ENSICAEN, CNRS, GREYC, 14000 Caen and France
Keyword(s):
Malicious PDF Detection, SVM, Evasion Attacks, Gradient-Descent, Feature Selections, Adversarial Learning.
Related
Ontology
Subjects/Areas/Topics:
Information and Systems Security
;
Intrusion Detection & Prevention
Abstract:
We present how we used machine learning techniques to detect malicious behaviours in PDF files. At this aim, we first set up a SVM (Support Machine Vector) classifier that was able to detect 99.7% of malware. However, this classifier was easy to lure with malicious PDF files, which we forged to make them look like clean ones. For instance, we implemented a gradient-descent attack to evade this SVM. This attack was almost 100% successful. Next, we provided counter-measures to this attack: a more elaborated features selection and the use of a threshold allowed us to stop up to 99.99% of this attack. Finally, using adversarial learning techniques, we were able to prevent gradient-descent attacks by iteratively feeding the SVM with malicious forged PDF files. We found that after 3 iterations, every gradient-descent forged PDF file were detected, completely preventing the attack.