Authors:
Muhammad Ali
1
;
Monem Hamid
1
;
Jacob Jasser
1
;
Joachim Lerman
1
;
Samod Shetty
1
and
Fabio Di Troia
2
Affiliations:
1
Department of Computer Engineering, San Jose State University, San Jose, CA, U.S.A.
;
2
Department of Computer Science, San Jose State University, San Jose, CA, U.S.A.
Keyword(s):
PHMM, Malware Detection, Malware Obfuscation, API Calls, Dynamic Detection, Machine Learning.
Abstract:
Profile Hidden Markov Models (PHMM) have been used to detect malware samples based on their behavior on the host system and obtained promising results. Since PHMMs are a novel way of categorizing malware and there is limited research work on such detection method, there is no data on the impact that certain obfuscation techniques have on PHMMs. An obfuscation tool that could weaken PHMM based detection has not yet been proposed. Our novel approach is based on applying PHMM detection by training the machine learning models on API calls that are dynamically extracted from the malware samples, and then attempting to elude detection by the same models using obfuscation techniques. Hence, in our paper, we created a PHMM model trained on API call sequences extracted by running malware in a sandbox, then we tried to undermine the detection effectiveness by applying different state-of-the-art API obfuscation techniques to the malware. By implementing sophisticated API calls obfuscation techn
iques, we were able to reduce the PHMM detection rate from 1.0, without API call obfuscation, to 0.68.
(More)