Table 2: This table shows a comparison of our method
against well-known antiviruses. Our tool achieves a detec-
tion rate of 100%.
Antivirus Detection Rates Antivirus Detection Rates
Our tool 100% Panda 19%
Avira 16% Kaspersky 81%
Avast 87% Qihoo-360 96%
McAfee 96% AVG 82%
BitDefender 87% ESET-NOD32 87%
F-Secure 87% Symantec 14%
6 CONCLUSION
The main contribution of this paper is the applica-
tion of graph kernel based learning techniques for
malware detection in a completely static way (no dy-
namic analysis). As far as we know, this is the first
time that these techniques are applied for malware
detection in a static manner. We introduced an auto-
matic malware detection algorithm based on SVMs.
First, we use static analysis in order to create ab-
stract API graphs from control flow graphs. Then, we
build SVMs that learn the malicious behaviors from
these API graphs and achieve malware detection and
recognition. These SVMs are built upon a well ded-
icated random walk graph kernel (RDW) that mea-
sures graph similarity as the number of common paths
of increasing lengths and characterizes common ma-
licious behaviors through training and test data. The
use of this kernel is clearly appropriate as it allows us
to handle non-vectorial data (i.e., graphs) without any
explicit generation of features on these graphs. Exper-
iments show that our RDW-based classifier achieves
a TPR of almost 99% with only 1.24% FPR for mal-
ware detection and an accuracy of 96.55% for mal-
ware category recognition. Compared to other ker-
nels (such as histogram intersection and convolution),
our RDW based method obtains the best classification
performances.
Note that we could have extracted vectorial features
from graphs and then applied other learning tech-
niques such as ANNs, but this would have led to loss
of information. Thus, we believe that applying graph
kernel based SVMs is the best choice to learn our ma-
licious behavior graphs.
REFERENCES
Anderson, B., Quist, D., Neil, J., Storlie, C., and Lane,
T. (2011). Graph-based malware detection using
dynamic analysis. Journal in Computer Virology,
7(4):247–258.
Babi
´
c, D., Reynaud, D., and Song, D. (2011). Malware
analysis with tree automata inference. CAV’11.
Barla, A., Odone, F., and Verri, A. (2003). Histogram inter-
section kernel for image classification. In ICIP 2003.
Bergeron, J., Debbabi, M., Erhioui, M., and Ktari, B.
(1999). Static analysis of binary code to isolate mali-
cious behaviors. In WET ICE ’99.
Burges, C. J. C. (1998). A tutorial on support vector ma-
chines for pattern recognition. Data Min. Knowl. Dis-
cov., 2(2).
Chang, C.-C. and Lin, C.-J. (2011). Libsvm: A library for
support vector machines. ACM Transactions on Intel-
ligent Systems and Technology, 2. Software available
at http://www.csie.ntu.edu.tw/ cjlin/libsvm.
Christodorescu, M. and Jha, S. (2003). Static analysis of
executables to detect malicious patterns. SSYM’03.
Christodorescu, M., Jha, S., and Kruegel, C. (2007). Mining
specifications of malicious behavior. ESEC-FSE ’07.
ACM.
Eagle, C. (2011). The IDA Pro Book. No Starch Press, 2nd
edition.
Elhadi, E., Maarof, M. A., and Barry, B. (2015). Improving
the detection of malware behaviour using simplified
data dependent api call graph.
Fredrikson, M., Jha, S., Christodorescu, M., Sailer, R., and
Yan, X. (2010). Synthesizing near-optimal malware
specifications from suspicious behaviors. SP ’10.
G
¨
artner, T., Flach, P., and Wrobel, S. (2003). On graph
kernels: Hardness results and efficient alternatives. In
Learning Theory and Kernel Machines.
Gavrilut, D., Cimpoesu, M., Anton, D., and Ciortuz, L.
(2009). Malware detection using perceptrons and sup-
port vector machines. In 2009 Computation World:
Future Computing, Service Computation, Cognitive,
Adaptive, Content, Patterns. IEEE.
Haussler, D. (1999). Convolution kernels on discrete struc-
tures.
Khammas, B. M., Monemi, A., Bassi, J. S., Ismail, I., Nor,
S. M., and Marsono, M. N. (2015). Feature selection
and machine learning classification for malware de-
tection. Jurnal Teknologi, 77.
Kinable, J. and Kostakis, O. (2011). Malware classification
based on call graph clustering. J. Comput. Virol., 7(4).
Kinder, J., Katzenbeisser, S., Schallhart, C., and Veith, H.
(2010). Proactive detection of computer worms using
model checking. Dependable and Secure Computing,
IEEE Transactions on, 7(4).
Kinder, J. and Veith, H. (2008). Jakstab: A static analy-
sis platform for binaries. In Gupta, A. and Malik, S.,
editors, Computer Aided Verification, volume 5123.
Kolter, J. Z. and Maloof, M. A. (2004). Learning to detect
malicious executables in the wild. KDD ’04.
Kong, D. and Yan, G. (2013). Discriminant malware dis-
tance learning on structural information for automated
malware classification. In Proceedings of the 19th
ACM SIGKDD international conference on Knowl-
edge discovery and data mining.
Macedo, H. and Touili, T. (2013). Mining malware spec-
ifications through static reachability analysis. In ES-
ORICS 2013.
ICISSP 2017 - 3rd International Conference on Information Systems Security and Privacy
462