5 CONCLUSIONS
In this paper, we proposed a malware classification
method based on sequence pattern generated by net-
work flow of malware samples. The goal was to clas-
sify malware only by using its network behavior. The
method begins by extracting flow data from traffic ex-
tracted by a dynamic analyser of malware. We ex-
tract features of flow and cluster them by a K-means
algorithm. On the basis of the clustering result, the
sequence patterns are generated. These patterns rep-
resent the network behavior of a malware family. Fi-
nally, we classify the malware’s behavior by using a
sequence alignment algorithm. Although our experi-
ment is preliminary, its results show that it can clas-
sify new types of malware into appropriate families as
their variants.
Our future work will focus on studying the clas-
sification of unknown malware against known mal-
ware families using network behaviors. We intend to
continue developing and testing the classification sys-
tem, while expending our malware samples and refin-
ing our classification algorithm. We are also going
to compare our method with other classification sys-
tems that use malware behavior. Our classification
method has the potential to accurately analyse mal-
ware behavior, which should assist developers of anti-
malware software to catch up with the rapid evolution
of malware.
ACKNOWLEDGEMENTS
This work is supported by R&D of detective and an-
alytical technology against advanced cyber-attacks,
administered by the Ministry of Internal Affairs and
Communications.
Also, we thank Dr. Takeshi Yagi, who is a re-
searcher in NTT Secure Platform Lab., for providing
us the traffic capture data of malware samples.
REFERENCES
Aoki, K., Kawakoya, Y., Iwamura, M., and Itoh, M. (2010).
Investigation about malware execution time in dy-
namic analysis. In Computer Security Symposium.
Berger-Sabbatel, G. and Duda, A. (2012). Classification of
malware network activity. In Multimedia Communica-
tions, Services and Security, pages 24–35. Springer.
Coull, S., Branch, J., Szymanski, B., and Breimer, E.
(2003). Intrusion detection: A bioinformatics ap-
proach. In Computer Security Applications Confer-
ence, 2003. Proceedings. 19th Annual. IEEE.
Coull, S. E. and Szymanski, B. K. (2008). Sequence align-
ment for masquerade detection. Computational Statis-
tics & Data Analysis, 52(8):4116–4131.
Erman, J., Arlitt, M., and Mahanti, A. (2006). Traffic clas-
sification using clustering algorithms. In Proceedings
of the 2006 SIGCOMM workshop on Mining network
data, pages 281–286. ACM.
Iwamoto, K. and Wasaki, K. (2012). Malware classification
based on extracted api sequences using static analy-
sis. In Proceedings of the Asian Internet Engineeering
Conference, AINTEC ’12, pages 31–38, New York,
NY, USA. ACM.
McAfee (2014). Mcafee labs threats report: June 2014.
Nari, S. and Ghorbani, A. A. (2013). Automated malware
classification based on network behavior. In Comput-
ing, Networking and Communications (ICNC), 2013
International Conference on, pages 642–647. IEEE.
Needleman, S. B. and Wunsch, C. D. (1970). A gen-
eral method applicable to the search for similarities
in the amino acid sequence of two proteins. Journal
of molecular biology, 48(3):443–453.
Pedersen, J., Bastola, D., Dick, K., Gandhi, R., and Ma-
honey, W. (2013). Fingerprinting malware using
bioinformatics tools building a classifier for the zeus
virus. The 2013 International Conference on Security
& Management (SAM2013).
Perdisci, R., Lee, W., and Feamster, N. (2010). Behavioral
clustering of http-based malware and signature gener-
ation using malicious network traces. In NSDI.
Rafique, M. Z., Chen, P., Huygens, C., and Joosen, W.
(2014). Evolutionary algorithms for classification
of malware families through different network be-
haviors. In Proceedings of the 2014 conference on
Genetic and evolutionary computation, pages 1167–
1174. ACM.
Shankarapani, M. K., Ramamoorthy, S., Movva, R. S., and
Mukkamala, S. (2011). Malware detection using as-
sembly and api call sequences. Journal in computer
virology, 7(2):107–119.
Smith, T. F. and Waterman, M. S. (1981). Identification of
common molecular subsequences. Journal of molec-
ular biology, 147(1):195–197.
Stakhanova, N., Couture, M., and Ghorbani, A. A.
(2011). Exploring network-based malware classifi-
cation. In Malicious and Unwanted Software (MAL-
WARE), 2011 6th International Conference on, pages
14–20. IEEE.
MalwareClassificationMethodBasedonSequenceofTrafficFlow
237