Authors:
Abhishek Joshi
1
;
Divyateja Pasupuleti
1
;
P. Nischith
1
;
Sarvesh Sutaone
1
;
Soumil Ray
1
;
Soumyadeep Dey
2
and
Barsha Mitra
1
Affiliations:
1
Department of CSIS, BITS Pilani, Hyderabad Campus, Hyderabad, India
;
2
Microsoft, India
Keyword(s):
Malware, Android Malware, IoT Malware, Transformer Models, Malware Classification.
Abstract:
The massive demand for connected and smart applications and the growth of high-speed communication technologies like 5G have led to a surge in the use of Android and Internet-of-Things (IoT) devices. The popularity of such devices has resulted in a huge number of malware attacks and infections being inflicted upon these devices. Cyber criminals relentlessly target the Android and IoT devices by developing new strains of malware. To defend against these malware attacks, researchers have developed different types of malware detection and categorization techniques. In this paper, we investigate the applicability and effectiveness of different transformer-based models, which use self-attention to learn global dependencies and contextual information, for malware classification on two platforms: Android and IoT. We consider two types of inputs for malware analysis - images and sequences. For image-based analysis, we convert Android APKs and IoT traffic into images that reflect their struct
ural and behavioral features. We compare various convolutional neural network (CNN) based models with and without transformer layers, and a pure transformer model that directly processes the images. For sequence-based analysis, we extract the API call sequences from Android APKs, and apply a transformer model to encode and classify them. We also explore the effect of pretraining and embedding initialization on the transformer models. Our experiments demonstrate the advantages and limitations of using transformer-based models for malware classification, and provide insights into the training strategies and challenges of these models. To the best of our knowledge, this is the first work that systematically explores and compares different transformer-based models for malware classification on both image and sequence inputs.
(More)