Malware Analysis Using Transformer Based Models: An Empirical Study

Abhishek Joshi; Divyateja Pasupuleti; P. Nischith; Sarvesh Sutaone; Soumil Ray; Soumyadeep Dey; Barsha Mitra

doi:10.5220/0012855100003767

Malware Analysis Using Transformer Based Models: An Empirical Study

Abhishek Joshi, Divyateja Pasupuleti, P. Nischith, Sarvesh Sutaone, Soumil Ray, Soumyadeep Dey, Barsha Mitra

2024

Abstract

The massive demand for connected and smart applications and the growth of high-speed communication technologies like 5G have led to a surge in the use of Android and Internet-of-Things (IoT) devices. The popularity of such devices has resulted in a huge number of malware attacks and infections being inflicted upon these devices. Cyber criminals relentlessly target the Android and IoT devices by developing new strains of malware. To defend against these malware attacks, researchers have developed different types of malware detection and categorization techniques. In this paper, we investigate the applicability and effectiveness of different transformer-based models, which use self-attention to learn global dependencies and contextual information, for malware classification on two platforms: Android and IoT. We consider two types of inputs for malware analysis - images and sequences. For image-based analysis, we convert Android APKs and IoT traffic into images that reflect their structural and behavioral features. We compare various convolutional neural network (CNN) based models with and without transformer layers, and a pure transformer model that directly processes the images. For sequence-based analysis, we extract the API call sequences from Android APKs, and apply a transformer model to encode and classify them. We also explore the effect of pretraining and embedding initialization on the transformer models. Our experiments demonstrate the advantages and limitations of using transformer-based models for malware classification, and provide insights into the training strategies and challenges of these models. To the best of our knowledge, this is the first work that systematically explores and compares different transformer-based models for malware classification on both image and sequence inputs.

Download

Paper Citation

in Harvard Style

Joshi A., Pasupuleti D., Nischith P., Sutaone S., Ray S., Dey S. and Mitra B. (2024). Malware Analysis Using Transformer Based Models: An Empirical Study. In Proceedings of the 21st International Conference on Security and Cryptography - Volume 1: SECRYPT; ISBN 978-989-758-709-2, SciTePress, pages 858-865. DOI: 10.5220/0012855100003767

in Bibtex Style

@conference{secrypt24,
author={Abhishek Joshi and Divyateja Pasupuleti and P. Nischith and Sarvesh Sutaone and Soumil Ray and Soumyadeep Dey and Barsha Mitra},
title={Malware Analysis Using Transformer Based Models: An Empirical Study},
booktitle={Proceedings of the 21st International Conference on Security and Cryptography - Volume 1: SECRYPT},
year={2024},
pages={858-865},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012855100003767},
isbn={978-989-758-709-2},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 21st International Conference on Security and Cryptography - Volume 1: SECRYPT
TI - Malware Analysis Using Transformer Based Models: An Empirical Study
SN - 978-989-758-709-2
AU - Joshi A.
AU - Pasupuleti D.
AU - Nischith P.
AU - Sutaone S.
AU - Ray S.
AU - Dey S.
AU - Mitra B.
PY - 2024
SP - 858
EP - 865
DO - 10.5220/0012855100003767
PB - SciTePress