REFERENCES
Aboaoja, F. A., Zainal, A., Ghaleb, F. A., Al-rimy, B. A. S.,
Eisa, T. A. E., and Elnour, A. A. H. (2022). Malware
detection issues, challenges, and future directions: A
survey. Applied Sciences, 12(17).
Akarsh, S., Simran, K., Poornachandran, P., Menon, V. K.,
and Soman, K. (2019). Deep learning framework
and visualization for malware classification. In 2019
5th International Conference on Advanced Computing
& Communication Systems (ICACCS), pages 1059–
1063. IEEE.
Alzammam, A., Binsalleeh, H., AsSadhan, B., Kyriakopou-
los, K. G., and Lambotharan, S. (2020). Compara-
tive analysis on imbalanced multi-class classification
for malware samples using cnn. In 2019 International
Conference on Advances in the Emerging Computing
Technologies (AECT), page 1–6. IEEE.
Anderson, H. S., Kharkar, A., Filar, B., Evans, D., and
Roth, P. (2018). Learning to evade static pe machine
learning malware models via reinforcement learning.
arXiv, arXiv:1801.08917.
Birman, Y., Hindi, S., Katz, G., and Shabtai, A. (2022).
Cost-effective ensemble models selection using deep
reinforcement learning. Information Fusion, 77:133–
148.
Catak, F. O., Ahmed, J., Sahinbas, K., and Khand, Z. H.
(2021). Data augmentation based malware detection
using convolutional neural networks. PeerJ Computer
Science, 7:e346.
Coscia, A., Dentamaro, V., Galantucci, S., Maci, A., and
Pirlo, G. (2023). Yamme: a yara-byte-signatures
metamorphic mutation engine. IEEE Transactions on
Information Forensics and Security, 18:4530–4545.
de Oliveira, A. S. and Sassi, R. J. (2019). Behavioral
Malware Detection Using Deep Graph Convolutional
Neural Networks. TechRxiv.
Demirkıran, F., C¸ ayır, A.,
¨
Unal, U., and Da
˘
g, H. (2022). An
ensemble of pre-trained transformer models for imbal-
anced multiclass malware classification. Computers &
Security, 121:102846.
Deng, X., Cen, M., Jiang, M., and Lu, M. (2023). Ran-
somware early detection using deep reinforcement
learning on portable executable header. Cluster Com-
puting, pages 1–15.
Ding, Y., Wang, S., Xing, J., Zhang, X., Qi, Z., Fu, G.,
Qiang, Q., Sun, H., and Zhang, J. (2020). Mal-
ware classification on imbalanced data through self-
attention. In 2020 IEEE 19th International Confer-
ence on Trust, Security and Privacy in Computing and
Communications (TrustCom), page 154–161. IEEE.
Fang, Z., Wang, J., Geng, J., and Kan, X. (2019a). Feature
selection for malware detection based on reinforce-
ment learning. IEEE Access, 7:176177–176187.
Fang, Z., Wang, J., Li, B., Wu, S., Zhou, Y., and Huang, H.
(2019b). Evading anti-malware engines with deep re-
inforcement learning. IEEE Access, 7:48867–48879.
Fortunato, M., Azar, M. G., Piot, B., Menick, J., Osband,
I., Graves, A., Mnih, V., Munos, R., Hassabis, D.,
Pietquin, O., Blundell, C., and Legg, S. (2019). Noisy
networks for exploration. arXiv, arXiv:1706.10295.
Hasselt, H. v., Guez, A., and Silver, D. (2016). Deep re-
inforcement learning with double q-learning. In Pro-
ceedings of the Thirtieth AAAI Conference on Artifi-
cial Intelligence, volume 30, page 2094–2100. AAAI
Press.
Lin, E., Chen, Q., and Qi, X. (2020). Deep reinforcement
learning for imbalanced classification. Applied Intel-
ligence, 50:2488–2502.
Lu, Y. and Shetty, S. (2021). Multi-class malware classifi-
cation using deep residual network with non-softmax
classifier. In 2021 IEEE 22nd International Confer-
ence on Information Reuse and Integration for Data
Science (IRI), page 201–207. IEEE.
Maci, A., Santorsola, A., Coscia, A., and Iannacone,
A. (2023). Unbalanced web phishing classification
through deep reinforcement learning. Computers,
12(6).
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Ve-
ness, J., Bellemare, M. G., Graves, A., Riedmiller, M.,
Fidjeland, A. K., Ostrovski, G., et al. (2015). Human-
level control through deep reinforcement learning. na-
ture, 518(7540):529–533.
Schaul, T., Quan, J., Antonoglou, I., and Silver, D.
(2016). Prioritized experience replay. arXiv,
arXiv:1511.05952.
Sewak, M., Sahay, S. K., and Rathore, H. (2023). Deep
reinforcement learning in the advanced cybersecurity
threat detection and protection. Information Systems
Frontiers, 25(2):589–611.
Song, W., Li, X., Afroz, S., Garg, D., Kuznetsov, D., and
Yin, H. (2022). Mab-malware: A reinforcement learn-
ing framework for blackbox generation of adversarial
malware. In Proceedings of the 2022 ACM on Asia
Conference on Computer and Communications Secu-
rity, page 990–1003. ACM.
Wang, Y., Stokes, J., and Marinescu, M. (2020). Actor critic
deep reinforcement learning for neural malware con-
trol. In Proceedings of the AAAI Conference on Artifi-
cial Intelligence, volume 34, page 1005–1012. Asso-
ciation for the Advancement of Artificial Intelligence
(AAAI).
Wang, Y., Stokes, J. W., and Marinescu, M. (2019). Neu-
ral malware control with deep reinforcement learning.
In MILCOM 2019 - 2019 IEEE Military Communica-
tions Conference (MILCOM), page 1–8. IEEE.
Wang, Z., Schaul, T., Hessel, M., van Hasselt, H., Lanctot,
M., and de Freitas, N. (2016). Dueling network archi-
tectures for deep reinforcement learning. In Proceed-
ings of the 33rd International Conference on Machine
Learning, volume 48, page 1995–2003. PMLR.
Wu, Y., Li, M., Zeng, Q., Yang, T., Wang, J., Fang, Z., and
Cheng, L. (2023). Droidrl: Feature selection for an-
droid malware detection with reinforcement learning.
Computers & Security, 128:103126.
Yang, J., El-Bouri, R., O’Donoghue, O., Lachapelle, A. S.,
Soltan, A. A. S., and Clifton, D. A. (2022). Deep re-
inforcement learning for multi-class imbalanced train-
ing. arXiv, arXiv:2205.12070.
Deep Q-Networks for Imbalanced Multi-Class Malware Classification
349