Fine Tuning LLMs vs Non-Generative Machine Learning Models: A Comparative Study of Malware Detection
Gheorghe Balan, Gheorghe Balan, Ciprian-Alin Simion, Ciprian-Alin Simion, Dragoş Teodor Gavriluţ, Dragoş Teodor Gavriluţ
2025
Abstract
The emergence of Generative AI has provided various scenarios where Large Language Models can be used to replace older technologies. Cyber-security industry has been an early adopter of these technologies, but in particular for scenarios that involved security operation centers, support or cyber attack visibility. This paper aims to compare how well Large Language Models behave against traditional machine learning models for malware detection wrt. various constrains that apply to a security product such as inference time, memory footprint, detection and false positive rate. In this paper we have fine tuned 3 open source models (LLama2-13B, Mistral, Mixtral) and compared them with 18 classical machine learning models (feed forward neural networks, SVMs, etc) using more than 135,000 benign and malicious binary samples. The goal was to identify scenarios/cases where large language models are suited for the task of malware detection.
DownloadPaper Citation
in Harvard Style
Balan G., Simion C. and Gavriluţ D. (2025). Fine Tuning LLMs vs Non-Generative Machine Learning Models: A Comparative Study of Malware Detection. In Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART; ISBN 978-989-758-737-5, SciTePress, pages 715-725. DOI: 10.5220/0013177300003890
in Bibtex Style
@conference{icaart25,
author={Gheorghe Balan and Ciprian-Alin Simion and Dragoş Gavriluţ},
title={Fine Tuning LLMs vs Non-Generative Machine Learning Models: A Comparative Study of Malware Detection},
booktitle={Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART},
year={2025},
pages={715-725},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013177300003890},
isbn={978-989-758-737-5},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART
TI - Fine Tuning LLMs vs Non-Generative Machine Learning Models: A Comparative Study of Malware Detection
SN - 978-989-758-737-5
AU - Balan G.
AU - Simion C.
AU - Gavriluţ D.
PY - 2025
SP - 715
EP - 725
DO - 10.5220/0013177300003890
PB - SciTePress