Large Language Models for Code Obfuscation Evaluation of the Obfuscation Capabilities of OpenAI’s GPT-3.5 on C Source Code

Patrick Kochberger; Patrick Kochberger; Maximilian Gramberger; Sebastian Schrittwieser; Caroline Lawitschka; Edgar Weippl

doi:10.5220/0012167000003555

Large Language Models for Code Obfuscation Evaluation of the Obfuscation Capabilities of OpenAI’s GPT-3.5 on C Source Code

Patrick Kochberger, Patrick Kochberger, Maximilian Gramberger, Sebastian Schrittwieser, Caroline Lawitschka, Edgar Weippl

2023

Abstract

This study explores the efficacy of large language models, specifically GPT-3.5, in obfuscating C source code for software protection. We utilized eight distinct obfuscation techniques in tandem with seven representative C code samples to conduct a comprehensive analysis. The evaluation was performed using a Python-based tool we developed, which interfaces with the OpenAI API to access GPT-3.5. Our metrics of evaluation included the correctness and diversity of the obfuscated code, along with the robustness of the resultant protection. While the diversity of the resulting code was found to be commendable, our findings indicate a prevalent issue with the correctness of the obfuscated code and the overall level of protection provided. Consequently, we assert that while promising, the feasibility of deploying large language models for automatic code obfuscation is not yet sufficiently established. This study signifies an important step towards understanding the limitations and potential of AI-based code obfuscation, thereby informing future research in this area.

Download

Paper Citation

in Harvard Style

Kochberger P., Gramberger M., Schrittwieser S., Lawitschka C. and Weippl E. (2023). Large Language Models for Code Obfuscation Evaluation of the Obfuscation Capabilities of OpenAI’s GPT-3.5 on C Source Code. In Proceedings of the 20th International Conference on Security and Cryptography - Volume 1: SECRYPT; ISBN 978-989-758-666-8, SciTePress, pages 7-19. DOI: 10.5220/0012167000003555

in Bibtex Style

@conference{secrypt23,
author={Patrick Kochberger and Maximilian Gramberger and Sebastian Schrittwieser and Caroline Lawitschka and Edgar Weippl},
title={Large Language Models for Code Obfuscation Evaluation of the Obfuscation Capabilities of OpenAI’s GPT-3.5 on C Source Code},
booktitle={Proceedings of the 20th International Conference on Security and Cryptography - Volume 1: SECRYPT},
year={2023},
pages={7-19},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012167000003555},
isbn={978-989-758-666-8},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 20th International Conference on Security and Cryptography - Volume 1: SECRYPT
TI - Large Language Models for Code Obfuscation Evaluation of the Obfuscation Capabilities of OpenAI’s GPT-3.5 on C Source Code
SN - 978-989-758-666-8
AU - Kochberger P.
AU - Gramberger M.
AU - Schrittwieser S.
AU - Lawitschka C.
AU - Weippl E.
PY - 2023
SP - 7
EP - 19
DO - 10.5220/0012167000003555
PB - SciTePress