Automatic Analysis of App Reviews Using LLMs
Sadeep Gunathilaka, Nisansa de Silva
2025
Abstract
Large Language Models (LLMs) have shown promise in various natural language processing tasks, but their effectiveness for app review classification to support software evolution remains unexplored. This study evaluates commercial and open-source LLMs for classifying mobile app reviews into bug reports, feature requests, user experiences, and ratings. We compare the zero-shot performance of GPT-3.5and Gemini Pro 1.0, finding that GPT-3.5 achieves superior results with an F1 score of 0.849. We then use GPT-3.5 to autonomously annotate a dataset for fine-tuning smaller open-source models. Experiments with Llama 2and Mistralshow that instruction fine-tuning significantly improves performance, with results approaching commercial models. We investigate the trade-off between training data size and the number of epochs, demonstrating that comparable results can be achieved with smaller datasets and increased training iterations. Additionally, we explore the impact of different prompting strategies on model performance. Our work demonstrates the potential of LLMs to enhance app review analysis for software engineering while highlighting areas for further improvement in open-source alternatives.
DownloadPaper Citation
in Harvard Style
Gunathilaka S. and de Silva N. (2025). Automatic Analysis of App Reviews Using LLMs. In Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART; ISBN 978-989-758-737-5, SciTePress, pages 828-839. DOI: 10.5220/0013375600003890
in Bibtex Style
@conference{icaart25,
author={Sadeep Gunathilaka and Nisansa de Silva},
title={Automatic Analysis of App Reviews Using LLMs},
booktitle={Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART},
year={2025},
pages={828-839},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013375600003890},
isbn={978-989-758-737-5},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART
TI - Automatic Analysis of App Reviews Using LLMs
SN - 978-989-758-737-5
AU - Gunathilaka S.
AU - de Silva N.
PY - 2025
SP - 828
EP - 839
DO - 10.5220/0013375600003890
PB - SciTePress