
REFERENCES
Abdin, M. I. et al. (2024). Phi-3 technical report: A highly
capable language model locally on your phone. Tech-
nical Report MSR-TR-2024-12, Microsoft.
Allouch, M., Azaria, A., and Azoulay, R. (2021). Conver-
sational agents: Goals, technologies, vision and chal-
lenges. Sensors, 21(24).
Alshahwan, N. et al. (2024). Automated unit test improve-
ment using large language models at meta. 32nd ACM
Symposium on the Foundations of Software Engineer-
ing (FSE 24).
Bauer, T. et al. (2020). # metoomaastricht: Building
a chatbot to assist survivors of sexual harassment.
In Machine Learning and Knowledge Discovery in
Databases: International Workshops of ECML PKDD
2019, pages 503–521. Springer.
Cascella, M. et al. (2024). The breakthrough of large lan-
guage models release for medical applications: 1-year
timeline and perspectives. Journal of Medical Sys-
tems, 48(22).
Cox, S. R. and Ooi, W. T. (2024). Conversational interac-
tions with npcs in llm-driven gaming: Guidelines from
a content analysis of player feedback. In Chatbot Re-
search and Design, pages 167–184, Cham. Springer
Nature Switzerland.
Douze, M. et al. (2024). The faiss li-
brary. arXiv preprint arXiv:2401.08281,
https://github.com/facebookresearch/faiss.
Gallotta, R. et al. (2024). Large language models and
games: A survey and roadmap. arXiv preprint
arXiv:2402.18659. Submitted on 28 Feb 2024.
Guan, Y., Wang, D., Chu, Z., Wang, S., Ni, F., Song, R.,
Li, L., Gu, J., and Zhuang, C. (2023). Intelligent vir-
tual assistants with llm-based process automation. In
https://arxiv.org/abs/2312.06677.
Hu, E. J. et al. (2022). LoRA: Low-rank adaptation of
large language models. In International Conference
on Learning Representations.
Huber, S. E. et al. (2024). Leveraging the potential of
large language models in education through playful
and game-based learning. Educational Psychology
Review, 36(25):1–17.
Hui, L. and Belkin, M. (2020). Evaluation of neural archi-
tectures trained with square loss vs cross-entropy in
classification tasks. arXiv preprint arXiv:2006.07322.
Inan, H., Upasani, et al. (2023). Llama guard: Llm-
based input-output safeguard for human-ai conversa-
tions. arXiv preprint arXiv:2312.06674.
Isaza-Giraldo, A. et al. (2024). Prompt-gaming: A pilot
study on llm-evaluating agent in a meaningful energy
game. In Extended Abstracts of the CHI Conference
on Human Factors in Computing Systems (CHI EA
’24), page 12. ACM.
Johnson, J., Douze, M., and J
´
egou, H. (2019). Billion-scale
similarity search with GPUs. IEEE Transactions on
Big Data, 7(3):535–547.
Li, R. et al. (2023). Starcoder: may the source be with you!
In https://arxiv.org/abs/2305.06161.
Loshchilov, I. and Hutter, F. (2019). Decoupled weight de-
cay regularization. In 7th International Conference on
Learning Representations, ICLR 2019.
Marvin, G., Hellen, N., Jjingo, D., and Nakatumba-
Nabende, J. (2024). Prompt engineering in large
language models. In Jacob, I. J., Piramuthu, S.,
and Falkowski-Gilski, P., editors, Data Intelligence
and Cognitive Informatics, pages 387–402, Singa-
pore. Springer Nature Singapore.
Munday, P. (2017). Duolingo. gamified learning through
translation. Journal of Spanish Language Teaching,
4(2):194–198.
OpenAI et al. (2024). Gpt-4 technical report. In
https://arxiv.org/abs/2303.08774.
Paduraru, C., Cernat, M., and Stefanescu, A. (2023). Con-
versational agents for simulation applications and
video games. In Proceedings of the 18th International
Conference on Software Technologies, ICSOFT 2023,
pages 27–36. SCITEPRESS.
Patil, S. G. et al. (2023). Gorilla: Large language model
connected with massive apis. CoRR, abs/2305.15334.
Peng, B. et al. (2023). Instruction tuning with gpt-4. arXiv
preprint arXiv:2304.03277.
Radford, A., Kim, J. W., Xu, T., Brockman, G., McLeavey,
C., and Sutskever, I. (2022). Robust speech recogni-
tion via large-scale weak supervision.
Rozi
`
ere, B. et al. (2024). Code llama: Open foundation
models for code. In https://arxiv.org/abs/2307.09288.
Schick, T. a. (2023). Toolformer: Language models can
teach themselves to use tools. In Advances in Neural
Information Processing Systems, volume 36.
Siddiq, M. L. et al. (2022). An empirical study of code
smells in transformer-based code generation tech-
niques. In 2022 IEEE 22nd International Working
Conference on Source Code Analysis and Manipula-
tion (SCAM).
Team, G. et al. (2024). Gemma: Open models
based on gemini research and technology. In
https://arxiv.org/abs/2403.08295.
Touvron, H. et al. (2023). Llama 2: Open
foundation and fine-tuned chat models. In
https://arxiv.org/abs/2307.09288.
Vaswani, A. et al. (2023). Attention is all you need.
Wankhade, M., Rao, A., and Kulkarni, C. (2022). A sur-
vey on sentiment analysis methods, applications, and
challenges. Artificial Intelligence Review, 55:1–50.
Yao, S. et al. (2023). React: Synergizing reasoning and act-
ing in language models. In The Eleventh International
Conference on Learning Representations, ICLR 2023,
Kigali, Rwanda, May 1-5, 2023. OpenReview.net.
Yunanto, A. A. et al. (2019). English education game using
non-player character based on natural language pro-
cessing. Procedia Computer Science, 161:502–508.
The Fifth Information Systems International Confer-
ence, 23-24 July 2019, Surabaya, Indonesia.
Zhao, P. et al. (2024). Retrieval-augmented generation for
ai-generated content: A survey.
ICSOFT 2024 - 19th International Conference on Software Technologies
304