Analysis of the Effectiveness of LLMs in Handwritten Essay Recognition and Assessment

Daisy Albuquerque da Silva; Carlos  Luiz Ferreira; Sérgio  dos Santos Cardoso Silva; Juliano  Bruno de Almeida Cardoso

doi:10.5220/0013353700003890

Analysis of the Effectiveness of LLMs in Handwritten Essay Recognition and Assessment

Daisy Albuquerque da Silva, Carlos Luiz Ferreira, Sérgio dos Santos Cardoso Silva, Juliano Bruno de Almeida Cardoso

2025

Abstract

This study investigates the application of Large Language Models (LLMs) for handwritten essay recognition and evaluation within the Military Institute of Engineering (IME) selection process. Utilizing a two-stage methodology, 100 handwritten essays were transcribed using LLMs and subsequently evaluated against predefined linguistic and content criteria by both open-source and closed-source LLMs, including GPT-3.5, GPT-4, o1, LLaMA, and Mixtral. The evaluations were compared to those conducted by IME professors to assess reliability, alignment, and limitations. Results indicate that closed-source models like o1 demonstrated strong reliability and alignment with human evaluations, particularly in language-related criteria, though they exhibited a tendency to assign higher scores overall. In contrast, open-source models displayed weaker correlations and lower variance, limiting their effectiveness for nuanced assessment tasks. The study highlights the potential of LLMs as complementary tools for automated essay evaluation while identifying challenges such as variability in human and model evaluations, the need for advanced prompt engineering, and the necessity of incorporating diverse essay formats for improved generalizability. These findings provide insights into optimizing LLM performance in educational contexts.

Download

Paper Citation

in Harvard Style

Albuquerque da Silva D., Ferreira C., Silva S. and Cardoso J. (2025). Analysis of the Effectiveness of LLMs in Handwritten Essay Recognition and Assessment. In Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART; ISBN 978-989-758-737-5, SciTePress, pages 776-785. DOI: 10.5220/0013353700003890

in Bibtex Style

@conference{icaart25,
author={Daisy Albuquerque da Silva and Carlos Ferreira and Sérgio Silva and Juliano Cardoso},
title={Analysis of the Effectiveness of LLMs in Handwritten Essay Recognition and Assessment},
booktitle={Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART},
year={2025},
pages={776-785},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013353700003890},
isbn={978-989-758-737-5},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART
TI - Analysis of the Effectiveness of LLMs in Handwritten Essay Recognition and Assessment
SN - 978-989-758-737-5
AU - Albuquerque da Silva D.
AU - Ferreira C.
AU - Silva S.
AU - Cardoso J.
PY - 2025
SP - 776
EP - 785
DO - 10.5220/0013353700003890
PB - SciTePress