Analysis of the Effectiveness of Large Language Models in Assessing Argumentative Writing and Generating Feedback
Daisy Albuquerque da Silva, Carlos Eduardo de Mello, Ana Garcia
2024
Abstract
This study examines the use of Large Language Models (LLMs) like GPT-4 in the evaluation of argumentative writing, particularly opinion articles authored by military school students. It explores the potential of LLMs to provide instant, personalized feedback across different writing stages and assesses their effectiveness compared to human evaluators. The study utilizes a detailed rubric to guide the LLM evaluation, focusing on competencies from topic choice to bibliographical references. Initial findings suggest that GPT-4 can consistently evaluate technical and structural aspects of writing, offering reliable feedback, especially in the References category. However, its conservative classification approach may underestimate article quality, indicating a need for human oversight. The study also uncovers GPT-4’s challenges with nuanced and contextual elements of opinion writing, evident from variability in precision and low recall in recognizing complete works. These findings highlight the evolving role of LLMs as supplementary tools in education that require integration with human judgment to enhance argumentative writing and critical thinking in academic settings.
DownloadPaper Citation
in Harvard Style
Albuquerque da Silva D., Eduardo de Mello C. and Garcia A. (2024). Analysis of the Effectiveness of Large Language Models in Assessing Argumentative Writing and Generating Feedback. In Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART; ISBN 978-989-758-680-4, SciTePress, pages 573-582. DOI: 10.5220/0012466600003636
in Bibtex Style
@conference{icaart24,
author={Daisy Albuquerque da Silva and Carlos Eduardo de Mello and Ana Garcia},
title={Analysis of the Effectiveness of Large Language Models in Assessing Argumentative Writing and Generating Feedback},
booktitle={Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART},
year={2024},
pages={573-582},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012466600003636},
isbn={978-989-758-680-4},
}
in EndNote Style
TY - CONF
JO - Proceedings of the 16th International Conference on Agents and Artificial Intelligence - Volume 2: ICAART
TI - Analysis of the Effectiveness of Large Language Models in Assessing Argumentative Writing and Generating Feedback
SN - 978-989-758-680-4
AU - Albuquerque da Silva D.
AU - Eduardo de Mello C.
AU - Garcia A.
PY - 2024
SP - 573
EP - 582
DO - 10.5220/0012466600003636
PB - SciTePress