The Comparison and Analysis of Different Large Language Models in Text Summarization Task

Shengzhi Chen

2023

Abstract

This study primarily focuses on the evaluation of various large language models’ performance in text summarization task, especially as their significance is increasingly apparent in applications like news media, academic research, and business intelligence. The main objective is to evaluate the performance of different models through both quantitative and qualitative methods. Specifically, author selected the test set from the CNN/Daily-mail dataset and used Recall-Oriented Understudy for Gisting Evaluation (ROUGE) and Bidirectional Encoder Representation from Transformers (BERT) Score as evaluation metrics. After fine-tuning the parameters for each model, author conducted a detailed analysis of their predictive performance, including scoring and evaluation using Generative Pre-trained Transformer (GPT)-4. Conducted on the CNN/Daily-mail dataset, the experimental results show that, without any constraints, the summaries generated by GPT-3.5 perform best in terms of accuracy and completeness but are slightly lacking in conciseness. Summaries generated by Pre-training with Extracted Gap-sentences for Abstractive Summarization (Pegasus)-large are relatively shorter and mostly accurate but occasionally include redundant information. Fine-tuned Language Net Text-To-Text Transfer Transformer (Flan-T5) models produce more concise summaries but fall short in terms of accuracy and completeness. The outcomes of this research not only enrich the empirical understanding of text summarization but also offer directives for those employing large language models in this task.

Download


Paper Citation


in Harvard Style

Chen S. (2023). The Comparison and Analysis of Different Large Language Models in Text Summarization Task. In Proceedings of the 1st International Conference on Data Analysis and Machine Learning - Volume 1: DAML; ISBN 978-989-758-705-4, SciTePress, pages 417-421. DOI: 10.5220/0012799300003885


in Bibtex Style

@conference{daml23,
author={Shengzhi Chen},
title={The Comparison and Analysis of Different Large Language Models in Text Summarization Task},
booktitle={Proceedings of the 1st International Conference on Data Analysis and Machine Learning - Volume 1: DAML},
year={2023},
pages={417-421},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012799300003885},
isbn={978-989-758-705-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 1st International Conference on Data Analysis and Machine Learning - Volume 1: DAML
TI - The Comparison and Analysis of Different Large Language Models in Text Summarization Task
SN - 978-989-758-705-4
AU - Chen S.
PY - 2023
SP - 417
EP - 421
DO - 10.5220/0012799300003885
PB - SciTePress