loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Amina Merniz ; Anja Habacha Chaibi and Henda Hajjami Ben Ghézala

Affiliation: National School of Computer Science, University of Manouba, Tunisia

Keyword(s): Text Summarization, Multi-document Summarization, Pagerank Algorithm, Thematic Annotation.

Abstract: Reduce document(s) by keeping keys and significant sentences from a set of data is called text summarization. It has been around for a long time in natural language processing research, it is improving over the years due to a considerable number of methods and research in this area. The paper suggests Arabic multi-document text summarization. The originality of the approach is that the summary based on thematic annotation such as input documents are analyzed and segmented using LDA. Then segments of each topic are represented by a separate graph because of the redundancy problem in multi-document summarization. In the last step, the proposed approach applies a modified pagerank algorithm that utilizes cosine similarity measure as a weight between edges. Vertices that have high scores are essential. Therefore, they construct the final summary. To evaluate summary systems, researchers develop serval metrics divided into three categories, namely: automatic, semi-automatic and manual. Th is study research chooses automatic evaluation methods for text summarization, mainly Rouge measure (Rouge-1, Rouge-2, Rouge-L, and Rouge-SU4). (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.225.254.81

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Merniz, A.; Chaibi, A. and Ben Ghézala, H. (2021). Multi-document Arabic Text Summarization based on Thematic Annotation. In Proceedings of the 16th International Conference on Software Technologies - ICSOFT; ISBN 978-989-758-523-4; ISSN 2184-2833, SciTePress, pages 639-644. DOI: 10.5220/0010557906390644

@conference{icsoft21,
author={Amina Merniz. and Anja Habacha Chaibi. and Henda Hajjami {Ben Ghézala}.},
title={Multi-document Arabic Text Summarization based on Thematic Annotation},
booktitle={Proceedings of the 16th International Conference on Software Technologies - ICSOFT},
year={2021},
pages={639-644},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0010557906390644},
isbn={978-989-758-523-4},
issn={2184-2833},
}

TY - CONF

JO - Proceedings of the 16th International Conference on Software Technologies - ICSOFT
TI - Multi-document Arabic Text Summarization based on Thematic Annotation
SN - 978-989-758-523-4
IS - 2184-2833
AU - Merniz, A.
AU - Chaibi, A.
AU - Ben Ghézala, H.
PY - 2021
SP - 639
EP - 644
DO - 10.5220/0010557906390644
PB - SciTePress