AUTOMATIC SUMMARIZATION OF ARABIC TEXTS BASED ON RST TECHNIQUE

Mohamed Hédi Mâaloul, Iskandar Keskes, Lamia Hadrich Belguith, Philippe Blache

Abstract

We present in this paper an automatic summarization technique of Arabic texts, based on RST. We first present a corpus study which enabled us to specify, following empirical observations, a set of relations and rhetorical frames. Then, we present our method to automatically summarize Arabic texts. Finally, we present the architecture of the ARSTResume system. Our method is based on the Rhetorical Structure Theory (Mann, 1988) and uses linguistic knowledge. It relies on three pillars. The first consists in locating the rhetorical relations between the minimal units of the text by applying rhetorical rules. One of these units is the nucleus (the segment necessary to maintain coherence) and the other can be either nucleus or satellite (an optional segment). The second pillar is the representation and the simplification of the RST-tree that represents the source text in hierarchical form. The third pillar is the selection of sentences for the final summary, which takes into account the type of the rhetorical relations chosen for the extract.

References

  1. Alrahabi, M., 2006. Annotation Sémantique des Énonciations en Arabe", XXIVème Congrès en INFormatique des Organisations et Systèmes d'Information et de décision, Hammamet-Tunisie.
  2. Belguith, H., L., Baccour L., Mourad G., 2005. Segmentation de textes arabes basée sur l'analyse contextuelle des signes de ponctuations et de certaines particules. Actes de la 12ème conférence sur le Traitement Automatique des Langues Naturelles TALN'2005, Vol. 1, p : 451-456, Dourdan-France.
  3. Christophe, L., 2001. Une typologie des énumérations basée sur les structures rhétoriques et architecturales du texte. TALN - Tours, France.
  4. Mâaloul, M.H., 2007. Al Lakas El'eli / ??? ???? ????? : Un système de résumé automatique de documents arabes, IBIMA.
  5. Mann, W., C., Thompson, S., A., 1988. Rhetorical structure theory: Toward a functional theory of text organization.”Text, 8(3): p: 243 - 281.
  6. Mathkour, H., I., Touir A., Al-Sanie, W., 2008. Parsing Arabic Texts Using Rhetorical Structure Theory, Journal of Computer Science 4 (9): p:713-720.
  7. Minel, J-L., 2002. Filtrage sémantique : du résumé automatique à la fouille de textes, Paris : Hermès Science Publications.
  8. Teufel, S., Marc, M., 1997. Sentence extraction as a classification task. In Proceedings of the ACL'97/EACL'97 Workshop on Intelligent Scalable Text Summarization, p: 58-65, Madrid- Spain.
  9. Udo, H., and Holger, S., 2000. Phrases as carriers of coherence relations, In Lila R. Gleitman and Aravind K. Joshi, Proceedings of the 22nd Annual.
Download


Paper Citation


in Harvard Style

Hédi Mâaloul M., Keskes I., Hadrich Belguith L. and Blache P. (2010). AUTOMATIC SUMMARIZATION OF ARABIC TEXTS BASED ON RST TECHNIQUE . In Proceedings of the 12th International Conference on Enterprise Information Systems - Volume 2: ICEIS, ISBN 978-989-8425-05-8, pages 434-437. DOI: 10.5220/0002976104340437


in Bibtex Style

@conference{iceis10,
author={Mohamed Hédi Mâaloul and Iskandar Keskes and Lamia Hadrich Belguith and Philippe Blache},
title={AUTOMATIC SUMMARIZATION OF ARABIC TEXTS BASED ON RST TECHNIQUE},
booktitle={Proceedings of the 12th International Conference on Enterprise Information Systems - Volume 2: ICEIS,},
year={2010},
pages={434-437},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0002976104340437},
isbn={978-989-8425-05-8},
}


in EndNote Style

TY - CONF
JO - Proceedings of the 12th International Conference on Enterprise Information Systems - Volume 2: ICEIS,
TI - AUTOMATIC SUMMARIZATION OF ARABIC TEXTS BASED ON RST TECHNIQUE
SN - 978-989-8425-05-8
AU - Hédi Mâaloul M.
AU - Keskes I.
AU - Hadrich Belguith L.
AU - Blache P.
PY - 2010
SP - 434
EP - 437
DO - 10.5220/0002976104340437