loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Paper Unlock

Authors: Karima Meftouh 1 ; Kamel Smaili 2 and Mohamed Tayeb Laskri 1

Affiliations: 1 Badji Mokhtar University, Algeria ; 2 INRIA-LORIA, France

Keyword(s): Statistical language modeling, Arabic, French, Smoothing technique, n-gram model, Vocabulary, Perplexity, Performance.

Related Ontology Subjects/Areas/Topics: Applications ; Artificial Intelligence ; Knowledge Engineering and Ontology Development ; Knowledge-Based Systems ; Natural Language Processing ; Pattern Recognition ; Symbolic Systems

Abstract: In this paper, we propose a comparative study of statistical language models of Arabic and French. The objective of this study is to understand how to better model both Arabic and French. Several experiments using different smoothing techniques have been carried out. For French, trigram models are most appropriate whatever the smoothing technique used. For Arabic, the n-gram models of higher order smoothed with Witten Bell method are more efficient. Tests are achieved with comparable corpora and vocabularies in terms of size.

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.226.222.132

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Meftouh, K.; Smaili, K. and Tayeb Laskri, M. (2009). COMPARATIVE STUDY OF ARABIC AND FRENCH STATISTICAL LANGUAGE MODELS. In Proceedings of the International Conference on Agents and Artificial Intelligence - ICAART; ISBN 978-989-8111-66-1; ISSN 2184-433X, SciTePress, pages 156-160. DOI: 10.5220/0001537501560160

@conference{icaart09,
author={Karima Meftouh. and Kamel Smaili. and Mohamed {Tayeb Laskri}.},
title={COMPARATIVE STUDY OF ARABIC AND FRENCH STATISTICAL LANGUAGE MODELS},
booktitle={Proceedings of the International Conference on Agents and Artificial Intelligence - ICAART},
year={2009},
pages={156-160},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0001537501560160},
isbn={978-989-8111-66-1},
issn={2184-433X},
}

TY - CONF

JO - Proceedings of the International Conference on Agents and Artificial Intelligence - ICAART
TI - COMPARATIVE STUDY OF ARABIC AND FRENCH STATISTICAL LANGUAGE MODELS
SN - 978-989-8111-66-1
IS - 2184-433X
AU - Meftouh, K.
AU - Smaili, K.
AU - Tayeb Laskri, M.
PY - 2009
SP - 156
EP - 160
DO - 10.5220/0001537501560160
PB - SciTePress