Authors:
Bilel Elayeb
1
;
2
;
Myriam Bounhas
3
and
Mohamed Ettih
4
Affiliations:
1
L@bISEN, ISEN de Nantes, Yncréa Ouest, France
;
2
RIADI Research Laboratory, ENSI, Manouba University, Tunisia
;
3
LARODEC Research Laboratory, ISG Tunis, Tunis University, Tunisia
;
4
Université Paris-Est Créteil, Paris 12 Val de Marne, France
Keyword(s):
Morphological Disambiguation, Analogical Proportions, Machine Learning Algorithms, Deep Learning Algorithms, Feature Selection, Classification.
Abstract:
The Arabic language is known for its complexity, which encompasses extensive morphological and orthographic variations, as well as significant syntactic and semantic diversity. These unique characteristics often result in morphological ambiguity in Arabic. In this paper, we tackle the challenge of morphological disambiguation in Arabic texts. We frame this task as a classification problem, where the possible values of morphological features represent the classes, and a classification algorithm is used to assign the appropriate class to each word based on its context. Specifically, we investigate the effectiveness of an analogy-based classifier for morphological disambiguation in Arabic texts. Analogical Proportions (AP) are statements that express the relationship between four elements A, B, C, and D such that "A differs from B as C differs from D". Leveraging Analogical Proportions-based inference, the AP classifier predicts the fourth, unknown element (D), given that the first thre
e (A, B, and C) are known. We evaluate this analogical classifier using a corpus of Classical Arabic texts. The average disambiguation rate (74.80%) of the AP classifier outperforms that of a set of well-established machine-learning and deep learning-based classifiers.
(More)